Clustering is an unsupervised machine learning technique used for discovering interesting patterns in data. An example would be grouping similar customers based on their behavior, building a spam filter, identifying fraudulent or criminal activity.
Working with imbalanced dataset can be a tough nut to crack for data scientist. One of the ways at which you deal with imbalanced datasets is by resampling with sklearn.resample i.e. upsampling the minority class or downsampling the majority class.
Conventional Image identification using Neural Networks requires Image Labeling. Image Labeling can be a tedious, or more so expensive, activity to do. Imagine running a company that has large datasets of images, and every time we need to build an image identification algorithm for a particular image kind, we need to label the images as ‘instance’ and ‘not instances’. For example, ‘cats’ and ‘not cats’ or ‘dogs’ and ‘not dogs’. The problem is there are many kinds of ‘not instances’.
Two heads, they say, is better than one. Sometimes in many Machine Learning projects we want to make use of the power of synergy using ensemble methods. The voting and the stacking classifier brings us the benefit of combining 2 or more machine learning models for higher predictive performance.
Everyone talks about Lagos being the city where dreams are fulfilled and all aspirations are graciously met. These thoughts have brought about the former capital being overwhelmingly flooded with its 20 Local Government Areas being currently inhabited by over 16 million people.
Learning never really ends in Data Science. Right from the very first day one starts to learn Data Science, till gaining some proficiency and eventually a job in Data Science, learning still continues. As one gets deeper in the art, the kind of questions asked, interests, etc. may change, requiring also a change in the channel/method of learning.