TLDR: Active learning is a type of machine learning where the computer asks a human for help in labeling new data. It's like a student asking a teacher for answers.
Active learning is a special type of machine learning where the computer can interact with a human (or some other source of information) to get labels for new data points. This is different from traditional machine learning where all the data is labeled beforehand. In active learning, the computer can ask the human for help in labeling new data points. This is useful when there is a lot of unlabeled data available but labeling it manually would be expensive or time-consuming.
There are different scenarios in active learning. In one scenario, called membership query synthesis, the computer generates its own examples and asks the human to label them. For example, if the computer is learning to recognize pictures of animals, it might generate a picture of a leg and ask the human if it belongs to an animal or a human.
In another scenario, called pool-based sampling, the computer selects data points from a pool of unlabeled data and assigns a confidence score to each point. It then selects the points for which it is least confident and asks the human to label them.
In stream-based selective sampling, the computer examines each unlabeled data point one at a time and decides whether to label it or ask the human for help based on its own evaluation of the data.
There are different strategies for selecting which data points to label. Some strategies aim to balance exploration and exploitation, meaning they try to choose examples that will help the computer learn more about the data while also making use of what it already knows. Other strategies aim to reduce the model's uncertainty by selecting examples that would most change the current model or reduce its generalization error. There are also strategies that use the concept of entropy or margin to select the most uncertain or least confident examples.
Active learning can be used in various domains, such as image classification, text classification, and anomaly detection. It can also benefit from crowdsourcing frameworks like Amazon Mechanical Turk, where many humans can be involved in the active learning process.
In summary, active learning is a type of machine learning where the computer can ask a human for help in labeling new data points. It is useful when there is a lot of unlabeled data available and manual labeling is expensive. Different strategies and scenarios can be used to select which data points to label.
See the corresponding article on Wikipedia ยป
Note: This content was algorithmically generated using an AI/LLM trained-on and with access to Wikipedia as a knowledge source. Wikipedia content may be subject to the CC BY-SA license.