LERS (Learning from Examples based on Rough Sets) is a control induction system developed at the University of Kansas. The first implementation took place in 1988 in Franz Lisp. This early version of LERS had only one algorithm called LEM1 (Learning form Examples Module, version 1) to induce all the rules from the input data. Algorithm steps: – Step 1: Divide table `T` with m examples into n subtables (t1, t2,…. TN). Table for each possible value of the class attribute. (Repeat steps 2 through 8 for each subtable) Step 2: Initialize the number of attribute combinations ` j ` = 1. Step 3: For the subtable you are working on, divide the attribute list into different combinations, each combination with `j` different attributes. Step 4: For each combination of attributes, count the number of occurrences of attribute values that appear under the same combination of attributes in the unselected rows of the subtable under consideration, and at the same time not under the same combination of attributes of other subtables. Name the first combination with the maximum number of occurrences the Max combination `MAX`. Step 5: If `MAX` = null, increase `j` by 1 and proceed to step 3. Step 6: Mark all rows in the subtable where the values of `MAX` appear as ranked. Step 7: Add a rule (attribute IF = “XYZ” – > decision THEN is YES / NO) to R whose left side has attribute names `MAX` with their values separated by AND and whose right side contains the decision attribute value associated with the subtable.
Step 8: When all rows are marked as ordered, proceed to processing another subtable and proceed to step 2. Otherwise, go to step 4. If no subtable is available, exit with the rule set you received so far. An example showing the use of ILA, suppose a set of samples with the attributes location type, weather, location, decision and seven examples, our task is to generate a set of rules under what conditions what is the decision. The generated rule sets take up less memory and train faster. The rule induction modeling operator accepts the training data and provides the rule set as model output. The rule set is the textual output of if-then rules and accuracy and coverage statistics. The following settings are available in the model operator and can be configured for the desired modeling behavior. Rules are typically based on sets of attribute values that are divided into an antecedent value and a consequent value. A typical “if then” rule of the form “if precursor = true, then consistent = true” is: “If a male employee is over 50 years of age and in a management position, he or she will have a supplementary pension plan.” Support for such a rule is based on the proportion of tuples in the database that have the attribute values specified in the predecessor and subsequent values. The degree of confidence in a rule is the proportion of tuples that have the attribute values specified in the predecessor that also have the attribute values specified accordingly. In rule induction data models, models and rules are typically created from decision trees.
A decision tree, as the name suggests, has decision branches (for example, “Equal solvency” and “Marital status equal to married”), which read from top to bottom. The decision is placed at the end (output variable), e.g. CONTRACTUAL PLAN → YES. The decisions at the end are called leaves, according to the same botanical terminology. Rule induction is an area of machine learning where formal rules are extracted from a series of observations. The extracted rules can represent a complete scientific model of the data or simply represent local patterns in the data. Grzymala-Busse JW (2002) MLEM2: A new algorithm for the induction of rules from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Uncertainty Management in Knowledge-Based Systems, IPMU 2002, Annecy, France, pp 243-250 Rule induction is a process of creating sets of rules from raw data, called training data. These rules represent hidden and previously unknown knowledge contained in training data. These rules can be used to successfully classify new cases that have not been used for training. One possible application of this methodology is that of rules-based expert systems.
There are many documented examples of the successful use of rule induction in medicine (e.g. for decision support in diagnosis), finance, military, etc. Figure 4.24. The RapidMiner method for rule induction. Quinlan released its new rule/tree induction algorithm, C5.0, in 2000. It contains the following improvements over C4.5: Rule induction is understood here as an example of supervised learning. Rule induction is one of the fundamental processes of extracting knowledge in the form of rule sets from raw data. This process is widely used in machine learning (data mining). A dataset contains cases (examples) that have been identified by attribute values and classified by an expert as members of concepts. Rules are expressions of the following format: rule induction must then be combined with rule selection in terms of interest if it is to have real value in data mining. Determining and evaluating rules typically requires only standard database functions and can be done with embedded SQL. When a database is very large, it is often possible to induce a very large number of rules.
Some may simply correspond to known knowledge of the field, while others may simply be of little interest to the user. Data mining tools must therefore support the selection of interesting rules. Once a subtree is created, only one rule is retrieved from it. Each leaf corresponds to a possible ruler, and we look for the “best” leaf of those subtrees (usually a small minority) that have been enlarged into leaves. Experience shows that it is preferable to target the most general rule by choosing the sheet that covers the largest number of instances. The main heuristics used in the C4.5 algorithm are the value of information provided by a rule (or tree branch) (calculated by a function called “info”) and the overall improvement caused by a rule/branch (calculated by a function called “gain”). The algorithm also has a function to evaluate the loss of information caused by missing values in the data. Rule induction uses a set of specific beliefs in the form of database tuples as evidence to support a general belief consistent with those specific beliefs. A collection of tuples in the database can form a relationship defined by the values of certain attributes, and the relationships in the database form the basis for rules. Database evidence in support of a rule is therefore used to produce a rule that can be applied generally. Figure 6.6 shows a construction induction applied to a set of input variables relative to an output variable.
The input variables are client_age_over_35yrs, client_has_mortgage, and client_self_employed, and the output variable is P-Plan. Plan P percentage indicates the percentage of clients at the specified leaf node who actually have a retirement plan. As already mentioned, the highest variable in the tree is the most general, which is client_age_over_35yrs in this case; That said, the age of the client is an important aspect that determines whether or not they have a retirement plan. The lowest variable in the tree is client_self_employed; That is, the type of employment is a more specific criterion related to the conclusion of a pension plan. The other two input variables were given to the construction induction technique, but they were removed from the tree because they did not reach the minimum level of information support. The level of information support can be seen as a kind of measure of relevance or correlation with the business objective (exit label); Therefore, the algorithm uses this threshold to decide whether or not to include a variable in the tree`s data model. Grzymala-Busse JW (2007) Mining numerical data – A roughset approach. In: Proceedings of the RSEISP`2007, the international conference of rough sets and emerging intelligent systems paradigms, Warsaw, Poland. Lecture Notes in Artificial Intelligence, Vol 4585. Springer, Berlin, S. 12–21 LEM2 (Learning from Examples Module, Version 2) is the basic control induction algorithm of the LERS machine learning/data mining system.
LEM2, first implemented in 1990, uses the idea of local coverage to induce a minimum set of minimum rules describing all data concepts. Grzymala-Busse JW (2003) A comparison of three strategies to ruleinduction from data with numerical attributes. In: Proceedings of the international workshop on rough sets in knowledge discovery (RSKD 2003), inconnection with the European joint conferences on theory and practice of software, Warsaw, pp 132-140 The following section briefly describes three algorithms (ID3, C4.5 and C5.0) by author Ross Quinlan; They represent an evolution over time as Quinlan successively introduced improvements to control induction technology to obtain an algorithm that is now one of the best and most widely used of its kind for general data mining.