A decision tree is a model that is both predictive and descriptive. It is called a decision tree because the resulting model is presented in the form of a tree structure. Decision trees are a standard tool in data mining. Decision trees are generally preferred over other nonparametric techniques because of the readability of their learned hypotheses and the efficiency of training and evaluation. Decision trees can be used in many different situations to assist in solving a problem. Decision trees can be extremely valuable when one is faced with a sequential decision problem. Decision trees can provide a detailed representation of all the paths that may prevail within a decision problem's planning horizon according to the alternatives associated with each decision and possible outcomes associated with each uncertain event. (Golub, 1997)
Decision trees are read from left to right. Decision trees are generally learned by means of a top down growth procedure, which starts from the root node and greedily chooses a split of the data that maximizes some cost function, usually a measure of the impurity of the sub samples implicitly defined by the split.
After choosing a split, the sub samples are then mapped to the two children nodes. This procedure is then recursively applied to the children, and the tree is grown until some stopping criterion is met. The tree is then used as a starting point for a bottom up search, performing a pruning of the tree. This eliminates nodes that are redundant or are unable to pay for themselves in terms of the cost function. A decision tree can be thought of as a type of a trail map. On the tree, each line represents a trail segment to be traversed. Each point of interconnection,known as a node, indicates where any of...