Version 2.0 of the DTREG decision tree generator program has been released. A free demonstration version is available for download from:
http://www.dtreg.com Here is a summary of the new features: DTREG now can generate both single-tree (CART) models and TreeBoost models consisting of a series of trees. TreeBoost is an implementation of stochastic gradient boosting for models with decision trees as the base functions. TreeBoost is somewhat similar to AdaBoost, but it is optimized for tree models, and it introduces random selection of rows during the series build. The randomization improves the prediction accuracy and makes the TreeBoost method very similar to random forests. TreeBoost series are much less prone to overfitting than single-tree models. TreeBoost uses Huber's M-regression loss function which is very robust to noisy or mislabeled data values. Charts have been added for Lift/Gain curves, model size/error rate and variable importance. Speed improvements have been made. DTREG supports V-fold cross-validation with pruning to select the optimal tree size for generalization to independent data. Random row subsetting also can be used. DTREG supports surrogate splitter (predictor) variables to handle missing data values. Prior probabilities and misclassification costs can be specified. Text variables (for example, "male"/"female") are supported as well as numeric variables. Both regression trees with continuous target variables and classification trees with categorical target variables can be generated. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
