Re: An index to classification and regression tree software

Phil Sherrod Thu, 29 Apr 2004 08:47:19 -0700

On 29-Apr-2004, "Aleks Jakulin" <a_jakulin@@hotmail.com> wrote:


> > I'm doing research comparing boosted decision trees to neural
> > various types of predictive analyses.  A boosted decision tree is an
> > ensemble tree created as a series of small trees that form an
> > model.  I'm using the TreeBoost method of boosting to generate the
> > tree series.  TreeBoost uses stochastic gradient boosting to
> > predictive accuracy of decision tree models (see
> > http://www.dtreg.com/treeboost.htm).
>
> I think Phil exceeded the reasonable limits of Usenet advertising, so
> let me provide a list of cost-free classification tree utilities. I'm
> (*) tagging those that support boasting, bragging (pun intended) or
> some other approach to reducing the model variance.  If you're
> interested in perturbation approaches (boosting, bagging, arcing) I
> recommend looking at Random Forests, the recent approach by L. Breiman
> http://www.stat.berkeley.edu/users/breiman/RandomForests/

I'm sorry you were offended by my message, but I appreciate the list of
sites you posted.  I am familiar with about 60% of them, and I will explore
the others.

However, from a brief review of the list of sites you posted, I don't see
any that address the issue that I posed which is a comparison of neural
network models with boosted decision trees for a variety of real-world
applications.  I am quite familiar with the publications by Breiman and
Friedman regarding boosting, bagging, random forests, etc.;  but in their
publications they tend to compare various tree methods with each other, and
they have very few comparisons with NN models.  If you are aware of any
sites or publications that have extensive comparisons of NN models with
boosted trees, I would like to see them.  I would prefer comparisons of NN
models with trees boosted using TreeBoost rather than AdaBoost.

> tagging those that support boasting, bragging (pun intended) or
> some other approach to reducing the model variance.

Bagging reduces the model variance, but does little to increase the
predictive power.  Boosting and random forests increase predictive power
(often by a very significant amount) and also reduce variance.  So a
comparison of NN models with bagged tree models is not a fair comparison.

> Namely, it turns out that a single classification tree
> represents a single interaction. If you have a mostly linear
> phenomenon, as in many real-life problems, a classification tree will
> represent it as a single humongous interaction, which is not
> particularly clear or sophisticated.

That is true which is why boosting usually produces more accurate models
than single trees.

-- 
Phil Sherrod
(phil.sherrod 'at' sandh.com)
http://www.dtreg.com  (decision tree modeling)
http://www.nlreg.com  (nonlinear regression)
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: An index to classification and regression tree software

Reply via email to