[R] Deviance function in regression trees

2006-08-29 Thread Solomon Dobrowski
Hello all. I have heard over and over that CART and its various tree-like
brethren are non-parametric techniques. When I read the chapter in
Chambers and Hastie on tree-based models it states that tree-based models
can be generalized (GTMs) in a manner similar to GLMs by specifying a
different deviance function to distributions other than the gaussian error
distribution ( section 9.4.3).  I have an application in which the response
variable is a continuous variable representing tree counts within a unit
area and thus would be best described by a poisson distribution. The error
distribution for this data is not gaussian. If this is the case, will the
gaussian error distribution used in most regression tree packages, be
appropriate? Are there ways to specify the error distribution in R or Should
I log transform the response variable?  If the specification of error
distribution in regression trees is important, than are these techniques
truly  non-parametric. Thanks for your inputs. 


Solomon Dobrowski
Tahoe Environmental Research Center (TERC)
John Muir Institute of the Environment
University of California, Davis
530 754 9354


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deviance function in regression trees

2006-08-29 Thread Prof Brian Ripley
The short answer is that a Poisson distribution is a discrete 
distribution: if that is appropriate to your data the rpart function (in 
the package of that name) has a suitable option.

On Mon, 28 Aug 2006, Solomon Dobrowski wrote:

 Hello all. I have heard over and over that CART and its various tree-like
 brethren are non-parametric techniques.  When I read the chapter in
 Chambers and Hastie on tree-based models it states that tree-based models
 can be generalized (GTMs) in a manner similar to GLMs by specifying a
 different deviance function to distributions other than the gaussian error
 distribution ( section 9.4.3).  I have an application in which the response
 variable is a continuous variable representing tree counts within a unit
 area and thus would be best described by a poisson distribution. The error
 distribution for this data is not gaussian. If this is the case, will the
 gaussian error distribution used in most regression tree packages, be
 appropriate? Are there ways to specify the error distribution in R or Should
 I log transform the response variable?  If the specification of error
 distribution in regression trees is important, than are these techniques
 truly  non-parametric. Thanks for your inputs. 
 
 
 Solomon Dobrowski
 Tahoe Environmental Research Center (TERC)
 John Muir Institute of the Environment
 University of California, Davis
 530 754 9354
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.