Tree based models (such as RF) are invriant to monotonic transformations in the 
predictor (x) variables, because they only use the ranks of the variables, not 
their actual values.  More specifically, they look for splits that are at the 
mid-points of unique values.  Thus the resulting trees are basically identical 
regardless of how you transform the x variables.

Of course, the only, probably minor, differences is, e.g., mid-points can be 
different between the original and transformed data.  While this doesn't impact 
the training data, it can impact the prediction on test data (although 
difference should be slight).

Transformation of the response variable is quite another thing.  RF needs it 
just as much as others if the situation calls for it.

Cheers,
Andy
 

> -----Original Message-----
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of gianni lavaredo
> Sent: Monday, December 05, 2011 1:41 PM
> To: r-help@r-project.org
> Subject: [R] explanation why RandomForest don't require a 
> transformations (e.g. logarithmic) of variables
> 
> Dear Researches,
> 
> sorry for the easy and common question. I am trying to 
> justify the idea of
> RandomForest don't require a transformations (e.g. logarithmic) of
> variables, comparing this non parametrics method with e.g. the linear
> regressions. In leteruature to study my phenomena i need to apply a
> logarithmic trasformation to describe my model, but i found RF don't
> required this approach. Some people could suggest me text or 
> bibliography
> to study?
> 
> thanks in advance
> 
> Gianni
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to