On Thu, 19 Apr 2007, Florian Koller-Meinfelder wrote:

Dear R-helpers,

I am looking for a segmentation package that gives some "tree identifier"
as output for every observation in the data set (my response variable is
binary). I have skimmed through "rpart", "ada" and "adabag": The output
"trees" gives you the formula, but I have to run several thousand
segmentations on different data sets and it is tricky to use this
information within a macro (the only thing I could think of is to use some
string manipulation on the tree formula and apply it to the data, but I
hope there is an easier way - e.g. if the algorithm created 12 different
trees a vector that links every observation to one of these 12 segments
would be ideal).


is this

library("party")
airq <- subset(airquality, !is.na(Ozone))
         airct <- ctree(Ozone ~ ., data = airq,
+                         controls = ctree_control(maxsurrogate = 3))
where(airct)
  [1] 5 5 5 5 5 5 5 5 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 3 5 6 9 9 6 5 5 5 5 5 8 9
 [38] 6 8 9 8 8 8 8 5 6 6 3 6 8 8 9 3 8 8 6 9 8 8 8 6 3 6 6 8 8 8 8 9 8 9 6 6 5
 [75] 3 5 6 6 5 5 6 3 8 9 8 8 8 8 8 8 8 8 9 6 6 5 5 6 5 3 5 5 3 5 5 5 6 5 5 6 5
[112] 5 3 5 5 5

what you want? `where' gives you the number of the terminal node each observation in the learning sample is element of.

Best wishes,

Torsten


Cheers,
Florian




Florian Koller-Meinfelder
Research Consulting & Development
______________________________

GfK Fernsehforschung GmbH
Nordwestring 101
90319 N?rnberg

Tel     +49 (0)911 395-3554
Fax     +49 (0)911 395-4130
www.gfk.com/gfkfernsehforschung





This email and any attachments may contain confidential or...{{dropped}}

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to