Hi, Let me give a simple example, assume a dataset containing 5 instances with 1 variable and the class label:
[x1, y]: [0.5, A] [3.2, B] [4.5, B] [1.4, C] [1.6, C] [1.9, C] Assume that the randomForest algorithm create this (2 levels deep) tree: Root node: question: x1 < 2.2? Left terminal node: [0.5, A] [1.4, C] [1.6, C] [1.9, C] Leaf classification: C Right terminal node: [3.2, B] [4.5, B] Leaf classification: B If I change the question at the root node to "x1 < 1?", the instances in the left leaf node are not correctly passed down the tree anymore. My original question was if there was a way to re-evaluate the instances again into: Root node: question: x1 < 1? Left terminal node: [0.5, A] Leaf classification: A Right terminal node: [3.2, B] [4.5, B] [1.4, C] [1.6, C] [1.9, C] Leaf classification: C Cheers, Martin --- "Liaw, Andy" <[EMAIL PROTECTED]> wrote: > > From: Martin Lam > > > > Dear mailinglist members, > > > > I was wondering if there was a way to re-evaluate > the > > instances of a tree (in the forest) again after I > have > > manually changed a splitpoint (or split variable) > of a > > decision node. Here's an illustration: > > > > library("randomForest") > > > > forest.rf <- randomForest(formula = Species ~ ., > data > > = iris, do.trace = TRUE, ntree = 3, mtry = 2, > > norm.votes = FALSE) > > > > # I am going to change the splitpoint of the root > node > > of the first tree to 1 > > forest.rf$forest$xbestsplit[1,] > > forest.rf$forest$xbestsplit[1,1] <- 1 > > forest.rf$forest$xbestsplit[1,] > > > > Because I've changed the splitpoint, some > instances in > > the leafs are not supposed where they should be. > Is > > there a way to reappoint them to the correct leaf? > > I'm not sure what you want to do exactly, but I > suspect you can use > predict(). > > > I was also wondering how I should interpret the > output > > of do.trace: > > > > ntree OOB 1 2 3 > > 1: 3.70% 0.00% 6.25% 5.88% > > 2: 3.49% 0.00% 3.85% 7.14% > > 3: 3.57% 0.00% 5.56% 5.26% > > > > What's OOB and what does the percentages mean? > > OOB stands for `Out-of-bag'. Read up on random > forests (e.g., the article > in R News) to learn about it. Those numbers are > estimated error rates. The > `OOB' column is across all data, while the others > are for the classes. > > Andy > > > > Thanks in advance, > > > > Martin > > > > > > > > > > > ______________________________________________________ > > Click here to donate to the Hurricane Katrina > relief effort. > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > > > > > ------------------------------------------------------------------------------ > Notice: This e-mail message, together with any > attachments, contains information of Merck & Co., > Inc. (One Merck Drive, Whitehouse Station, New > Jersey, USA 08889), and/or its affiliates (which may > be known outside the United States as Merck Frosst, > Merck Sharp & Dohme or MSD and in Japan, as Banyu) > that may be confidential, proprietary copyrighted > and/or legally privileged. It is intended solely for > the use of the individual or entity named on this > message. If you are not the intended recipient, and > have received this message in error, please notify > us immediately by reply e-mail and then delete it > from your system. > ------------------------------------------------------------------------------ > ______________________________________________________ Click here to donate to the Hurricane Katrina relief effort. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html