Re: [R] Couple of Questions about Classification trees

Ed Merkle Thu, 12 Mar 2009 06:48:35 -0700

The issue with the sample size is that there are so many measurements incomparison to number of meats.

Aside from that, you should check out the rpart package. Its commandsare similar to the tree package, but there are more options for theplots. I don't know immediately how to display misclassification rates,but the text.rpart command can display numbers of incorrectly- andcorrectly-classified observations in each node.


Ed

--
Ed Merkle, PhD
Assistant Professor
Dept. of Psychology
Wichita State University
Wichita, KS, USA 67260

Date: Wed, 11 Mar 2009 13:53:46 -0700 (PDT)
From: Jen_mp3 <jen_...@msn.com>
Subject: Re: [R] Couple of Questions about Classification trees
To: r-help@r-project.org
Message-ID: <22464302.p...@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii

Okay perhaps I should've been more clear about the data. Im actually working
on spectroscopic measurements from food authenticity testing. I have five
different types of meat: 55 of chicken, 55 of turkey, 55 of pork, 34 of beef
and 32 of lamb - 231 in total. On each of these 231 meats, 1024
spectroscopic measurements were taken. Matrix of 231 by 1024. But the
questions I want answered are which of the 1024 measurements are important
for predicting meat type and which of the different types of meat are
incorrectly classified - i.e can we tell the difference between chicken and
turkey. So to carry out a multivariate analysis on the data Ive split it
into two. A training data set and a test data set - half and half although I
think the larger half (55 goes into 27 and 28) went into the test data set
which explains the inequalities in the row numbers. By the way 1024 is
standard - can't change that. Can't change the 231 either.

So I created a new row with the meat types for each row.

End up with the following R code:
library(tree)
meat.tree <- tree(meat.type~., data=train)
using tree.cv (or cv.tree) lowest missclassification rate is 5 so cut the
number of nodes down to 5 using prune.tree
prunedtree <- prune.tree(meat.tree, best = 5, method = "misclass")
Then I want to use predict.tree and the test data set.
predicttree <- predict.tree(prunedtree, data = test)
I already said what it produces.

Again, how would I display the misclassification rate at each node on the
diagram? I know about misclass.tree(prunedtree, detail = TRUE) but that
doesn't actually display them on the classification tree - it just gives a
bunch of numbers of the worksheet and it just wouldn't look very neat if I
had to add them later.

--
View this message in context: 
http://www.nabble.com/Couple-of-Questions-about-Classification-trees-tp22461673p22464302.html
Sent from the R help mailing list archive at Nabble.com.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Couple of Questions about Classification trees

Reply via email to