Hi Steve,
Thank you very much for your reply.
 
I’m trying to do something systematic/general in the program so that I can try 
different datasets without changing much in the program (without knowing the 
name of the class label that has different name from dataset to another…)
 
Could you please tell me your opinion about this code:-
 
library(e1071)
mydata<-read.delim("the_whole_dataset.txt")
class_label <- names(mydata)[1]                        # I’ll always put the 
class label in the first column.
myformula <- formula(paste(class_label,"~ ."))
x <- subset(mydata, select = - mydata[, 1])
mymodel<-(svm(myformula, x, cross=3))
summary(model)
################
Do I have to the same steps with testingset? i.e. the testing set must not 
contain the label too? But contains the same structure as the training set? Is 
it correct?
 
Cheers,
Amy

 
> Date: Tue, 5 Jan 2010 21:15:17 -0500
> Subject: Re: [R] svm
> From: mailinglist.honey...@gmail.com
> To: amy_4_5...@hotmail.com
> CC: r-help@r-project.org
> 
> Hi,
> 
> On Tue, Jan 5, 2010 at 7:01 PM, Amy Hessen <amy_4_5...@hotmail.com> wrote:
> >
> > Hi,
> >
> > I understand from help pages that in order to use a data set with svm, I 
> > have to divide it into two files: one for the dataset without the class 
> > label and the other file contains the class label as the following code:-
> 
> This isn't exactly correct ... look at the examples in the ?svm
> documentation a bit closer.
> 
> > library(e1071)
> > x<- read.delim("mydataset_except-class-label.txt")
> > y<- read.delim("mydataset_class-labell.txt")
> > model <- svm(x, y, cross=5)
> > summary(model)
> >
> > but I couldn’t understand how I add “formula” parameter to it? Does formula 
> > contain the class label too??
> 
> Using the first example in ?svm
> 
> attach(iris)
> model <- svm(Species ~ ., data = iris)
> 
> The first argument in the function call is the formula. The "Species"
> column is the class label.
> 
> `iris` is a data.frame ... you can see that it has the label *in it*,
> look at the output of "head(iris)
> 
> > and what I have to do to use testing set when I don’t use “cross” parameter.
> 
> Just follow the example in ?svm some more, you'll see training a model
> and then testing it on data. The example happens to be the same data
> the model trained on. To use new data, you'll just need a data
> matrix/data.frame with as many columns as your original data, and as
> many rows as you have observations.
> 
> The first step separates the labels from the data (you can do the same
> in your data -- you don't have to have separate test and train files
> that are different -- just remove the labels from it in R):
> 
> attach(iris)
> x <- subset(iris, select = -Species)
> y <- Species
> model <- svm(x, y)
> 
> # test with train data
> pred <- predict(model, x)
> 
> Hope that helps,
> -steve
> 
> -- 
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
                                          
_________________________________________________________________


messenger
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to