Hi Steve,

 

Thank you very much for your reply. Your code is more readable and obvious than 
mine…
 
Could you please help me in these questions?:
 
1) “Formula” is an alternative to “y” parameter in SVM. is it correct?
 
2) I forgot to remove the “class label” from the dataset besides I gave the 
program the class label in formula parameter but the program works! Could you 
please clarify this point to me?
 
Cheers,
Amy
 

> Date: Wed, 6 Jan 2010 18:44:13 -0500
> Subject: Re: [R] svm
> From: mailinglist.honey...@gmail.com
> To: amy_4_5...@hotmail.com
> CC: r-help@r-project.org
> 
> Hi Amy,
> 
> On Wed, Jan 6, 2010 at 4:33 PM, Amy Hessen <amy_4_5...@hotmail.com> wrote:
> > Hi Steve,
> >
> > Thank you very much for your reply.
> >
> > I’m trying to do something systematic/general in the program so that I can
> > try different datasets without changing much in the program (without knowing
> > the name of the class label that has different name from dataset to
> > another…)
> >
> > Could you please tell me your opinion about this code:-
> >
> > library(e1071)
> >
> > mydata<-read.delim("the_whole_dataset.txt")
> >
> > class_label <- names(mydata)[1]                        # I’ll always put the
> > class label in the first column.
> >
> > myformula <- formula(paste(class_label,"~ ."))
> >
> > x <- subset(mydata, select = - mydata[, 1])
> >
> > mymodel<-(svm(myformula, x, cross=3))
> >
> > summary(model)
> >
> > ################
> 
> Since you're not doing anything funky with the formula, a preference
> of mine is to just skip this way of calling SVM and go "straight" to
> the svm(x,y,...) method:
> 
> R> mydata <- as.matrix(read.delim("the_whole_dataset.txt"))
> R> train.x <- mydata[,-1]
> R> train.y <- mydata[,1]
> 
> R> mymodel <- svm(train.x, train.y, cross=3, type="C-classification")
> ## or
> R> mymodel <- svm(train.x, train.y, cross=3, type="eps-regression")
> 
> As an aside, I also like to be explicit about the type="" parameter to
> tell what I want my SVM to do (regression or classification). If it's
> not specified, the SVM picks which one to do based on whether or not
> your y vector is a vector of factors (does classification), or not
> (does regression)
> 
> > Do I have to the same steps with testingset? i.e. the testing set must not
> > contain the label too? But contains the same structure as the training set?
> > Is it correct?
> 
> I guess you'll want to report your accuracy/MSE/something on your
> model for your testing set? Just load the data in the same way then
> use `predict` to calculate the metric your after. You'll have to have
> the labels for your data to do that, though, eg:
> 
> testdata <- as.matrix(read.delim('testdata.txt'))
> test.x <- testdata[,-1]
> test.y <- testdata[,1]
> preds <- predict(mymodel, test.x)
> 
> Let's assume you're doing classification, so let's report the accuracy:
> 
> acc <- sum(preds == test.y) / length(test.y)
> 
> Does that help?
> -steve
> 
> -- 
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
                                          
_________________________________________________________________
[[elided Hotmail spam]]

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to