Can svm/tune in R do an automatic grid search with increasingly finer grid
and smaller focus for me?

On 3/1/06, Liaw, Andy <[EMAIL PROTECTED]> wrote:
>
> Do you know that there are (pseudo-)randomness involved in CV?  Even if
> you fix the parameters and run multiple times, you're going to get different
> answers, let alone changing parameters each time.  Hoping to narrow the
> optimal parameters down to that fine a resolution is generally not
> realistic.  Also, there may well be multiple optima in the CV error
> `surface'.  Take your pick.
>
> Andy
>
>  -----Original Message-----
> *From:* Michael [mailto:[EMAIL PROTECTED]
> *Sent:* Wednesday, March 01, 2006 3:59 AM
> *To:* Liaw, Andy
> *Cc:* R-help@stat.math.ethz.ch
> *Subject:* Re: [R] does svm have a CV to obtain the best "cost" parameter?
>
> Thanks a lot Andy.
>
> I read that paper and followed the instructions, but met with a lot
> peculiarities:
>
> 1. In using "tune" function for "svm", the best "cost" value turns out to
> be multi-peaks, and not  with a single global peak. So I don't know which
> peak to follow in order to refine my search grid and do more detailed search
> in a smaller/focused range. Please see below.
>
> - Detailed performance results:
>            cost      error
> 1  0.0004882813 0.05065909
> 2  0.0005608879 0.05122727
> 3  0.0006442910 0.04895130
> 4  0.0007400960 0.04725000
> 5  0.0008501470 0.04497078
> 6  0.0009765625 0.04497078
> 7  0.0011217757 0.04497078
> 8  0.0012885819 0.04440260
> 9  0.0014801920 0.04155844
> 10 0.0017002941 0.03985065
> 11 0.0019531250 0.04099675
> 12 0.0022435515 0.04327273
> 13 0.0025771639 0.04099675
> 14 0.0029603839 0.03929221
> 15 0.0034005881 0.03986039
> 16 0.0039062500 0.04157143
> 17 0.0044871029 0.04099675
> 18 0.0051543278 0.04042857
> 19 0.0059207678 0.03871753
> 20 0.0068011763 0.03871429
> 21 0.0078125000 0.03985065
> 22 0.0089742059 0.04042532
> 23 0.0103086556 0.04042532
> 24 0.0118415357 0.04099675
> 25 0.0136023526 0.04042532
> 26 0.0156250000 0.04440260
>
>
> 2. I first tried 2^(-15:15), and found the best "cost" to be around
> 2^(-8), then I reduce the range, run "tune" on cost values 2^(-11:-6), and
> it returned a best "cost" value to be 2^(-9), which is different from
> 2^(-8), then I run it on seq(-11, -6, by = 0.2), the best "cost" value was
> found to be 2^(-7.2), and with the above multi-peaks... each time the best
> "cost" is at a different value. And with the above multi-peaks, a lot of
> local optima, I don't know what range should I focus on for the next step...
>
>
> The code I've used is as below:
>
> obj <- tune(svm, x, y,
>            ranges = list(cost = 2^seq(-11, -6, by=0.2)),
>            tunecontrol = tune.control(sampling = "cross") ,
>            kernel='linear'
>           )
>
> ------------------------
>
> What can I do now?
>
> Thanks a lot!
>
> On 2/28/06, Liaw, Andy < [EMAIL PROTECTED]> wrote:
> >
> > You might find 
> > http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf<http://www.csie.ntu.edu.tw/%7Ecjlin/papers/guide/guide.pdf>
> >  helpful.
> >
> >
> > Parameter tuning is essential for avoiding overfitting.
> >
> > Andy
> >
> >  -----Original Message-----
> > *From:* Michael [mailto:[EMAIL PROTECTED]
> > *Sent:* Tuesday, February 28, 2006 3:30 PM
> > *To:* Liaw, Andy
> > *Cc:* R-help@stat.math.ethz.ch
> > *Subject:* Re: [R] does svm have a CV to obtain the best "cost"
> > parameter?
> >
> > Hi Andy,
> >
> > Thanks a lot for your answer! So what do I do if the model overfits?
> >
> > Thanks a lot!
> >
> > On 2/28/06, Liaw, Andy < [EMAIL PROTECTED]> wrote:
> > >
> > > From: Michael
> > > >
> > > > Hi all,
> > > >
> > > > I am using the "svm" command in the e1071 package.
> > > >
> > > > Does it have an automatic way of setting the "cost" parameter?
> > >
> > > See ?best.svm in that package.
> > >
> > > > I changed a few values for the "cost" parameter but I hope there is
> > > a
> > > > systematic way of obtaining the best "cost" value.
> > > >
> > > > I noticed that there is a "cross" (Cross validation)
> > > > parameter in the "svm"
> > > > function.
> > > >
> > > > But I did not see how it can be used to optimize the "cost"
> > > parameter.
> > > >
> > > > By the way, what does a 0 training error and a high testing
> > > > error mean?
> > > > Varying "cross=5", or "cross=10", etc. does not change the
> > > > training error
> > > > and testing error at all. How to improve?
> > >
> > > Overfitting, which varying different validation method will not solve.
> > >
> > > Andy
> > >
> > > > Thanks a lot!
> > > >
> > > > M.
> > > >
> > > >       [[alternative HTML version deleted]]
> > > >
> > > > ______________________________________________
> > > > R-help@stat.math.ethz.ch mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide!
> > > > http://www.R-project.org/posting-guide.html
> > > >
> > > >
> > >
> > >
> > >
> > > ------------------------------------------------------------------------------
> > > Notice:  This e-mail message, together with any attachments, contains
> > > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New
> > > Jersey, USA 08889), and/or its affiliates (which may be known outside the
> > > United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as
> > > Banyu) that may be confidential, proprietary copyrighted and/or legally
> > > privileged. It is intended solely for the use of the individual or entity
> > > named on this message.  If you are not the intended recipient, and have
> > > received this message in error, please notify us immediately by reply 
> > > e-mail
> > > and then delete it from your system.
> > >
> > > ------------------------------------------------------------------------------
> > >
> >
> >
> > ------------------------------------------------------------------------------
> > Notice: This e-mail message, together with any attachments, contains
> > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New
> > Jersey, USA 08889), and/or its affiliates (which may be known outside the
> > United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as
> > Banyu) that may be confidential, proprietary copyrighted and/or legally
> > privileged. It is intended solely for the use of the individual or entity
> > named on this message. If you are not the intended recipient, and have
> > received this message in error, please notify us immediately by reply e-mail
> > and then delete it from your system.
> >
> > ------------------------------------------------------------------------------
> >
>
>
> ------------------------------------------------------------------------------
> Notice: This e-mail message, together with any attachments...{{dropped}}

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to