Can svm/tune in R do an automatic grid search with increasingly finer grid and smaller focus for me?
On 3/1/06, Liaw, Andy <[EMAIL PROTECTED]> wrote: > > Do you know that there are (pseudo-)randomness involved in CV? Even if > you fix the parameters and run multiple times, you're going to get different > answers, let alone changing parameters each time. Hoping to narrow the > optimal parameters down to that fine a resolution is generally not > realistic. Also, there may well be multiple optima in the CV error > `surface'. Take your pick. > > Andy > > -----Original Message----- > *From:* Michael [mailto:[EMAIL PROTECTED] > *Sent:* Wednesday, March 01, 2006 3:59 AM > *To:* Liaw, Andy > *Cc:* R-help@stat.math.ethz.ch > *Subject:* Re: [R] does svm have a CV to obtain the best "cost" parameter? > > Thanks a lot Andy. > > I read that paper and followed the instructions, but met with a lot > peculiarities: > > 1. In using "tune" function for "svm", the best "cost" value turns out to > be multi-peaks, and not with a single global peak. So I don't know which > peak to follow in order to refine my search grid and do more detailed search > in a smaller/focused range. Please see below. > > - Detailed performance results: > cost error > 1 0.0004882813 0.05065909 > 2 0.0005608879 0.05122727 > 3 0.0006442910 0.04895130 > 4 0.0007400960 0.04725000 > 5 0.0008501470 0.04497078 > 6 0.0009765625 0.04497078 > 7 0.0011217757 0.04497078 > 8 0.0012885819 0.04440260 > 9 0.0014801920 0.04155844 > 10 0.0017002941 0.03985065 > 11 0.0019531250 0.04099675 > 12 0.0022435515 0.04327273 > 13 0.0025771639 0.04099675 > 14 0.0029603839 0.03929221 > 15 0.0034005881 0.03986039 > 16 0.0039062500 0.04157143 > 17 0.0044871029 0.04099675 > 18 0.0051543278 0.04042857 > 19 0.0059207678 0.03871753 > 20 0.0068011763 0.03871429 > 21 0.0078125000 0.03985065 > 22 0.0089742059 0.04042532 > 23 0.0103086556 0.04042532 > 24 0.0118415357 0.04099675 > 25 0.0136023526 0.04042532 > 26 0.0156250000 0.04440260 > > > 2. I first tried 2^(-15:15), and found the best "cost" to be around > 2^(-8), then I reduce the range, run "tune" on cost values 2^(-11:-6), and > it returned a best "cost" value to be 2^(-9), which is different from > 2^(-8), then I run it on seq(-11, -6, by = 0.2), the best "cost" value was > found to be 2^(-7.2), and with the above multi-peaks... each time the best > "cost" is at a different value. And with the above multi-peaks, a lot of > local optima, I don't know what range should I focus on for the next step... > > > The code I've used is as below: > > obj <- tune(svm, x, y, > ranges = list(cost = 2^seq(-11, -6, by=0.2)), > tunecontrol = tune.control(sampling = "cross") , > kernel='linear' > ) > > ------------------------ > > What can I do now? > > Thanks a lot! > > On 2/28/06, Liaw, Andy < [EMAIL PROTECTED]> wrote: > > > > You might find > > http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf<http://www.csie.ntu.edu.tw/%7Ecjlin/papers/guide/guide.pdf> > > helpful. > > > > > > Parameter tuning is essential for avoiding overfitting. > > > > Andy > > > > -----Original Message----- > > *From:* Michael [mailto:[EMAIL PROTECTED] > > *Sent:* Tuesday, February 28, 2006 3:30 PM > > *To:* Liaw, Andy > > *Cc:* R-help@stat.math.ethz.ch > > *Subject:* Re: [R] does svm have a CV to obtain the best "cost" > > parameter? > > > > Hi Andy, > > > > Thanks a lot for your answer! So what do I do if the model overfits? > > > > Thanks a lot! > > > > On 2/28/06, Liaw, Andy < [EMAIL PROTECTED]> wrote: > > > > > > From: Michael > > > > > > > > Hi all, > > > > > > > > I am using the "svm" command in the e1071 package. > > > > > > > > Does it have an automatic way of setting the "cost" parameter? > > > > > > See ?best.svm in that package. > > > > > > > I changed a few values for the "cost" parameter but I hope there is > > > a > > > > systematic way of obtaining the best "cost" value. > > > > > > > > I noticed that there is a "cross" (Cross validation) > > > > parameter in the "svm" > > > > function. > > > > > > > > But I did not see how it can be used to optimize the "cost" > > > parameter. > > > > > > > > By the way, what does a 0 training error and a high testing > > > > error mean? > > > > Varying "cross=5", or "cross=10", etc. does not change the > > > > training error > > > > and testing error at all. How to improve? > > > > > > Overfitting, which varying different validation method will not solve. > > > > > > Andy > > > > > > > Thanks a lot! > > > > > > > > M. > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > ______________________________________________ > > > > R-help@stat.math.ethz.ch mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide! > > > > http://www.R-project.org/posting-guide.html > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > > Notice: This e-mail message, together with any attachments, contains > > > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New > > > Jersey, USA 08889), and/or its affiliates (which may be known outside the > > > United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as > > > Banyu) that may be confidential, proprietary copyrighted and/or legally > > > privileged. It is intended solely for the use of the individual or entity > > > named on this message. If you are not the intended recipient, and have > > > received this message in error, please notify us immediately by reply > > > e-mail > > > and then delete it from your system. > > > > > > ------------------------------------------------------------------------------ > > > > > > > > > ------------------------------------------------------------------------------ > > Notice: This e-mail message, together with any attachments, contains > > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New > > Jersey, USA 08889), and/or its affiliates (which may be known outside the > > United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as > > Banyu) that may be confidential, proprietary copyrighted and/or legally > > privileged. It is intended solely for the use of the individual or entity > > named on this message. If you are not the intended recipient, and have > > received this message in error, please notify us immediately by reply e-mail > > and then delete it from your system. > > > > ------------------------------------------------------------------------------ > > > > > ------------------------------------------------------------------------------ > Notice: This e-mail message, together with any attachments...{{dropped}} ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html