Awesome! Good news, James. Thanks for letting us know. Glad you were able to sort this out.
-steve On Thu, Aug 19, 2010 at 5:00 PM, Watling,James I <watli...@ufl.edu> wrote: > Hi Steve-- > > I spent some more time tuning the model with alternative gamma and cost > values, but still kept coming back to the same issue re: probabilities. I > spent some more time playing around with the code, and realized that the > error did indeed have to do with the ifelse() function I used to feed the > probabilities into the ascii file. I have rewritten the code with a > replace() statement, and the probabilities have 'landed' in the correct place > in the ascii file. The resulting map is exactly what I would expect. > > Thanks for your helpful suggestions that forced me to figure this out! > > Much appreciated > > James > > > -----Original Message----- > From: Steve Lianoglou [mailto:mailinglist.honey...@gmail.com] > Sent: Thursday, August 19, 2010 11:39 AM > To: Watling,James I > Cc: r-h...@lists.r-project.org > Subject: Re: [R] probabilities from predict.svm > > On Thu, Aug 19, 2010 at 10:56 AM, Watling,James I <watli...@ufl.edu> wrote: >> Hi Steve-- >> >> Thanks for your interest in helping me figure this out. I think the problem >> has to do with the values of the probabilities returned from the use of the >> model to predict occurrence in a new dataframe. > > Ok, so if you're sure this is the problem, and not, say, getting the > correct values for the predictor variables at a given point, then I'd > be a bit more thorough when building your model. > > Originally you said: > >> I have used a training dataset to train the model, and tested it against a >> validation data set with good results: AUC is high, and the confusion matrix >> indicates low commission and omission errors. > > Maybe your originally "good" AUC's was just a function of your train/test > split? > > Why not use all of your data and do something like 10 fold cross > validation to find: > > (1) Your average accuracy over your folds > (2) The best value for your cost parameter; (how did you pick cost=10000)? > (3) or even the best kernel to use. > > Doing 2 and 3 will likely be time consuming. To help with (2) you > might try looking at the svmpath package: > > http://cran.r-project.org/web/packages/svmpath/index.html > > It only works on 2-class classification problems, and (I think) using > a linear kernel (sorry, don't remember off hand, but it's written in > the package help and linked pubs). > > You don't need to use svmpath, but then you'll need to define a "grid" > of C values (or maybe a 2d grid, if your svm + kernel combo has more > params) and train over these values ... takes lots of cpu time, but > not too much human time. > > Does that make sense? > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.