Hi,
A long time I have some problems to run a SVM - regression. Here an example with the Ozone data that represents very well my own data. data(Ozone, package = "mlbench") #I cut the three first variables and splite the data in two parts Ozone<- na.omit(Ozone[, -(1:3)]) index <- 1:nrow(Ozone) testset <- Ozone[104:203,] trainset <- Ozone[1:103, ] names(Ozone) # library(e1071) # train svm with RBF kernel and without scale tuneobj = tune.svm(V4 ~ ., data = trainset, gamma = 10^(-6:-3), cost = 10^(1:3)) summary(tuneobj)$best.parameters svm.noscale <- svm(V4 ~ ., data = trainset, cost = 1000, gamma = 0.001,scale=FALSE) Parameters: SVM-Type: eps-regression SVM-Kernel: radial cost: 1000 gamma: 0.001 epsilon: 0.1 Number of Support Vectors: 101 # I get 101 support vectors wich seems to be bad because I have 103 training observations. #When I test with the trainset I have good results but when I use the testset my prediction are pretty bad. pred.noscale1 <- predict( svm.noscale, newdata=trainset, decision.values=T) crossprod(pred.noscale1 - trainset$V4)/103 #[1,] 0.009827706 pred.noscale2<- predict( svm.noscale, newdata=testset, decision.values=T) crossprod(pred.noscale2 - testset$V4)/100 #[1,] 82.97046 # primal parameters w <- t(svm.noscale$coefs) %*%svm.noscale$SV V5 V6 V7 V8 V9 V10 V11 V12 V13 [1,] 44187.34 -265.8382 3741.839 6359.768 5455.063 -646352.6 317.6211 6456 -23256.67 b=svm.noscale$rho [1] -10.46065 #It seems that I have overfitting. I suppose that the problem comes from not use scale data #(V5 and V10 are very high). #Now scaling the data svm.scale <- svm(V4 ~ ., data = trainset, cost = 1000, gamma = 0.001) Parameters: SVM-Type: eps-regression SVM-Kernel: radial cost: 1000 gamma: 0.001 epsilon: 0.1 Number of Support Vectors: 86 # It seems better svm.pred1 <- predict( svm.scale, newdata=trainset, decision.values=T) crossprod( svm.pred1 - trainset$V4)/103 #[1,] 9.459279 svm.pred2 <- predict( svm.scale, newdata=testset, decision.values=T) crossprod( svm.pred2 - testset$V4)/100 # 26.51138 # primal parameters > w <- t(svm.scale$coefs) %*%svm.scale$SV V5 V6 V7 V8 V9 V10 V11 V12 V13 [1,] -89.03491 -22.88782 146.8991 56.09881 217.0120 43.01645 -8.27661 50.2729 -60.78473 > b= svm.model$rho #[1] 18.42264 Looking only to prediction purpose the scale model is good but Im mainly interested in w. Is it possible to improve this model to get lower values to w? Actually Im trying to run the SVM-GARCH and one condition to the model is that the sum of ws <1 (in my model I have only two independent variables). If you have any idea how to improve the model or if you find any problem with it please let me now. Thanks in advance, Marlene. 2009/8/31 Noah Silverman <n...@smartmediacorp.com> > Thanks, > > I just remember with RapidMiner, there was always a screen showing the > effective "weights" assigned to each input variable by the SVM. These > numbers themselves weren't good for much, except they really helped to > visualize the data. It is rather useful to see how much relative weight > (significance.) the SVM assigned to each variable. > > > On 8/31/09 12:54 AM, Achim Zeileis wrote: > > On Mon, 31 Aug 2009, Noah Silverman wrote: > > > >> Steve, > >> > >> That doesn't work. > >> > >> I just trained an SVM with 80 variables. > >> svm_model$coefs gives me a list of 10,000 items. My training set is > >> 30,000 examples of 80 variables, so I have no idea what the 10,000 > >> items represent. > > > > Presumably, the coefficients of the support vectors times the training > > labels, see help("svm", package = "e1071"). See also > > http://www.jstatsoft.org/v15/i09/ > > for some background information and the different formulations available. > > > >> There should be some attribute that lists the "weights" for each of > >> the 80 variables. > > > > Not sure what you are looking for. Maybe David, the author auf svm() > > (and now Cc), can help. > > Z > > > >> -- > >> Noah > >> > >> On 8/30/09 7:47 PM, Steve Lianoglou wrote: > >>> Hi, > >>> > >>> On Sun, Aug 30, 2009 at 6:10 PM, Noah > >>> Silverman<n...@smartmediacorp.com> wrote: > >>> > >>>> Hello, > >>>> > >>>> I'm using the svm function from the e1071 package. > >>>> > >>>> It works well and gives me nice results. > >>>> > >>>> I'm very curious to see the actual coefficients calculated for each > >>>> input > >>>> variable. (Other packages, like RapidMiner, show you this > >>>> automatically.) > >>>> > >>>> I've tried looking at attributes for the model and do see a > >>>> "coefficients" > >>>> item, but printing it returns an NULL result. > >>>> > >>> Hmm .. I don't see a "coefficients" attribute, but rather a "coefs" > >>> attribute, which I guess is what you're looking for (?) > >>> > >>> Run "example(svm)" to its end and type: > >>> > >>> R> m$coefs > >>> [,1] > >>> [1,] 1.00884130 > >>> [2,] 1.27446460 > >>> [3,] 2.00000000 > >>> [4,] -1.00000000 > >>> [5,] -0.35480340 > >>> [6,] -0.74043692 > >>> [7,] -0.87635311 > >>> [8,] -0.04857869 > >>> [9,] -0.03721980 > >>> [10,] -0.64696793 > >>> [11,] -0.57894605 > >>> > >>> HTH, > >>> > >>> -steve > >>> > >>> > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.