Hi,


A long time I have some problems to run a SVM - regression. Here an example
with the Ozone data that represents very well my own data.



 data(Ozone, package = "mlbench")

#I cut the three first variables and splite the data in two parts

Ozone<- na.omit(Ozone[, -(1:3)])

 index <- 1:nrow(Ozone)

 testset <- Ozone[104:203,]

trainset <- Ozone[1:103, ]

names(Ozone)



# library(e1071)

# train svm with RBF kernel and without scale

tuneobj = tune.svm(V4 ~ ., data = trainset, gamma = 10^(-6:-3), cost =
10^(1:3))

 summary(tuneobj)$best.parameters

 svm.noscale <- svm(V4 ~ ., data = trainset, cost = 1000, gamma =
0.001,scale=FALSE)



Parameters:

   SVM-Type:  eps-regression

 SVM-Kernel:  radial

       cost:  1000

      gamma:  0.001

    epsilon:  0.1





Number of Support Vectors:  101



# I get 101 support vectors wich seems to be bad because I have 103 training
observations.

#When I test with the trainset I have good results but when I use the
testset my prediction are pretty bad.



pred.noscale1 <- predict( svm.noscale, newdata=trainset, decision.values=T)

crossprod(pred.noscale1 -  trainset$V4)/103

  #[1,] 0.009827706



pred.noscale2<- predict( svm.noscale, newdata=testset, decision.values=T)

crossprod(pred.noscale2 -  testset$V4)/100

 #[1,] 82.97046





# primal parameters

w <- t(svm.noscale$coefs) %*%svm.noscale$SV



           V5        V6       V7       V8       V9       V10      V11
V12
V13

[1,] 44187.34 -265.8382 3741.839 6359.768 5455.063 -646352.6 317.6211 6456
-23256.67

b=svm.noscale$rho

[1] -10.46065



#It seems that I have overfitting. I suppose that the problem comes from not
use scale data #(V5 and V10 are very high).

#Now scaling the data



 svm.scale <- svm(V4 ~ ., data = trainset, cost = 1000, gamma = 0.001)



Parameters:

   SVM-Type:  eps-regression

 SVM-Kernel:  radial

       cost:  1000

      gamma:  0.001

    epsilon:  0.1



Number of Support Vectors:  86



# It seems better



svm.pred1 <- predict( svm.scale, newdata=trainset, decision.values=T)

 crossprod( svm.pred1 -  trainset$V4)/103

 #[1,] 9.459279



 svm.pred2 <- predict( svm.scale, newdata=testset, decision.values=T)

 crossprod( svm.pred2 - testset$V4)/100

#  26.51138





# primal parameters

>  w <- t(svm.scale$coefs) %*%svm.scale$SV



            V5        V6       V7       V8       V9      V10      V11
V12       V13

[1,] -89.03491 -22.88782 146.8991 56.09881 217.0120 43.01645 -8.27661
50.2729 -60.78473



> b= svm.model$rho

#[1] 18.42264



Looking only to prediction purpose the scale model is good but I’m mainly
interested in w. Is it possible to improve this model to get lower values to
w? Actually I’m trying to run the SVM-GARCH and one condition to the model
is that the sum of

w’s <1 (in my model I have only two independent variables).



If you have any idea how to improve the model or if you find any problem
with it please let me now.



Thanks in advance,


Marlene.



2009/8/31 Noah Silverman <n...@smartmediacorp.com>

> Thanks,
>
> I just remember with RapidMiner, there was always a screen showing the
> effective "weights" assigned to each input variable by the SVM.  These
> numbers themselves weren't good for much, except they really helped to
> visualize the data.  It is rather useful to see how much relative weight
> (significance.) the SVM assigned to each variable.
>
>
> On 8/31/09 12:54 AM, Achim Zeileis wrote:
> > On Mon, 31 Aug 2009, Noah Silverman wrote:
> >
> >> Steve,
> >>
> >> That doesn't work.
> >>
> >> I just trained an SVM with 80 variables.
> >> svm_model$coefs gives me  a list of 10,000 items.  My training set is
> >> 30,000 examples of 80 variables, so I have no idea what the 10,000
> >> items represent.
> >
> > Presumably, the coefficients of the support vectors times the training
> > labels, see help("svm", package = "e1071"). See also
> >   http://www.jstatsoft.org/v15/i09/
> > for some background information and the different formulations available.
> >
> >> There should be some attribute that lists the "weights" for each of
> >> the 80 variables.
> >
> > Not sure what you are looking for. Maybe David, the author auf svm()
> > (and now Cc), can help.
> > Z
> >
> >> --
> >> Noah
> >>
> >> On 8/30/09 7:47 PM, Steve Lianoglou wrote:
> >>> Hi,
> >>>
> >>> On Sun, Aug 30, 2009 at 6:10 PM, Noah
> >>> Silverman<n...@smartmediacorp.com> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I'm using the svm function from the e1071 package.
> >>>>
> >>>> It works well and gives me nice results.
> >>>>
> >>>> I'm very curious to see the actual coefficients calculated for each
> >>>> input
> >>>> variable.  (Other packages, like RapidMiner, show you this
> >>>> automatically.)
> >>>>
> >>>> I've tried looking at attributes for the model and do see a
> >>>> "coefficients"
> >>>> item, but printing it returns an NULL result.
> >>>>
> >>> Hmm .. I don't see a "coefficients" attribute, but rather a "coefs"
> >>> attribute, which I guess is what you're looking for (?)
> >>>
> >>> Run "example(svm)" to its end and type:
> >>>
> >>> R>  m$coefs
> >>>               [,1]
> >>>   [1,]  1.00884130
> >>>   [2,]  1.27446460
> >>>   [3,]  2.00000000
> >>>   [4,] -1.00000000
> >>>   [5,] -0.35480340
> >>>   [6,] -0.74043692
> >>>   [7,] -0.87635311
> >>>   [8,] -0.04857869
> >>>   [9,] -0.03721980
> >>> [10,] -0.64696793
> >>> [11,] -0.57894605
> >>>
> >>> HTH,
> >>>
> >>> -steve
> >>>
> >>>
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to