Re: [R] statdataml question

2007-08-20 Thread David Meyer
Bryan:

> 
> Hi,
> 
> I was wondering if Statdataml is currently the preferred way to
> represent statistical data in XML in R.

This is hard to tell. We think it is a _possible_ way doing that. Do you 
have some alternatives in mind?

And also if the Statdataml api
> provides ways to load the XML as a HTTP GET?

You mean directly from the Internet like:

readSDML("http://wi.wu-wien.ac.at/home/meyer/test.sdml";)

? Yes, this comes for free since readSDML use xmlTreeParse which 
supports that.

Best,
David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nnet 10-fold cross-validation

2007-07-24 Thread David Meyer
Hi all,

there is tune() in the e1071 package for doing this in general, and, 
among others, a tune.nnet() wrapper (see ?tune):


 > tmodel = tune.nnet(Species ~ ., data = iris, size = 1:5)
 > summary(tmodel)

Parameter tuning of `nnet':

- sampling method: 10-fold cross validation

- best parameters:
  size
 1

- best performance: 0.0133

- Detailed performance results:
   size  error dispersion
11 0.0133 0.02810913
22 0.0267 0.04661373
33 0.0267 0.04661373
44 0.0200 0.04499657
55 0.0267 0.04661373

 > plot(tmodel)
 > tmodel$best.model
a 4-1-3 network with 11 weights
inputs: Sepal.Length Sepal.Width Petal.Length Petal.Width
output(s): Species
options were - softmax modelling

etc.

Best
David



On 7/23/07, S.O. Nyangoma <[EMAIL PROTECTED]> wrote:
 > > Hi
 > > It clear that to do a classification with svm under 10-fold cross
 > > validation one uses
 > >
 > > svm(Xm, newlabs, type = "C-classification", kernel = "linear",cross =
 > > 10)
 > >
 > > What corresponds to the nnet?
 > > nnet(.,cross=10)?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] package "relations" updated

2007-07-10 Thread David Meyer
Dear useRs,

Version 0.2 of package "relations" appeared on CRAN and is currently 
propagating to the mirrors. In addition to some bug fixes, the new 
release includes:

   o an introductory vignette showing the main features;

   o new SD fitters for the C ("complete") and A ("antisymmetric")
 families of relations;

   o a fitter for Copeland's method;

   o the relation_classes() function to extract and pretty-print
 (ordered) classes from preferences and equivalences;

   o the function relation_violations() to compute a measure of
 remoteness from a specified property (e.g., symmetry,
 transitivity, etc.).

David and Kurt.





-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Tel: +43-1-313 36 4393
Fax: +43-1-313 36 90 4393
HP:  http://wi.wu-wien.ac.at/~meyer/

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with e1071 and SparseM

2007-07-09 Thread David Meyer
Chris:

yes, this is indeed a bug (in predict.svm) - will be fixed in the next 
release of e1071.

Thanks for pointing this out,

David

--

Hello all,


I am trying to use the "svm" method provided by e1071 (Version: 1.5-16)
together with a matrix provided by the SparseM package (Version: 0.73)
but it fails with this message:

 > > model <- svm(lm, lv, scale = TRUE, type = 'C-classification', kernel =
'linear')
Error in t.default(x) : argument is not a matrix

although lm was created before with read.matrix.csr (from the e1071)
package.

I also tried to simply convert a normal matrix to a SparseM matrix and
then pass it, but I get the same error again.

According to the manual of svm(), this is supposed to work though:

"   x: a data matrix, a vector, or a sparse matrix (object of class
   'matrix.csr' as provided by the package 'SparseM')."

Used R version: R version 2.4.0 Patched (2006-11-25 r39997)

Does anyone know how I can use Sparse Matrices with e1071? This would be
really important because the matrix is simply too large to write it out.


Best regards,


Chris

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New package "proxy" for distances and similiarities

2007-07-09 Thread David Meyer
Dear useRs,

a new package for computing distance and similarity matrices made it to 
CRAN, and will propagate to the mirrors soon.

It includes an enhanced version of "dist()" with support for more than 
40 popular similarity and distance measures, both for auto- and 
cross-distances. Some important ones are implemented in C.

The proximity measures are stored in a registry which can easily be 
queried and extended by users at run-time. For adding a new measure, the 
simplest way is to provide the distance measure as a small R function, 
the package code will do the loops on the C code level to create the 
proximity matrix. It is of course also possible to use more efficient C 
implementations---either for the distance measure alone, or the whole 
matrix computation.

Input data is not restricted to matrices: provided the proximity measure 
can handle it, lists and data frames are also accepted.

The formulas for binary proximities can conveniently be specified in the 
a/b/c/d/n format, where the number of concordant/discordant pairs is 
precomputed on the C code level.

We are currently working on support for sparse data.

This is also a "Call for Measures": if you feel that a particular 
similarity of distance measure is missing, please send the formula and a 
reference (or, ideally, the whole registry entry) to one of the package 
maintainers who will happily add it.

David and Christian.

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question for svm function in e1071

2007-07-06 Thread David Meyer
Adschai:

The function is written in C++, so debugging the source code of the R 
svm() function will not really help. What you can do to make the C-code 
more verbose is the following:

- get the sources of e1071
- in the src/ directory, look up the svm.cpp file
- In line 37, there is:


#if 0
void info(char *fmt,...)

[...]

replace the first line by:

#if 1

- build and install the package again.

Best
David



Sorry that I have many questions today. I am using svm function on about
180,000 points of training set. It takes very long time to run. However,
I would like it to spit out something to make sure that the run is not
dead in between.  Would you please suggest anyway to do so?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about framework to weighting different classes in SVM

2007-07-05 Thread David Meyer
Adschai:

here is an example for class.weights (isn't it on the help page?):

  data(iris)
  i2 <- iris
  levels(i2$Species)[3] <- "versicolor"
  summary(i2$Species)
  wts <- 100 / table(i2$Species)
  wts
  m <- svm(Species ~ ., data = i2, class.weights = wts)

Cheers,
David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New package: relations

2007-06-04 Thread David Meyer
Dear useRs,

it is our great pleasure to announce the new package "relations" to
appear on all CRAN-mirrors soon.

This package provides data structures and methods for creating and
manipulating relations, relation ensembles, sets, and tuples. The
feature list includes:

* creation of relations by domain and graph/characteristic
function/incidences,

* extraction of characteristic function and graph,

* predicate functions for the most common standard characteristics,

* operators known from relational algebra theory (such as projection,
selection, cartesian product, joins, etc.),

* transitive/reflexive reduction and closure of a relation,

* relation ensembles for combining relations,

* fitters for determining (possibly all) consensus relations of a
relation ensemble including the Borda and Condorcet methods, as well as
exact solvers for minimizing a criterion function based on the symmetric
difference (Kemeny-Snell) metric.

* a simple plot method for Hasse-diagrams using RGraphviz.


Kurt and David

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] string edit distance

2007-04-10 Thread David Meyer
It's in package cba (sdists()).

David

--

I have a column of words, for example

"DOG"
"DOOG"
"GOD"
"GOOD"
"DOOR"
...

and I am interested in creating a matrix that contains the string
edit distances between each pair of words.  I am this close  -> '  '
<-   to writing the algorithm myself (which will allow for different
variations on the string edit rules, indels, plus or minus
transpositions, and possibly some variations on that), but I figured
I'd see if anyone on the list has any experience with this and might
already have some shoulders for me to stand on.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] training svm

2007-03-18 Thread David Meyer
Oldrich:

> 
> The columns in data passed to svm need to contain only numeral values.

This is not correct, svm() of course also accepts factors and then 
builds a model matrix similar to lm(). But it won't accept, e.g., 
character vectors.

> I simply assigned a number to each category of each feature. However,
> there must not be a column where all the numbers are equal 

yes, since the intercept is always included in svm models anyway.

Best
David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] distance metrics

2007-03-13 Thread David Meyer
 > Hello:
 > > >
 > > > Does anyone know if there exists a package that handles methods for [
 > for
 > > > dist objects?
 > > >
 > > > I would like to access a dist object using matrix notation
 > > >
 > > > e.g.
 > > >
 > > > dMat = dist(x)
 > > > dMat[i,j]


You can use the [[ operator defined for distance matrices currently in 
package cba, which allows subsetting "dist" objects. (Note that this 
will move to the new "proxy" package on proximity measures very soon).

David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PROC TABULATE with R

2007-03-01 Thread David Meyer
You can also have a look at structable() in vcd, especially the indexing 
functions. The problem with OLAP in R is, that you will have to create 
sth. like a hierarchical factor to handle rollup/drilldown correctly.

And you will have to decide whether it's purely memory-based (fast 
calculations, but memory limit), or you do it using data bases / SQL (slow).

And finally, you will need some simple GUI to provide interactive use 
(doing OLAP using command-line functions is not really OLAP).

Best,
David

-

 > Hi !
 > > >
 > > > with apply or tapply-like functions, is it possible to create
 > > > multidimensional cubes with R ? Like with SAS and its function PROC
 > TABULATE
 > > > or OLAP ?
 > > > Is there some functions or modules to access OLAP databases with R ?
 > > >
 > > > My idea is to create a package for that, since XMLA and JOLAP
 > specifications
 > > > should able us to do so !
 > > >
 > > >

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Tel: +43-1-313 36 4393
Fax: +43-1-313 36 90 4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] training svm

2007-02-27 Thread David Meyer
Hello (whoever you are),

your data looks problematic. What does

head(ne_span_data)

reveal?

BTW, svm() will not handle NA values.

Best
David

-

Hello. I'm new to R and I'm trying to solve a classification problem. I have
a training dataset of about 40,000 rows and 50 columns. When I try to train
support vector machine, it gives me this error after a few seconds:

  Error in predict.svm(ret, xhold) : Model is empty!

This is the code I use:

  ne_span_data <- as.matrix(read.table('ne_span.data.R.txt', header=TRUE,
row.names='id'))
  library('e1071')
  svm_ne_span_model <- svm(NE_type ~ . , ne_span_data)

it gives me:
Error in predict.svm(ret, xhold) : Model is empty!

A line from the ne_span.data.R.txt file:
  svt OTHER N N I S 2 NA NA NA NA NA A NA NA 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 train-s1m2

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] "contingency table" for several variables

2007-02-21 Thread David Meyer
David:

 >I ‘m trying to draw ONE table that summarize SEVERAL categorical 
 >variables
 >according to one classification variable, say “sex”. The result would 
 >look
 >like several contingency tables appended one to the other. All the 
 >variables
 >belong to a data frame.

 >The summary.formula in Hmisc package does something pretty close and is
 >ready for a Latex export  but I need either to get rid off the >percentage
 >(or put the count prior to the percentage )in the “reverse” option or 
 >to add
 >a chisquare test in the “response” method.


You could have a look at structable() in package vcd.


Best
David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] naiveBayes question

2007-01-22 Thread David Meyer
Aimin:

The problem is that the columns you choose for training (only 4 
variables) do not match the ones used for prediction (all except y).

David



I try to use naiveBayes

  > p.nb.90<-naiveBayes(y~aa_three+bas+bcu+aa_ss,data=training)
  > 
pr.nb.90<-table(predict(p.nb.90,training[,-13],type="class"),training[,13])

bur I get this error
Error in object$tables[[v]] : subscript out of bounds
  >
head is data set
  > head(training)
  pr aa_three aa_one aa_ss aa_posaas bas   ams bmsacu
bcu omega   y index
1 1acx  ALA  A C  1 127.71   0 69.99   0
-0.2498560   0  79.91470 outward  TRUE
2 1acx  PRO  P C  2  68.55   0 55.44   0
-0.0949008   0  76.60380 outward  TRUE
3 1acx  ALA  A E  3  52.72   0 47.82   0
-0.0396550   0  52.19970 outward  TRUE
4 1acx  PHE  F E  4  22.62   0 31.21   0  0.1270330   0
169.52500  inward  TRUE
5 1acx  SER  S E  5  71.32   0 52.84   0
-0.1312380   0   7.47528 outward  TRUE
6 1acx  VAL  V E  6  12.92   0 22.40   0  0.1728390   0
149.09400  inward  TRUE

anyone know why?

Aimin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svm plot question

2006-12-09 Thread David Meyer
Aimin:

hard to tell. IMO, without specifying defaults, it could only work with
purely numeric data since factors were wrongly processed.

David.

Aimin Yan wrote:
> thanks, I did get this plot.
> Before I have this problem, I did get a plot by my code.
> However after I change a little my code. it doesn't work.
> It is pity not saving my original code.
> 
> Now the question is the plot I get using your code is different from
> what I got before.
> Moreover I did remember I use plot(m.svm,p5.new,As~Cur)
> 
> Do you know why?
> 
> Thanks,
> 
> Aimin
> 
> At 06:32 AM 12/8/2006, David Meyer wrote:
>> Aimin:
>>
>> 1) Please do not spam the r-help list---one request per issue (and two
>> private mails to the code author) really suffice. Not all contributors
>> to the R-project are on-line 24/24, and have time to provide immediate
>> answers.
>>
>> 2) The error occurs because plot.svm() currently does not set valid
>> defaults for categorical dimensions you are conditioning on for your
>> 2D-plot (in your example: 'P' and 'Aa') which certainly is a bug. I will
>> commit a fix for the next release of e1071. For the time being, you will
>> have to explicitly specify the levels of 'P' and 'Aa':
>>
>> plot(m.svm,p5.new,As~Cur, slice = list(P = factor("821p", levels =
>> levels(P)), Aa = factor("ALA", levels = levels(Aa
>>
>> (Note that the defaults for the "slice" argument are completely
>> arbitrary anyway).
>>
>> Thanks for pointing this out,
>>
>> David
>>
>> Aimin Yan wrote:
>> > I have a question about svm in R
>> >
>> > I run the following code, all other is ok,
>> > but plot(m.svm,p5.new,As~Cur) is not ok
>> >
>> > Do you know why?
>> >
>> > install.packages("e1071")
>> > library(e1071)
>> > library(MASS)
>> > p5 <- read.csv("http://www.public.iastate.edu/~aiminy/data/p_5_2.csv";)
>> > p5.new<-subset(p5,select=-Ms)
>> > p5.new$Y<-factor(p5.new$Y)
>> > levels(p5.new$Y) <- list(Out=c(1), In=c(0))
>> > attach(p5.new)
>> > m.svm<-svm(Y~P+Aa+As+Cur,data=p5.new)
>> > summary(m.svm)
>> > plot(m.svm,p5.new,As~Cur)
>> >
>> > Here is output:
>> >
>> >> install.packages("e1071")
>> > --- Please select a CRAN mirror for use in this session ---
>> > trying URL
>> >
>> 'http://rh-mirror.linux.iastate.edu/CRAN/bin/windows/contrib/2.4/e1071_1.5-16.zip'
>>
>> >
>> > Content type 'application/zip' length 592258 bytes
>> > opened URL
>> > downloaded 578Kb
>> >
>> > package 'e1071' successfully unpacked and MD5 sums checked
>> >
>> > The downloaded packages are in
>> > C:\Documents and Settings\aiminy\Local
>> > Settings\Temp\RtmpY0B2qb\downloaded_packages
>> > updating HTML package descriptions
>> >> library(e1071)
>> > Loading required package: class
>> >> library(MASS)
>> >> p5 <- read.csv("http://www.public.iastate.edu/~aiminy/data/p_5_2.csv";)
>> >> p5.new<-subset(p5,select=-Ms)
>> >> p5.new$Y<-factor(p5.new$Y)
>> >> levels(p5.new$Y) <- list(Out=c(1), In=c(0))
>> >> attach(p5.new)
>> >> m.svm<-svm(Y~P+Aa+As+Cur,data=p5.new)
>> >> summary(m.svm)
>> >
>> > Call:
>> > svm(formula = Y ~ P + Aa + As + Cur, data = p5.new)
>> >
>> >
>> > Parameters:
>> >SVM-Type:  C-classification
>> >  SVM-Kernel:  radial
>> >cost:  1
>> >   gamma:  0.04
>> >
>> > Number of Support Vectors:  758
>> >
>> >  ( 382 376 )
>> >
>> >
>> > Number of Classes:  2
>> >
>> > Levels:
>> >  Out In
>> >
>> >
>> >
>> >> plot(m.svm,p5.new,As~Cur)
>> > Error in scale(newdata[, object$scaled, drop = FALSE], center =
>> > object$x.scale$"scaled:center",  :
>> > (subscript) logical subscript too long
>> >>
>> >>
>> >
>> >
>> >
>>
>> -- 
>> Dr. David Meyer
>> Department of Information Systems and Operations
>>
>> Vienna University of Economics and Business Administration
>> Augasse 2-6, A-1090 Wien, Austria, Europe
>> Tel: +43-1-313 36 4393
>> Fax: +43-1-313 36 90 4393
>> HP:  http://wi.wu-wien.ac.at/~meyer/
> 
> 
> 
> 

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Tel: +43-1-313 36 4393
Fax: +43-1-313 36 90 4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svm plot question

2006-12-08 Thread David Meyer
Aimin:

1) Please do not spam the r-help list---one request per issue (and two
private mails to the code author) really suffice. Not all contributors
to the R-project are on-line 24/24, and have time to provide immediate
answers.

2) The error occurs because plot.svm() currently does not set valid
defaults for categorical dimensions you are conditioning on for your
2D-plot (in your example: 'P' and 'Aa') which certainly is a bug. I will
commit a fix for the next release of e1071. For the time being, you will
have to explicitly specify the levels of 'P' and 'Aa':

plot(m.svm,p5.new,As~Cur, slice = list(P = factor("821p", levels =
levels(P)), Aa = factor("ALA", levels = levels(Aa

(Note that the defaults for the "slice" argument are completely
arbitrary anyway).

Thanks for pointing this out,

David

Aimin Yan wrote:
> I have a question about svm in R
> 
> I run the following code, all other is ok,
> but plot(m.svm,p5.new,As~Cur) is not ok
> 
> Do you know why?
> 
> install.packages("e1071")
> library(e1071)
> library(MASS)
> p5 <- read.csv("http://www.public.iastate.edu/~aiminy/data/p_5_2.csv";)
> p5.new<-subset(p5,select=-Ms)
> p5.new$Y<-factor(p5.new$Y)
> levels(p5.new$Y) <- list(Out=c(1), In=c(0))
> attach(p5.new)
> m.svm<-svm(Y~P+Aa+As+Cur,data=p5.new)
> summary(m.svm)
> plot(m.svm,p5.new,As~Cur)
> 
> Here is output:
> 
>> install.packages("e1071")
> --- Please select a CRAN mirror for use in this session ---
> trying URL
> 'http://rh-mirror.linux.iastate.edu/CRAN/bin/windows/contrib/2.4/e1071_1.5-16.zip'
> 
> Content type 'application/zip' length 592258 bytes
> opened URL
> downloaded 578Kb
> 
> package 'e1071' successfully unpacked and MD5 sums checked
> 
> The downloaded packages are in
> C:\Documents and Settings\aiminy\Local
> Settings\Temp\RtmpY0B2qb\downloaded_packages
> updating HTML package descriptions
>> library(e1071)
> Loading required package: class
>> library(MASS)
>> p5 <- read.csv("http://www.public.iastate.edu/~aiminy/data/p_5_2.csv";)
>> p5.new<-subset(p5,select=-Ms)
>> p5.new$Y<-factor(p5.new$Y)
>> levels(p5.new$Y) <- list(Out=c(1), In=c(0))
>> attach(p5.new)
>> m.svm<-svm(Y~P+Aa+As+Cur,data=p5.new)
>> summary(m.svm)
> 
> Call:
> svm(formula = Y ~ P + Aa + As + Cur, data = p5.new)
> 
> 
> Parameters:
>SVM-Type:  C-classification
>  SVM-Kernel:  radial
>    cost:  1
>   gamma:  0.04
> 
> Number of Support Vectors:  758
> 
>  ( 382 376 )
> 
> 
> Number of Classes:  2
> 
> Levels:
>  Out In
> 
> 
> 
>> plot(m.svm,p5.new,As~Cur)
> Error in scale(newdata[, object$scaled, drop = FALSE], center =
> object$x.scale$"scaled:center",  :
> (subscript) logical subscript too long
>>
>>
> 
> 
> 

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Tel: +43-1-313 36 4393
Fax: +43-1-313 36 90 4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vcd package, assoc()

2006-12-06 Thread David Meyer
Yes, he will :)

Thanks, Uwe!

David

Uwe Ligges schrieb:
>
>
> Nicolas Mazziotta wrote:
>> R version is R 2.2.1 (kubuntu dapper package)
>
> So update your outdated version of R at first!
> At least to R-2.4.0, even better to R-2.4.0 patched which will soon 
> become R-2.4.1.
>
> Anyway, David, the vcd maintainer, is certainly going to fix the 
> DESCRIPTION's "Depends" entry.
>
> Best,
> Uwe Ligges
>
>
>> Le mercredi 06 décembre 2006 08:26, Uwe Ligges a écrit :
>>>> Error in unit.c(mar[4], unit(1, "null"), mar[2], legend_width) :
>>>> It is invalid to combine unit objects with other types
>>> Works for me. Which version of R is this? R-2.4.0, R-patched or 
>>> R-devel?
>>>
>>
>
>
>

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Tel: +43-1-313 36 4393
Fax: +43-1-313 36 90 4393 
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package e1071 - class probabilities

2006-09-30 Thread David Meyer
Vince:

the implementations for both are different, so this might happen
(although undesirably).

Can you provide me an example with data (off-list)?

David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] comparing 2 odds ratios

2006-06-27 Thread David Meyer
Hi,

you can have a look at fourfold() and oddratio() in package vcd.

Best,
David

Hi there, is there any way to compare 2 odds ratios? I
have two tests that are supposed to detect a disease
presence. So for each test, I can compute an odds
ratio. My problem is how can I compare the 2 tests by
testing whether the 2 odds ratios are the same?

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Tel: +43-1-313 36 4393
Fax: +43-1-313 36 90 4393 
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Getting SVM minimized function value

2006-04-17 Thread David Meyer

Pau,

the objective value currently is not returned by libsvm; I will drop
Chih-Chen Lin, the author of libsvm, a note on that (it's actually not much
of a work).

However, the C-code can create some debugging output which includes the
objective value, so at least you can get it on the screen. Therefore, you
need to activate a "switch" in src/svm.cpp in the package sources. Almost at
the beginning of the file, you will find:


#if 0
void info(char *fmt,...)
{
va_list ap;
va_start(ap,fmt);
vprintf(fmt,ap);
va_end(ap);
}
void info_flush()
{
fflush(stdout);
}
#else
void info(char *fmt,...) {}
void info_flush() {}
#endif

Just change the 

#if 0

to 

#if 1

and re-build + re-install the package.

HTH,
David



Hello, I have been searching a way to get the resulting optimized
function value of a trained SVM model (svm from the package e1071) but
I have not succeed.
Does anyone knows a way to get that value?
Pau


-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Tel: +43-1-313 36 4393
Fax: +43-1-313 36 90 4393 
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Predict function for 'newdata' of different dimension in svm

2006-03-31 Thread David Meyer
Sandra,

hard to tell where the error message originates from without having the data
at hand (perhaps you could provide that to me off-list?), but I am almost
sure things will work when you train the model the "standard" way:

cd1.svm<-svm(Acode~EXT+TOF, data = boot.dist.dat, cost=100, gamma=20)

and then do the predictions.

Best,
David

-
I am using the "predict" function on a support vector machine (svm)
object, and I don't understand why I can't predict on a dataset with more
observations than the training dataset.

I think this problem is a generic "predict" problem, but I'm not sure.

The original svm was fit on 50 observations.

cd1.svm<-svm(boot.dist.dat$Acode~boot.dist.dat$EXT+boot.dist.dat
$TOF,cost=100,gamma=20)

## for these training data,
> names(boot.dist.dat)
[1] "TOF"   "EXT"   "Acode"
> dim(boot.dist.dat)
[1] 50  3

Now I want to use the svm classifier on a new dataset with 175
observations:

new.dat<-data.frame(TOF=Cd1[cand.adult,]$TOF,EXT=Cd1[cand.adult,]
$EXT,Acode=rep(0,175),row.names=NULL)

## for the new dataset,
> names(new.dat)
[1] "TOF"   "EXT"   "Acode"
> dim(new.dat)
[1] 175   3

Now try to predict:

> predict(cd1.svm,newdata=new.dat)

Error in "names<-.default"(`*tmp*`, value = c("1", "2", "3", "4", "5",  :
'names' attribute [175] must be the same length as the vector [50]

What am I missing?  Why would the row names have to be the same?

Thanks so much,
Sandra McBride
-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Tel: +43-1-313 36 4393
Fax: +43-1-313 36 90 4393 
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Fixed legend in vcd/mosaicplot

2006-03-22 Thread David Meyer
Dieter,

there is no way of fixing the range of the residuals yet, we will add sth.
like a ylim argument to legend_foo().

Thanks for pointing this out,

David

PS: there is no mosaicplot() function in vcd, but a mosaic() ...


-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] getting probabilities from SVM

2006-02-17 Thread David Meyer
> Finally figured it out.  You have to extract it from the attributes.
> Tricky.  Thanks anyway.

> attr(pred, "prob")[1:10,]

Correct. 

Just for the records, the rationale behind this `tricky' design:

In addition to probabilites, predict.svm() (more precisely: libsvm) can also 
compute the decision values. Common ways to handle `polymorph' prediction types 
are, e.g, using a `type' argument in the predict() function, or to return all 
variants in one list object. With a `type' argument, you need several calls to 
predict() if you need, say, hard predictions _and_ the probabilities. On the 
other hand, the probability and decision values features were added to libsvm 
only when svm() in e1071 had already been around for a while, so returning a 
list instead of a vector would have broken a lot of code. So I decided to keep 
the `standard' predict behavior and to `hide' special predictions in an 
attribute. If the latter had been available from the beginning, I probably 
would have used the `type' approach.

Cheers,
David


On 2/16/06, roger bos <[EMAIL PROTECTED]> wrote:
>
> I am using SVM to classify categorical data and I would like the
> probabilities instead of the classification.  ?predict.svm says that its
> only enabled when you train the model with it enabled, so I did that, but it
> didn't work.  I can't even get it to work with iris.  The help file shows
> that probability = TRUE when training the model, but doesn't show an
> example.  Then I try to predict with probabilities, I still only get
> classifications back.  Anyone get this to work and can help me out?

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help with plot.svm from e1071

2006-01-23 Thread David Meyer
Josh,

the problem here is that your code and mine refer to "x" and
non-standard evaluation happens in points(), looking up "x" in the
object supplied to "data". So your code will work when you are using,
e.g., "xx" instead of "x" in the data frame and the call to svm(). I
will fix this ASAP, thanks for pointing this out...

Cheers,

David.

--

Hi.

I'm trying to plot a pair of intertwined spirals and an svm that
separates them. I'm having some trouble. Here's what I tried.

> library(mlbench)
> library(e1071)
Loading required package: class
> raw <- mlbench.spirals(200,2)
> spiral <- data.frame(class=as.factor(raw$classes), x=raw$x[,1], y=raw$x[,2])
> m <- svm(class~., data=spiral)
> plot(m, spiral)
Error in -x$index : invalid argument to unary operator

So we delve into e1071:::plot.svm. When I run the code in plot.svm
everything is fine up until
 points(formula, data = data[-x$index, ], pch = dataSymbol,
 col = symbolPalette[colind[-x$index]])
That gives me the same error message, "Error in -x$index : invalid
argument to unary operator". The weird thing is that I can run either
of the those statements in isolation
data[-x$index, ]
symbolPalette[colind[-x$index]]
and neither gives me an error. I looked in the two points functions I


can see (points.default and points.formula) but neither calls x$index.

I was following along the documentation for plot.svm, which has a
simple example (that works)
## a simple example
library(MASS)
data(cats)
m <- svm(Sex~., data = cats)
plot(m, cats)

I don't see what the difference between their example and mine.


-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] e1071::SVM calculate distance to separating hyperplane

2006-01-05 Thread David Meyer

predict.svm() can give you the decision values which are the distances
you are looking for (up to a scaling constant).

Regards,
David

>Hi,
>I know this question has been posed before, but I didnt find the answer in
>the R-help archive, so please accept my sincere apologies for being
>repetitive:
>How can one (elegantly) calculate the distance between data points (in the
>transformed space, I suppose) and the hyperplane that separates the 2
>categories when using svm() from the e1071 library?

>thanks a lot,
>Hans

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] str and structable error

2005-11-28 Thread David Meyer
> > 
> > I encountered a behaviour which puzzles me (but 
> > finally I did get what I wanted).
> > 
> > I used structable and strucplot but I wanted to change 
> > names of variables in structable object. 
I tried to subset 
> > it, use names but to no avail. So I tried str and 
> > expected to get a structure of an object but:
> > 
> > 
> >>sss<-structable(Titanic)
> >>str(sss)
> > 
> > Error in "[.structable"(x, args[[1]], ) : subscript out of 
> > bounds
> 
> Looks like package vcd needs a separate structable method for the str() 
> generic.

yes! Thanks for pointing this out. It's because "[.structable" has a
non-standard behavior. Using:

"[.structable" = function(object, ...) NextMethod()

at the command line, str() would work as expected.

David

> 
> Uwe Ligges
> 
> 
> 
> > Finally I learned, that I need to change attributes of 
> > structable object.
> > 
> > Is this error message OK and I did not read 
> > documentation properly? Or is it normal that str gives 
> > an error on some objects but I just was not so lucky to 
> > meet one?.
> > 
> > W2000, R2.2.0, vcd package Built: R 2.2.0; ; 2005-11-
> > 22 14:23:44; windows, 
> > 
> > Best regards.
> > 
> > Petr
> > 
> > Petr Pikal
> > [EMAIL PROTECTED]
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> 
> 


-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] vcd package 0.9-5 released

2005-10-20 Thread David Meyer

Dear useRs,

a new version of the vcd package (0.9-5) is now available from CRAN.

Apart from (a lot of) bug fixes, it includes the following new features
(some of them have 'silently' been included in previous bug fix
releases):

* Improved documentation:

  - an introductory vignette on the strucplot framework (including
mosaic, association and sieve plots)
  - special vignettes on using/extending shading and labeling functions

* New function spine() for spinograms and spine plots

* New function cd_plot() for conditional density plots 

* New function cotabplot() for visualizing conditional independence in a
  trellis-like layout, providing panel functions for association,
  mosaic, and sieve plots

* Sieve plots are now integrated in the strucplot framework, sieve()
  replaces sieveplot()

* Extended support for 'structable' objects (textual representation of
  mosaic plots):

  - structable objects can be used as input for mosaic(), sieve(), and
assoc()
  - extract ("[") and replacement ("[<-") functions are available (old
"[[" method removed)
  - methods for t(), dim(), as.matrix(), as.vector(), as.table(), etc.
are available

* New panel function pairs_diagonal_text() for pairs()

* The alternative legend function legend_fixed() now looks more similar
  to the legend of mosaicplot() in base R

Comments are more then welcome!

David, Achim, Kurt.

PS: If you like modern art, try out demo(mondrian)! :)


-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to use tune.knn() for dataset with missing values

2005-10-06 Thread David Meyer

Well, since knn() can't handle incomplete data as it says, you can
choose to either omit incomplete observations (e.g., using na.omit()),
or to impute the data if the conditions are met (missingness at random,
...); see, e.g.,  packages cat, mix, norm, and e1071 for that.

HTH,
David



Hi Everybody,

i again have the problem in using tune.knn(), its giving an error saying

missing values are not allowed again here is the script for 
BreastCancer Data,

library(e1071)
library(mda)

trdata<-data.frame(train,row.names=NULL)
attach(trdata)

xtr <- subset(trdata, select = -Class)
ytr <- Class

bestpara <-tune.knn(xtr,ytr, k = 1:25, tunecontrol =
tune.control(sampling 
= "cross"))

and here i got the mentioned error.

can anybody help me in this regard...

Thanks & Regards,

Uttam Phulwale
Tata Consultancy Services Limited
Mailto: [EMAIL PROTECTED]
Website: http://www.tcs.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to insert a certain model in SVM regarding to fixed kernels

2005-08-12 Thread David Meyer
> David, Please correct me if I am wrong but I think svm partially works
> with dyn although I don't remember what the specific limitations were.

Yes, the fitted values / residuals can be extracted from the trained
model. The 'newdata' argument of predict() is not functional yet for
time series.

Cheers,
David

> Its possible that what works already is enough for Amir. For example,
> 
> library(e1071)
> library(dyn)
> set.seed(1)
> y <- ts(rnorm(100))
> y.svm <- dyn$svm(y ~ lag(y))
> yp <- predict(y.svm)
> ts.plot(y, yp, col = 1:2)
> 
> On 8/12/05, David Meyer <[EMAIL PROTECTED]> wrote:
> > Amir,
> > 
> > >
> > > Suppose that we want to regress for example a certain
> > > autoregressive model using SVM. We have our data and also some
> > > fixed kernels in libSVM behinde e1071 in front. The question:
> > > Where can we insert our certain autoregressive model ? During
> > > creating data frame ?
> > 
> > Yes, I think.
> > 
> > > Or perhaps we can make a
> > > relationship between our variables ended to desired autoregressive
> > > model ?
> > 
> > Gabor Grothendieck's `dyn` package provides support for the use of
> > general regression functions for time series analysis, and we are
> > currently struggling to integrate the e1071 interface into that
> > framework (but nothing is ready so far). Is it that kind of support
> > you have been looking for?
> > 
> > Cheers,
> > David
> > 
> > >
> > > Thanks a lot for your help.
> > > Amir Safari
> > >
> > >
> > >
> > >
> > > __
> > > Do You Yahoo!?
> > > Tired of spam?  Yahoo! Mail has the best spam protection around
> > > http://mail.yahoo.com
> > 
> > 
> > --
> > Dr. David Meyer
> > Department of Information Systems and Operations
> > 
> > Vienna University of Economics and Business Administration
> > Augasse 2-6, A-1090 Wien, Austria, Europe
> > Fax: +43-1-313 36x746
> > Tel: +43-1-313 36x4393
> > HP:  http://wi.wu-wien.ac.at/~meyer/
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> 
> 


-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to insert a certain model in SVM regarding to fixed kernels

2005-08-12 Thread David Meyer
Amir,

>  
> Suppose that we want to regress for example a certain autoregressive
> model using SVM. We have our data and also some fixed kernels in
> libSVM behinde e1071 in front. The question: Where can we insert our
> certain autoregressive model ? During creating data frame ? 

Yes, I think.

> Or perhaps we can make a 
> relationship between our variables ended to desired autoregressive
> model ?

Gabor Grothendieck's `dyn` package provides support for the use of
general regression functions for time series analysis, and we are
currently struggling to integrate the e1071 interface into that
framework (but nothing is ready so far). Is it that kind of support you
have been looking for?

Cheers,
David

>  
> Thanks a lot for your help.
> Amir Safari
>  
>  
> 
> 
> __
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 


-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] setting weights for such a two-class problem in nnet and svm

2005-07-24 Thread David Meyer

Dear Baoqiang,

there is an example on the svm Help page on the use of 'class.weights'. 

HTH
David




I have such a two-class problem, one class is very large(~98% of total),
and the other is just 2%. According to manual of nnet, I need setup
"weights", so I intend to set 1 for class one, 49 for class 2. How do I
do that? Just weights=49? 
Meanwhile I'd like to try svm(e1071), again, how do I setup
"class.weights"? Thanks.

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] New version of "vcd" package

2005-07-05 Thread David Meyer
Dear useRs,

a completely revised version of the `vcd' ("Visualizing Categorical
Data") package is now
available from CRAN. This major revision includes the following
enhancements:

* grid-based:

The package is now entirely based on `grid', the new R graphics system,
thus exploiting
its unique functionalities. Powered by grid, it is now possible,
e.g., to simply compose complex plots of available components, or to
modify plot elements 
after they have been drawn.

* new flexible framework for mosaic and association plots

Extended mosaic and association plots are now integrated in a completely
new
generic framework for the visualization of contingency tables (so-called
`strucplots'). 
The new design modularizes labeling, shading, spacing, and drawing of
legends, and
also the cells' content by the use of panel functions.

Powerful labeling functions offer much more flexibility for adding
labels
(e.g., no restrictions on the number of dimensions, flexible positioning
of labels,
cell labeling, etc.)

The framework in particular includes many predefinded functions for the
creation of
residual-based shadings.

Convenience interfaces for various `flavors' of mosaic displays are
available, e.g., 
doubledecker plots, or visualizations of "loglm" objects with
residual-based shading.

* misc:

In addition, the package features several new data sets, and an
inference
function for (conditional) independence of margins in a contingency
table.


Happy drawing!

David, Achim, Kurt

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Running SVM {e1071}

2005-06-29 Thread David Meyer

>  
> Dear David, Dear Friends,
>  
> After any running svm I receive different results of Error estimation
> of 'svm' using 10-fold cross validation. 

using tune.svm(), or the `cross' parameter of svm()?

> What is the reason ? It is caused by the algorithm, libsvm , e1071 or
> something els? 

The splits are chosen randomly.

> Which value can be optimal one? 

The Bayes Error.

> How much run can reach to the optimality.

What do you mean by `How much run'?

> And finally, what is difference between Error estimation of svm using
> 10-fold cross validation and MSE ( Mean Square Error ) ?

the former is an error estimation _procedure_, the latter is an error
_measure.

Cheers,
David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] svm and scaling input

2005-06-29 Thread David Meyer

[EMAIL PROTECTED] wrote:
> Dear All,
> 
> I've a question about scaling the input variables for an analysis with
> svm (package e1071). Most of my variables are factors with 4 to 6
> levels but there are also some numeric variables.
> 
> I'm not familiar with the math behind svms, so my assumtions maybe
> completely wrong ... or obvious. Will the svm automatically expand the
> factors into a binary matrix? 

yes.

> If I add numeric variables outside the range of 0 to 1 do I have to
> scale them to have 0 to 1 range?

svm() will scale your data by default.

Cheers, 
David

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] SVM parameters...

2005-06-27 Thread David Meyer

Vivek,

I certainly would agree that every help page, including the one of
svm(), could be improved, but I think it is not _that_ deficient. In
particular, it tells you which parameters are used in the various
kernels available.

Have you read the corresponding article in R News (basically contained
as a vignette in the package)?

In addition, you could have a look at the documentation of libsvm, the
library that is interfaced by the svm()-function in e1071.

Best,
David



hi,

i am really sorry to ask this on the list, but i havent been able to
find anything on this topic.

i would like to know how the various parameters in the svm function
call in library e1071 work. all the literature that i was able to find
on the internet have been on the mathematics and derivation of
equations of the SVM or some very specific examples that relate to
biostatistics. i have been unable to find a concise description of how
these parameters affect the model.
so far i have been using a brute force method, running all possible
permutations and combinations , but this is taking enormous amounts of
time.

i would be grateful for any help that you could provide. 

thanks and regards ,
vivek.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] best.svm

2005-05-24 Thread David Meyer
Stephen:

you need to supply the parameter ranges, your call did not tune anything
at all.
best.svm() is really just a wrapper for tune.svm(...)$best.model. The
help page for 'tune()' will tell you more on the available options.

HTH,

David


[...]

> svm.model = best.svm(data[1:3000,1:23],data[1:3000,24],tunecontrol =
> tune.control())

[...]

> It didn_t produce really good results.
 
> Will best.svm get me the best svm?  Have I given it the wrong
> parameters?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] tune.svm in {e1071}

2005-05-20 Thread David Meyer
Amir,

>Dear All ,
>1- I'm trying to access  the values of  fitted(model) after   model<-
>tune.svm( ) but seemingly it is >not poosible. How can I access to
>values of fitted ? However ,it is possible only after  model<- svm( ) 

tune.svm() is a wrapper to tune() and as such returns a tune-object.
That one _includes_ a "best.model" component containing the "svm"
object. So you want sth. like:

tuneobj <- tune.svm(...)
model <- tuneobj$best.model
summary(model)

etc.
 
>2- How can I access to the other values such as the number of Support
>Vectors , gamma, cost , nu , >epsilon , after   model<- tune.svm( ) ?
>these are not possible? I receive only  "Error estimation of 'svm' " 
>with   model and summary(model) functions.

Clear from the above, I think.

HTH,
David

>Best Wishes and so many thanks,
>Amir

-- 
Dr. David Meyer
Department of Information Systems and Process Management

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] SVM linear kernel and SV

2005-05-13 Thread David Meyer
> Thank you for your answer,
> but my problem concerns the support vectors. Indeed the two classes
> are well separated and the hyperplane is linear but the support
> vectors aren't aligned in parallel to the hyperplane. And according to
> me,  the support vectors (for each class) should be aligned along the
> linear hyperplane and form the marge (by definition). But it's not the
> case. In fact,  I'd like to understand why they are not aligned. 

Remember the `cost'-penalty controlling for overlapping classes. It has
some effect even in the linearly separable case causing more SVs than
would actually be needed. Try adding e.g. `cost=1000' and you will
obtain a result with only 2 SVs (why not 3? Because 2 SVs here solve the
optimization problem. So in fact the hyperplane in this case is not
uniquely defined, although only in a small range.)

Best,

David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] SVM linear kernel and SV

2005-05-12 Thread David Meyer
Gladys,

> 
>  I've used  svm() with a linear kernel and I'd like to plot the linear
>  
> hyperplane and the support vectors. I use plot.svm() and, according to
> me, I would have found aligned support vectors (because the hyperplane
> is linear) for each class but it wasn't the case. Could you explain me
> why ?

In how far does the plot give you the impression is wouldn't? The two
classes look pretty separated to me.

> 
> In addition, when I change the option 'scale' (from TRUE to FALSE) the
> 
> results change. 

(Which results?) The plot is, of course, slightly different since the
model is based on different data, but the class predictions (on the
training data) are the same. Why does this surprise you?

Could you explain me why ? the option 'scale' of svm() 
> acts on the dataset or on the weight vector w and threshold b  ?

On the data set, and therefore also on w and b.

Best,
David


-- 
Dr. David Meyer
Department of Information Systems and Process Management

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Status

2005-04-29 Thread david . meyer
ALERT!

This e-mail, in its original form, contained one or more attached files that 
were infected with a virus, worm, or other type of security threat. This e-mail 
was sent from a Road Runner IP address. As part of our continuing initiative to 
stop the spread of malicious viruses, Road Runner scans all outbound e-mail 
attachments. If a virus, worm, or other security threat is found, Road Runner 
cleans or deletes the infected attachments as necessary, but continues to send 
the original message content to the recipient. Further information on this 
initiative can be found at http://help.rr.com/faqs/e_mgsp.html.
Please be advised that Road Runner does not contact the original sender of the 
e-mail as part of the scanning process. Road Runner recommends that if the 
sender is known to you, you contact them directly and advise them of their 
issue. If you do not know the sender, we advise you to forward this message in 
its entirety (including full headers) to the Road Runner Abuse Department, at 
[EMAIL PROTECTED]

The message cannot be represented in 7-bit ASCII encoding and has been sent as 
a binary attachment.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Error using e1071 svm: NA/NaN/Inf in foreign function call

2005-04-28 Thread David Meyer
Joao:

1) The error message you get when setting nu=0 is due to the fact that
no support vectors can be found with that extreme restriction, and this
confuses the predict function (try svm(, fitted = false): the model
returned is empty). In fact, the C++ code interfaced by svm() clearly
allows nu = 0 and nu = 1, although these aren't sensible values. I will
add a check to the R code and drop Chih-Chen Lin, the author of the C
code, a message -- thanks for pointing this out.

2) The libsvm code is not optimized for polynomial kernels and is known
to perform quite badly in that case (in contrast to the RBF kernel for
which it is very fast). Do you think you need the whole data set for
tuning the parameters?

Best,

David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] read.matrix.csr bug (e1071)?

2005-01-29 Thread David Meyer
This is a bug, thanks for pointing this out.

Fixed for the next release of e1071.

David

-


Hello,

I would like to read and write sparse matrices using the
functions write.matrix.csr() and read.matrix.csr()
of the package e1071. Writing is OK but reading back the
matrix fails:

x <- rnorm(100)
m <- matrix(x, 10)
m[m < 0.5] <- 0
m.csr <- as.matrix.csr(m)
write.matrix.csr(m, "sparse.dat")
read.matrix("sparse.dat")

Error in initialize(value, ...) : Can't use object of class "integer"
in new():  Class "matrix.csr" does not extend that class

Is something wrong with the code above or it must be
considered as a bug?

Best regards,

Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] probabilty calculation in SVM

2005-01-16 Thread David Meyer
Raj:

The references given on the help page will tell you.

Best,
David

-

Hi All,

In package e1071 for SVM based classification, one can get a probability
measure for each prediction. I like to know what is method that is used
for
calculating this probability. Is it calculated using logistic link
function?
Thanks for your help.

Regards,

Raj

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Rgui.exe - Error while tuning svm

2004-12-23 Thread David Meyer
> If I try to tune my svm with the code:

> Tune <- tune.svm(Data.Train, Class.Train, type="C-classification",
> kernel="radial", gamma = 2^(-1:1), cost = 2^(2:4))

> i get a windows Messagebox with a error in the application "Rgui.exe"
> and the message: "Die Anweisung in 0x6c48174d verweist auf Speicher
> 0x. Der Vorgang "read" konnte nicht auf dem Speicher
> ausgef_hrt werden. ."

Which version of e1071 are you using?
There has been a memory leak problem until 1.5-1 which could very well
cause this null pointer exception...

best,
David

-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to interpret and modify "plot.svm"?

2004-12-23 Thread David Meyer

> I updated the e1071 package but still can't find
> the other three arguments for plot.svm. 

It's in e1071 since version 1.5-3. (current version: 1.5-4).

> In addition, I can plot a
> gray-colored contour region by adding the argument "col = c(gray(0.2),
> gray(0.8))". But I failed to change those colored "x" or "o" points
> into the shapes I want. Basically, I don't want to have any color in
> the plot. Could you give me a hint how to do that?

Look at the example on the help page for plot.svm().

Best,
David.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] erro in SVM (packsge "e1071")

2004-12-20 Thread David Meyer
So the error occurs during a call to model.matrix() from svm() because
of the polynomial contrasts--do you get the same error using, e.g.,
lm()?

best,
David

> The way I call SVM is:
> 
> i <- (-2)
> j <- 4
> learner='svm'
> learner.pars=list(Duracao ~ ., data=orig.data,
>scale=c(FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, TRUE,
> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
> FALSE, FALSE, FALSE),
>type='nu-regression', kernel='linear',
>cost=2^(2*i), nu=j/10)
> learner.pars$data <- orig.data[begin.test.pos:(test.pos-1),]
> 
> model <- do.call(learner,learner.pars)
> 
> The variables begin.test.pos and test.pos are windexes for orig.data
> and are working well. in this case begin.test.pos = 1 and test.pos =
> 875.
> 
> orig.data is a data.frame where the second, third and seventh
> parameters are numeric. The first parameter  is a date and all the
> others are factors (some of them ordered). The ordered factors are:
> Dia Semana (week day), DiaAno (day of the year), DiaMes (day of the
> month), SemanaAno (week of the year) and SemanaMes (week of the
> month). The first two lines of the orig.data data.frame are:
>  Data   InicioViagem Duracao DiaSemana  TipoDia
>  EpocaEscolar 
> DiasDesdeUltPagamento DiaAno DiaMes FluxoEntrada FluxoSaida
> 13 2004-01-01250563220quinta-feira   feriado 
> normal 91 1 
> normal   fsp4
> 9  2004-01-01285542866 quinta-feira   feriado 
> normal 91 1 
> normal   fsp4
> Modelo  Motorista SemanaAno SemanaMes
> Servico
> 13Mercedes_O530_N 10701 1 1 
> 10597
> 9  Mercedes_O530_N 11292 1 1 
> 10597
> 
> I am using sliding window with 30 days (around 900 records) for
> training. The error is in the svm function. May be because SVM uses
> other functions, but it happens when I run svm.
> 
> Thanks a lot for the help
> 
> Joao
> ___
> FEUP - Engineering Faculty, Porto University
> Engineering and Industrial Management group
> Tel.: +351 22 508 1639
> Fax: +351 22 508 1538
> 
> - Original Message - 
> From: "David Meyer" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>
> Sent: Sunday, December 19, 2004 1:23 PM
> Subject: Re: [R] erro in SVM (packsge "e1071")
> 
> 
> > Joao:
> >
> > The reported error message is not from e1071.
> > How *exactly* did you call svm()?
> >
> > As to the documentation of the nu parameter: yes, this is an
> > omission, of course, nu is used in nu-regression as well; thanks for
> > pointing this out.
> >
> > best,
> > David
> >
> > -
> >
> > Hello,
> >
> > I am using SVM under e1071 package for nu-regression with 18
> > parameters. The
> > variables are ordered factors, factors, date or numeric datatypes. I
> > use the
> > linear kernel.
> > It gives the following error that I cannot solve. I tryed debug,
> > browser and
> > all that stuff, but no way.
> > The error is:
> >
> > Error in get(ctr, mode = "function", envir =
> > parent.frame())(levels(x),:
> >Orthogonal polynomials cannot be represented accurately
> >enough
> > for 236
> > degrees of freedom
> >
> > I use the nu parameter. However, reading ?svm help it says
> > "parameter needed
> > for 'nu-classification' and 'one-classification'". Does not say
> > anything about
> > nu-regression. It is an omission in the ?svm help page? Or am I
> > notundestanding something?
> >
> > I believe it has something to do with the calculus of the
> > eigenvalues. Anyway
> > how can I overpass this problem? Increasing the training data (is
> > around 900
> > records)?
> >
> > Thanks for any help
> >
> > Joao
> >
> >
> >
> >
> > -- 
> > Dr. David Meyer
> > Department of Information Systems
> >
> > Vienna University of Economics and Business Administration
> > Augasse 2-6, A-1090 Wien, Austria, Europe
> > Fax: +43-1-313 36x746
> > Tel: +43-1-313 36x4393
> > HP:  http://wi.wu-wien.ac.at/~meyer/
> > 
> 
> 


-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] erro in SVM (packsge "e1071")

2004-12-19 Thread David Meyer
Joao:

The reported error message is not from e1071.
How *exactly* did you call svm()?

As to the documentation of the nu parameter: yes, this is an omission,
of course, nu is used in nu-regression as well; thanks for pointing this
out.

best,
David

-

Hello,

I am using SVM under e1071 package for nu-regression with 18 parameters.
The 
variables are ordered factors, factors, date or numeric datatypes. I use
the 
linear kernel.
It gives the following error that I cannot solve. I tryed debug, browser
and 
all that stuff, but no way.
The error is:

Error in get(ctr, mode = "function", envir = parent.frame())(levels(x), 
: 
Orthogonal polynomials cannot be represented accurately enough
for 236 
degrees of freedom

I use the nu parameter. However, reading ?svm help it says "parameter
needed 
for 'nu-classification' and 'one-classification'". Does not say anything
about 
nu-regression. It is an omission in the ?svm help page? Or am I 
notundestanding something?

I believe it has something to do with the calculus of the eigenvalues.
Anyway 
how can I overpass this problem? Increasing the training data (is around
900 
records)?

Thanks for any help

Joao




-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] How to interpret and modify "plot.svm"?

2004-12-17 Thread David Meyer
Frank:


> Dear R people,

> I am trying to plot the results from running svm in library(e1071). I
> use plot.svm. After searching through the help archives and FAQ, I
> still have several questions:

> 1.  In default, crosses indicate support vectors. But why are there
> two colors of crosses? What do they represent?

The colors represent the classes of the data points. The help page
admittedly doesn't tell you this and deserves improvement.

> 2. I want to draw a white-gray colored plot and modify the different
> colored crosses or circles by different shaped points. Could anyone
> give me a hint?

I just added three arguments to plot.svm() that allow customizing of the
plot symbols. The contour region is controlled by the parameters of the
filled.contour() function used in plot.svm(), so you will need to add
the color.palette argument to plot.svm (which subsequently will be
passed to filled.contour()).

> 3. Is it possible for me to draw a "hyperplane" on the plot?

You can add arbitrary objects to the plot (try lines()); but plot.svm()
doesn't compute the boundaries.

> 4. What is the algorithm to plot the contour region?

see filled.contour(). The input is determined by a grid of predicted
values.

Best,
-d


> Thank you very much,

> Frank

-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problem with SVM and scaling

2004-12-17 Thread David Meyer

Ton:

Does preprocessing (scaling, removing constant variables, etc.) "by
hand" of the whole data set *before* splitting resolve things?

You will need the same variable structure in the training and the test
set anyway; scaling is just the first code part that fails on your
data...

g,
-d

-

Hi all -
_
I am running into a problem with the SVM() method when applying it to
data sets that have descriptors with zero variance. Here is the sequence
of events:

1. I split my data set with 512 descriptors in a training and test set
2. I build an SVM model for the training set. Out of 512 descriptors,
500 have zero variance which I discard before calling the SVM method
3. For the test set, 8 descriptors have zero variance, which I discard
too
4. predict.svm() then fails, because it tries to scale using two vectors
of different size (500 and 504)

Is there a way to get around this?

-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] svm- class.weights

2004-12-09 Thread David Meyer

Uwe:

[the language of the list is English!]

Try using a *factor* for classification. The described behavior
(segfault when using class.weights with a *numeric* dependent variable)
should be fixed in the current version of e1071 (1.5-2), so please check
if you are using the latest version of e1071.

Best,
-d

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] tuning SVM's

2004-12-01 Thread David Meyer
Stephen:

Your calls to best.svm() do not tune anything unless you specify the
parameter ranges (see the examples on the help page). Your calls are
just using the defaults which are very unlikely to yield models with
good performance.

[I think some day, I will have to remove the defaults in svm()...]

Another point: why aren't you using classification machines (which is
done automatically by providing a factor as dependent variable)?

There is classAgreement() in e1071, too, you might want to look at.

Cheers,
David






Hi  
 
I am doing this  sort of thing:
 
POLY:
 
> > obj = best.tune(svm, similarity ~., data = training, kernel =
"polynomial")
> summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = "polynomial") 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  polynomial 
   cost:  1 
 degree:  3 
  gamma:  0.04545455 
 coef.0:  0 
epsilon:  0.1 
 
 
Number of Support Vectors:  754
 
> svm.model <- svm(similarity ~., data = training, kernel  =
"polynomial", cost = 1, degree = 3, gamma = 0.04545455, coef.0 = 0,
epsilon = 0.1)
> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0  
> table(testing$similarity, pred)
   pred
0  1 
  0 30  8
  1 70 63
> obj = best.tune(svm, similarity ~., data = training, kernel =
"linear")
> summary(obj)
 
LINEAR:
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = "linear") 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  linear 
   cost:  1 
  gamma:  0.04545455 
epsilon:  0.1 
 
 
Number of Support Vectors:  697
 
> svm.model <- svm(similarity ~., data = training, kernel  = "linear",
cost = 1, gamma = 0.04545455, epsilon = 0.1)
> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0  
> table(testing$similarity, pred)
   pred
0   1  
  0   6  32
  1   4 129
 
 
RADIAL:
 
> obj = best.tune(svm, similarity ~., data = training, kernel =
"radial")
> summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = "linear") 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  linear 
   cost:  1 
  gamma:  0.04545455 
epsilon:  0.1 
 
 
Number of Support Vectors:  697
 
> svm.model <- svm(similarity ~., data = training, kernel  = "radial",
cost = 1, gamma = 0.04545455, epsilon = 0.1)
> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0  
> table(testing$similarity, pred)
   pred
0  1 
  0 27 11
  1 64 69
 
 
SIGMOID:
 
> obj = best.tune(svm, similarity ~., data = training, kernel =
"sigmoid")
> summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = "sigmoid") 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  sigmoid 
   cost:  1 
  gamma:  0.04545455 
 coef.0:  0 
epsilon:  0.1 
 
 
Number of Support Vectors:  986
 
> svm.model <- svm(similarity ~., data = training, kernel  = "sigmoid",
cost = 1, gamma = 0.04545455, coef.0 = 0, epsilon = 0.1)
> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0  
> table(testing$similarity, pred)
   pred
0   1  
  0   8  30
  1  26 107
>
 
and then taking out the kappa statistic to see if I am getting anything
significant.
 
I get kappas of 15 - 17% - I don't think that is very good.  I know
kappa is really for comparing the outcomes of two taggers but it seems a
good way to measure if your results might be by chance.
 
Two questions:
 
Any comments on Kappa and what it might be telling me?
 
What can I do to tune my kernels further?
 
Stephen
-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Re: What is nu-regression for svm?

2004-09-20 Thread David Meyer
Look up the reference given at the help-page, it tells you exactly what
nu-regression does.

best,
-d

> Date: Fri, 17 Sep 2004 11:24:03 +0100
> From: [EMAIL PROTECTED]
> Subject: [R] What is nu-regression for svm?
> To: [EMAIL PROTECTED]
> Message-ID: <[EMAIL PROTECTED]>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> 
> Does anyone knows what is the nu-regression option for the type
> parameter in svm (from package e1071)? I cannot find any explanation
> on that and I have a reasonable understanding on svm fundamentals.
> 
> Thanks
> 
> Joao Moreira
>
 
- 
David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/Wer_sind_wir/meyer/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Re: R library(e1071) question: definition of performance in tune.* functions

2004-07-13 Thread David Meyer
Tae-Hoon:

> When we run tune.* for parameter tuning, we get performance value.
> Can you tell me what the definition of it is?

The values returned by tune() are Mean Squared Errors in case of
regression, and simple rates (*no* percentages) in case of
classification. As Andy already suggested, you might want to check if
your target variable is indeed a factor. In case it is and you still get
values greater than 1, drop me a mail with a piece of code (and data)
enabling me to reproduce the phenomenon. 

Best,
David

-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/Wer_sind_wir/meyer/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] reading a "sparse" matrix into R

2004-04-28 Thread David Meyer
Have you considered the read.matrix.csr() function in pkg. e1071? It
uses another sparse input format, but perhaps you can easily transform
your data in the supported one. Also, in my experience, data frames are
not the best basis for a sparse format since they might turn out to be
very memory consuming and slow... The sparse formats provided by the
SparseM package are better suited for this.

-d

Date: Tue, 27 Apr 2004 17:10:09 -0400
From: "Aaron J. Mackey" <[EMAIL PROTECTED]>
Subject: [R] reading a "sparse" matrix into R
To: [EMAIL PROTECTED]
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=US-ASCII; format=flowed


I have a 47k x 47k adjacency matrix that is very sparse (at most 30 
entries per row); my textual representation therefore is simply an 
adjacency list of connections between nodes for each row, e.g.

nodeconnections
A   B   C   D   E
B   A   C   D
C   A   E
D   A
E   A   F
F   E
G
H

I'd like to import this into a dataframe of node/connection 
(character/vector-of-characters) pairs.  I've experimented with scan, 
but haven't been able to coax it to work.  I can also "hack" it with 
strsplit() myself, but I thought there might be a more elegant way.

Thanks,

-Aaron

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] SVM question

2004-03-22 Thread David Meyer
>I have a question concerning the svm in the e1071 package.
>I trained the svm by a set of samples, doing a 10 cross validation.
>The summary function then prints out the total accuracy and single
>accuracies, >works fine.

>My question is then: Is it possible to get classification results per
>cross >validation out the svm? I mean e.g. numbers about the true
>positives ,fp,fn,tf ?

No, because the accuracies are not computed using a confusion matrix.
The computation is internally done in C.

>How do I get a list of the classified examples ? 


-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/Wer_sind_wir/meyer/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Re: R-help Digest, Vol 13, Issue 14

2004-03-15 Thread David Meyer
You could look at

@Article{ e1071-papers:meyer+leisch+hornik:2003,
  author= {David Meyer and Friedrich Leisch and Kurt Hornik},
  title = {The Support Vector Machine under Test},
  journal   = {Neurocomputing},
  year  = 2003,
  month = {September},
  pages = {169--186},
  volume= 55
}

which compares a lot of classifiction and regression methods available
in R. The purpose obviously was to assess SVMs, but the error rates can
be compared independently from that. Generally, the performance of
nnet() was acceptable, but ensemble methods have been quite competitive
as well.

Best,
David

---

I was wandering if anybody ever tried to compare the classification
accuracy of nnet to other (rpart, tree, bagging) models. From what I
know, there is no reason to expect a significant difference in 
classification accuracy between these models, yet in my particular case
I get about 10% error rate for tree, rpart and bagging model and 80% 
error rate for nnet, applied to the same data.

Thanks.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] SVM unbalanced classes

2004-03-09 Thread David Meyer
You might consider using the `weight' argument of svm().

Best,

David.


Hi!

I am using R 1.8.1 and the svm of the e1071 package for classification.
The problem is that I have unbalanced classes e.g. the first one is much
bigger than the second one and therfore the svm is biased to the first
class.
If I manually adjust the class size the bias disappears.
The question is then how to include this unequal class distribution to
the svm (e.g. via wheights or costs)?

Yours,
Frank
-- 
Frank G. Zoellner
AG Angewandte Informatik
Technische Fakult"at
Universit"at Bielefeld
phone: +49(0)521-106-2951
fax:   +49(0)521-106-2992
email: [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] svm in e1071 package: polynomial vs linear kernel

2003-11-03 Thread David Meyer


On Mon, 3 Nov 2003 [EMAIL PROTECTED] wrote:

> I am trying to understand what is the difference between linear and 
> polynomial kernel:
> 
>   linear: u'*v
> 
>   polynomial: (gamma*u'*v + coef0)^degree
> 
> It would seem that polynomial kernel with gamma = 1; coef0 = 0 and degree 
> = 1
> should be identical to linear kernel, however it gives me significantly 
> different results  for very simple
> data set, with linear kernel significantly outperforming polynomial 
> kernel.
> 
> *** mse, r2 = 0.5, 0.9 for linear
> *** mse, r2 = 1.8, 0.1 for polynomial
> 
> What am I missing ?

Well: perhaps, that you should pass *all* parameters from your cv.svm
function to the call of svm()?

g.,
-d

> 
> Ryszard
> 
> P.S.
> 
> Here are my results:
> 
> # simple cross validation function
> cv.svm <- function(formula, data, ntry = 3, kernel = "linear", scale = 
> FALSE, cross = 3,
>gamma = 1/(dim(data)-1), degree = 3) {
>mse <- 0; r2 <- 0
>for (n in 1:ntry) {
>   svm.model <- svm(formula , data = data, scale = scale, kernel = 
> kernel,
>cross = cross)
>   mse <- mse + svm.model$tot.MSE
>   r2  <- r2 + svm.model$scorrcoeff
>}
>mse <- mse/ntry; r2 <- r2/ntry; result <- c(mse, r2)
>cat(sprintf("cv.svm> mse, r2 = %5.3f %5.3f\n", mse, r2))
>return (result)
> }
> 
> # define data set
> 
> x1 <- rnorm(9); x2 <- rnorm(9)
> df <- data.frame(y = 2*x1 + x2, x1, x2)
> 
> #  invoke cv.svm() for linear and polynomial kernels few times
> 
> > r <- cv.svm( y ~ ., df, kernel = "polynomial", gamma = 1, degree = 1, 
> ntry = 32)
> cv.svm> mse, r2 = 1.888 0.162
> > r <- cv.svm( y ~ ., df, kernel = "polynomial", gamma = 1, degree = 1, 
> ntry = 32)
> cv.svm> mse, r2 = 1.867 0.146
> > r <- cv.svm( y ~ ., df, kernel = "polynomial", gamma = 1, degree = 1, 
> ntry = 32)
> cv.svm> mse, r2 = 1.818 0.105
> > r <- cv.svm( y ~ ., df, kernel = "linear", gamma = 1, degree = 1, ntry = 
> 32)
> cv.svm> mse, r2 = 0.525 0.912
> > r <- cv.svm( y ~ ., df, kernel = "linear", gamma = 1, degree = 1, ntry = 
> 32)
> cv.svm> mse, r2 = 0.537 0.878
> > r <- cv.svm( y ~ ., df, kernel = "linear", gamma = 1, degree = 1, ntry = 
> 32)
> cv.svm> mse, r2 = 0.528 0.913
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] problem with tune.svm

2003-10-31 Thread David Meyer


On Fri, 31 Oct 2003 [EMAIL PROTECTED] wrote:

> > rng <- list(gamma = 2^(-1:1), cost = 2^(2:4))
> > rng
> $gamma
> [1] 0.5 1.0 2.0
> 
> $cost
> [1]  4  8 16
> 
> > obj <- tune.svm(pIC50 ~ ., data = data, ranges = rng)
> Error in tune(svm, train.x = x, data = data, ranges = ranges, ...) :
> formal argument "ranges" matched by multiple actual arguments

The function `tune.svm' has no `range' argument, use `gamma' and `cost'
separately. The idea is to make `tune.foo' a `vectorized' function of
`foo' in the parameters. If you want to preconstruct a list, use

tune(svmobj, ranges = ...)

instead.

g.,
-d

> 
> Ay idea why ???
> 
> Ryszard
> 
>   [[alternative HTML version deleted]]
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] svm from e1071 package

2003-10-29 Thread David Meyer
> This suggests to me that data are scrambled each time - the last time I 
> looked at libsvm python interface
> this is what was done. Is this the same here (I hope) ?

yes.

g.,
David

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Logit reality check

2003-09-28 Thread David Meyer
> > If I try the model below, R seems to grumble with a complaint.
> >
> > glm(cbind(Y,1-Y) ~ X, family = binomial)
> >
> > non-integer counts in a binomial glm! in: eval(expr, envir, enclos)
> >

For binomial models (as described in the help page), the response must be
either a factor or a n x 2 matrix with the numbers of successes of
failures, not the proportions.

g.,
David

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] How to detect which function is used for e.g. printing an object of a given class

2003-09-24 Thread David Meyer
> Is there an alternative way of "dispatching" the printing, such that
> the usual print method for loglm is used after doing what is special
> for hllm?
You might want to have a look at `NextMethod()'

Best,
David
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] problem with HoltWinters

2003-09-03 Thread David Meyer
HoltWinters() by default fits a seasonal model, and therefore needs 
three complete cycles for the starting values. But your third cycle is 
incomplete, so in short, you haven't got enough data to fit a seasonal 
model, unless you provide all starting values using l.start, b.start, 
and s.start. I will change the code to give a better error message in 
such cases.

Best,
David.
On 2003.09.03 18:32, Luis Miguel Almeida da Silva wrote:
The data goes in attachment. I used ts to create data.ts

data.ts<-ts(data=data,start=c(2001,1),frequency=12)

	-Original Message-
	From: David Meyer [mailto:[EMAIL PROTECTED]
	Sent: Wed 03/09/2003 17:21
	To: Luis Miguel Almeida da Silva
	Cc: [EMAIL PROTECTED]
	Subject: Re: [R] problem with HoltWinters
	 
	 

	How did you construct `data.ts'? Can you send me the file?
	 
	best,
	David
	 
	On 2003.09.03 15:57, Luis Miguel Almeida da Silva wrote:
	> Dear helpers
	>
	> I'm having a problem with function HoltWinters from package
ts. I have
	> a time series that I want to fit an Holt-Winters model and
make
	> predictions for the next values. I've already built an
object of class
	> ts to serve as input to HoltWinters. But then I get an
error; I've
	> used HoltWinters a lot of times and this never hapened
	>
	> > data.HW<-HoltWinters(data.ts)
	> Error in model.frame(formula, rownames, variables, varnames,
extras,
	> extranames,  :
	> variable lengths differ
	>
	> This is the data
	>
	> > data.ts
	>  Jan Feb Mar Apr May Jun Jul
Aug
	>   Sep
	> 2001 1117001 1017287 1195142 1049729 1409147 1267002 1579907
1563127
	> 1195597
	> 2002 1228333 1062520 1080117 1171998 1383951 1141008 1604061
1446024
	> 1276017
	> 2003 1068221 1045052 1164273 1091765 1272330 1305676
	>
	>  Oct Nov Dec
	> 2001 1290688 1104137 1027022
	> 2002 1262232 1048522 1174157
	> 2003
	>
	> Do you know what is happening?
	>
	> Thank you
	> Luis
	>
	> __
	> [EMAIL PROTECTED] mailing list
	> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
	>
	>
	 

1117001
1017287
1195142
1049729
1409147
1267002
1579907
1563127
1195597
1290688
1104137
1027022
1228333
1062520
1080117
1171998
1383951
1141008
1604061
1446024
1276017
1262232
1048522
1174157
1068221
1045052
1164273
1091765
1272330
1305676
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] problem with HoltWinters

2003-09-03 Thread David Meyer
How did you construct `data.ts'? Can you send me the file?

best,
David
On 2003.09.03 15:57, Luis Miguel Almeida da Silva wrote:
Dear helpers

I'm having a problem with function HoltWinters from package ts. I have
a time series that I want to fit an Holt-Winters model and make
predictions for the next values. I've already built an object of class
ts to serve as input to HoltWinters. But then I get an error; I've
used HoltWinters a lot of times and this never hapened
> data.HW<-HoltWinters(data.ts)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  :
variable lengths differ
This is the data

> data.ts
 Jan Feb Mar Apr May Jun Jul Aug
  Sep
2001 1117001 1017287 1195142 1049729 1409147 1267002 1579907 1563127
1195597
2002 1228333 1062520 1080117 1171998 1383951 1141008 1604061 1446024
1276017
2003 1068221 1045052 1164273 1091765 1272330 1305676
 Oct Nov Dec
2001 1290688 1104137 1027022
2002 1262232 1048522 1174157
2003
Do you know what is happening?

Thank you
Luis
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] R, geochemistry, ternary diagrams

2003-07-15 Thread David Meyer
And in package vcd function ternaryplot().

g.,
-d
On 2003.07.15 09:33, Tobias Verbeke wrote:
> Are there enough geochemists using R already that he'd find
> like-minded people to discuss technical issues with if he _did_
> switch to R? Is there a package somewhere already that does ternary
> and other geochemistry diagrams?
Another possibility for a ternary plot
was mentioned by Prof Ripley in
http://maths.newcastle.edu.au/~rking/R/help/02b/3637.html

> library(MASS)
> example(Skye)
gives code and an example

HTH,

Tobias

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Can't load e1071

2003-06-24 Thread David Meyer
Andrew,

1) The current R version is 1.7.1
2) Which version of `e1071' are you using?
3) Does the `e1071.so' file exist (in e1071/libs)?
best,
David.
On 2003.06.24 21:43, Andrew Perrin wrote:
After upgrading to 1.7.0 under debian linux, I can't get e1071 working
properly.
The first problem I had was that g++-3.0 was the standard compiler but
wasn't installed, so I installed it. e1071 then installed correctly,
but I
get the following:
[EMAIL PROTECTED]:~/afshome/papers/authoritarian/R$ R

R : Copyright 2003, The R Development Core Team
Version 1.7.0  (2003-04-16)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type `license()' or `licence()' for distribution details.
R is a collaborative project with many contributors.
Type `contributors()' for more information.
Type `demo()' for some demos, `help()' for on-line help, or
`help.start()' for a HTML browser interface to help.
Type `q()' to quit R.
[Previously saved workspace restored]

> library(e1071)
Error in dyn.load(x, as.logical(local), as.logical(now)) :
unable to load shared library
"/usr/local/lib/R/site-library/e1071/libs/e1071.so":
  /usr/local/lib/R/site-library/e1071/libs/e1071.so: cannot
dynamically
load executable
Error in library(e1071) : .First.lib failed
any suggestions? Thanks.

--
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
[EMAIL PROTECTED] * andrew_perrin (at) unc.edu
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] formula (joint, conditional independence, etc.) - mosaicplots

2003-06-13 Thread David Meyer
On 2003.06.13 02:11, g wrote:
Hi,

Can someone set me straight as to how to write formulas in R to
indicate:
complete independence [A][B][C]
Freq ~ A + B + C

	joint independence [AB][C]
Freq ~ A * B + C

	conditional independence [AC][BC]
Freq ~ A * C + B * C

	nway interaction [AB][AC][BC]
Freq ~ A * B * C - A:B:C

You might have a look at demo(mosaic) in package vcd.

g.,
-d

?

For example, if I have 4 factors:
hair colour, eye colour, age, sex
does
>  mosaicplot( frequency ~ hair + eye + age + sex)
mean that the model fitted is of complete independence of all factors
[hair][eye][age][sex]?
So does
> mosaicplot(frequency ~ hair + eye)
mean that the model is of conditional independence
[hairAgeSex][eyeAgeSex]?
How does the operator *  as in
> mosaicplot( frequency ~ hair * eye)
or
> mosaicplot( frequency ~ hair * eye + age)
equate to in the type of independence model used?
Thanks in advance for any elucidation!

Gina

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Code for Support Vector Clustering Algorithm

2003-06-12 Thread David Meyer
On 2003.06.12 11:57, Ramzi Feghali wrote:
No comment,
yes, please *do* comment!

The help page clearly says the implementation can carry out:

- classification,
- regression, and
- density estimation
*no* clustering.

David.

svm  package:e1071  R Documentation
Support Vector Machines
Description:
 `svm' is used to train a support vector machine. It can be used
to
 carry out general regression and classification (of nu and
 epsilon-type), as well as density-estimation. A formula interface
 is provided.
David Meyer <[EMAIL PROTECTED]> wrote:
On 2003.06.12 09:15, Uwe Ligges wrote:
> Iouri Tipenko wrote:
>
>> Dear R-Users,
>> I'm a master student in Mathematics and Statistics at Carleton
>> University, Ottawa, Canada.
>> I'm studying Clustering methods including different related
>> algorithms. One of them is Support Vector Clustering algorithm.
>> I was wondering whether anybody implemented this algorithm and
could
>> help me with the S-Plus or R computer code that I could use in my
>> simulations.
>> I would really appreciate your help or any advise on where I can
get
>> this code.
>
> There is an implementation of Support Vector Machines in package
> e1071, which is available on CRAN.
...but it does not include Support Vector *clustering*.

David

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
-

	[[alternate HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Code for Support Vector Clustering Algorithm

2003-06-12 Thread David Meyer
On 2003.06.12 09:15, Uwe Ligges wrote:
Iouri Tipenko wrote:

Dear R-Users,
I'm a master student in Mathematics and Statistics at Carleton 
University, Ottawa, Canada.
I'm studying Clustering methods including different related 
algorithms. One of them is Support Vector Clustering algorithm.
I was wondering whether anybody implemented this algorithm and could 
help me with the S-Plus or R computer code that I could use in my 
simulations.
I would really appreciate your help or any advise on where I can get 
this code.
There is an implementation of Support Vector Machines in package 
e1071, which is available on CRAN.
...but it does not include Support Vector *clustering*.

David

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Error Compiling e1071

2003-06-07 Thread David Meyer
> 
> I am trying to compile the package e1071 (version 1.3-11) with R CMD
> INSTALL. I tried with R 1.7.0 on Redhat Linux 2.4.7-10 and R 1.6.2 on
> Linux 2.4.9-34smp but keep getting the same error message during
> configure :
> 
> WARNING: g++ 2.96 cannot reliably be used with this package. Please use
> a different compiler.
> 
> Can anyone help me with this or at least point me in the right direction
> ? Thank you very much.

We added this warning because g++ 2.96 breaks the C++ code of `libsvm'
contained in the `e1071' package. If you don't use SVMs, you might
interprete this warning simply as a general upgrade suggestion :)

Best,
David.

> 
> Regards, Adai.
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Goodness of fit tests

2003-03-29 Thread David Meyer
Try `goodfit' in package `vcd'.

g.,
-d

    Mag. David MeyerWiedner Hauptstrasse 8-10
Vienna University of Technology A-1040 Vienna/AUSTRIA
 Department of  Tel.: (+431) 58801/10772
Statistics and Probability Theory   Fax.: (+431) 58801/10798






On Sat, 29 Mar 2003, Fernando Henrique Ferraz wrote:

>   
>I have a dataset which I want to model using a Poisson distribution, with a given 
> parameter. I would like to know what is the proper way to do a 'goodness of fit' 
> test using R.
>I know the steps I'd take if I were to do it 'manually': grouping the numbers 
> into classes, calculating the expected frequencies using 'ppois', then calculating 
> Chi_2_obs = Sum (e_i - o_i)^2/e_i) (where e_i represents the expected frequencies 
> and o_i the observeds ones) and then finally calculating the p-value (using pchisq).
>I've read a lot of documentation, also tried googling for 'goodness of fit R' but 
> it was helpless, most of it is only about 'regression analysis'. Does anyone know if 
> there is a simpler way to do this?
> 
> 
> Thank you, 
> 
>  
> 
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] RODBC and Excel in Widows

2003-03-26 Thread David Meyer
You might look at Thomas Baier's DCOM interface as an alternative to the
odbc-method for accessing EXCEL-files.

-d

"r.ghezzo" wrote:
> 
> HI,
>  no sorry, so far nobody answer. So it probably does not have a solution.
>  Excell is from you.know.who
> 
> >= Original Message From Meinhard Ploner <[EMAIL PROTECTED]>
> =
> >Hello!
> >Did you resolve the problem?
> >I'm interested in the solution, too.
> >Meinhard
> >
> >On Thursday, March 13, 2003, at 07:21  PM, R. Heberto Ghezzo wrote:
> >
> >> Hello, I have some problems with RODBC and Excel in Win98
> >> I am using R 1.6.2 and just upgraded RODBC to the last version on CRAN.
> >> I have an Excel file with columns Number, Name, Sex, Age, FEV1 on Sheet
> >> 1 and Number, Age, FEV1, Name, Sex on Sheet 2.
> >> Now I open the channel to the file
> >>> chan1 <- odbcConnectExcel("c:/testOdbc.xls")
> >>> tables(chan1)
> >> and the list appears with the 2 tables
> >>> aa -> sqlFetch(chan1,"Sheet1")
> >> and aa has the Number, Name and Sex columns correct but Age and FEV1
> >> are
> >> all NAs
> >>> bb -> sqlfetch(chan1,"Sheet2")
> >> and bb is correct!
> >> So all numeric columns after a column of characters become NAs
> >> Is this an Excel problem or an sql problem.? I did not find anything in
> >> the r-help archives relative to this problem.
> >> Thanks for any help
> >>
> >> __
> >> [EMAIL PROTECTED] mailing list
> >> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >>
> 
> R. Heberto Ghezzo Ph.D.
> Meakins-Christie Labs
> McGill University
> Montreal - Que - Canada
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help

-- 
Mag. David MeyerWiedner Hauptstrasse 8-10
Vienna University of Technology A-1040 Vienna/AUSTRIA
 Department of  Tel.: (+431) 58801/10772
Statistics and Probability Theory   Fax.: (+431) 58801/10798

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] where is kurtosis??

2003-03-08 Thread David Meyer

E.g., in package e1071.

best,
-d

Mag. David MeyerWiedner Hauptstrasse 8-10
Vienna University of Technology A-1040 Vienna/AUSTRIA
 Department of  Tel.: (+431) 58801/10772
Statistics and Probability Theory   Fax.: (+431) 58801/10798






On Sat, 8 Mar 2003, Shutnik wrote:

>  Dear friends,
>  I try to get started with R and can’t estimate kurtosis of a random sample by using 
> one command. I have installed R 1.6.2. Please help.
> 
>  Max
> 
> 
> 
> 
> -
> 
> ur needs
> 
>   [[alternate HTML version deleted]]
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] svm

2003-02-06 Thread David Meyer
Christian Hennig wrote:
> 
> Hello list,
> 
> I want to apply svm from library e1071, and I want to supply class weights.
> I do not really understand the help entry (and there is no example)
> 
> class.weights: a named vector of weights for the different classes,
>   used for asymetric class sizes. Not all factor levels have to
>   be supplied (default weight: 1). All components have to be
>   named.
> 
> I have two classes, factor levels are 1 (2000 cases, say) and 2 (1000
> cases). How has the entry for class.weights to look like? (I'm more
> interested in the syntax than what the weight should be, but if you know,
> please tell me...)

for example, consider the two classes `male' and `female':

svm(..., class.weights = c(male=0.4, female=0.6))

g.,
-d

> 
> Best,
> Christian
> 
> --
> ***
> Christian Hennig
> Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently)
> and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
> [EMAIL PROTECTED], http://stat.ethz.ch/~hennig/
> [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/
> ###
> ich empfehle www.boag.de
> 
> __
> [EMAIL PROTECTED] mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help

-- 
Mag. David MeyerWiedner Hauptstrasse 8-10
Vienna University of Technology A-1040 Vienna/AUSTRIA
 Department of  Tel.: (+431) 58801/10772
Statistics and Probability Theory   Fax.: (+431) 58801/10798

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] svm regression in R

2003-01-31 Thread David Meyer
Christoph Helma wrote:
> 
> Hallo,
> 
> I have a question concerning SVM regression in R. I intend to use SVMs for feature 
>selection (and knowledge discovery). For this purpose I will need to extract the 
>weights that are associated with my features. I understand from a previous thread on 
>SVM classification, that predictive models can be derived from SVs, coefficiants and 
>rhos, but it is unclear for me how to transfer this information to the regression 
>problem. Can anyone help in this respect (I am *not* an SVM expert)?

That's pretty simple.
The ``decision'' (predictor) function for regression is as follows:

f(x) = \sum_{i=1}^{l} alpha_i * K(x_i, x) - rho

where `alpha_i' are the coefficients of the SVs, `x_i' are the SVs
themselves, and `l' the number of SVs.
Note that `rho' must be *substracted* because libsvm returns -b for some
reasion.

Best,

David.

> 
> Thanks,
> Christoph
> --
> :: christoph helma
> :: computational toxicologist
> :: university freiburg
> :: georges koehler allee 079, d-79110 freiburg/br
> :: phone ++49-761-203-8013, fax -8007
> :: [EMAIL PROTECTED]
> :: http://www.informatik.uni-freiburg.de/~helma/
> 
> __
> [EMAIL PROTECTED] mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help

-- 
Mag. David MeyerWiedner Hauptstrasse 8-10
Vienna University of Technology A-1040 Vienna/AUSTRIA
 Department of  Tel.: (+431) 58801/10772
Statistics and Probability Theory   Fax.: (+431) 58801/10798

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help