[R] BreastCancer Dataset for Classification in kknn

2010-06-01 Thread Nitin
Dear All,

I'm getting a error while trying to apply the BreastCancer dataset
(package=mlbench) to kknn (package=kknn) that I don't understand as I'm new
to R.
The codes are as follow:

rm = (list = ls())
library(mlbench)
data(BreastCancer)
library(kknn)

BCancer = na.omit(BreastCancer)
d  = dim(BCancer)[1]
i1 = seq(1, d, 2)
i2 = seq(2, d, 2)

t1 = BCancer[i1, ]
t2 = BCancer[i2, ]
y2  = BCancer[i2, 11]

x = 10
k = array(1:x, dim = c(x,1))
ker = array(c( rectangular, triangular, epanechnikov, biweight,
triweight, cos, inv, gaussian), dim = c(8,1))

f = function(x, ker){

BreastCancer.kknn  -  kknn(Class~., train = t1, test = t2, k = x,
kernel = ker, distance = 1)
fit = fitted(BreastCancer.kknn)

z - (fit==y2)
z.e - (100 - (length(y2)-length(z[!z]))/length(y2)*100 )
}

err.k = function(ker){
error.BreastCancer = apply(k,1,function(y) f(y, ker))
}

err.ker = apply(ker, 1, err.k)
colnames(err.ker) = c(rectangular, triangular, epanechnikov,
biweight,
triweight, cos, inv, gaussian)
print(err.ker)

It throws a error: Error in as.matrix(learn[, ind == i]) :
  (subscript) logical subscript too long
In addition: Warning messages:
1: In model.matrix.default(mt, mf) : variable 'Id' converted to a factor
2: In model.matrix.default(mt, test) : variable 'Id' converted to a factor

I tried the codes with other datasets in mlbench package and most of them
working. That is the mistake here for this particular dataset and how can I
solve it?

Thanks
Nitin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BreastCancer Dataset for Classification in kknn

2010-06-01 Thread Joris Meys
Hi Nitin,

It can be solved by splitting your data a bit different. You need more
training data than you have evaluation data, eg :
i1 = 1:400
i2=401:d

Then it works on my computer. No clue as to where the error originates from
though.

Cheers
Joris

On Tue, Jun 1, 2010 at 4:27 PM, Nitin niti...@gmail.com wrote:

 Dear All,

 I'm getting a error while trying to apply the BreastCancer dataset
 (package=mlbench) to kknn (package=kknn) that I don't understand as I'm new
 to R.
 The codes are as follow:

 rm = (list = ls())
 library(mlbench)
 data(BreastCancer)
 library(kknn)

 BCancer = na.omit(BreastCancer)
 d  = dim(BCancer)[1]
 i1 = seq(1, d, 2)
 i2 = seq(2, d, 2)

 t1 = BCancer[i1, ]
 t2 = BCancer[i2, ]
 y2  = BCancer[i2, 11]

 x = 10
 k = array(1:x, dim = c(x,1))
 ker = array(c( rectangular, triangular, epanechnikov, biweight,
triweight, cos, inv, gaussian), dim = c(8,1))

 f = function(x, ker){

BreastCancer.kknn  -  kknn(Class~., train = t1, test = t2, k = x,
kernel = ker, distance = 1)
fit = fitted(BreastCancer.kknn)

z - (fit==y2)
z.e - (100 - (length(y2)-length(z[!z]))/length(y2)*100 )
 }

 err.k = function(ker){
error.BreastCancer = apply(k,1,function(y) f(y, ker))
 }

 err.ker = apply(ker, 1, err.k)
 colnames(err.ker) = c(rectangular, triangular, epanechnikov,
 biweight,
triweight, cos, inv, gaussian)
 print(err.ker)

 It throws a error: Error in as.matrix(learn[, ind == i]) :
  (subscript) logical subscript too long
 In addition: Warning messages:
 1: In model.matrix.default(mt, mf) : variable 'Id' converted to a factor
 2: In model.matrix.default(mt, test) : variable 'Id' converted to a factor

 I tried the codes with other datasets in mlbench package and most of them
 working. That is the mistake here for this particular dataset and how can I
 solve it?

 Thanks
 Nitin

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.