On 08.09.2011 14:19, Максим Иванов wrote:
Hello.

I found the behavior of knn(
http://stat.ethz.ch/R-manual/R-devel/library/class/html/knn.html) function
looking very strange.
Consider the toy example.
library(class)
train<- matrix(nrow=5000,ncol=2,data=rnorm(10000,0,1))
test<- matrix(nrow=10,ncol=2,data=rnorm(20,0,1))
cl<- rep(c(0,1),2500)
knn(train,test,cl,1)
  [1] 1 1 0 0 1 0 1 1 0 1
Levels: 0 1

It works properly if you pass any number of nearest neibhours (4-th
parameter) from 1 to 499
But if you run it with number of n.n.>= 500 than there would be an error.
knn(train,test,cl,500)
error in knn(train, test, cl, 500) : too many ties in knn

no matter what data you have. even if you run it with odd number of n.n.,
say, 501 (so there just can't be any ties) there will be exactly
the same error.
Am I missing smth?


Yes, the source code. In the source package, ./src/class.c, line 89:
#define MAX_TIES 1000

That means the author (who is on well deserved vacations and may not answer at once) decided that it is extremely unlikely that someone is going to run knn with such an extreme number of neighbours k. If you really have an application where this makes sense, just edit the source code and increase that number, then install the package from sources yourself.

Uwe Ligges





        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to