> This is a summary of discussions between Shengqiao Li and me, > entered here as a reference for future requests on knn regression > or missing value imputation based on a nearest neighbor approach.
There several functions that can be used for 'nearest neighbor' classification such as knn, knn1 (in package class), knn3(caret), kknn(kknn), ipredknn(ipred), sknn(klaR), or gknn(cba). To utilize these functions for 'nearest neighbor' regression would be difficult. There is actually just one knn-like functions that can be applied to continuous variables: kknn(kknn) uses a formula and looks at the type of the target variable: if the target variable is continuous will return a regression result for each row in the learning set And two implementations of functions that simply return the indices and distances of k nearest neighbors for further processing: ann(yaImpute) constructs kd- or bd-trees to find k nearest neighbors and returns indices and distances of those neighbors (it may kill the whole R process when matrices are too big) [Remark: Watch out, default distance is sum of squares] knnFinder(knnFinder) constructs a kd-tree to find the k nearest neighbors; has too many bugs and quirks to make it almost unusable; not maintained anymore (perhaps should be removed from CRAN) The other approach is to use a distance function and sort 'manually' to find the nearest neighbors and their values for the target variable. 'dist' itself is not really appropriate as it can only be applied to _one_ matrix where here we need something like dist(A, B). Combining A and B into one matrix is often forbidden as it needs too much memory. dists(cba) computes a distance matrix between rows of two matrices can be a bit slow for very big matrices (slower than 'dist') [Rem: default distance is square root of sum of squares] I would appreciate to hear from you when I missed something. Hans Werner Borchers ABB Corporate Research ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.