[R] NaiveBayes fails with one input variable (caret and klarR packages)
Hello, We have a system which creates thousands of regression/classification models and in cases where we have only one input variable NaiveBayes throws an error. Maybe I am mistaken and I shouldn't expect to have a model with only one input variable. We use R version 2.6.0 (2007-10-03). We use caret (v4.1.19), but have tested similar code with klaR (v.0.5.8), because caret relies on NaiveBayes implementation from klaR. I get different error messages from caret than from klaR so I will provide the code for caret usage and klaR usage. Here is the code which uses the iris dataset. > library(klaR); Loading required package: MASS > X<-iris["Sepal.Length"]; > Y<-iris["Species"]; > mnX<-as.matrix (X); > mnY<-as.matrix (Y); > cY<-factor(mnY); > d <- data.frame (cbind(mnX,cY)); > m<-NaiveBayes(cY~mnX, data=d); > predict(m); Error in as.vector(x, mode) : invalid argument 'mode' > library(caret); Loading required package: lattice > mCaret<-train(mnX,cY,method="nb",trControl = trainControl(method = "cv", > number = 10)); Loading required package: class Fitting: usekernel=TRUE Fitting: usekernel=FALSE > predicted <- predict(mCaret, newdata=mnX); Error in 1:nrow(newdata) : NA/NaN argument > We use caret to call NaiveBayes and we don't have any error messages in cases where the number of input variables is greater than 1. Cheers DK _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nearZeroVar in caret fails
I am using R version 2.6.0 on Linux (CentOS 4.5) and have a problem with executing nearZeroVar function in the package caret. I am using the latest release of caret v4.17. I have a matrix X with 266 rows and 4 columns and when implementing nearZeroVar function from caret package I get following error message. > C <- nearZeroVar(X); Error in table(data, useNA = "no") : all arguments must have the same length Calls: nearZeroVar -> apply -> FUN -> table I have executed step by step commands in the function nearZeroVar and found that it fails when it tries > t<- table(X,useNa = "no") Error in table(X, useNa = "no") : all arguments must have the same length If I try without useNa="no" it works fine > t<- table(X) > t X 019 11 12 14 17 18 21 22 37 39 66 123 1026 10158111112412 > When I tried to see the code for table there is no mention of useNa. > table function (..., exclude = c(NA, NaN), dnn = list.names(...), deparse.level = 1) { Could this be a problem with my current version of R 2.6 ? In the specs for caret it depends on R>2.5.1. Thanks in advance DK _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] predict.rpart question
Dear All, I have a question regarding predict.rpart. I use rpart to build classification and regression trees and I deal with data with relatively large number of input variables (predictors). For example, I build an rpart model like this rpartModel <- rpart(Y ~ X, method="class", minsplit =1, minbucket=nMinBucket,cp=nCp); and get predictors used in building the model like this colnamesUsed<-unique(rownames(rpartModel$splits)); When later I apply the rpart model to predict the new data I strip the input data from unneccessary columns and only use X columns that exist in colnamesUsed. Unfortunately I get error message like this Error: variable 'X' was fitted with type "nmatrix.3522" but type "nmatrix.19" was supplied The error message is correct. In the documentation it clearly specifies that the predictors referred to in the right side of formula (object) must be present by name in newdata, but I wonder why, if they are not used? Thanks DK _ Share what Santa brought you __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] predict.rpart question
Dear All, I have a question regarding predict.rpart. I use rpart to build classification and regression trees and I deal with data with relatively large number of input variables (predictors). For example, I build an rpart model like this rpartModel <- rpart(Y ~ X, method="class", minsplit =1, minbucket=nMinBucket,cp=nCp); and get predictors used in building the model like this colnamesUsed<-unique(rownames(rpartModel$splits)); When later I apply the rpart model to predict the new data I strip the input data from unneccessary columns and only use X columns that exist in colnamesUsed. Unfortunately I get error message like this Error: variable 'X' was fitted with type "nmatrix.3522" but type "nmatrix.19" was supplied The error message is correct. In the documentation it clearly specifies that the predictors referred to in the right side of formula (object) must be present by name in newdata, but I wonder why, if they are not used? Thanks DK _ Who's friends with who and co-starred in what? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] genetic algorithms in solving classification problems
Hello, I am planning to implement Genetic Algorithms in solving classification problems. As far as I can see there is only genalg package that I can use. I can see that there are gafit and rgenoud packages that can be used for regression and optimisation problems but not for classification. Is there any other R package that I can use? Any ideas on how to implement GA in resolving classification problem without re-inventing a wheel would be much appreciated. I can see that there are some good stuff in C++ for matlab but am keen to do it in R. Thanks. Damjan Krstajic Director Research Centre for Cheminformatics www.rcc.org.yu _ Fancy some celeb spotting? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.