Hi, I am a graduate student applying published R scripts to compare the classification accuracy of 2 predictive models, one built using discriminant function analysis and one using random forests (webpage link for these scripts is provided below). The purpose of these models is to predict the biotic integrity of streams. Specifically, I am trying to compare the classification accuracy (i.e., prediction of group membership)of both the DFA and RF models using k-fold crossvalidation for the following metrics: AUC ROC, percent correctly classified, specificity, sensitivity, and Kappa. I would also like to obtain the F statistic, Wilks lambda, MSE or RMSE for the random forest models as the script does not contain code to get this data. I think I need to use the caret package to obtain the classification accuracy, but I keep getting error messages when I apply the train function to my data. As I am relatively new to R and my thesis committee is unable to help as they are also unf! amiliar with R, I thought it best to ask for help. Would someone be willing to help me?
Thanks, Robin http://www.epa.gov/wed/pages/models/rivpacs/rivpacs.htm > TrainDataDFAgrps2 <-predcal > TrainClassesDFAgrps2 <-grp.2; > DFAgrps2Fit1 <- train(TrainDataDFAgrps2, TrainClassesDFAgrps2, + method = "lda", + tuneLength = 10, + trControl = trainControl(method = "cv")); Error in train.default(TrainDataDFAgrps2, TrainClassesDFAgrps2, method = "lda", : wrong model type for regression > RFgrps2Fit1 <- train(TrainDataRFgrps2, TrainClassesRFgrps2, + method = "rf", + tuneLength = 10, + trControl = trainControl(method = "cv")); There were 50 or more warnings (use warnings() to see the first 50) Clip of predcal (same length as grp.2, but too much data to display all): > predcal Reference_Test HUC12_AREA_HA_log10 ELEV_m M_Slp_sqt Precip_mm Temp_CX10 2370 R 3.7 588.0 2.2 1751 148 559 R 4.0 643.1 1.8 1674 141 2062 R 4.0 643.1 1.8 1674 141 2467 R 4.0 643.1 1.8 1674 141 1176 R 3.9 694.3 2.4 1534 131 1840 R 3.9 694.3 2.4 1534 131 2052 R 3.9 694.3 2.4 1534 131 1174 R 4.1 605.0 2.1 1382 138 1841 R 4.1 605.0 2.1 1382 138 2051 R 4.1 605.0 2.1 1382 138 1831 R 4.1 363.9 1.7 937 156 Grps.2: grp.2 [1] 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 1 1 [45] 2 2 1 1 1 1 1 1 1 2 2 1 1 1 2 2 1 2 2 1 1 1 2 2 2 2 2 2 1 1 1 2 2 2 1 2 2 2 2 2 2 2 2 1 [89] 1 2 2 2 2 2 1 1 2 2 2 1 2 1 2 2 1 2 1 1 2 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.