Does caret have a bug calculating ROC with earth? When using caret and earth on any of my data sets, caret's ROC never varies. This could mean earth is finding the same model (for example, because of using an nprune parameter that is too high). However, if that were true, sensitivity and specificity would also not vary, but they do vary. Also, I verified nprune is not too high.
I am attaching sample output from R 2.14.0 on Windows 7 64-bit with earth 3.2 and caret 5.07. I don't have this problem with caret and ctree. Andrew
R version 2.14.0 (2011-10-31) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-pc-mingw32/x64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > # install and load packages, as needed > for (pkg in c('caret','earth','mlbench', 'e1071')) { + if (!require(pkg, character.only=T)) {install.packages(pkg)} + require(pkg, character.only=T) + } Loading required package: caret Loading required package: lattice Loading required package: reshape Loading required package: plyr Attaching package: reshape The following object(s) are masked from package:plyr: rename, round_any Loading required package: cluster Loading required package: foreach Loading required package: iterators Loading required package: codetools foreach: simple, scalable parallel programming from Revolution Analytics Use Revolution R for scalability, fault tolerance and more. http://www.revolutionanalytics.com Loading required package: earth Loading required package: leaps Loading required package: plotmo Loading required package: plotrix Loading required package: mlbench Loading required package: e1071 Loading required package: class Attaching package: class The following object(s) are masked from package:reshape: condense > > # system information > installed.packages()[c('earth','caret'),'Version'] earth caret "3.2-1" "5.07-001" > > > # prepare data > data(etitanic) > mydata <- etitanic > mydata$survived <- as.factor(ifelse(etitanic$survived==1, 'T', 'F')) > summary(mydata) pclass survived sex age sibsp parch 1st:284 F:619 female:388 Min. : 0.1667 Min. :0.0000 Min. :0.0000 2nd:261 T:427 male :658 1st Qu.:21.0000 1st Qu.:0.0000 1st Qu.:0.0000 3rd:501 Median :28.0000 Median :0.0000 Median :0.0000 Mean :29.8811 Mean :0.5029 Mean :0.4207 3rd Qu.:39.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 Max. :80.0000 Max. :8.0000 Max. :6.0000 > > # show natural maximum pruning is 9 > fit <- earth(survived ~ ., data=mydata) > summary(fit, style="max") Call: earth(formula=survived~., data=mydata) T = 1.094732 - 0.2113713 * max(0, pclass2nd - 0) - 0.3413489 * max(0, pclass3rd - 0) - 0.4851343 * max(0, sexmale - 0) - 0.004222467 * max(0, age - 10) + 0.02569032 * max(0, 10 - age) - 0.09699376 * max(0, sibsp - 1) - 0.06266133 * max(0, parch - 1) - 0.09015484 * max(0, 1 - parch) Selected 9 of 10 terms, and 6 of 6 predictors Importance: sexmale, pclass3rd, age, pclass2nd, sibsp, parch Number of terms at each degree of interaction: 1 8 (additive model) GCV 0.1519922 RSS 153.8581 GRSq 0.3720351 RSq 0.3911174 > > # custom metric > twoClassSummaryPlus <- function (data, + lev = NULL, + model = NULL) + + { + out1 <- twoClassSummary(data, lev, model) + out2 <- defaultSummary(data, lev, model) + #browser() # debug + #print(out1) + #print(dim(data)) + c(out1, out2) + } > > > # tne > train_earth <- function(nprune) + { + # prepare tuning parameters + grid <- expand.grid(.degree=c(1), .nprune=nprune) + + trControl<- trainControl(summaryFunction = twoClassSummaryPlus, + classProbs = T, + verboseIter=T) + + # tune + mydata.best <- train(survived ~ ., + data = mydata, + method = "earth", + trControl = trControl, + metric="Sens", + tuneGrid=grid) + + # show tuned + print(mydata.best) + } > > train_earth(c(1:9)) # ROC is constant Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Fitting: degree=1, nprune=9 Aggregating results Selecting tuning parameters Fitting model on full training set 1046 samples 6 predictors 2 classes: 'F', 'T' No pre-processing Resampling: Bootstrap (25 reps) Summary of sample sizes: 1046, 1046, 1046, 1046, 1046, 1046, ... Resampling results across tuning parameters: nprune ROC Sens Spec Accuracy Kappa ROC SD Sens SD Spec SD Accuracy SD Kappa SD 1 0.843 1 0 0.588 0 0.0154 0 0 0.0239 0 2 0.843 0.845 0.684 0.779 0.537 0.0154 0.0209 0.0318 0.0191 0.0393 3 0.843 0.845 0.685 0.779 0.537 0.0154 0.0217 0.0326 0.0191 0.0392 4 0.843 0.846 0.694 0.784 0.547 0.0154 0.0232 0.0343 0.0203 0.0412 5 0.843 0.842 0.714 0.789 0.561 0.0154 0.0236 0.0344 0.0184 0.037 6 0.843 0.848 0.718 0.794 0.57 0.0154 0.0222 0.0349 0.0182 0.0367 7 0.843 0.84 0.727 0.793 0.57 0.0154 0.0279 0.0357 0.0163 0.0324 8 0.843 0.84 0.723 0.792 0.567 0.0154 0.0276 0.0375 0.0161 0.0317 9 0.843 0.84 0.721 0.791 0.565 0.0154 0.026 0.0389 0.0161 0.0322 Tuning parameter 'degree' was held constant at a value of 1 Sens was used to select the optimal model using the largest value. The final values used for the model were degree = 1 and nprune = 1. There were 15 warnings (use warnings() to see them) >
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.