Re: [R] how is the model resample performance calculated by caret?

2014-02-28 Thread Max Kuhn
On Fri, Feb 28, 2014 at 1:13 AM, zhenjiang zech xu
zhenjiang...@gmail.com wrote:
 Dear all,

 I did a 5-repeat of 10-fold cross validation using partial least square
 regression model provided by caret package. Can anyone tell me how are the
 values in plsTune$resample calculated? Is that predicted on each hold-out
 set using the model which is trained on the rest data with the optimized
 parameter tuned from previous cross validation?

Yes, those values are the performance estimates across each hold-out
using the final model. There is an option in trainControl() that will
have it return the resamples from all models too.

 So in the following
 example, firstly, 5-repeat of 10-fold cross validation gives 2 for ncomp as
 the best, and then using ncomp of 2 and the training data to build a model
 and then predict the hold-out data with the model to give a RMSE and
 RSQUARE - is what I am thinking true?

It is.

Max



 plsTune
 524 samples
 615 predictors

 Pre-processing: centered, scaled
 Resampling: Cross-Validation (10 fold, repeated 5 times)

 Summary of sample sizes: 472, 472, 471, 471, 471, 471, ...

 Resampling results across tuning parameters:

   ncomp  RMSE  Rsquared  RMSE SD  Rsquared SD
   1  16.8  0.434 1.47 0.0616
   2  14.3  0.612 2.21 0.0768
   3  13.5  0.704 6.33 0.145
   4  14.6  0.706 9.29 0.163
   5  15.2  0.703 10.9 0.172
   6  16.5  0.69  13.4 0.181
   7  18.4  0.672 17.8 0.194
   8  200.651 20.4 0.199
   9  20.9  0.634 20.9 0.199
   10 22.1  0.613 22.1 0.197
   11 23.3  0.599 23.8 0.198
   12 240.588 24.7 0.198
   13 24.9  0.572 25.2 0.197
   14 25.8  0.557 26.2 0.194
   15 26.2  0.544 25.8 0.191
   16 26.6  0.532 25.5 0.187

 RMSE was used to select the optimal model using  the one SE rule.
 The final value used for the model was ncomp = 2.

 plsTune$resample
ncomp RMSE  RsquaredResample
 1  2 13.61569 0.6349700 Fold06.Rep4
 2  2 16.02091 0.5808985 Fold05.Rep1
 3  2 12.59985 0.6008357 Fold03.Rep5
 4  2 13.20069 0.6296245 Fold02.Rep3
 5  2 12.43419 0.6560434 Fold04.Rep2
 6  2 15.36510 0.5954177 Fold04.Rep5
 7  2 12.70028 0.6894489 Fold03.Rep2
 8  2 13.34882 0.6468300 Fold09.Rep3
 9  2 14.80217 0.5575010 Fold08.Rep3
 10 2 19.03705 0.4907630 Fold05.Rep4
 11 2 14.26704 0.6579390 Fold10.Rep2
 12 2 13.79060 0.5806663 Fold05.Rep3
 13 2 14.83641 0.5918039 Fold05.Rep2
 14 2 12.48721 0.7011439 Fold01.Rep3
 15 2 14.98765 0.5866102 Fold07.Rep4
 16 2 10.88100 0.7597167 Fold06.Rep1
 17 2 13.60705 0.6321377 Fold08.Rep5
 18 2 13.42618 0.6136031 Fold08.Rep4
 19 2 13.26066 0.6784586 Fold07.Rep1
 20 2 13.20623 0.6812341 Fold03.Rep3
 21 2 18.54275 0.4404729 Fold08.Rep2
 22 2 11.80312 0.7177681 Fold05.Rep5
 23 2 18.56271 0.4661072 Fold03.Rep1
 24 2 13.54879 0.5850439 Fold10.Rep3
 25 2 14.10859 0.5994811 Fold06.Rep5
 26 2 13.68329 0.6701091 Fold01.Rep5
 27 2 16.12123 0.5401200 Fold10.Rep1
 28 2 12.92250 0.6917220 Fold06.Rep3
 29 2 12.94366 0.6400066 Fold06.Rep2
 30 2 12.39889 0.6790578 Fold01.Rep2
 31 2 13.48499 0.6759649 Fold01.Rep1
 32 2 12.52938 0.6728476 Fold03.Rep4
 33 2 16.43352 0.5795160 Fold09.Rep5
 34 2 12.53991 0.6550694 Fold09.Rep4
 35 2 12.78708 0.6304606 Fold08.Rep1
 36 2 13.97559 0.6655688 Fold04.Rep3
 37 2 15.31642 0.5124997 Fold09.Rep2
 38 2 15.24194 0.5324943 Fold09.Rep1
 39 2 12.90107 0.6318960 Fold04.Rep1
 40 2 13.59574 0.6277869 Fold01.Rep4
 41 2 19.73633 0.4154821 Fold07.Rep5
 42 2 12.03759 0.6537381 Fold02.Rep5
 43 2 15.47139 0.5597097 Fold02.Rep4
 44 2 22.55060 0.3816672 Fold07.Rep3
 45 2 14.57875 0.6269560 Fold07.Rep2
 46 2 13.02385 0.6395148 Fold02.Rep2
 47 2 13.81020 0.6116137 Fold02.Rep1
 48 2 13.46100 0.6200828 Fold04.Rep4
 49 2 13.95487 0.6709253 Fold10.Rep5
 50 2 12.65981 0.6606435 Fold10.Rep4

 Best,
 Zhenjiang

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how is the model resample performance calculated by caret?

2014-02-27 Thread zhenjiang zech xu
Dear all,

I did a 5-repeat of 10-fold cross validation using partial least square
regression model provided by caret package. Can anyone tell me how are the
values in plsTune$resample calculated? Is that predicted on each hold-out
set using the model which is trained on the rest data with the optimized
parameter tuned from previous cross validation? So in the following
example, firstly, 5-repeat of 10-fold cross validation gives 2 for ncomp as
the best, and then using ncomp of 2 and the training data to build a model
and then predict the hold-out data with the model to give a RMSE and
RSQUARE - is what I am thinking true?


 plsTune
524 samples
615 predictors

Pre-processing: centered, scaled
Resampling: Cross-Validation (10 fold, repeated 5 times)

Summary of sample sizes: 472, 472, 471, 471, 471, 471, ...

Resampling results across tuning parameters:

  ncomp  RMSE  Rsquared  RMSE SD  Rsquared SD
  1  16.8  0.434 1.47 0.0616
  2  14.3  0.612 2.21 0.0768
  3  13.5  0.704 6.33 0.145
  4  14.6  0.706 9.29 0.163
  5  15.2  0.703 10.9 0.172
  6  16.5  0.69  13.4 0.181
  7  18.4  0.672 17.8 0.194
  8  200.651 20.4 0.199
  9  20.9  0.634 20.9 0.199
  10 22.1  0.613 22.1 0.197
  11 23.3  0.599 23.8 0.198
  12 240.588 24.7 0.198
  13 24.9  0.572 25.2 0.197
  14 25.8  0.557 26.2 0.194
  15 26.2  0.544 25.8 0.191
  16 26.6  0.532 25.5 0.187

RMSE was used to select the optimal model using  the one SE rule.
The final value used for the model was ncomp = 2.

 plsTune$resample
   ncomp RMSE  RsquaredResample
1  2 13.61569 0.6349700 Fold06.Rep4
2  2 16.02091 0.5808985 Fold05.Rep1
3  2 12.59985 0.6008357 Fold03.Rep5
4  2 13.20069 0.6296245 Fold02.Rep3
5  2 12.43419 0.6560434 Fold04.Rep2
6  2 15.36510 0.5954177 Fold04.Rep5
7  2 12.70028 0.6894489 Fold03.Rep2
8  2 13.34882 0.6468300 Fold09.Rep3
9  2 14.80217 0.5575010 Fold08.Rep3
10 2 19.03705 0.4907630 Fold05.Rep4
11 2 14.26704 0.6579390 Fold10.Rep2
12 2 13.79060 0.5806663 Fold05.Rep3
13 2 14.83641 0.5918039 Fold05.Rep2
14 2 12.48721 0.7011439 Fold01.Rep3
15 2 14.98765 0.5866102 Fold07.Rep4
16 2 10.88100 0.7597167 Fold06.Rep1
17 2 13.60705 0.6321377 Fold08.Rep5
18 2 13.42618 0.6136031 Fold08.Rep4
19 2 13.26066 0.6784586 Fold07.Rep1
20 2 13.20623 0.6812341 Fold03.Rep3
21 2 18.54275 0.4404729 Fold08.Rep2
22 2 11.80312 0.7177681 Fold05.Rep5
23 2 18.56271 0.4661072 Fold03.Rep1
24 2 13.54879 0.5850439 Fold10.Rep3
25 2 14.10859 0.5994811 Fold06.Rep5
26 2 13.68329 0.6701091 Fold01.Rep5
27 2 16.12123 0.5401200 Fold10.Rep1
28 2 12.92250 0.6917220 Fold06.Rep3
29 2 12.94366 0.6400066 Fold06.Rep2
30 2 12.39889 0.6790578 Fold01.Rep2
31 2 13.48499 0.6759649 Fold01.Rep1
32 2 12.52938 0.6728476 Fold03.Rep4
33 2 16.43352 0.5795160 Fold09.Rep5
34 2 12.53991 0.6550694 Fold09.Rep4
35 2 12.78708 0.6304606 Fold08.Rep1
36 2 13.97559 0.6655688 Fold04.Rep3
37 2 15.31642 0.5124997 Fold09.Rep2
38 2 15.24194 0.5324943 Fold09.Rep1
39 2 12.90107 0.6318960 Fold04.Rep1
40 2 13.59574 0.6277869 Fold01.Rep4
41 2 19.73633 0.4154821 Fold07.Rep5
42 2 12.03759 0.6537381 Fold02.Rep5
43 2 15.47139 0.5597097 Fold02.Rep4
44 2 22.55060 0.3816672 Fold07.Rep3
45 2 14.57875 0.6269560 Fold07.Rep2
46 2 13.02385 0.6395148 Fold02.Rep2
47 2 13.81020 0.6116137 Fold02.Rep1
48 2 13.46100 0.6200828 Fold04.Rep4
49 2 13.95487 0.6709253 Fold10.Rep5
50 2 12.65981 0.6606435 Fold10.Rep4

Best,
Zhenjiang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.