Re: [R] Custom caret metric based on prob-predictions/rankings

2012-02-10 Thread Yang Zhang
Actually, is there any way to get at additional information beyond the
classProbs?  In particular, is there any way to find out the
associated weights, or otherwise the row indices into the original
model matrix corresponding to the tested instances?

On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang yanghates...@gmail.com wrote:
 Oops, found trainControl's classProbs right after I sent!

 On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang yanghates...@gmail.com wrote:
 I'm dealing with classification problems, and I'm trying to specify a
 custom scoring metric (recall@p, ROC, etc.) that depends on not just
 the class output but the probability estimates, so that caret::train
 can choose the optimal tuning parameters based on this metric.

 However, when I supply a trainControl summaryFunction, the data given
 to it contains only class predictions, so the only metrics possible
 are things like accuracy, kappa, etc.

 Is there any way to do this that I'm looking?  If not, could I put
 this in as a feature request?  Thanks!

 --
 Yang Zhang
 http://yz.mit.edu/



 --
 Yang Zhang
 http://yz.mit.edu/



-- 
Yang Zhang
http://yz.mit.edu/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Custom caret metric based on prob-predictions/rankings

2012-02-10 Thread Max Kuhn
I think you need to read the man pages and the four vignettes. A lot
of your questions have answers there.

If you don't specify the resampling indices, they ones generated for
you are saved in the train object:

 data(iris)
 TrainData - iris[,1:4]
 TrainClasses - iris[,5]

 knnFit1 - train(TrainData, TrainClasses,
+  method = knn,
+  preProcess = c(center, scale),
+  tuneLength = 10,
+  trControl = trainControl(method = cv))
Loading required package: class

Attaching package: ‘class’

The following object(s) are masked from ‘package:reshape’:

condense

Warning message:
executing %dopar% sequentially: no parallel backend registered
 str(knnFit1$control$index)
List of 10
 $ Fold01: int [1:135] 1 2 3 4 5 6 7 9 10 11 ...
 $ Fold02: int [1:135] 1 2 3 4 5 6 8 9 10 12 ...
 $ Fold03: int [1:135] 1 3 4 5 6 7 8 9 10 11 ...
 $ Fold04: int [1:135] 1 2 3 5 6 7 8 9 10 11 ...
 $ Fold05: int [1:135] 1 2 3 4 6 7 8 9 11 12 ...
 $ Fold06: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold07: int [1:135] 1 2 3 4 5 7 8 9 10 11 ...
 $ Fold08: int [1:135] 2 3 4 5 6 7 8 9 10 11 ...
 $ Fold09: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold10: int [1:135] 1 2 4 5 6 7 8 10 11 12 ...

There is also a savePredictions argument that gives you the hold-out results.

I'm not sure which weights you are referring to.

On Fri, Feb 10, 2012 at 4:38 AM, Yang Zhang yanghates...@gmail.com wrote:
 Actually, is there any way to get at additional information beyond the
 classProbs?  In particular, is there any way to find out the
 associated weights, or otherwise the row indices into the original
 model matrix corresponding to the tested instances?

 On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang yanghates...@gmail.com wrote:
 Oops, found trainControl's classProbs right after I sent!

 On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang yanghates...@gmail.com wrote:
 I'm dealing with classification problems, and I'm trying to specify a
 custom scoring metric (recall@p, ROC, etc.) that depends on not just
 the class output but the probability estimates, so that caret::train
 can choose the optimal tuning parameters based on this metric.

 However, when I supply a trainControl summaryFunction, the data given
 to it contains only class predictions, so the only metrics possible
 are things like accuracy, kappa, etc.

 Is there any way to do this that I'm looking?  If not, could I put
 this in as a feature request?  Thanks!

 --
 Yang Zhang
 http://yz.mit.edu/



 --
 Yang Zhang
 http://yz.mit.edu/



 --
 Yang Zhang
 http://yz.mit.edu/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Custom caret metric based on prob-predictions/rankings

2012-02-10 Thread Yang Zhang
Sorry for not being more clear - I'm interested in accessing these
indices from within the trainControl summaryFunction, not afterward
(from the train object).

As for the weights, I'm referring to the weights argument passed into
train.

On Fri, Feb 10, 2012 at 5:50 AM, Max Kuhn mxk...@gmail.com wrote:
 I think you need to read the man pages and the four vignettes. A lot
 of your questions have answers there.

 If you don't specify the resampling indices, they ones generated for
 you are saved in the train object:

 data(iris)
 TrainData - iris[,1:4]
 TrainClasses - iris[,5]

 knnFit1 - train(TrainData, TrainClasses,
 +                  method = knn,
 +                  preProcess = c(center, scale),
 +                  tuneLength = 10,
 +                  trControl = trainControl(method = cv))
 Loading required package: class

 Attaching package: ‘class’

 The following object(s) are masked from ‘package:reshape’:

    condense

 Warning message:
 executing %dopar% sequentially: no parallel backend registered
 str(knnFit1$control$index)
 List of 10
  $ Fold01: int [1:135] 1 2 3 4 5 6 7 9 10 11 ...
  $ Fold02: int [1:135] 1 2 3 4 5 6 8 9 10 12 ...
  $ Fold03: int [1:135] 1 3 4 5 6 7 8 9 10 11 ...
  $ Fold04: int [1:135] 1 2 3 5 6 7 8 9 10 11 ...
  $ Fold05: int [1:135] 1 2 3 4 6 7 8 9 11 12 ...
  $ Fold06: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
  $ Fold07: int [1:135] 1 2 3 4 5 7 8 9 10 11 ...
  $ Fold08: int [1:135] 2 3 4 5 6 7 8 9 10 11 ...
  $ Fold09: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
  $ Fold10: int [1:135] 1 2 4 5 6 7 8 10 11 12 ...

 There is also a savePredictions argument that gives you the hold-out results.

 I'm not sure which weights you are referring to.

 On Fri, Feb 10, 2012 at 4:38 AM, Yang Zhang yanghates...@gmail.com wrote:
 Actually, is there any way to get at additional information beyond the
 classProbs?  In particular, is there any way to find out the
 associated weights, or otherwise the row indices into the original
 model matrix corresponding to the tested instances?

 On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang yanghates...@gmail.com wrote:
 Oops, found trainControl's classProbs right after I sent!

 On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang yanghates...@gmail.com wrote:
 I'm dealing with classification problems, and I'm trying to specify a
 custom scoring metric (recall@p, ROC, etc.) that depends on not just
 the class output but the probability estimates, so that caret::train
 can choose the optimal tuning parameters based on this metric.

 However, when I supply a trainControl summaryFunction, the data given
 to it contains only class predictions, so the only metrics possible
 are things like accuracy, kappa, etc.

 Is there any way to do this that I'm looking?  If not, could I put
 this in as a feature request?  Thanks!

 --
 Yang Zhang
 http://yz.mit.edu/



 --
 Yang Zhang
 http://yz.mit.edu/



 --
 Yang Zhang
 http://yz.mit.edu/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --

 Max



-- 
Yang Zhang
http://yz.mit.edu/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Custom caret metric based on prob-predictions/rankings

2012-02-10 Thread Yang Zhang
(I couldn't find answers to this question in the documentation)

On Fri, Feb 10, 2012 at 11:59 AM, Yang Zhang yanghates...@gmail.com wrote:
 Sorry for not being more clear - I'm interested in accessing these
 indices from within the trainControl summaryFunction, not afterward
 (from the train object).

 As for the weights, I'm referring to the weights argument passed into
 train.

 On Fri, Feb 10, 2012 at 5:50 AM, Max Kuhn mxk...@gmail.com wrote:
 I think you need to read the man pages and the four vignettes. A lot
 of your questions have answers there.

 If you don't specify the resampling indices, they ones generated for
 you are saved in the train object:

 data(iris)
 TrainData - iris[,1:4]
 TrainClasses - iris[,5]

 knnFit1 - train(TrainData, TrainClasses,
 +                  method = knn,
 +                  preProcess = c(center, scale),
 +                  tuneLength = 10,
 +                  trControl = trainControl(method = cv))
 Loading required package: class

 Attaching package: ‘class’

 The following object(s) are masked from ‘package:reshape’:

    condense

 Warning message:
 executing %dopar% sequentially: no parallel backend registered
 str(knnFit1$control$index)
 List of 10
  $ Fold01: int [1:135] 1 2 3 4 5 6 7 9 10 11 ...
  $ Fold02: int [1:135] 1 2 3 4 5 6 8 9 10 12 ...
  $ Fold03: int [1:135] 1 3 4 5 6 7 8 9 10 11 ...
  $ Fold04: int [1:135] 1 2 3 5 6 7 8 9 10 11 ...
  $ Fold05: int [1:135] 1 2 3 4 6 7 8 9 11 12 ...
  $ Fold06: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
  $ Fold07: int [1:135] 1 2 3 4 5 7 8 9 10 11 ...
  $ Fold08: int [1:135] 2 3 4 5 6 7 8 9 10 11 ...
  $ Fold09: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
  $ Fold10: int [1:135] 1 2 4 5 6 7 8 10 11 12 ...

 There is also a savePredictions argument that gives you the hold-out results.

 I'm not sure which weights you are referring to.

 On Fri, Feb 10, 2012 at 4:38 AM, Yang Zhang yanghates...@gmail.com wrote:
 Actually, is there any way to get at additional information beyond the
 classProbs?  In particular, is there any way to find out the
 associated weights, or otherwise the row indices into the original
 model matrix corresponding to the tested instances?

 On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang yanghates...@gmail.com wrote:
 Oops, found trainControl's classProbs right after I sent!

 On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang yanghates...@gmail.com wrote:
 I'm dealing with classification problems, and I'm trying to specify a
 custom scoring metric (recall@p, ROC, etc.) that depends on not just
 the class output but the probability estimates, so that caret::train
 can choose the optimal tuning parameters based on this metric.

 However, when I supply a trainControl summaryFunction, the data given
 to it contains only class predictions, so the only metrics possible
 are things like accuracy, kappa, etc.

 Is there any way to do this that I'm looking?  If not, could I put
 this in as a feature request?  Thanks!

 --
 Yang Zhang
 http://yz.mit.edu/



 --
 Yang Zhang
 http://yz.mit.edu/



 --
 Yang Zhang
 http://yz.mit.edu/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --

 Max



 --
 Yang Zhang
 http://yz.mit.edu/



-- 
Yang Zhang
http://yz.mit.edu/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Custom caret metric based on prob-predictions/rankings

2012-02-09 Thread Yang Zhang
Oops, found trainControl's classProbs right after I sent!

On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang yanghates...@gmail.com wrote:
 I'm dealing with classification problems, and I'm trying to specify a
 custom scoring metric (recall@p, ROC, etc.) that depends on not just
 the class output but the probability estimates, so that caret::train
 can choose the optimal tuning parameters based on this metric.

 However, when I supply a trainControl summaryFunction, the data given
 to it contains only class predictions, so the only metrics possible
 are things like accuracy, kappa, etc.

 Is there any way to do this that I'm looking?  If not, could I put
 this in as a feature request?  Thanks!

 --
 Yang Zhang
 http://yz.mit.edu/



-- 
Yang Zhang
http://yz.mit.edu/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.