subject:"\[R\] pipe data from plot\(\). was\: ROCR.plot methods, cross validation averaging"

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tim Howard

Yes, that's exactly what I am after. Thank you for clarifying my problem for me!

I'll try to dive into the plot.performance function.

Best, 
Tim

>>> Tobias Sing  9/24/2009 9:57 AM >>>
Tim,

if I understand correctly, you are trying to get the numerical values
of averaged cross-validation curves.
Unfortunately the plot function of ROCR does not return anything in
the current version (it's a good suggestion to change this).

If you want a quick fix, you could change the plot.performance
function of ROCR to return back the values you wanted.

Kind regards,
  Tobias

On Thu, Sep 24, 2009 at 3:09 PM, Tim Howard  wrote:
> All,
>  I'm trying again with a slightly more generic version of my first question. 
> I can extract the
> plotted values from hist(), boxplot(), and even plot.randomForest(). Observe:
>
>  # get some data
> dat <- rnorm(100)
>  # grab histogram data
> hdat <- hist(dat)
> hdat #provides details of the hist output
>
>  #grab boxplot data
> bdat <- boxplot(dat)
> bdat #provides details of the boxplot output
>
>  # the same works for randomForest
> library(randomForest)
> data(mtcars)
> RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), 
> log="y")
> RFdat
>
>
> ##But, I can't use this method in ROCR
> library(ROCR)
> data(ROCR.xval)
> RCdat <- plot(perf, avg="threshold")
>
> RCdat
> ## output:  NULL
>
> Does anyone have any tricks for piping or extracting these data?
> Or, perhaps for steering me in another direction?
>
> Thanks,
> Tim
>
>
> From: "Tim Howard" 
> Subject: [R] ROCR.plot methods, cross validation averaging
> To: , ,
>
> Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
> Content-Type: text/plain; charset=US-ASCII
>
> Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -
>
> I think my first question is generic and could apply to many methods,
> which is why I'm directing this initially to R-help as well as Tobias and 
> Oliver.
>
> Question 1. The plot function in ROCR will average your cross validation
> data if asked. I'd like to use that averaged data to find a "best" cutoff
> but I can't figure out how to grab the actual data that get plotted.
> A simple redirect of the plot (such as test <- plot(mydata)) doesn't do it.
>
> Question 2. I am asking ROCR to average lists with varying lengths for
> each list entry. See my example below. None of the ROCR examples have data
> structured in this manner. Can anyone speak to whether the averaging
> methods in ROCR allow for this? If I can't easily grab the data as desired
> from Question 1, can someone help me figure out how to average the lists,
> by threshold, similarly?
>
> Question 3. If my cross validation data happen to have a list entry whose
> length = 2, ROCR errors out. Please see the second part of my example.
> Any suggestions?
>
> #reproducible examples exemplifying my questions
> ##part one##
> library(ROCR)
> data(ROCR.xval)
>  # set up data so it looks more like my real data
> sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
> testSet <- ROCR.xval
>  # do the extraction
> for (i in 1:length(ROCR.xval[[1]])){
>  y <- sample(c(1:350),sampSize[i])
>  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
>  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
>  }
>  # now massage the data using ROCR, set up for a ROC plot
>  # if it errors out here, run the above sample again.
> pred <- prediction(testSet$predictions, testSet$labels)
> perf <- performance(pred,"tpr","fpr")
>  # create the ROC plot, averaging by cutoff value
> plot(perf, avg="threshold")
>  # check out the structure of the data
> str(perf)
>  # note the ragged edges of the list and that I assume averaging
>  # whether it be vertical, horizontal, or threshold, somehow
>  # accounts for this?
>
> ## part two ##
> # add a list entry with only two values
> p...@x.values[[1]] <- c(0,1)
> p...@y.values[[1]] <- c(0,1)
> p...@alpha.values[[1]] <- c(Inf,0)
>
> plot(perf, avg="threshold")
>
> ##output results in an error with this message
> # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from,  :
> # missing value where TRUE/FALSE needed
>
>
> Thanks in advance for your help
> Tim Howard
> New York Natural Heritage Program
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tim Howard

David, 
Thank you for your reply. Yes, I can access the y-values slot with p...@y-values
but, note that in the cross-validation example (ROCR.xval), the plot function 
averages across the list of ten vectors in the y-values slot. 

I might be able to create a function to average across these ten vectors, but, 
since 
the plot function already does it for me, I thought it most efficient to get 
the values
from the function.  The compounding factor is that averaging needs to 
incorporate 
some kind of complex (to me at least) equalization based on the third slot 
(alpha.values). 

I don't know how to average vectors (especially uneven-length vectors) that 
align
using the alpha-values (suggestions here welcome!). Again, the plot function 
does 
this for me... if I could just get those values. 


Tobias, 
You suggestion to change the plot.performance function is a good one. I'll see 
if 
I can get in there and tweak it. 


Thanks to both of you for the help.
Tim


>>> David Winsemius  9/24/2009 9:43 AM >>>

On Sep 24, 2009, at 9:09 AM, Tim Howard wrote:

> All,
> I'm trying again with a slightly more generic version of my first  
> question. I can extract the
> plotted values from hist(), boxplot(), and even plot.randomForest().  
> Observe:
>
> # get some data
> dat <- rnorm(100)
> # grab histogram data
> hdat <- hist(dat)
> hdat #provides details of the hist output
>
> #grab boxplot data
> bdat <- boxplot(dat)
> bdat #provides details of the boxplot output
>
> # the same works for randomForest
> library(randomForest)
> data(mtcars)
> RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE,  
> ntree=100), log="y")
> RFdat
>
>
> ##But, I can't use this method in ROCR
> library(ROCR)
> data(ROCR.xval)
> RCdat <- plot(perf, avg="threshold")
>
> RCdat
> ## output:  NULL
>
> Does anyone have any tricks for piping or extracting these data?
> Or, perhaps for steering me in another direction?

After looking at the examples in ROCR, my guess is that you really  
ought to examine the perf object itself. It's an S4 object so some of  
the access to internals are a bit different. In the example  
performance object I just created, the y-values slot values would ba  
obtainable with:

p...@y.values 

  The is also help from:
?"plot-methods"

-- 
David
>
> Thanks,
> Tim
>
>
> From: "Tim Howard" 
> Subject: [R] ROCR.plot methods, cross validation averaging
> To: , ,
>   
> Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
> Content-Type: text/plain; charset=US-ASCII
>
> Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -
>
> I think my first question is generic and could apply to many methods,
> which is why I'm directing this initially to R-help as well as  
> Tobias and Oliver.
>
> Question 1. The plot function in ROCR will average your cross  
> validation
> data if asked. I'd like to use that averaged data to find a "best"  
> cutoff
> but I can't figure out how to grab the actual data that get plotted.
> A simple redirect of the plot (such as test <- plot(mydata)) doesn't  
> do it.
>
> Question 2. I am asking ROCR to average lists with varying lengths for
> each list entry. See my example below. None of the ROCR examples  
> have data
> structured in this manner. Can anyone speak to whether the averaging
> methods in ROCR allow for this? If I can't easily grab the data as  
> desired
> from Question 1, can someone help me figure out how to average the  
> lists,
> by threshold, similarly?
>
> Question 3. If my cross validation data happen to have a list entry  
> whose
> length = 2, ROCR errors out. Please see the second part of my example.
> Any suggestions?
>
> #reproducible examples exemplifying my questions
> ##part one##
> library(ROCR)
> data(ROCR.xval)
> # set up data so it looks more like my real data
> sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
> testSet <- ROCR.xval
> # do the extraction
> for (i in 1:length(ROCR.xval[[1]])){
>  y <- sample(c(1:350),sampSize[i])
>  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
>  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
>  }
> # now massage the data using ROCR, set up for a ROC plot
> # if it errors out here, run the above sample again.
> pred <- prediction(testSet$predictions, testSet$labels)
> perf <- performance(pred,"tpr","fpr")
> # create the ROC plot, averaging by cutoff value
> plot(perf, avg="threshold")
> # check out the structure of the data
> str(perf)
> # note the ragged edges of the list and that I assume averaging
> # whether it be vertical, horizontal, or threshold, somehow
> # accounts for this?
>
> ## part two ##
> # add a list entry with only two values
> p...@x.values[[1]] <- c(0,1)
> p...@y.values[[1]] <- c(0,1)
> p...@alpha.values[[1]] <- c(Inf,0)
>
> plot(perf, avg="threshold")
>
> ##output results in an error with this message
> # Error in if (from == to) rep.int(from, length.out) else  
> as.vector(c(from,  :
> # missing value where TRUE/FALSE needed
>
>
> Thanks in advance for your

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tobias Sing

Tim,

if I understand correctly, you are trying to get the numerical values
of averaged cross-validation curves.
Unfortunately the plot function of ROCR does not return anything in
the current version (it's a good suggestion to change this).

If you want a quick fix, you could change the plot.performance
function of ROCR to return back the values you wanted.

Kind regards,
  Tobias

On Thu, Sep 24, 2009 at 3:09 PM, Tim Howard  wrote:
> All,
>  I'm trying again with a slightly more generic version of my first question. 
> I can extract the
> plotted values from hist(), boxplot(), and even plot.randomForest(). Observe:
>
>  # get some data
> dat <- rnorm(100)
>  # grab histogram data
> hdat <- hist(dat)
> hdat     #provides details of the hist output
>
>  #grab boxplot data
> bdat <- boxplot(dat)
> bdat     #provides details of the boxplot output
>
>  # the same works for randomForest
> library(randomForest)
> data(mtcars)
> RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), 
> log="y")
> RFdat
>
>
> ##But, I can't use this method in ROCR
> library(ROCR)
> data(ROCR.xval)
> RCdat <- plot(perf, avg="threshold")
>
> RCdat
> ## output:  NULL
>
> Does anyone have any tricks for piping or extracting these data?
> Or, perhaps for steering me in another direction?
>
> Thanks,
> Tim
>
>
> From: "Tim Howard" 
> Subject: [R] ROCR.plot methods, cross validation averaging
> To: , ,
>        
> Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
> Content-Type: text/plain; charset=US-ASCII
>
> Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -
>
> I think my first question is generic and could apply to many methods,
> which is why I'm directing this initially to R-help as well as Tobias and 
> Oliver.
>
> Question 1. The plot function in ROCR will average your cross validation
> data if asked. I'd like to use that averaged data to find a "best" cutoff
> but I can't figure out how to grab the actual data that get plotted.
> A simple redirect of the plot (such as test <- plot(mydata)) doesn't do it.
>
> Question 2. I am asking ROCR to average lists with varying lengths for
> each list entry. See my example below. None of the ROCR examples have data
> structured in this manner. Can anyone speak to whether the averaging
> methods in ROCR allow for this? If I can't easily grab the data as desired
> from Question 1, can someone help me figure out how to average the lists,
> by threshold, similarly?
>
> Question 3. If my cross validation data happen to have a list entry whose
> length = 2, ROCR errors out. Please see the second part of my example.
> Any suggestions?
>
> #reproducible examples exemplifying my questions
> ##part one##
> library(ROCR)
> data(ROCR.xval)
>  # set up data so it looks more like my real data
> sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
> testSet <- ROCR.xval
>  # do the extraction
> for (i in 1:length(ROCR.xval[[1]])){
>  y <- sample(c(1:350),sampSize[i])
>  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
>  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
>  }
>  # now massage the data using ROCR, set up for a ROC plot
>  # if it errors out here, run the above sample again.
> pred <- prediction(testSet$predictions, testSet$labels)
> perf <- performance(pred,"tpr","fpr")
>  # create the ROC plot, averaging by cutoff value
> plot(perf, avg="threshold")
>  # check out the structure of the data
> str(perf)
>  # note the ragged edges of the list and that I assume averaging
>  # whether it be vertical, horizontal, or threshold, somehow
>  # accounts for this?
>
> ## part two ##
> # add a list entry with only two values
> p...@x.values[[1]] <- c(0,1)
> p...@y.values[[1]] <- c(0,1)
> p...@alpha.values[[1]] <- c(Inf,0)
>
> plot(perf, avg="threshold")
>
> ##output results in an error with this message
> # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from,  :
> # missing value where TRUE/FALSE needed
>
>
> Thanks in advance for your help
> Tim Howard
> New York Natural Heritage Program
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread David Winsemius



On Sep 24, 2009, at 9:09 AM, Tim Howard wrote:


All,
I'm trying again with a slightly more generic version of my first  
question. I can extract the
plotted values from hist(), boxplot(), and even plot.randomForest().  
Observe:


# get some data
dat <- rnorm(100)
# grab histogram data
hdat <- hist(dat)
hdat #provides details of the hist output

#grab boxplot data
bdat <- boxplot(dat)
bdat #provides details of the boxplot output

# the same works for randomForest
library(randomForest)
data(mtcars)
RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE,  
ntree=100), log="y")

RFdat


##But, I can't use this method in ROCR
library(ROCR)
data(ROCR.xval)
RCdat <- plot(perf, avg="threshold")

RCdat
## output:  NULL

Does anyone have any tricks for piping or extracting these data?
Or, perhaps for steering me in another direction?


After looking at the examples in ROCR, my guess is that you really  
ought to examine the perf object itself. It's an S4 object so some of  
the access to internals are a bit different. In the example  
performance object I just created, the y-values slot values would ba  
obtainable with:


p...@y.values

 The is also help from:
?"plot-methods"

--
David


Thanks,
Tim


From: "Tim Howard" 
Subject: [R] ROCR.plot methods, cross validation averaging
To: , ,

Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
Content-Type: text/plain; charset=US-ASCII

Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -

I think my first question is generic and could apply to many methods,
which is why I'm directing this initially to R-help as well as  
Tobias and Oliver.


Question 1. The plot function in ROCR will average your cross  
validation
data if asked. I'd like to use that averaged data to find a "best"  
cutoff

but I can't figure out how to grab the actual data that get plotted.
A simple redirect of the plot (such as test <- plot(mydata)) doesn't  
do it.


Question 2. I am asking ROCR to average lists with varying lengths for
each list entry. See my example below. None of the ROCR examples  
have data

structured in this manner. Can anyone speak to whether the averaging
methods in ROCR allow for this? If I can't easily grab the data as  
desired
from Question 1, can someone help me figure out how to average the  
lists,

by threshold, similarly?

Question 3. If my cross validation data happen to have a list entry  
whose

length = 2, ROCR errors out. Please see the second part of my example.
Any suggestions?

#reproducible examples exemplifying my questions
##part one##
library(ROCR)
data(ROCR.xval)
# set up data so it looks more like my real data
sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
testSet <- ROCR.xval
# do the extraction
for (i in 1:length(ROCR.xval[[1]])){
 y <- sample(c(1:350),sampSize[i])
 testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
 testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
 }
# now massage the data using ROCR, set up for a ROC plot
# if it errors out here, run the above sample again.
pred <- prediction(testSet$predictions, testSet$labels)
perf <- performance(pred,"tpr","fpr")
# create the ROC plot, averaging by cutoff value
plot(perf, avg="threshold")
# check out the structure of the data
str(perf)
# note the ragged edges of the list and that I assume averaging
# whether it be vertical, horizontal, or threshold, somehow
# accounts for this?

## part two ##
# add a list entry with only two values
p...@x.values[[1]] <- c(0,1)
p...@y.values[[1]] <- c(0,1)
p...@alpha.values[[1]] <- c(Inf,0)

plot(perf, avg="threshold")

##output results in an error with this message
# Error in if (from == to) rep.int(from, length.out) else  
as.vector(c(from,  :

# missing value where TRUE/FALSE needed


Thanks in advance for your help
Tim Howard
New York Natural Heritage Program

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tim Howard

Whoops, sorry. Here is the full set with the missing lines:

library(ROCR)
data(ROCR.xval)
pred <- prediction(ROCR.xval$predictions, ROCR.xval$labels)
perf <- performance(pred,"tpr","fpr")
RCdat <- plot(perf, avg="threshold")
RCdat

Thanks.
Tim
>>> David Winsemius  9/24/2009 9:25 AM >>>

On Sep 24, 2009, at 9:09 AM, Tim Howard wrote:

> All,
> I'm trying again with a slightly more generic version of my first  
> question. I can extract the
> plotted values from hist(), boxplot(), and even plot.randomForest().  
> Observe:
>
> # get some data
> dat <- rnorm(100)
> # grab histogram data
> hdat <- hist(dat)
> hdat #provides details of the hist output
>
> #grab boxplot data
> bdat <- boxplot(dat)
> bdat #provides details of the boxplot output
>
> # the same works for randomForest
> library(randomForest)
> data(mtcars)
> RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE,  
> ntree=100), log="y")
> RFdat
>
>
> ##But, I can't use this method in ROCR
> library(ROCR)
> data(ROCR.xval)
> RCdat <- plot(perf, avg="threshold")

That code throws an object not found error. Perhaps you defined perf  
earlier?

David


>
> RCdat
> ## output:  NULL
>
> Does anyone have any tricks for piping or extracting these data?
> Or, perhaps for steering me in another direction?
>
> Thanks,
> Tim
>
>
> From: "Tim Howard" 
> Subject: [R] ROCR.plot methods, cross validation averaging
> To: , ,
>   
> Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
> Content-Type: text/plain; charset=US-ASCII
>
> Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -
>
> I think my first question is generic and could apply to many methods,
> which is why I'm directing this initially to R-help as well as  
> Tobias and Oliver.
>
> Question 1. The plot function in ROCR will average your cross  
> validation
> data if asked. I'd like to use that averaged data to find a "best"  
> cutoff
> but I can't figure out how to grab the actual data that get plotted.
> A simple redirect of the plot (such as test <- plot(mydata)) doesn't  
> do it.
>
> Question 2. I am asking ROCR to average lists with varying lengths for
> each list entry. See my example below. None of the ROCR examples  
> have data
> structured in this manner. Can anyone speak to whether the averaging
> methods in ROCR allow for this? If I can't easily grab the data as  
> desired
> from Question 1, can someone help me figure out how to average the  
> lists,
> by threshold, similarly?
>
> Question 3. If my cross validation data happen to have a list entry  
> whose
> length = 2, ROCR errors out. Please see the second part of my example.
> Any suggestions?
>
> #reproducible examples exemplifying my questions
> ##part one##
> library(ROCR)
> data(ROCR.xval)
> # set up data so it looks more like my real data
> sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
> testSet <- ROCR.xval
> # do the extraction
> for (i in 1:length(ROCR.xval[[1]])){
>  y <- sample(c(1:350),sampSize[i])
>  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
>  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
>  }
> # now massage the data using ROCR, set up for a ROC plot
> # if it errors out here, run the above sample again.
> pred <- prediction(testSet$predictions, testSet$labels)
> perf <- performance(pred,"tpr","fpr")
> # create the ROC plot, averaging by cutoff value
> plot(perf, avg="threshold")
> # check out the structure of the data
> str(perf)
> # note the ragged edges of the list and that I assume averaging
> # whether it be vertical, horizontal, or threshold, somehow
> # accounts for this?
>
> ## part two ##
> # add a list entry with only two values
> p...@x.values[[1]] <- c(0,1)
> p...@y.values[[1]] <- c(0,1)
> p...@alpha.values[[1]] <- c(Inf,0)
>
> plot(perf, avg="threshold")
>
> ##output results in an error with this message
> # Error in if (from == to) rep.int(from, length.out) else  
> as.vector(c(from,  :
> # missing value where TRUE/FALSE needed
>
>
> Thanks in advance for your help
> Tim Howard
> New York Natural Heritage Program
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread David Winsemius



On Sep 24, 2009, at 9:09 AM, Tim Howard wrote:


All,
I'm trying again with a slightly more generic version of my first  
question. I can extract the
plotted values from hist(), boxplot(), and even plot.randomForest().  
Observe:


# get some data
dat <- rnorm(100)
# grab histogram data
hdat <- hist(dat)
hdat #provides details of the hist output

#grab boxplot data
bdat <- boxplot(dat)
bdat #provides details of the boxplot output

# the same works for randomForest
library(randomForest)
data(mtcars)
RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE,  
ntree=100), log="y")

RFdat


##But, I can't use this method in ROCR
library(ROCR)
data(ROCR.xval)
RCdat <- plot(perf, avg="threshold")


That code throws an object not found error. Perhaps you defined perf  
earlier?


David




RCdat
## output:  NULL

Does anyone have any tricks for piping or extracting these data?
Or, perhaps for steering me in another direction?

Thanks,
Tim


From: "Tim Howard" 
Subject: [R] ROCR.plot methods, cross validation averaging
To: , ,

Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
Content-Type: text/plain; charset=US-ASCII

Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -

I think my first question is generic and could apply to many methods,
which is why I'm directing this initially to R-help as well as  
Tobias and Oliver.


Question 1. The plot function in ROCR will average your cross  
validation
data if asked. I'd like to use that averaged data to find a "best"  
cutoff

but I can't figure out how to grab the actual data that get plotted.
A simple redirect of the plot (such as test <- plot(mydata)) doesn't  
do it.


Question 2. I am asking ROCR to average lists with varying lengths for
each list entry. See my example below. None of the ROCR examples  
have data

structured in this manner. Can anyone speak to whether the averaging
methods in ROCR allow for this? If I can't easily grab the data as  
desired
from Question 1, can someone help me figure out how to average the  
lists,

by threshold, similarly?

Question 3. If my cross validation data happen to have a list entry  
whose

length = 2, ROCR errors out. Please see the second part of my example.
Any suggestions?

#reproducible examples exemplifying my questions
##part one##
library(ROCR)
data(ROCR.xval)
# set up data so it looks more like my real data
sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
testSet <- ROCR.xval
# do the extraction
for (i in 1:length(ROCR.xval[[1]])){
 y <- sample(c(1:350),sampSize[i])
 testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
 testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
 }
# now massage the data using ROCR, set up for a ROC plot
# if it errors out here, run the above sample again.
pred <- prediction(testSet$predictions, testSet$labels)
perf <- performance(pred,"tpr","fpr")
# create the ROC plot, averaging by cutoff value
plot(perf, avg="threshold")
# check out the structure of the data
str(perf)
# note the ragged edges of the list and that I assume averaging
# whether it be vertical, horizontal, or threshold, somehow
# accounts for this?

## part two ##
# add a list entry with only two values
p...@x.values[[1]] <- c(0,1)
p...@y.values[[1]] <- c(0,1)
p...@alpha.values[[1]] <- c(Inf,0)

plot(perf, avg="threshold")

##output results in an error with this message
# Error in if (from == to) rep.int(from, length.out) else  
as.vector(c(from,  :

# missing value where TRUE/FALSE needed


Thanks in advance for your help
Tim Howard
New York Natural Heritage Program

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tim Howard

All,
 I'm trying again with a slightly more generic version of my first question. I 
can extract the
plotted values from hist(), boxplot(), and even plot.randomForest(). Observe:

 # get some data
dat <- rnorm(100)
 # grab histogram data
hdat <- hist(dat)
hdat #provides details of the hist output

 #grab boxplot data
bdat <- boxplot(dat)
bdat #provides details of the boxplot output

 # the same works for randomForest
library(randomForest)
data(mtcars)
RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), 
log="y")
RFdat


##But, I can't use this method in ROCR
library(ROCR)
data(ROCR.xval)
RCdat <- plot(perf, avg="threshold")

RCdat
## output:  NULL

Does anyone have any tricks for piping or extracting these data?  
Or, perhaps for steering me in another direction?

Thanks,
Tim


From: "Tim Howard" 
Subject: [R] ROCR.plot methods, cross validation averaging
To: , ,

Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
Content-Type: text/plain; charset=US-ASCII

Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - 

I think my first question is generic and could apply to many methods, 
which is why I'm directing this initially to R-help as well as Tobias and 
Oliver.

Question 1. The plot function in ROCR will average your cross validation
data if asked. I'd like to use that averaged data to find a "best" cutoff
but I can't figure out how to grab the actual data that get plotted.
A simple redirect of the plot (such as test <- plot(mydata)) doesn't do it.

Question 2. I am asking ROCR to average lists with varying lengths for
each list entry. See my example below. None of the ROCR examples have data
structured in this manner. Can anyone speak to whether the averaging
methods in ROCR allow for this? If I can't easily grab the data as desired
from Question 1, can someone help me figure out how to average the lists,
by threshold, similarly?

Question 3. If my cross validation data happen to have a list entry whose
length = 2, ROCR errors out. Please see the second part of my example.
Any suggestions?

#reproducible examples exemplifying my questions
##part one##
library(ROCR)
data(ROCR.xval)
 # set up data so it looks more like my real data
sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
testSet <- ROCR.xval
 # do the extraction
for (i in 1:length(ROCR.xval[[1]])){
  y <- sample(c(1:350),sampSize[i])
  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
  }
 # now massage the data using ROCR, set up for a ROC plot
 # if it errors out here, run the above sample again.
pred <- prediction(testSet$predictions, testSet$labels)
perf <- performance(pred,"tpr","fpr")
 # create the ROC plot, averaging by cutoff value
plot(perf, avg="threshold")
 # check out the structure of the data
str(perf)
 # note the ragged edges of the list and that I assume averaging
 # whether it be vertical, horizontal, or threshold, somehow 
 # accounts for this?

## part two ##
# add a list entry with only two values
p...@x.values[[1]] <- c(0,1)
p...@y.values[[1]] <- c(0,1)
p...@alpha.values[[1]] <- c(Inf,0)

plot(perf, avg="threshold")

##output results in an error with this message
# Error in if (from == to) rep.int(from, length.out) else as.vector(c(from,  :
# missing value where TRUE/FALSE needed


Thanks in advance for your help
Tim Howard
New York Natural Heritage Program

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

[R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

7 matches

Site Navigation

Mail list logo

Footer information