Re: [Bioc-devel] Changes in AnnotationDbi

Marc Carlson Thu, 04 Jun 2015 15:07:32 -0700

Hi Jim,

I do agree that the warning was protective for that (this is why I putit there).

But it was also annoying for many and a source of some confusion becausewhen people see a warning() they think that something has gone wrongwith the code that was just run. And in this case the select method wasactually doing exactly what it was supposed to be doing. What it wasactually warning you about was what you did separately in thatassignment to fit2... Which is the step right after the select methodalready did it's work. And I can understand why that seems a little bitconfusing since you are basically telling someone to be careful with thedata you just gave them.

Now I could replace it with a message() I guess, but in cases like thiswhere the warning is about something that happens outside of thefunction you are calling, shouldn't that probably be handled bydocumentation? Or at least, that is the argument that finally persuadedme to remove it. That and that fact that almost every call to select()ended up accompanied by the warning you mentioned, because it turns outthat perfect 1:1 relationships are pretty rare for annotation data.Very often, you are going to get back multiple results.

But I didn't just remove the warning, I also supplied an alternative forpeople who have a real need for consistent 1:1 mapping.

The mapIds() method takes most of the same arguments as select, exceptthat unlike select(), it only looks up one column and it always returnsa vector that is the same size as the vector that came in.


So for your example, you could do something like this psuedocode here:

mapIds(<chippackage>, featureNames(eset), column="ENTREZID",keytype="PROBEID")

And mapIds will follow a rule specified by the default value for themultiVals argument so that you can get back your results in a 1:1manner. And if you don't like any of the options available for themultiVals argument, you can make your own function and pass it in.



Anyhow please continue to let us know what you think?


 Marc






On 06/04/2015 10:50 AM, James W. MacDonald wrote:

In the last release, the warning message from select() telling people that
their results include one-to-many mappings was removed. While some may find
this warning annoying, I think silently returning something unexpected to
our users is dangerous.

In other words, for me it is a common practice to do something like this:

fit <- lmFit(eset, design)
fit2 <- eBayes(fit)
gns <- select(<chippackage>, featureNames(eset), c("ENTREZID","SYMBOL"))
gns <- gns[!duplicated(gns[,1]),]
fit2$genes <- gns

I add in the step where dups are removed because I already know they are
there. But a naive user might instead do

fit2$genes <- select(<chippackage>, featureNames(eset),
c("ENTREZID","SYMBOL"))

Which will work just fine, but then all the annotation (except for the
first few lines) will now be completely incorrect, and there wasn't a
warning to let the end user know that they may have made a mistake.

lmFit() will parse the featureData slot of an ExpressionSet and use those
data for annotation, so that gives some hypothetical protections, for those
who first put their annotation data into their ExpressionSet. However,
?eSet says:

  ‘featureData’: Contains variables describing features (i.e., rows
           in ‘assayData’) unique to this experiment. Use the
           ‘annotation’ slot to efficiently reference feature data
           common to the annotation package used in the experiment.
           Class: ‘AnnotatedDataFrame-class’

Which to me indicates that the featureData slot isn't really intended to
contain annotation data, but instead some unique information that pertains
to a given experiment. But maybe I misunderstand.

Is the featureData slot actually intended for annotation data? If not, what
is the intended pipeline for annotating data in an ExpressionSet? Am I
alone in being concerned about this?

Best,

Jim


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Changes in AnnotationDbi

Reply via email to