Re: [Bioc-devel] mapping vector of gene ids to gene symbols

Michael Lawrence Wed, 18 Jun 2014 06:45:15 -0700

Wow, VariantFiltering is awesome. It's impressive how it integrates all of
the annotation resources so seamlessly. And the shiny app looks very
useful. This should be a model package for where we want to bring
Bioconductor.


You should look into using VRanges instead of GRanges for the return value
of filteredVariants(). It can record the provenance of the filters if you
are using the FilterRules framework. And ReportingTools may be useful for
generating semi-interactive HTML reports.

I'm going to use this in the course next week at Brixen. My tutorial is
already based on the CEU trio, so the vignette integrates perfectly.

Nice work,
Michael


On Wed, Jun 18, 2014 at 6:20 AM, Robert Castelo <robert.cast...@upf.edu>
wrote:

> hi Michael,
>
> this souns like if you had a list of variants where you have annotated
> their Entrez Gene IDs, which sometimes are NA because those variance do not
> overlap any gene and sometimes are repeated Entrez Gene IDs when two or
> more of those variants overlap the same gene :)
>
> at least is the situation i had when programming the VariantFiltering
> package, i also could not find a one-liner solution but you might want to
> look to what i ended up doing there, in case it might be also useful for
> you.
>
> you'll find it in the method "annotateVariants" that dispatches "OrgDb"
> objects (i.e., gene-centric annotation packages), within VariantFiltering/R/
> annotationEngine.R
>
> if you take a look at it, do not hesitate to comment if you have any
> suggestion to improve this. i also look forward to the annotation-gurus
> feedback on this question :)
>
> cheers,
>
> robert.
>
>
> On 06/18/2014 03:03 PM, Michael Lawrence wrote:
>
>> Let's say I have a vector of gene IDs where some are NA, and are some are
>> repeated, and I want to map them to gene symbols, where I get NAs for the
>> NA IDs or IDs without a symbol. What is the best way to do this?
>>
>> I tried select() but it gave me a table with unique entries; not very
>> convenient. It also does not handle NAs. And totally breaks with
>> duplicates
>> using the GENEID key type (kind of works with ENTREZID):
>>
>> select(Homo.sapiens, GENEID, "SYMBOL", "GENEID")
>> Error in `[[<-`(`*tmp*`, name, value = list(GENEID = c("245938", "245939",
>> :
>>    269 elements in value to replace 1312 elements
>>
>> Also tried the venerable mget(GENEID, org.Hs.egSYMBOL, ifnotfound=NA), but
>> this returns a list and fails with NAs.
>>
>> What would be nice is something like:
>>
>> map(Homo.sapiens, GENEID, "SYMBOL", "GENEID", OneToOneOrNone)
>>
>> where OneToOneOrNone is an assertion that I expect the mappings to be
>> one-to-one, so it will unlist() or whatever and throw an error if the
>> assertion fails. It should return NA for anything not found, and for any
>> NA
>> GENEID. Does something like this already exist?
>>
>> Michael
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
> --
> Robert Castelo, PhD
> Associate Professor
> Dept. of Experimental and Health Sciences
> Universitat Pompeu Fabra (UPF)
> Barcelona Biomedical Research Park (PRBB)
> Dr Aiguader 88
> E-08003 Barcelona, Spain
> telf: +34.933.160.514
> fax: +34.933.160.550
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] mapping vector of gene ids to gene symbols

Reply via email to