hi Michael,

this souns like if you had a list of variants where you have annotated their Entrez Gene IDs, which sometimes are NA because those variance do not overlap any gene and sometimes are repeated Entrez Gene IDs when two or more of those variants overlap the same gene :)

at least is the situation i had when programming the VariantFiltering package, i also could not find a one-liner solution but you might want to look to what i ended up doing there, in case it might be also useful for you.

you'll find it in the method "annotateVariants" that dispatches "OrgDb" objects (i.e., gene-centric annotation packages), within VariantFiltering/R/annotationEngine.R

if you take a look at it, do not hesitate to comment if you have any suggestion to improve this. i also look forward to the annotation-gurus feedback on this question :)

cheers,

robert.

On 06/18/2014 03:03 PM, Michael Lawrence wrote:
Let's say I have a vector of gene IDs where some are NA, and are some are
repeated, and I want to map them to gene symbols, where I get NAs for the
NA IDs or IDs without a symbol. What is the best way to do this?

I tried select() but it gave me a table with unique entries; not very
convenient. It also does not handle NAs. And totally breaks with duplicates
using the GENEID key type (kind of works with ENTREZID):

select(Homo.sapiens, GENEID, "SYMBOL", "GENEID")
Error in `[[<-`(`*tmp*`, name, value = list(GENEID = c("245938", "245939",
:
   269 elements in value to replace 1312 elements

Also tried the venerable mget(GENEID, org.Hs.egSYMBOL, ifnotfound=NA), but
this returns a list and fails with NAs.

What would be nice is something like:

map(Homo.sapiens, GENEID, "SYMBOL", "GENEID", OneToOneOrNone)

where OneToOneOrNone is an assertion that I expect the mappings to be
one-to-one, so it will unlist() or whatever and throw an error if the
assertion fails. It should return NA for anything not found, and for any NA
GENEID. Does something like this already exist?

Michael

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to