Hi Thomas,
Could you please send us the exact query with filter IDs so we can debug
it over here. Another option would be to break down your 6500 ID set
into smaller subsets and do multiple queries.
Thanks
syed
Thomas Burkard wrote:
Hi,
I experienced the return of incomplete data sets. If I want to map a
large set of ids (6500) (for example swissprot AC to EntrezGeneID via
Ensembl Human) I get not the complete list back. If I query for all
known id mappings and perform a merge in R afterwards I get a much
longer list back (both lists with unique ids).
1. PROCDURE (via web interface):
$query->setDataset("hsapiens_gene_ensembl");
$query->addFilter("uniprot_swissprot_accession",
[sample-SwissProtAC.txt]);
$query->addAttribute("entrezgene");
$query->formatter("TSV");
$query_runner->uniqueRowsOnly(1);
-> sample-RetrievedEG.txt (3420 IDs)
2. PROCEDURE (via web interface):
$query->setDataset("hsapiens_gene_ensembl");
$query->addAttribute("entrezgene");
$query->addAttribute("uniprot_swissprot_accession");
$query->formatter("TSV");
$query_runner->uniqueRowsOnly(1);
-> save as sp2eg
-> merge via R (sample-sp2eg.R)
-> sample-MapViaALL.txt (4382 IDs)
Unfortunately, the e-mail did not come through with attachments
(9.10.). Here a try without.
Thanks & best regards,
Thomas
P.S.: I experienced a similar problem with an own mart before.
--
Thomas Burkard
CeMM - Research Centre for Molecular Medicine of the Austrian Academy
of Science
Lazarettgasse 19/3. floor, A-1090 Vienna, Austria
Tel.: +43/1/40160 70 021
Mobile: +43/699/126 05 000
Fax.: +43/1/40160 970 030
Email: [EMAIL PROTECTED]
URL: http://www.cemm.at