On 18 Dec 2007, at 04:42, David M. Godostein wrote:
I think that would work quite well, as long as the regions could also
specify strand. Is that possible?
yes, strand is certainly possible. Could you also give us an idea how
many features (regions) does your typical
'intersection use case' consists of? (10s, 100s, 1000s, 10000s, 100000s
etc?)
I must warn you that this kind of intersection scales quite poorly for
large number of regions
so would be good if we could assess the practicality of such a solution
beforehand. In the long run it will be possible to properly optimize it
but in the short term you would have to cope with the performance as it
stands at the moment
cheers,
a.
--David
Syed Haider wrote:
Hi David,
for both perl and webservice APIs, it will look like a normal filter
representing a genomic region (chr,start,end). If you see on Biomart
MartView, Ensembl Gene -> human, under filters you can see 'Encode
region' which has preset values. You should be able to assign your own
set of values for the new genomic region filter just the same. You
would
be able to upload as many segments as you want to.
for instance, in webservice call,
<Filter name = "genomic_region" value = "7:115597757:117475182"/>
or
<Filter name = "genomic_region" value =
"7:115597757:117475182,1:100:100000,12:1000000,4000000"/>
equivalent perl API call would be
$query->addFilter("genomic_region", ["7:115597757:117475182"]);
or $query->addFilter("genomic_region",
["7:115597757:117475182,1:100:100000,12:1000000,4000000"]);
Hope that will enable you to feed biomart system with multiple region
value.
regards
syed
On Sat, 2007-11-17 at 19:10 +0000, Arek Kasprzyk wrote:
On 14 Nov 2007, at 07:32, David M. Goodstein wrote:
I was wondering if there are any shortcuts that enable UCSC table
browser-style intersection queries in BioMart. The typical
application would be to grab all the genes that overlap a given set
of sequences (e.g., ESTs) aligned to the reference genome. Or
does one need to retrieve all the spans for the alignments in
questions and then directly query BioMart for overlap with each
span?
Hi David,
Syed (cc'ed on this email) looked into the fix in more detail and it
appears that we would be able to implement it with not too much
trouble.
This means that we could add it to the existing ensembl mart config
to be available for your even as early as ensembl 48 (scheduled for
early december)
would it work for you?
Syed, could you give David a code snippet for the api and web
service of how this would work in theory so we could make sure that
this is what he wants before implementing anything? :)
cheers,
a.
regards,
-David
David M. Goodstein
Computational Genomics Group
Joint Genome Institute
Lawrence Berkeley National Lab
---------------------------------------------------------------------
--- -------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
---------------------------------------------------------------------
--- -------
------------------------------------------------------------------------
-------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
------------------------------------------------------------------------
-------