I'm struggling with some odd behaviour in martview from 0.7 (as
plants.ensembl.org/biomart/martview). The problem is that it takes a
long time to retrieve sequences for variations in the interface. If I
select plants variation, choose Arabidopsis and pick a strain as a
filter, and select sequence plus flanking regions, martview can take
several minutes to return the first 10 rows (at least the first
time...). I've tried various things like optimising the tables, adding
extra indices, but the performance for the uncached query is very slow
(several minutes).
As far as I can tell, there seem to be two things at play - martview
retrieves and processes 200 rows, when only the first 10 rows are
needed, but the killer here seems to be an ORDER BY clause which forces
a filesort (if you remove the ORDER BY, the query is very fast).
I presume this is needed for the LIMIT chunked retrieval mechanism for
downloading large sets but I'm wondering why martview does this at all
for previewing the 10 rows (or even for a chunk of 200)? Is there any
way to change the behaviour here? Or am I missing something obvious?
Thanks,
Dan.
--
Dan Staines, PhD Ensembl Genomes Technical Coordinator
EMBL-EBI Tel: +44-(0)1223-492507
Wellcome Trust Genome Campus Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users