Hi Kolja,
The DISTINCT would not work for all the queries because there isnt just
on SQL statement being executed by the library to retrieve the results.
Sometimes, its spread across multiple datasets on different locations
which may well be running on different databases too. This has to be
done in the library rather SQL level. the complexity of the merge is
really down to the spread of attribute values in the database.
However, for single table queries we can improve the speed as you
suggested. We will add this optimization in the next release.
Thanks
Syed
Kolja Henckel wrote:
Hello there!
I use the BioMart Perl API and just figured out some "problem":
When using the option
uniqueRowsOnly(1)
the API fetches all rows matching the query and afterwards deletes the
ones that are too much (so that only unique ones are returned or printed).
My problem is that I have about 1 mio. datarows and only 18 different
values in the desired Attribute.
This means that the query takes about 10 minutes for the result of 18
values.
Is it possible (or planned, or already implemented somewhere, somehow?)
to implement the uniqueRowsOnly-option using the SELECT DISTINCT option
of SQL?
In this case the query should perform within seconds...
Cheers, Kolja
PS: thanks for the great mart, anyway :)