Hi Denny

Just the flip the dataset order and it would filter the second dataset
first.

DS_1
DS_2 (with filter)

this should quickify the process :)

cheers
syed



On Thu, 2008-05-15 at 11:57 -0400, Chan, Denny (NIH/NCI) [C] wrote:
> I have problem joining two large datasets.  
> 
> When there is no filter, the results came back in reasonable time.
> 
> When I specified a filter in the first dataset to limit the result,  it
> took a long time to execute.
> 
> The log file indicated that the Biomart-perl got the first 200 records
> from the second dataset and took those 200 records to filter the first
> dataset in a separate SQL statement.  The Biomart-perl then took another
> 200 records from the second datasets and filtered with the first
> dataset.  The system looped thru 200 records at a time until exhausted
> all the records in the second dataset.  It took a while to loop thru
> 500,000 records in the second dataset.
> 
> Even though adding a filter to the first dataset gave back only couple
> records, it would still take a long time to loop thru the 2nd dataset.
> 
> Is there a way to speed up the dataset join or change the way that
> Biomart handling the datasets join with filter?
> 
> Thanks,
> Denny 
-- 
======================================
Syed Haider.
EMBL-European Bioinformatics Institute
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
======================================

Reply via email to