Re: [sqlite] How to optimize a multi-condition query
Hi, thanks to everybody that answered. I tried your suggestions but there was no measurable improvement. Possibly this is the best what I can get out of Sqlite. However, I tried a similar query on a larger table using both SQlite and Postgresql (same machine, same table structure + indices, same data). Postrgresql took 25secs for this query while it took more than 1 hour with SQlite. Obviously there is room for improvement. If I manage to find out what makes the difference I will post it here. Thanks again and best regards, Hilmar El 10/15/2010 4:44 PM, Igor Tandetnik escribió: > Hilmar Berger wrote: >> Indexes has been created for all fields in A and B (e.g. create index >> name on A(chr, start, stop, strand)); >> >> The query is as follows: >> >> select * from >> a, b >> where a.chr = b.chr and a.strand = b.strand and a.start<= b.start >> and a.stop>= b.stop and b.start<= a.stop and a.start<= b.stop ; > Drop those indexes you created, change the query to > > select * from a, b > where a.chr = b.chr and a.strand = b.strand > and b.start between a.start and a.stop > and b.stop<= a.stop; > > and create a single index on b(chr, strand, start) or b(chr, strand, start, > stop), listing fields in this exact order (adding stop to the index may or > may not result in marginal performance improvement - experiment). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize a multi-condition query
Thanks ! However, I tried what you suggested and there was no change. I used EXPLAIN QUERY PLAN on the query and it looks like it does not use any index on b at all, only if I use hardcoded conditions like b > 0. It appears that the real problem is that SQlite does not use indices for both tables, e.g.: CREATE INDEX a_chr on a(chr) CREATE INDEX b_chr on b(chr) explain query plan select a.chr, b.chr from b, a where b.chr = a.chr; Output: 00TABLE b 11TABLE a WITH INDEX a_chr There is an example on how to use multiple indices on the same table here: http://www.sqlite.org/cvstrac/wiki?p=PerformanceTuning However, I'm not sure how to extend this to joined tables. Thanks ! Best regards, Hilmar El 10/14/2010 7:12 PM, Simon Slavin escribió: > On 14 Oct 2010, at 5:46pm, Hilmar Berger wrote: > >> The query is as follows: >> >> select * from >> a, b >> where a.chr = b.chr and a.strand = b.strand and a.start<= b.start >> and a.stop>= b.stop and b.start<= a.stop and a.start<= b.stop ; >> Indexes has been created for all fields in A and B (e.g. >> create index name on A(chr, start, stop, strand)); > I not certain from your phrasing, but this may not do what you think. It is > different matter to do > > create index Achr on A (chr) > create index Astart on A (start) > create index Astop on A (stop) > ... > > to what you did above. One creates one index on the sequence of four > variables, the other creates four indexes each on one variable. > > However, to make this SELECT go fast, > CREAT INDEX Bkey ON B (chr,strand) > CREAT INDEX Bstart ON B (start) > CREAT INDEX Bstop ON B (stop) > > May help. > > Also I recommend doing this: > > http://www.sqlite.org/lang_analyze.html > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] How to optimize a multi-condition query
Hi all, I have a query that does not perform as fast as expected and would like to know if anyone here has an idea on how to optimize it. There are two tables, A and B. Both have fields chr, start, stop and strand. A has about 50k entries, B about 12k. Both contain intervals defined by start...stop. The intervals in B have length 1, those in A any possible length. I want to identify all intervals in B that fall into one of those in A and get the corresponding interval data from A for each match of B. Indexes has been created for all fields in A and B (e.g. create index name on A(chr, start, stop, strand)); The query is as follows: select * from a, b where a.chr = b.chr and a.strand = b.strand and a.start <= b.start and a.stop >= b.stop and b.start <= a.stop and a.start <= b.stop ; This query takes about 130 seconds, but if when I remove the two last conditions it drops to 0.5 seconds. Replacing both last conditions with conditions like b.start > 1000 or b.start <> 0 the execution time will stay at 0.5 seconds. Any help appreciated. Thanks ! Hilmar ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users