Hi, I am experiencing a massive performance issue on a query with a left join in 3.24, now taking 40 seconds, whereas in 3.22 it was just a few milliseconds. The problematic query looks like
select d.key_field, count(*) nb from low_volume_table b join mid_volume_table c on c.key_b = b.id left join high_volume_table d on d.key_c = c.id where b.id >= $1 and b.filter1 = 0 and c.filter2 > 0 and d.filter3 > 0 and d.filter4 = 1 group by d.key_field order by nb desc The filter fields on it are relatively non-discriminant (and non_indexed), however the key_field is indexed. The most discriminating conditions in this query are those on the low_volume and mid_volume tables, but the optimizer selects as first action: SCAN TABLE high_volume_table USING INDEX key_field_idx which leads to a huge number of iterations. If on the other hand, just one of the d filter conditions is removed, then the optimizer goes (like 3.22) first for SEARCH TABLE low_volume_table AS b USING COVERING INDEX low_volume_table_id_idx (b.filter1=? AND rowid>?) This happens after running ANALYZE, the sqlite1_stat for the high_volume table and key_field_idx is 5855234 6 while for the low_volume_table_filter1_idx it is 1976628 988314 While the low_volume_table_filter1_idx does not look very selective, as it is coupled with rowid filtering, it is actually very effective as it combines rowid & filter1, so there are just thousandths of rows being considered in the "group by", while when starting from a key_field_idx, there are millions of rows being considered, the overwhelming majority not fulfilling the conditions. The above table names and fields have been anonymized, if someone from the SQLite team want to have a look at the actual data, I can provide a database download (it's about 1.7 GB) Thanks! _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users