Also ran a few index to "force" the query plan, but with limited success:
- the "indexed by" clause does not result in the optimizer using the index first, it just uses the indexes in the later steps of the query plan. - using "not indexed" still results in the same table scan of high_volume_table first, just without the index. - using the unary "+" on the d table filters has no effect on the query plan (as these are not indexed in the first place I guess) Using unlikely() on the d table filters seems to be the only option that works. On Tue, Jun 26, 2018 at 10:02 AM, Eric Grange <zar...@gmail.com> wrote: > Hi, > > I am experiencing a massive performance issue on a query with a left join > in 3.24, now taking 40 seconds, whereas in 3.22 it was just a few > milliseconds. > The problematic query looks like > > select d.key_field, count(*) nb > from low_volume_table b > join mid_volume_table c on c.key_b = b.id > left join high_volume_table d on d.key_c = c.id > where b.id >= $1 > and b.filter1 = 0 > and c.filter2 > 0 > and d.filter3 > 0 > and d.filter4 = 1 > group by d.key_field > order by nb desc > > The filter fields on it are relatively non-discriminant (and > non_indexed), however the key_field is indexed. > > The most discriminating conditions in this query are those on the > low_volume and mid_volume tables, but the optimizer > selects as first action: > > SCAN TABLE high_volume_table USING INDEX key_field_idx > > which leads to a huge number of iterations. > > If on the other hand, just one of the d filter conditions is removed, then > the optimizer goes (like 3.22) first for > > SEARCH TABLE low_volume_table AS b USING COVERING INDEX > low_volume_table_id_idx (b.filter1=? AND rowid>?) > > This happens after running ANALYZE, the sqlite1_stat for the high_volume > table and key_field_idx is > > 5855234 6 > > while for the low_volume_table_filter1_idx it is > > 1976628 988314 > > While the low_volume_table_filter1_idx does not look very selective, as > it is coupled with rowid filtering, it is actually very effective > as it combines rowid & filter1, so there are just thousandths of rows > being considered in the "group by", while when starting from > a key_field_idx, there are millions of rows being considered, the > overwhelming majority not fulfilling the conditions. > > The above table names and fields have been anonymized, if someone from the > SQLite team want to have a look at the actual data, > I can provide a database download (it's about 1.7 GB) > > Thanks! > > > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users