Add "cross" before the first "join" to force the first table into the outermost loop
-----Ursprüngliche Nachricht----- Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Eric Grange Gesendet: Dienstag, 26. Juni 2018 10:13 An: General Discussion of SQLite Database <sqlite-users@mailinglists.sqlite.org> Betreff: [EXTERNAL] Re: [sqlite] Performance issue with left join in 3.24 compared with 3.22 Also ran a few index to "force" the query plan, but with limited success: - the "indexed by" clause does not result in the optimizer using the index first, it just uses the indexes in the later steps of the query plan. - using "not indexed" still results in the same table scan of high_volume_table first, just without the index. - using the unary "+" on the d table filters has no effect on the query plan (as these are not indexed in the first place I guess) Using unlikely() on the d table filters seems to be the only option that works. On Tue, Jun 26, 2018 at 10:02 AM, Eric Grange <zar...@gmail.com> wrote: > Hi, > > I am experiencing a massive performance issue on a query with a left > join in 3.24, now taking 40 seconds, whereas in 3.22 it was just a few > milliseconds. > The problematic query looks like > > select d.key_field, count(*) nb > from low_volume_table b > join mid_volume_table c on c.key_b = b.id > left join high_volume_table d on d.key_c = c.id > where b.id >= $1 > and b.filter1 = 0 > and c.filter2 > 0 > and d.filter3 > 0 > and d.filter4 = 1 > group by d.key_field > order by nb desc > > The filter fields on it are relatively non-discriminant (and > non_indexed), however the key_field is indexed. > > The most discriminating conditions in this query are those on the > low_volume and mid_volume tables, but the optimizer selects as first > action: > > SCAN TABLE high_volume_table USING INDEX key_field_idx > > which leads to a huge number of iterations. > > If on the other hand, just one of the d filter conditions is removed, > then the optimizer goes (like 3.22) first for > > SEARCH TABLE low_volume_table AS b USING COVERING INDEX > low_volume_table_id_idx (b.filter1=? AND rowid>?) > > This happens after running ANALYZE, the sqlite1_stat for the > high_volume table and key_field_idx is > > 5855234 6 > > while for the low_volume_table_filter1_idx it is > > 1976628 988314 > > While the low_volume_table_filter1_idx does not look very selective, > as it is coupled with rowid filtering, it is actually very effective > as it combines rowid & filter1, so there are just thousandths of rows > being considered in the "group by", while when starting from a > key_field_idx, there are millions of rows being considered, the > overwhelming majority not fulfilling the conditions. > > The above table names and fields have been anonymized, if someone from > the SQLite team want to have a look at the actual data, I can provide > a database download (it's about 1.7 GB) > > Thanks! > > > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___________________________________________ Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0 May be privileged. May be confidential. Please delete if not the addressee. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users