Add "cross" before the first "join" to force the first table into the outermost 
loop

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Eric Grange
Gesendet: Dienstag, 26. Juni 2018 10:13
An: General Discussion of SQLite Database <sqlite-users@mailinglists.sqlite.org>
Betreff: [EXTERNAL] Re: [sqlite] Performance issue with left join in 3.24 
compared with 3.22

Also ran a few index to "force" the query plan, but with limited success:

- the "indexed by" clause does not result in the optimizer using the index 
first, it just uses the indexes in the later steps of the query plan.
- using "not indexed" still results in the same table scan of high_volume_table 
first, just without the index.
- using the unary "+" on the d table filters has no effect on the query plan 
(as these are not indexed in the first place I guess)

Using unlikely() on the d table filters seems to be the only option that works.


On Tue, Jun 26, 2018 at 10:02 AM, Eric Grange <zar...@gmail.com> wrote:

> Hi,
>
> I am experiencing a massive performance issue on a query with a left
> join in 3.24, now taking 40 seconds, whereas in 3.22 it was just a few
> milliseconds.
> The problematic query looks like
>
>       select d.key_field, count(*) nb
>       from low_volume_table b
>       join mid_volume_table c on c.key_b = b.id
>       left join high_volume_table d on d.key_c = c.id
>       where b.id >= $1
>       and b.filter1 = 0
>       and c.filter2 > 0
>       and d.filter3 > 0
>       and d.filter4 = 1
>       group by d.key_field
>       order by nb desc
>
> The filter fields on it are relatively non-discriminant (and
> non_indexed), however the key_field is indexed.
>
> The most discriminating conditions in this query are those on the
> low_volume and mid_volume tables, but the optimizer selects as first
> action:
>
>     SCAN TABLE high_volume_table USING INDEX key_field_idx
>
> which leads to a huge number of iterations.
>
> If on the other hand, just one of the d filter conditions is removed,
> then the optimizer goes (like 3.22) first for
>
>    SEARCH TABLE low_volume_table AS b USING COVERING INDEX
> low_volume_table_id_idx (b.filter1=? AND rowid>?)
>
> This happens after running ANALYZE, the sqlite1_stat for the
> high_volume table and key_field_idx is
>
>      5855234 6
>
> while for the  low_volume_table_filter1_idx it is
>
>      1976628 988314
>
> While the  low_volume_table_filter1_idx does not look very selective,
> as it is coupled with rowid filtering, it is actually very effective
> as it combines rowid & filter1, so there are just thousandths of rows
> being considered in the "group by", while when starting from a
> key_field_idx, there are millions of rows being considered, the
> overwhelming majority not fulfilling the conditions.
>
> The above table names and fields have been anonymized, if someone from
> the SQLite team want to have a look at the actual data, I can provide
> a database download (it's about 1.7 GB)
>
> Thanks!
>
>
>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to