Re: [sqlite] Performance issue with left join in 3.24 compared with 3.22

2018-06-26 Thread Eric Grange
Thanks Richard!

Changing the inner join to a cross join works as well in that case, though
is it enough to always disable the left join optimization ?

I have other variants of the query with different/more left joined
tables/subqueries, and varying filtering conditions, as the query
is dynamically generated from user options and filters (which can indeed
lead to SQL that is not really "optimal").

Is having a cross join somewhere among the joins enough to "disable" the
left join strength reduction for other joins?

On Tue, Jun 26, 2018 at 5:58 PM, Richard Hipp  wrote:

> On 6/26/18, Eric Grange  wrote:
> > I am experiencing a massive performance issue on a query with a left join
> > in 3.24, now taking 40 seconds, whereas in 3.22 it was just a few
> > milliseconds.
> > The problematic query looks like
> >
> >   select d.key_field, count(*) nb
> >   from low_volume_table b
> >   join mid_volume_table c on c.key_b = b.id
> >   left join high_volume_table d on d.key_c = c.id
> >   where b.id >= $1
> >   and b.filter1 = 0
> >   and c.filter2 > 0
> >   and d.filter3 > 0
> >   and d.filter4 = 1
> >   group by d.key_field
> >   order by nb desc
>
> Dan bisected and found the speed reduction coincides with the
> introduction of the LEFT JOIN strength reduction optimization
> (https://sqlite.org/optoverview.html#leftjoinreduction) in version
> 3.23.0.  (Dan bisected to the specific check-in
> https://sqlite.org/src/info/dd568c27).
>
> Your work-around is to change the LEFT JOIN into CROSS JOIN, thus
> forcing the query planner to preserve the same join order as it did
> before the string reduction optimization.
>
> We (the SQLite devs) will take an action to try to improve the query
> planner so that it picks a better plan for your case.
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance issue with left join in 3.24 compared with 3.22

2018-06-26 Thread Richard Hipp
On 6/26/18, Eric Grange  wrote:
> I am experiencing a massive performance issue on a query with a left join
> in 3.24, now taking 40 seconds, whereas in 3.22 it was just a few
> milliseconds.
> The problematic query looks like
>
>   select d.key_field, count(*) nb
>   from low_volume_table b
>   join mid_volume_table c on c.key_b = b.id
>   left join high_volume_table d on d.key_c = c.id
>   where b.id >= $1
>   and b.filter1 = 0
>   and c.filter2 > 0
>   and d.filter3 > 0
>   and d.filter4 = 1
>   group by d.key_field
>   order by nb desc

Dan bisected and found the speed reduction coincides with the
introduction of the LEFT JOIN strength reduction optimization
(https://sqlite.org/optoverview.html#leftjoinreduction) in version
3.23.0.  (Dan bisected to the specific check-in
https://sqlite.org/src/info/dd568c27).

Your work-around is to change the LEFT JOIN into CROSS JOIN, thus
forcing the query planner to preserve the same join order as it did
before the string reduction optimization.

We (the SQLite devs) will take an action to try to improve the query
planner so that it picks a better plan for your case.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance issue with left join in 3.24 compared with 3.22

2018-06-26 Thread Eric Grange
Also ran a few index to "force" the query plan, but with limited success:

- the "indexed by" clause does not result in the optimizer using the index
first, it just uses the indexes in the later steps of the query plan.
- using "not indexed" still results in the same table scan of
high_volume_table first, just without the index.
- using the unary "+" on the d table filters has no effect on the query
plan (as these are not indexed in the first place I guess)

Using unlikely() on the d table filters seems to be the only option that
works.


On Tue, Jun 26, 2018 at 10:02 AM, Eric Grange  wrote:

> Hi,
>
> I am experiencing a massive performance issue on a query with a left join
> in 3.24, now taking 40 seconds, whereas in 3.22 it was just a few
> milliseconds.
> The problematic query looks like
>
>   select d.key_field, count(*) nb
>   from low_volume_table b
>   join mid_volume_table c on c.key_b = b.id
>   left join high_volume_table d on d.key_c = c.id
>   where b.id >= $1
>   and b.filter1 = 0
>   and c.filter2 > 0
>   and d.filter3 > 0
>   and d.filter4 = 1
>   group by d.key_field
>   order by nb desc
>
> The filter fields on it are relatively non-discriminant (and
> non_indexed), however the key_field is indexed.
>
> The most discriminating conditions in this query are those on the
> low_volume and mid_volume tables, but the optimizer
> selects as first action:
>
> SCAN TABLE high_volume_table USING INDEX key_field_idx
>
> which leads to a huge number of iterations.
>
> If on the other hand, just one of the d filter conditions is removed, then
> the optimizer goes (like 3.22) first for
>
>SEARCH TABLE low_volume_table AS b USING COVERING INDEX
> low_volume_table_id_idx (b.filter1=? AND rowid>?)
>
> This happens after running ANALYZE, the sqlite1_stat for the high_volume
> table and key_field_idx is
>
>  5855234 6
>
> while for the  low_volume_table_filter1_idx it is
>
>  1976628 988314
>
> While the  low_volume_table_filter1_idx does not look very selective, as
> it is coupled with rowid filtering, it is actually very effective
> as it combines rowid & filter1, so there are just thousandths of rows
> being considered in the "group by", while when starting from
> a key_field_idx, there are millions of rows being considered, the
> overwhelming majority not fulfilling the conditions.
>
> The above table names and fields have been anonymized, if someone from the
> SQLite team want to have a look at the actual data,
> I can provide a database download (it's about 1.7 GB)
>
> Thanks!
>
>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users