I patched my SQlite 3.24 code to include the fix from the ticket

<   if( rc==SQLITE_OK && (mBest = (pNew->prereq & ~mPrereq))!=0 ){
---
> // from SQLite bugfix
>  if( rc==SQLITE_OK && ((mBest = (pNew->prereq & ~mPrereq))!=0 || bIn) ){

and changed the xBestIndex return value to be lower if the equality constraint 
from IN is not usable

The generated code as reported is invalid (instruction 16 with the init of R6 
is not shown)

explain select lsn from atx_txlog where period_no between 7300 and 7313 and 
event_Type in (140001,180001);

addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     16    0                    00  Start at 16
1     VOpen          0     0     0     vtab:B90B50    00
2     Explain        2     0     0     SCAN TABLE atx_txlog VIRTUAL TABLE INDEX 
1:
  00
3     Integer        7300  3     0                    00  r[3]=7300
4     Integer        7313  4     0                    00  r[4]=7313
5     Integer        1     1     0                    00  r[1]=1
6     Integer        2     2     0                    00  r[2]=2
7     VFilter        0     15    1
              00  iplan=r[1] zplan='
'
8       Noop           0     0     0                    00  begin IN expr
9       VColumn        0     15    5                    00  r[5]=vcolumn(15); 
atx_txlog.event_type
10      Eq             5     12    6     (BINARY)       43  if r[6]==r[5] goto 
12
11      Ne             5     14    7     (BINARY)       53  if r[7]!=r[5] goto 
14; end IN expr
12      VColumn        0     21    8                    00  r[8]=vcolumn(21); 
atx_txlog.lsn
13      ResultRow      8     1     0                    00  output=r[8]
14    VNext          0     8     0                    00
15    Halt           0     0     0                    00
17    Integer        180001  7     0                    00  r[7]=180001
18    Goto           0     1     0                    00

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dan Kennedy
Gesendet: Freitag, 29. März 2019 14:30
An: sqlite-users@mailinglists.sqlite.org
Betreff: [EXTERNAL] Re: [sqlite] Query Regression IN and Virtual Tables


On 29/3/62 14:32, Hick Gunter wrote:
> When upgrading from 3.7.14.1 to 3.24 I noticed the following problem
>
> Given a virtual table like
>
> CREATE VIRTUAL TABLE vt ( key1 INTEGER, key2 INTEGER, key3 INTEGER,
> attr1 INTEGER,...);
>
> whose xBestIndex function simulates (in unsupported syntax)
>
> CREATE VIRTUAL INDEX vt_key ON vt (key1, key2, key3);
>
> but also handles simple comparisons internally, the query
>
> SELECT * FROM vt WHERE key1 = <val1> AND key2 BETWEEN <val2> AND
> <val3> AND attr1 IN (<list>);


Thanks for reporting this.

What is supposed to happen in this scenario is that xBestIndex() is invoked 
once with all 4 constraints marked as usable. The IN(...) is represented as an 
SQLITE_INDEX_CONSTRAINT_EQ constraint. If the
xBestIndex() implementation elects to use the IN(...) operator, then
xBestIndex() is invoked a second time, this time with the IN(...) marked as not 
usable. SQLite evaluates both plans, considering the cost estimates provided by 
the virtual table implementation and its own estimate of the cardinality of the 
IN(...) operator. And chooses the most efficient plan overall.

There was a bug preventing the second call to xBestIndex() from being made in 
some circumstances - including for your query. Now fixed here:

   https://sqlite.org/src/info/f5752517f590b37b

So if you upgrade to trunk, or else apply the patch linked above to 3.27.2, and 
the virtual table implementation provides relatively accurate cost estimates, 
SQLite should make an intelligent decision about which plan to use.

Dan.


>
> SQLite 3.7.14.1 asks for (key1,'='), (key2,'>='), (key2,'<=') and
> xBestIndex accepts all 3 constraints yielding query plan
>
> - materialize IN <list> as anonymous ephemeral table
> - search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ?)
>    - retrieve column attr1
>    - search anonymous ephemeral table
>
> i.e. perform a single partial table scan on vt and check attr1
>
>
> SQLite3.24 asks for ...,(attr1,'=') and xBestIndex accepts all 4
> constraints yielding
>
> - materialize IN (<list>) as anonymous ephemeral table
> - scan anonymous ephemeral table
>    - search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ? AND
> attr1 = ?)
>
> i.e. perform a partial table scan of vt FOR EACH attr1, which is
> slower by the cardinality of the IN list
>
> Fortunately, CTEs come to the rescue:
>
> WITH attrs (attr1) AS (VALUES <(list)>) SELECT * FROM vt CROSS JOIN
> attrs a ON (a.attr1 = vt.attr1) WHERE key1 = <val1> AND key2 BETWEEN
> <val2> AND <val3>
>
> This prevents SQLite 3.24 from adding the last constraint, yielding
>
> - materialize IN (<(list)>) as epehemeral table attrs
> - search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ?)
>    - retrieve column attr1
>    - scan ephemeral table attrs
>
> The only issue is that the previously generated ephemeral table was
> implemented as a covering index (allowing the sequence IfNull,
> Affinity, NotFound) whereas the named ephemeral table is implemented
> as a table (requiring a full table scan of the ephemeral table, even
> though at most 1 row can match)
>
> Optimisation opportunity:
>
> 32      Rewind         1     40    0                    00
> 33        Column         1     0     10                   00  
> r[10]=events.event_type
> 34        VColumn        0     15    11                   00  
> r[11]=vcolumn(15); atx_txlog.event_type
> 35        Ne             11    38    10    (BINARY)       53  if r[10]!=r[11] 
> goto 38
> 36        VColumn        0     6     12                   00  
> r[12]=vcolumn(6); atx_txlog.sync_offset
> 37        ResultRow      12    1     0                    00  output=r[12]
> 38      Next           1     33    0                    01
>
> Could IMHO be rewritten as
>
> 32      VColumn        0     15    11                   00  
> r[11]=vcolumn(15); atx_txlog.event_type
> 33      Rewind         1     40    0                    00
> 34        Column         1     0     10                   00  
> r[10]=events.event_type
> 35        Ne             11    38    10    (BINARY)       53  if r[10]!=r[11] 
> goto 38
> 36        VColumn        0     6     12                   00  
> r[12]=vcolumn(6); atx_txlog.sync_offset
> 37        ResultRow      12    1     0                    00  output=r[12]
> 38      Next           1     33    0                    01
>
>
> ___________________________________________
>   Gunter Hick | Software Engineer | Scientific Games International
> GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR:
> 0430013 | (O) +43 1 80100 - 0
>
> May be privileged. May be confidential. Please delete if not the addressee.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to