Re: [sqlite] Weird query plan when sorting by a joined column
Hi Richard, On 16.05.2018 12:32, Richard Hipp wrote: >> Actually it is: SQLite will generate the same query plan for both >> queries as long as no statistics data is available. Adding the >> statistics information will lead to the more complex and slow query plan >> which is why I kept the statistics in the example SQL. > > Can you please send the output of ".fullschema" so that I can try to > reproduce your problem? I sent the steps to reproduce this problem to the list last week. As there was no reaction from you so far I assume that this was enough information? Anyway, should you need more for reproducing this (assuming that you are interested at all) please contact me directly as I do not have the time to regularly read the sqlite-users list. Our application now uses a query that leads to the better query plan so there is no hurry to fix this, I am just curious about what might cause it. Greetings, Torsten -- $---+1+2+3+4+5+----6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Weird query plan when sorting by a joined column
On 16.05.2018 12:32, Richard Hipp wrote: > On 5/16/18, Torsten Landschoff wrote: >> >> Actually it is: SQLite will generate the same query plan for both >> queries as long as no statistics data is available. Adding the >> statistics information will lead to the more complex and slow query plan >> which is why I kept the statistics in the example SQL. > > Can you please send the output of ".fullschema" so that I can try to > reproduce your problem? > Here you go: --- torsten.landschoff@horatio:~$ /opt/sqlite3/bin/sqlite3 demo.db SQLite version 3.23.2 2018-05-09 02:23:29 Enter ".help" for usage hints. sqlite> .fullschema CREATE TABLE base ( id INTEGER NOT NULL PRIMARY KEY, modification_time DATETIME NOT NULL ); CREATE UNIQUE INDEX ix_base_mtime ON base (modification_time); CREATE TABLE derived (id INTEGER NOT NULL PRIMARY KEY); ANALYZE sqlite_master; ANALYZE sqlite_master; ANALYZE sqlite_master; --- Please note that I included a full example to reproduce the problem just by running --- sqlite3 < demo.sql --- It's in my own reply to the original email. Thank you for looking at this! Greetings, Torsten -- $---+1+2+3----+4----+5+6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Weird query plan when sorting by a joined column
On 16.05.2018 01:48, Keith Medcalf wrote: > > There is no LEFT join visible to anyone except someone who is in love > with LEFT joins without knowing what one is. Sorry, I corrupted my text while trying to simplify it. I originally wrote "since this is not a left outer join", but tried to remove the negation. Actually this is an inner join. Sorry. > Your query is misformed. You proper query ought to be: > > select * from base b join derived d using (id) order by id; If I were in the position to just write this query than this is what I would write. But sitting behind an ORM this is taken out of my hands. Or rather it isn't - I just replaced the code with hand-crafted SQL which is fine and dandy but means that I have code to update if the schema ever changes. > There is no b.id nor d.id in the output row. It is a figment of your > imagination. The output row contains an "id". Of course the problem > arises because of all the other people who also like to write broken > SQL and expect it it work. Yeah. The code above is incredibly broken by having two identical id columns in the result. Still it is a mystery to me how this leads the optimizer to come up with a much worse query plan after the data is analyzed. > Why do you expect that your brokenness > should win out over someone else's brokenness? I don't. I only wanted to express my surprise how an otherwise really stable database engine like SQLite is caught off guard when confronted with less than optimal SQL. Greetings, Torsten -- $---+1+2+3+----4+----5+6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Weird query plan when sorting by a joined column
On 16.05.2018 01:24, Simon Slavin wrote: >> I would have expected that b.id is basically an alias for d.id as > > Obvious to a good programmer. Not obvious to SQLite. Your SELECT is > selecting from the "base" table, but trying to order by a table > JOINed to it. The fact that d.id is always equal to b.id isn't > spotted by SQLite. Actually it is: SQLite will generate the same query plan for both queries as long as no statistics data is available. Adding the statistics information will lead to the more complex and slow query plan which is why I kept the statistics in the example SQL. > It's a possible enhancement opportunity for SQLite, but the way the > query is phrased is counter-intuitive and I doubt many people do it. I wouldn't do it myself actually, it happens to be the way that SQLAlchemy creates queries for joined-table inheritance. I assume Hibernate would do the same. > The processing to spot the opportunity might slow down every JOIN > operation SQLite performs, so it might not be the right thing for the > average user. Again, SQLite does already do this but somehow lets the optimizer spoil the otherwise perfect original query. I should have made that more explicit in my original mail but it was late and I spent the 2 hours before with stripping down our schema and database to the small reproducing example. Greetings, Torsten -- $---+1+2+3+4+----5+----6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Weird query plan when sorting by a joined column
On 16.05.2018 00:12, Torsten Landschoff wrote: > In case anybody would be so kind and take a look at this I included a > small SQL Script to reproduce. Results here are: > > -- >> $ /opt/sqlite3/bin/sqlite3 -version >> 3.23.2 2018-05-09 02:23:29 >> f139f6f07df094a0a62e0a55ae7e91dc68006d55c9db7b244a945cc8216f55ff >> $ /opt/sqlite3/bin/sqlite3 < weird_plan.sql Great, seems like the list strips attachments. Therefore I'll put the code here - this is the content of weird_plan.sql: - CREATE TABLE base ( id INTEGER NOT NULL PRIMARY KEY, modification_time DATETIME NOT NULL ); CREATE UNIQUE INDEX ix_base_mtime ON base (modification_time); CREATE TABLE derived (id INTEGER NOT NULL PRIMARY KEY); analyze sqlite_master; INSERT INTO sqlite_stat1 VALUES('base','ix_base_mtime','5465697 1'); analyze sqlite_master; select x'0a' || 'This results in a blazingly fast execution:' || x'0a'; explain SELECT base.id AS base_id, base.modification_time AS base_modification_time FROM base JOIN derived USING (id) WHERE base.id < 5463052 order by derived.id desc limit 100 offset 0; select x'0a' || 'This takes seconds with real data:' || x'0a'; explain SELECT base.id AS base_id, base.modification_time AS base_modification_time FROM base JOIN derived USING (id) WHERE base.id < 5463052 order by base.id desc limit 100 offset 0; - Greetings, Torsten -- $---+1+2+3+4+5+6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Weird query plan when sorting by a joined column
then > r[3]--, goto 22 > 20 Last 2 0 000 > 21 Delete 2 0 000 > 22Next 1 10000 > 23Sort 2 29000 > 24 IfPos 2 28100 if r[2]>0 then > r[2]-=1, goto 28; OFFSET > 25 Column 2 2 11 00 > r[11]=base_modification_time > 26 Column 2 0 10 00 r[10]=base_id > 27 ResultRow 102 000 output=r[10..11] > 28Next 2 24000 > 29Halt 0 0 000 > 30Transaction0 0 4 0 01 usesStmtJournal=0 > 31Integer 5463052 7 000 r[7]=5463052 > 32Goto 0 1 000 -- TIA for your support! Greetings, Torsten -- $---+1+2+3+4+5+6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] SQLite behaviour when modifying sqlite_master (was: Re: pragma integrity_check doesn't check constraints without indices)
Hi Ben, On 10/17/2016 10:40 PM, Ben Newberg wrote: > First off, my apologies for hijacking this thread. :-( I just wanted to report a bug... > But I've seen some strange things when messing with pragma writable_schema. > It appears all bets are off? > > example: create a table of columns (x, y, z), and fill it with values. > then, modify sqlite_master to take out column z. > let's say later on down the line, you either add a column using the > pragma writable_schema, or the ALTER TABLE. > What happens is the new column is filled with the old column's values: Which should be quite obvious: Changing sqlite_master just changes what SQLite thinks is in those tables or rather B-Trees. This is a sharp knife and you must know not only what you are doing but also what SQLite is doing internally with that information. For example I tried to get rid of a surrogate key on a table like create table items (id integer primary key, parent, child, order); by dropping the id column from sqlite master, assuming that it was not in the internal representation anyway as the id will be the row number. This was a false assumption. Actually the row number is not written but a NULL value is prepended to every row: https://www.sqlite.org/fileformat2.html#representation_of_sql_tables So to make this modification work I would have to write code to move the data one column to the left. SQLite does not directly support this. And, BTW: my modification caused segfaults when accessing the DB via SQLite. So I ended up renaming the table and recreating it from the old data, which takes up to 2 hours for each client instance which is quite unfortunate. > -- we are still seeing column z, until we vaccum: > sqlite> vacuum; > Error: table vacuum_db.t has 2 columns but 3 values were supplied Most likely this is only because sqlite_master was not reread. The 3rd column is still there despite the vacuum, it just ended up rereading the schema. Probably because vacuum internally copies the whole database and probably is akin to reopening the DB. Greetings, Torsten -- $---+1+2+3+4+----5----+6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] pragma integrity_check doesn't check constraints without indices
/ for(i=0; inDb; i++){ // ... /* Make sure all the indices are constructed correctly. */ for(x=sqliteHashFirst(pTbls); x && !isQuick; x=sqliteHashNext(x)){ // ... if( pTab->pIndex==0 ) continue; // ... /* Verify that all NOT NULL columns really are NOT NULL */ for(j=0; jnCol; j++){ // ... It appears to me that the row based primary key index is not counted as an index by pTab->pIndex so the NOT NULL checks are skipped without index. Greetings, Torsten -- $---+1+2+3+4+5+6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] pragma integrity_check doesn't check constraints without indices
Hi *, while working on a database upgrade on our application I found a bug in SQLite. I was adding a not null constraint by updating the sqlite_master table and to see the check fail used pragma integrity_check to find the offending rows that I know are there. However, the integrity check just passes. This in contrast to the documentation: > The integrity_check pragma looks for out-of-order records, missing > pages, malformed records, missing index entries, and UNIQUE and NOT > NULL constraint errors. Source: https://www.sqlite.org/pragma.html#pragma_integrity_check Here is a trivial demonstration to illustrate this problem: * Create a table with NOT NULL constraints > torsten.landschoff@horatio:~/report$ /opt/sqlite-dev/bin/sqlite3 demo.db > SQLite version 3.14.2 2016-09-12 18:50:49 > Enter ".help" for usage hints. > sqlite> create table example (name varchar not null, address varchar not > null); > sqlite> * Observe: Pragma integrity check does not report NULL errors > torsten.landschoff@horatio:~/report$ (echo ".explain"; echo "explain pragma > integrity_check;" ) | /opt/sqlite-dev/bin/sqlite3 demo.db | grep NULL > 5 IsNull 2 10000 if r[2]==NULL goto > 10 > 13IsNull 2 18000 if r[2]==NULL goto > 18 Interestingly it worked for other tables. Looking at the SQLite source code and the DDL of the other tables, it seems like checks are skipped for tables without indices. * Create an index > torsten.landschoff@horatio:~/report$ /opt/sqlite-dev/bin/sqlite3 demo.db > SQLite version 3.14.2 2016-09-12 18:50:49 > Enter ".help" for usage hints. > sqlite> create index ix_demo on example(name); > sqlite> * Interestingly, not null constraints are checked now > torsten.landschoff@horatio:~/report$ (echo ".explain"; echo "explain pragma > integrity_check;" ) | /opt/sqlite-dev/bin/sqlite3 demo.db | grep NULL > 5 IsNull 2 10000 if r[2]==NULL goto > 10 > 19 NotNull3 25000 if r[3]!=NULL > goto 25 > 21 String80 3 0 NULL value in example.name 00 > r[3]='NULL value in example.name' > 26 NotNull3 32000 if r[3]!=NULL > goto 32 > 28 String80 3 0 NULL value in example.address 00 > r[3]='NULL value in example.address' > 59IsNull 2 640 00 if r[2]==NULL goto > 64 Greetings, Torsten -- $---+1+2+3+4+5+6+ SCALE GmbH Niederlassung Dresden Torsten Landschoff Pohlandstraße 19 01309 Dresden Tel: +49-351-312002-10 Fax: +49-351-312002-29 SCALE GmbH Registergericht und Sitz: Ingolstadt, HRB 6384 Geschäftsführer: Dr.-Ing. Heiner Müllerschön, Dipl.-Math. Ulrich Franz ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users