Hi there, I noticed some - at least for me - unexpected behavior of the REINDEX command:
I have a DB with 7 tables, 11 indexes and a size of about 140MiB. The DB is generated by a python script parsing an XML file. Entries are added to the tables in arbitrary order. After the tables have been created and filled, some UPDATE queries are executed. One of those queries took ~36m. I tried to reduce the runtime by using the REINDEX command, but with no success. However, if I delete and recreate one of the indexes, using the DROP INDEX and CREATE INDEX commands, the UPDATE query takes only ~2m. Unfortunately I can not share the DB with you, due to legal reasons and was also not able to create a test script to trigger this behavior. The two tables, affected by the UPDATE query: > CREATE TABLE `tbl1` ( > `id` INTEGER PRIMARY KEY AUTOINCREMENT, > `str` TEXT NOT NULL UNIQUE COLLATE NOCASE, > `int` INTEGER NOT NULL, > `tbl2_id` INTEGER, > `other_str` TEXT DEFAULT NULL, > `other_int1` INTEGER DEFAULT NULL, > `other_int2` INTEGER NOT NULL DEFAULT 0, > FOREIGN KEY(`tbl2_id`) REFERENCES `tbl2`(`id`) > ); > > CREATE TABLE `tbl2` ( > `id` INTEGER PRIMARY KEY AUTOINCREMENT, > `str` TEXT NOT NULL UNIQUE COLLATE NOCASE, > `int` INTEGER NOT NULL > ); Their corresponding indexes: > CREATE INDEX `idx1` ON `tbl1` ( > `int` > ); > CREATE INDEX `idx2` ON `tbl1` ( > `tbl2_id` > ); > CREATE INDEX `idx3` ON `tbl2` ( > `int` > ); ... filling the tables ... Recreating index idx1 (To archive the speedup, recreation of the other indexes is not necessary): > DROP INDEX `idx1`; > CREATE INDEX `idx1` ON `tbl1` ( > `int` > ); And the update query: > UPDATE `tbl1` > SET `tbl2_id` = ( > SELECT `tbl2`.`id` > FROM `tbl2` > WHERE `tbl1`.`int` >= `tbl2`.`int` > AND `tbl1`.`str` LIKE '%' || `tbl2`.`str` > ORDER BY `tbl2`.`int` DESC > LIMIT 1 > ); And it's corresponding query plan: > id parent notused detail> 3 0 0 SCAN TABLE tbl1 > 23 0 0 CORRELATED SCALAR SUBQUERY 1 > 29 23 0 SEARCH TABLE tbl2 USING INDEX idx3 (int<?) tbl1 and tbl2 contain about 600k and 20k entries, respectively. Does one have an explanation for this behavior? Is it expected that the REINDEX command produces other results than "manually" recreation of an index? If yes, under which circumstances does this happen? And is there some way to measure the "quality" of an index / if it has been scattered? PS: I also tried the VACUUM (combined w/ and w/o REINDEX) command to no success. -- Best regards dirdi _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users