Hello, On May 27, 2008, at 9:07 PM, Stephen Oberholtzer wrote:
> Well, the first thing you should bring away from this experience is > that the > number of VM instructions isn't really an indicator of how efficient > the > query is :) Good point :) > Now, I'm not sure exactly why one is faster than the other, > especially since > you didn't post your exact schema and indices, The DDL is rather straightforward: create table if not exists token ( id integer primary key not null, name text not null ) create unique index if not exists token_name on token( name ) http://dev.alt.textdrive.com/browser/HTTP/Finder.ddl#L60 > and I have no idea how many rows there are in either table. The incoming data set size varies from 108 to 3345 rows, with a average size of around 930 rows. The target table size is about 75,148 rows. Over time, most of the incoming rows will already exist in the target table. > But if I had to guess, it's because of the ORDER BY clause. In > general, an > ORDER BY means that SQLite needs to generate a temporary table with > all the > rows to be selected/inserted, > then sort that temporary table. The INSERT OR IGNORE version has to > unconditionally sort the entire 'stage' table; your second query > only has to > sort those rows in 'stage' that don't already exist in 'table'. If > each > table fits comfortably in your computer's disk cache, the extra pass > won't > matter so much. Ah... yes... the order by clause... good point... indeed removing the 'order by' from the 'insert or ignore' statement brings down its execution time a whisker away from the 'self join' version :) Thanks for the explaination! Cheers, -- PA. http://alt.textdrive.com/nanoki/ _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users