I'm having a performance problem with sqlite, where a particular select operation (involving a join) seems to be taking much longer than I'd expect (by an order of magnitude). I've been through several iterations of searching for information on the web, adjusting the table structures, and modifying the relevant query, but to no avail.
Here's the basic sequence of events: - I populate a collection of tables from a large file (inside a transaction which is then committed) - I then need to add further rows to the tables (from another large file), but for each insert, first locate an existing row and store it's rowid in the new row. It's locating the existing row that is causing a huge problem - altering the runtime of the second phase from 2 seconds to 70 or 80 seconds. (If I skip the select, and insert a dummy value, everything is blazingly fast) Here's the slow select: SELECT id, heading, lon, lat from positioned, ways on ways.id = positioned.rowid WHERE ident=? AND type=4 AND airport=?; Relevant table definitions: create table positioned (type INT, ident char(8), name char(32), lon float, lat float, elev float, bucket int); create table ways (id INTEGER primary key, heading float, length float, width float, surface int, airport int); Other relevant things: - this is sqlite 3.4.0, as shipped with Leopard. - I'm using rowids to implement support for inheritance - 'positioned' is the base class, and 'ways' is a derived class. All the tables corresponding to my derived classes have an integer primary key which is defined to match the rowid of the base row in 'positioned'. Hence I'm very frequently doing joins on positioned.rowid = sometable.id. Perhaps there's a more efficient or standard idiom to express this? - There's indexes defined on ways (airport) and positioned (type) and (ident) - and some other columns. I have a slight worry that querying an indexed table which is being continually INSERT-ed into might be a potential problem, but I experimented with disabling certain indexes with no change in performance (well, sometimes it got worse) 'ident' is not unique, but matching on it should reduce the potential result set down to fifty or sixty rows at most (before any join needs to take place). A given combination of (ident, airport) *should* be globally unique - there should be exactly one result row unless the input data is malformed (which is unlikely, but possible). I've deliberately made the ident term the first one, after reading in various places that this helps the SQLite query planner. - both positioned and ways have tens of thousands of rows; the entire DB will easily fit in memory. (And, at the time I'm doing these operations, the tables have just been populated, so I'd expect all the DB pages to be in memory ... but perhaps I'm wrong about that) - in the slow phase, I'm inside a single transaction; the sequence of operations is the problematic select, followed by an insert; repeated several thousand times. Hopefully that's all clear - if not, I can supply an example database file, or an EXPLAIN output of the select operation. Regards, James _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users