I'm having a performance problem with sqlite, where a particular  
select operation (involving a join) seems to be taking much longer  
than I'd expect (by an order of magnitude). I've been through several  
iterations of searching for information on the web, adjusting the  
table structures, and modifying the relevant query, but to no avail.

Here's the basic sequence of events:
        - I populate a collection of tables from a large file (inside a  
transaction which is then committed)
        - I then need to add further rows to the tables (from another large  
file), but for each insert, first locate an existing row and store  
it's rowid in the new row.

It's locating the existing row that is causing a huge problem -  
altering the runtime of the second phase from 2 seconds to 70 or 80  
seconds. (If I skip the select, and insert a dummy value, everything  
is blazingly fast)

Here's the slow select:

        SELECT id, heading, lon, lat from  positioned, ways on ways.id =  
positioned.rowid WHERE ident=? AND type=4 AND airport=?;

Relevant table definitions:

        create table positioned (type INT, ident char(8), name char(32), lon  
float, lat float, elev float, bucket int);
        create table ways (id INTEGER primary key, heading float, length  
float, width float, surface int, airport int);

Other relevant things:

        - this is sqlite 3.4.0, as shipped with Leopard.

         - I'm using rowids to implement support for inheritance -  
'positioned' is the base class, and 'ways' is a derived class. All the  
tables corresponding to my derived classes have an integer primary key  
which is defined to match the rowid of the base row in 'positioned'.  
Hence I'm very frequently doing joins on positioned.rowid =  
sometable.id. Perhaps there's a more efficient or standard idiom to  
express this?

         - There's indexes defined on ways (airport) and positioned (type)  
and (ident)  - and some other columns. I have a slight worry that  
querying an indexed table which is being continually INSERT-ed into  
might be a potential problem, but I experimented with disabling  
certain indexes with no change in performance (well, sometimes it got  
worse)

        'ident' is not unique, but matching on it should reduce the potential  
result set down to fifty or sixty rows at most (before any join needs  
to take place). A given combination of (ident, airport) *should* be  
globally unique - there should be exactly one result row unless the  
input data is malformed (which is unlikely, but possible). I've  
deliberately made the ident term the first one, after reading in  
various places that this helps the SQLite query planner.

        - both positioned and ways have tens of thousands of rows; the entire  
DB will easily fit in memory. (And, at the time I'm doing these  
operations, the tables have just been populated, so I'd expect all the  
DB pages to be in memory ... but perhaps I'm wrong about that)

        - in the slow phase, I'm inside a single transaction; the sequence of  
operations is the problematic select, followed by an insert; repeated  
several thousand times.

Hopefully that's all clear - if not, I can supply an example database  
file, or an EXPLAIN output of the select operation.

Regards,
James

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to