On 22/04/2009 10:48 AM, sorka wrote: > Wow. Anybody? I figured this would be a simple question for the gurus on this > board. Seriously, nobody knows a better way to do this?
Seriously, consider alternative possibilities > > > sorka wrote: >> This should be simple but apparently it isn't. >> >> I have two tables: >> "CREATE TABLE showing ( " >> " showingId INTEGER PRIMARY KEY, " >> " stationId INTEGER, " >> " startTime INTEGER, " >> ") " >> CREATE INDEX showing_startTime on showing(startTime); >> >> >> AND >> "CREATE TABLE channel ( " >> " channelId INTEGER PRIMARY KEY, " >> " ChannelNumber INTEGER, " >> " stationId INTEGER, " >> " ) " >> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); >> CREATE INDEX channel_stationId on channel(stationId); >> >> When I do this select: >> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= >> 1239230000 AND showing.stationId = channel.stationId ORDER BY >> showing.startTime LIMIT 8; SQLite is smart enough to recognise that the showing_startTime index satisfies the ORDER BY clause ... so it's just traversing that index, filtering out cases of bogus stationID [I presume that's the reason for having the JOIN at all], and stopping when it's found 8 results. What is the result of: SELECT COUNT(*) FROM showing JOIN channel ON showing.startTime >= 1239230000 AND showing.stationId = channel.stationId; ? Is that 8 or 100000 or somewhere in between? >> >> I get back the correct 8 results in about 3 milliseconds. >> >> If I throw a secondary order term in there say ChannelNumber: >> >> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= >> 1239230000 AND showing.stationId = channel.stationId ORDER BY >> showing.startTime, channel.ChannelMajorNumber LIMIT 8; What is channel.ChannelMajorNumber? It's not in the schema that you posted. People who might be interested in helping you might want to try running your queries prefaced with EXPLAIN and EXPLAIN QUERY PLAN (which doesn't need any data, just the CREATE statements). Perhaps they then give up. Your tables contain only presumingly meaningless integers. [You have given us the full schema for each table, haven't you? There aren't any 50 Kb BLOBs in there, are there?] So you wouldn't have a problem with making a zipped db with the two tables (no indexes) available on the web somewhere for people who wanted to experiment? >> >> It now takes over 120 seconds!!!!!!! >> >> I've tried various multi-column indices including one on >> channel(stationId, ChannelNumber). And was that index used by the query? Did you have ChannelNumber in the ORDER BY clause at the time, or something else? >> >> No difference. >> >> As far as I can tell, when ordering on columns that cross tables, sqlite >> will bring in all the records that match the equality or inequality search >> term on the first column before it does a secondary sort. What any SQL query engine has to do is produce results that are the same as those obtained by this method: 1. Produce temporary result set ignoring the ORDER BY and the LIMIT/OFFSET. This will include all columns from the SELECT plus any extra that appear only in the ORDER BY. 2. Sort the results according to the ORDER BY. 3. Read the sort output, discarding columns that appear only in the ORDER BY, and filtering according to the LIMIT/OFFSET. Is that what you mean? >> >> I have over 100,000 records in the showing table and about 100 records in >> the channel table. >> >> Sqlite should be smart enough to do the secondary sort on ChannelNumber as >> soon as it sees that the records coming back have a later startTime than >> the previous one. i.e. Say the first 5 records have the same startTime and >> different channel numbers. Then the next 5 records have a later start time >> than the first. Sqlite should be smart enough to see this as the results >> come back and do a secondary sort on ChannelNumber on the first 5 results >> and then rinse and repeat. What is "the secondary sort on ChannelNumber" that you imply is already being done, just not soon enough? >> >> What appears to be happening is that even though startTime is indexed, >> sqlite is bringing all 100K records into memory sorted by time before it >> starts to sort by channel number. Apart from that being one possible explanation of what does seem a long time, what grounds do you have for the above statement? >> >> Is there a way to get sqlite to do the right thing? If there was only a >> way to have a multi-column index that included columns from different >> tables. Oh wait, there is, it's called an intermediate table. However the >> cost of doing this is pretty high for reasons I can't go into here. >> >> Any ideas? Maybe I'm just doing something wrong and this should be simple. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users