Re: [sqlite] Multi column ORDER BY across table peformance problem....
For this particular query, the only reason for the join on the channel table is simply to order by channel number and that's it. Currently, what I'm doing is leaving off the order by channel number. Since this is done through the C++ interface, I'm stepping through results until a showing with a new and greater time comes along. At that point, I take all the showings I got with the previous start time and sort them all manually with qsort by channel number. As soon as I have enough results for what I'm looking for, I stop stepping through rows and finish the transaction. It works well enough but it really seems that sql should have the ability to be smarter about ordering. Jeremy Hinegardner wrote: > > Ah, i see, so you are going to be calling this many times, and slowly > incrementing the value of startTime. Yes, in this query, if you have more > than > 8 shows at a particular start time, then when you increment the time to go > to > the next 'batch' of shows you will miss all the ones from then end of the > previous time. > > If that is the case, how about just making the inner query 'WHERE > showing.startTme = ' and leave off the limit, and call the > whole > thing repeatedly? I doubt that would do what you want, since you querying > for > startTime which is what your where clause, so you'll either get back rows > that > are all the value of your where clause or no rows. > > I guess more information about the whole problem is going to be in > necessary to > help further. For instance, it appears you have a table of 'showings' and > a > table of 'channels' and the goal is, find the next N items from the > showings > table for channels in the channel table. > > enjoy, > > -jeremy > > > > On Tue, Apr 21, 2009 at 10:11:41PM -0700, sorka wrote: >> >> Um, well, except that I'd have huge gaping holes. i.e. say there are 100 >> shows at or after the specified time all on different channels. The above >> query would limit to the first 8 showings on or after that time *before* >> sorting by channel number. >> >> >> >> Jeremy Hinegardner wrote: >> > >> > How about trying: >> > >> > SELECT s.startTime >> > FROM ( SELECT showing.startTime AS startTime >> > ,showing.stationId AS stationId >> > FROM showing >> > WHERE showing.startTime >= 123923 >> > ORDER BY showing.startTime >> >LIMIT 8 ) AS s >> > JOIN channel AS c >> > ON s.stationId = c.stationId >> > ORDER BY s.startTime, c.ChannelMajorNumber >> > >> > I believe this should limit your table of 100,000 rows down to just the >> 8 >> > you >> > want on the inner query, and then join that against the 100 rows in the >> > channel >> > table. >> > >> > Although, to tell you the truth, I don't see the purpose in your >> original >> > query: >> > >> > SELECT showing.startTime >> > FROM showing >> > JOIN channel >> > ON showing.startTime >= 123923 >> >AND showing.stationId = channel.stationId >> > ORDER BY showing.startTime, channel.ChannelMajorNumber >> > LIMIT 8; >> > >> > You are not doing anything with the channel table data other than >> joining >> > it >> > agains the showing tables. To me that means you are joining against >> the >> > channel >> > table in order to filter out rows in the showing table that have >> > stationId's >> > that do not exist in the channel table. >> > >> > Is that correct? If that is correct, then the query I gave will not do >> > what you >> > want. >> > >> > Can the goal you are attempting be stated as? >> > >> > Find next or earliest showings from the showings table for channels >> in >> > the >> > channel table? >> > >> > If this is the case, maybe this query will work: >> > >> > SELECT showing.startTime >> > FROM showing >> >WHERE showing.startTime >= 123923000 >> > AND showing.stationId IN ( SELECT DISTINCT channel.stationId >> FROM >> > channel ) >> >LIMIT 8; >> > >> > enjoy, >> > >> > -jeremy >> > >> > >> > On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote: >> >> >> >> Writing the query as you said you would returns results in nearly >> >> instantly, >> >> within a few ms. >> >> >> >> The problem is when you add a secondary ordering field. The intended >> >> indices >> >> are being used. >> >> >> >> The problem, as I've explained several times already is that there is >> no >> >> way >> >> to create a multicolumn index across tables. Sqlite is using the >> >> stationId >> >> index on the channel table to join showings via the stationId. The >> query >> >> plan shows it clearly. When ordering by startTime or even startTime >> and >> >> stationId, the results are returned in a few ms. But when adding >> channel >> >> number(i.e startTime and then channel number), it jumps to two >> minutes. >> >> This >> >> is because sqlite is bringing in nearly 100K
Re: [sqlite] Multi column ORDER BY across table peformance problem....
Ah, i see, so you are going to be calling this many times, and slowly incrementing the value of startTime. Yes, in this query, if you have more than 8 shows at a particular start time, then when you increment the time to go to the next 'batch' of shows you will miss all the ones from then end of the previous time. If that is the case, how about just making the inner query 'WHERE showing.startTme = ' and leave off the limit, and call the whole thing repeatedly? I doubt that would do what you want, since you querying for startTime which is what your where clause, so you'll either get back rows that are all the value of your where clause or no rows. I guess more information about the whole problem is going to be in necessary to help further. For instance, it appears you have a table of 'showings' and a table of 'channels' and the goal is, find the next N items from the showings table for channels in the channel table. enjoy, -jeremy On Tue, Apr 21, 2009 at 10:11:41PM -0700, sorka wrote: > > Um, well, except that I'd have huge gaping holes. i.e. say there are 100 > shows at or after the specified time all on different channels. The above > query would limit to the first 8 showings on or after that time *before* > sorting by channel number. > > > > Jeremy Hinegardner wrote: > > > > How about trying: > > > > SELECT s.startTime > > FROM ( SELECT showing.startTime AS startTime > > ,showing.stationId AS stationId > > FROM showing > > WHERE showing.startTime >= 123923 > > ORDER BY showing.startTime > >LIMIT 8 ) AS s > > JOIN channel AS c > > ON s.stationId = c.stationId > > ORDER BY s.startTime, c.ChannelMajorNumber > > > > I believe this should limit your table of 100,000 rows down to just the 8 > > you > > want on the inner query, and then join that against the 100 rows in the > > channel > > table. > > > > Although, to tell you the truth, I don't see the purpose in your original > > query: > > > > SELECT showing.startTime > > FROM showing > > JOIN channel > > ON showing.startTime >= 123923 > >AND showing.stationId = channel.stationId > > ORDER BY showing.startTime, channel.ChannelMajorNumber > > LIMIT 8; > > > > You are not doing anything with the channel table data other than joining > > it > > agains the showing tables. To me that means you are joining against the > > channel > > table in order to filter out rows in the showing table that have > > stationId's > > that do not exist in the channel table. > > > > Is that correct? If that is correct, then the query I gave will not do > > what you > > want. > > > > Can the goal you are attempting be stated as? > > > > Find next or earliest showings from the showings table for channels in > > the > > channel table? > > > > If this is the case, maybe this query will work: > > > > SELECT showing.startTime > > FROM showing > >WHERE showing.startTime >= 123923000 > > AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM > > channel ) > >LIMIT 8; > > > > enjoy, > > > > -jeremy > > > > > > On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote: > >> > >> Writing the query as you said you would returns results in nearly > >> instantly, > >> within a few ms. > >> > >> The problem is when you add a secondary ordering field. The intended > >> indices > >> are being used. > >> > >> The problem, as I've explained several times already is that there is no > >> way > >> to create a multicolumn index across tables. Sqlite is using the > >> stationId > >> index on the channel table to join showings via the stationId. The query > >> plan shows it clearly. When ordering by startTime or even startTime and > >> stationId, the results are returned in a few ms. But when adding channel > >> number(i.e startTime and then channel number), it jumps to two minutes. > >> This > >> is because sqlite is bringing in nearly 100K records from the showing > >> table, > >> sorted by time, because of the startTime index, before it starts sorting > >> by > >> channel number. It would sure be nice to be able to get sqlite to sort on > >> the channel number for each time group. i.e. once all the 10:00 PM > >> showings > >> have been returned and the first 10:30 PM showing is about to be > >> returned, > >> sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing > >> this but should. > >> > >> At this point, I'm going to assume that this is just a limitation in the > >> sqlite engine and I'll have to proceed in other ways to solve this issue. > >> > >> > >> > >> Ian Walters wrote: > >> > > >> >> SELECT showing.startTime FROM showing JOIN channel ON > >> >> showing.startTime >= > >> >> 123923 AND showing.stationId = channel.stationId ORDER BY > >> >> showing.startTime LIMIT 8; > >> > > >> > I don't
Re: [sqlite] Multi column ORDER BY across table peformance problem....
how about select Starttime from channel C inner join (select showing.startTime AS startTime, showing.stationId AS stationId FROM showing WHERE showing.startTime >= 123923 ORDER BY showing.startTime LIMIT 8 ) AS s on S.StationId = C.stationId ORDER BY s.startTime, c.ChannelMajorNumber limit 8 --- On Wed, 4/22/09, sorka <sorka95...@gmail.com> wrote: From: sorka <sorka95...@gmail.com> Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem To: sqlite-users@sqlite.org Date: Wednesday, April 22, 2009, 1:11 AM Um, well, except that I'd have huge gaping holes. i.e. say there are 100 shows at or after the specified time all on different channels. The above query would limit to the first 8 showings on or after that time *before* sorting by channel number. Jeremy Hinegardner wrote: > > How about trying: > > SELECT s.startTime > FROM ( SELECT showing.startTime AS startTime > ,showing.stationId AS stationId > FROM showing > WHERE showing.startTime >= 123923 > ORDER BY showing.startTime >LIMIT 8 ) AS s > JOIN channel AS c > ON s.stationId = c.stationId > ORDER BY s.startTime, c.ChannelMajorNumber > > I believe this should limit your table of 100,000 rows down to just the 8 > you > want on the inner query, and then join that against the 100 rows in the > channel > table. > > Although, to tell you the truth, I don't see the purpose in your original > query: > > SELECT showing.startTime > FROM showing > JOIN channel > ON showing.startTime >= 123923 >AND showing.stationId = channel.stationId > ORDER BY showing.startTime, channel.ChannelMajorNumber > LIMIT 8; > > You are not doing anything with the channel table data other than joining > it > agains the showing tables. To me that means you are joining against the > channel > table in order to filter out rows in the showing table that have > stationId's > that do not exist in the channel table. > > Is that correct? If that is correct, then the query I gave will not do > what you > want. > > Can the goal you are attempting be stated as? > > Find next or earliest showings from the showings table for channels in > the > channel table? > > If this is the case, maybe this query will work: > > SELECT showing.startTime > FROM showing >WHERE showing.startTime >= 123923000 > AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM > channel ) >LIMIT 8; > > enjoy, > > -jeremy > > > On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote: >> >> Writing the query as you said you would returns results in nearly >> instantly, >> within a few ms. >> >> The problem is when you add a secondary ordering field. The intended >> indices >> are being used. >> >> The problem, as I've explained several times already is that there is no >> way >> to create a multicolumn index across tables. Sqlite is using the >> stationId >> index on the channel table to join showings via the stationId. The query >> plan shows it clearly. When ordering by startTime or even startTime and >> stationId, the results are returned in a few ms. But when adding channel >> number(i.e startTime and then channel number), it jumps to two minutes. >> This >> is because sqlite is bringing in nearly 100K records from the showing >> table, >> sorted by time, because of the startTime index, before it starts sorting >> by >> channel number. It would sure be nice to be able to get sqlite to sort on >> the channel number for each time group. i.e. once all the 10:00 PM >> showings >> have been returned and the first 10:30 PM showing is about to be >> returned, >> sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing >> this but should. >> >> At this point, I'm going to assume that this is just a limitation in the >> sqlite engine and I'll have to proceed in other ways to solve this issue. >> >> >> >> Ian Walters wrote: >> > >> >> SELECT showing.startTime FROM showing JOIN channel ON >> >> showing.startTime >= >> >> 123923 AND showing.stationId = channel.stationId ORDER BY >> >> showing.startTime LIMIT 8; >> > >> > I don't know if it would be faster... but I would
Re: [sqlite] Multi column ORDER BY across table peformance problem....
Um, well, except that I'd have huge gaping holes. i.e. say there are 100 shows at or after the specified time all on different channels. The above query would limit to the first 8 showings on or after that time *before* sorting by channel number. Jeremy Hinegardner wrote: > > How about trying: > > SELECT s.startTime > FROM ( SELECT showing.startTime AS startTime > ,showing.stationId AS stationId > FROM showing > WHERE showing.startTime >= 123923 > ORDER BY showing.startTime >LIMIT 8 ) AS s > JOIN channel AS c > ON s.stationId = c.stationId > ORDER BY s.startTime, c.ChannelMajorNumber > > I believe this should limit your table of 100,000 rows down to just the 8 > you > want on the inner query, and then join that against the 100 rows in the > channel > table. > > Although, to tell you the truth, I don't see the purpose in your original > query: > > SELECT showing.startTime > FROM showing > JOIN channel > ON showing.startTime >= 123923 >AND showing.stationId = channel.stationId > ORDER BY showing.startTime, channel.ChannelMajorNumber > LIMIT 8; > > You are not doing anything with the channel table data other than joining > it > agains the showing tables. To me that means you are joining against the > channel > table in order to filter out rows in the showing table that have > stationId's > that do not exist in the channel table. > > Is that correct? If that is correct, then the query I gave will not do > what you > want. > > Can the goal you are attempting be stated as? > > Find next or earliest showings from the showings table for channels in > the > channel table? > > If this is the case, maybe this query will work: > > SELECT showing.startTime > FROM showing >WHERE showing.startTime >= 123923000 > AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM > channel ) >LIMIT 8; > > enjoy, > > -jeremy > > > On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote: >> >> Writing the query as you said you would returns results in nearly >> instantly, >> within a few ms. >> >> The problem is when you add a secondary ordering field. The intended >> indices >> are being used. >> >> The problem, as I've explained several times already is that there is no >> way >> to create a multicolumn index across tables. Sqlite is using the >> stationId >> index on the channel table to join showings via the stationId. The query >> plan shows it clearly. When ordering by startTime or even startTime and >> stationId, the results are returned in a few ms. But when adding channel >> number(i.e startTime and then channel number), it jumps to two minutes. >> This >> is because sqlite is bringing in nearly 100K records from the showing >> table, >> sorted by time, because of the startTime index, before it starts sorting >> by >> channel number. It would sure be nice to be able to get sqlite to sort on >> the channel number for each time group. i.e. once all the 10:00 PM >> showings >> have been returned and the first 10:30 PM showing is about to be >> returned, >> sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing >> this but should. >> >> At this point, I'm going to assume that this is just a limitation in the >> sqlite engine and I'll have to proceed in other ways to solve this issue. >> >> >> >> Ian Walters wrote: >> > >> >> SELECT showing.startTime FROM showing JOIN channel ON >> >> showing.startTime >= >> >> 123923 AND showing.stationId = channel.stationId ORDER BY >> >> showing.startTime LIMIT 8; >> > >> > I don't know if it would be faster... but I would have written that. >> > >> > SELECT showing.startTime FROM showing JOIN channel USING (stationId) >> > WHERE startTime < 123923 ORDER BY startTime LIMIT 8. >> > >> > Also I know in the latest version of SQLite its possible to 'hint' >> > what indexes should be used, which might be helpful. There is also >> > something on the contrib page that lets you check a query to see what >> > indexes it does use. >> > >> > Sorry if the above lacks detail, its kinda a busy day. >> > >> > -- >> > Ian >> > ___ >> > sqlite-users mailing list >> > sqlite-users@sqlite.org >> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23169251.html >> Sent from the SQLite mailing list archive at Nabble.com. >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > -- > > Jeremy Hinegardner
Re: [sqlite] Multi column ORDER BY across table peformance problem....
That last query should be: SELECT showing.startTime FROM showing WHERE showing.startTime >= 123923000 AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM channel ) ORDER BY showing.startTime LIMIT 8; I forgot the order by. enjoy, -jeremy On Tue, Apr 21, 2009 at 11:07:15PM -0600, Jeremy Hinegardner wrote: > How about trying: > > SELECT s.startTime > FROM ( SELECT showing.startTime AS startTime > ,showing.stationId AS stationId > FROM showing > WHERE showing.startTime >= 123923 > ORDER BY showing.startTime >LIMIT 8 ) AS s > JOIN channel AS c > ON s.stationId = c.stationId > ORDER BY s.startTime, c.ChannelMajorNumber > > I believe this should limit your table of 100,000 rows down to just the 8 you > want on the inner query, and then join that against the 100 rows in the > channel > table. > > Although, to tell you the truth, I don't see the purpose in your original > query: > > SELECT showing.startTime > FROM showing > JOIN channel > ON showing.startTime >= 123923 >AND showing.stationId = channel.stationId > ORDER BY showing.startTime, channel.ChannelMajorNumber > LIMIT 8; > > You are not doing anything with the channel table data other than joining it > agains the showing tables. To me that means you are joining against the > channel > table in order to filter out rows in the showing table that have stationId's > that do not exist in the channel table. > > Is that correct? If that is correct, then the query I gave will not do what > you > want. > > Can the goal you are attempting be stated as? > > Find next or earliest showings from the showings table for channels in the > channel table? > > If this is the case, maybe this query will work: > > SELECT showing.startTime > FROM showing >WHERE showing.startTime >= 123923000 > AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM > channel ) >LIMIT 8; > > enjoy, > > -jeremy > > > On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote: > > > > Writing the query as you said you would returns results in nearly instantly, > > within a few ms. > > > > The problem is when you add a secondary ordering field. The intended indices > > are being used. > > > > The problem, as I've explained several times already is that there is no way > > to create a multicolumn index across tables. Sqlite is using the stationId > > index on the channel table to join showings via the stationId. The query > > plan shows it clearly. When ordering by startTime or even startTime and > > stationId, the results are returned in a few ms. But when adding channel > > number(i.e startTime and then channel number), it jumps to two minutes. This > > is because sqlite is bringing in nearly 100K records from the showing table, > > sorted by time, because of the startTime index, before it starts sorting by > > channel number. It would sure be nice to be able to get sqlite to sort on > > the channel number for each time group. i.e. once all the 10:00 PM showings > > have been returned and the first 10:30 PM showing is about to be returned, > > sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing > > this but should. > > > > At this point, I'm going to assume that this is just a limitation in the > > sqlite engine and I'll have to proceed in other ways to solve this issue. > > > > > > > > Ian Walters wrote: > > > > > >> SELECT showing.startTime FROM showing JOIN channel ON > > >> showing.startTime >= > > >> 123923 AND showing.stationId = channel.stationId ORDER BY > > >> showing.startTime LIMIT 8; > > > > > > I don't know if it would be faster... but I would have written that. > > > > > > SELECT showing.startTime FROM showing JOIN channel USING (stationId) > > > WHERE startTime < 123923 ORDER BY startTime LIMIT 8. > > > > > > Also I know in the latest version of SQLite its possible to 'hint' > > > what indexes should be used, which might be helpful. There is also > > > something on the contrib page that lets you check a query to see what > > > indexes it does use. > > > > > > Sorry if the above lacks detail, its kinda a busy day. > > > > > > -- > > > Ian > > > ___ > > > sqlite-users mailing list > > > sqlite-users@sqlite.org > > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > > > > > > > -- > > View this message in context: > > http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23169251.html > > Sent from the SQLite mailing list archive at Nabble.com. > > > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > >
Re: [sqlite] Multi column ORDER BY across table peformance problem....
How about trying: SELECT s.startTime FROM ( SELECT showing.startTime AS startTime ,showing.stationId AS stationId FROM showing WHERE showing.startTime >= 123923 ORDER BY showing.startTime LIMIT 8 ) AS s JOIN channel AS c ON s.stationId = c.stationId ORDER BY s.startTime, c.ChannelMajorNumber I believe this should limit your table of 100,000 rows down to just the 8 you want on the inner query, and then join that against the 100 rows in the channel table. Although, to tell you the truth, I don't see the purpose in your original query: SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= 123923 AND showing.stationId = channel.stationId ORDER BY showing.startTime, channel.ChannelMajorNumber LIMIT 8; You are not doing anything with the channel table data other than joining it agains the showing tables. To me that means you are joining against the channel table in order to filter out rows in the showing table that have stationId's that do not exist in the channel table. Is that correct? If that is correct, then the query I gave will not do what you want. Can the goal you are attempting be stated as? Find next or earliest showings from the showings table for channels in the channel table? If this is the case, maybe this query will work: SELECT showing.startTime FROM showing WHERE showing.startTime >= 123923000 AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM channel ) LIMIT 8; enjoy, -jeremy On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote: > > Writing the query as you said you would returns results in nearly instantly, > within a few ms. > > The problem is when you add a secondary ordering field. The intended indices > are being used. > > The problem, as I've explained several times already is that there is no way > to create a multicolumn index across tables. Sqlite is using the stationId > index on the channel table to join showings via the stationId. The query > plan shows it clearly. When ordering by startTime or even startTime and > stationId, the results are returned in a few ms. But when adding channel > number(i.e startTime and then channel number), it jumps to two minutes. This > is because sqlite is bringing in nearly 100K records from the showing table, > sorted by time, because of the startTime index, before it starts sorting by > channel number. It would sure be nice to be able to get sqlite to sort on > the channel number for each time group. i.e. once all the 10:00 PM showings > have been returned and the first 10:30 PM showing is about to be returned, > sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing > this but should. > > At this point, I'm going to assume that this is just a limitation in the > sqlite engine and I'll have to proceed in other ways to solve this issue. > > > > Ian Walters wrote: > > > >> SELECT showing.startTime FROM showing JOIN channel ON > >> showing.startTime >= > >> 123923 AND showing.stationId = channel.stationId ORDER BY > >> showing.startTime LIMIT 8; > > > > I don't know if it would be faster... but I would have written that. > > > > SELECT showing.startTime FROM showing JOIN channel USING (stationId) > > WHERE startTime < 123923 ORDER BY startTime LIMIT 8. > > > > Also I know in the latest version of SQLite its possible to 'hint' > > what indexes should be used, which might be helpful. There is also > > something on the contrib page that lets you check a query to see what > > indexes it does use. > > > > Sorry if the above lacks detail, its kinda a busy day. > > > > -- > > Ian > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > > > -- > View this message in context: > http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23169251.html > Sent from the SQLite mailing list archive at Nabble.com. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Jeremy Hinegardner jer...@hinegardner.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
Writing the query as you said you would returns results in nearly instantly, within a few ms. The problem is when you add a secondary ordering field. The intended indices are being used. The problem, as I've explained several times already is that there is no way to create a multicolumn index across tables. Sqlite is using the stationId index on the channel table to join showings via the stationId. The query plan shows it clearly. When ordering by startTime or even startTime and stationId, the results are returned in a few ms. But when adding channel number(i.e startTime and then channel number), it jumps to two minutes. This is because sqlite is bringing in nearly 100K records from the showing table, sorted by time, because of the startTime index, before it starts sorting by channel number. It would sure be nice to be able to get sqlite to sort on the channel number for each time group. i.e. once all the 10:00 PM showings have been returned and the first 10:30 PM showing is about to be returned, sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing this but should. At this point, I'm going to assume that this is just a limitation in the sqlite engine and I'll have to proceed in other ways to solve this issue. Ian Walters wrote: > >> SELECT showing.startTime FROM showing JOIN channel ON >> showing.startTime >= >> 123923 AND showing.stationId = channel.stationId ORDER BY >> showing.startTime LIMIT 8; > > I don't know if it would be faster... but I would have written that. > > SELECT showing.startTime FROM showing JOIN channel USING (stationId) > WHERE startTime < 123923 ORDER BY startTime LIMIT 8. > > Also I know in the latest version of SQLite its possible to 'hint' > what indexes should be used, which might be helpful. There is also > something on the contrib page that lets you check a query to see what > indexes it does use. > > Sorry if the above lacks detail, its kinda a busy day. > > -- > Ian > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23169251.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
> SELECT showing.startTime FROM showing JOIN channel ON > showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; I don't know if it would be faster... but I would have written that. SELECT showing.startTime FROM showing JOIN channel USING (stationId) WHERE startTime < 123923 ORDER BY startTime LIMIT 8. Also I know in the latest version of SQLite its possible to 'hint' what indexes should be used, which might be helpful. There is also something on the contrib page that lets you check a query to see what indexes it does use. Sorry if the above lacks detail, its kinda a busy day. -- Ian ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
That's completely untrue. The only reason for the difference in the name is because I changed it recently. MajoreChannelNumber is ChannelNumber, the name just changed. I've included the full schema and all the indices and indicated about how many records are in each table. The schema example is super simple and the question is one that anyone reasonably familiar with sqlite should be able to answer. sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168893.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
sorry. Ive had good results with subqueries in resolving similar performance issues. --- On Tue, 4/21/09, sorka <sorka95...@gmail.com> wrote: From: sorka <sorka95...@gmail.com> Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem.... To: sqlite-users@sqlite.org Date: Tuesday, April 21, 2009, 10:22 PM I can't think of any reason why this would result in a faster query. But I went ahead and tried it on the off chance that maybe specifying the channel.ChannelMajorNumber inside an inner select might somehow trick sqlite into doing the right thing. Unfortunately, the result was the same. A few ms without the channel number and over 2 minutes with it. Again, sqlite is bringing in all results, not just the limit before it sorts on the channel number. I tried your query but had to alias the inner columns so they could be selected in the outer query. SELECT startTime FROM (SELECT showing.startTime as startTime, channel.ChannelMajorNumber as ChannelMajorNumber FROM showing JOIN channel ON showing.startTime >= 1240362000 AND showing.stationId = channel.stationId ) ORDER BY startTime, ChannelMajorNumber LIMIT 8; sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168567.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
On 22/04/2009 10:48 AM, sorka wrote: > Wow. Anybody? I figured this would be a simple question for the gurus on this > board. Seriously, nobody knows a better way to do this? Seriously, consider alternative possibilities > > > sorka wrote: >> This should be simple but apparently it isn't. >> >> I have two tables: >> "CREATE TABLE showing ( " >> "showingIdINTEGER PRIMARY KEY, " >> "stationId INTEGER, " >> "startTime INTEGER, " >> ") " >> CREATE INDEX showing_startTime on showing(startTime); >> >> >> AND >> "CREATE TABLE channel ( " >> " channelIdINTEGER PRIMARY KEY, " >> " ChannelNumber INTEGER, " >> " stationIdINTEGER, " >> " ) " >> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); >> CREATE INDEX channel_stationId on channel(stationId); >> >> When I do this select: >> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= >> 123923 AND showing.stationId = channel.stationId ORDER BY >> showing.startTime LIMIT 8; SQLite is smart enough to recognise that the showing_startTime index satisfies the ORDER BY clause ... so it's just traversing that index, filtering out cases of bogus stationID [I presume that's the reason for having the JOIN at all], and stopping when it's found 8 results. What is the result of: SELECT COUNT(*) FROM showing JOIN channel ON showing.startTime >= 123923 AND showing.stationId = channel.stationId; ? Is that 8 or 10 or somewhere in between? >> >> I get back the correct 8 results in about 3 milliseconds. >> >> If I throw a secondary order term in there say ChannelNumber: >> >> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= >> 123923 AND showing.stationId = channel.stationId ORDER BY >> showing.startTime, channel.ChannelMajorNumber LIMIT 8; What is channel.ChannelMajorNumber? It's not in the schema that you posted. People who might be interested in helping you might want to try running your queries prefaced with EXPLAIN and EXPLAIN QUERY PLAN (which doesn't need any data, just the CREATE statements). Perhaps they then give up. Your tables contain only presumingly meaningless integers. [You have given us the full schema for each table, haven't you? There aren't any 50 Kb BLOBs in there, are there?] So you wouldn't have a problem with making a zipped db with the two tables (no indexes) available on the web somewhere for people who wanted to experiment? >> >> It now takes over 120 seconds!!! >> >> I've tried various multi-column indices including one on >> channel(stationId, ChannelNumber). And was that index used by the query? Did you have ChannelNumber in the ORDER BY clause at the time, or something else? >> >> No difference. >> >> As far as I can tell, when ordering on columns that cross tables, sqlite >> will bring in all the records that match the equality or inequality search >> term on the first column before it does a secondary sort. What any SQL query engine has to do is produce results that are the same as those obtained by this method: 1. Produce temporary result set ignoring the ORDER BY and the LIMIT/OFFSET. This will include all columns from the SELECT plus any extra that appear only in the ORDER BY. 2. Sort the results according to the ORDER BY. 3. Read the sort output, discarding columns that appear only in the ORDER BY, and filtering according to the LIMIT/OFFSET. Is that what you mean? >> >> I have over 100,000 records in the showing table and about 100 records in >> the channel table. >> >> Sqlite should be smart enough to do the secondary sort on ChannelNumber as >> soon as it sees that the records coming back have a later startTime than >> the previous one. i.e. Say the first 5 records have the same startTime and >> different channel numbers. Then the next 5 records have a later start time >> than the first. Sqlite should be smart enough to see this as the results >> come back and do a secondary sort on ChannelNumber on the first 5 results >> and then rinse and repeat. What is "the secondary sort on ChannelNumber" that you imply is already being done, just not soon enough? >> >> What appears to be happening is that even though startTime is indexed, >> sqlite is bringing all 100K records into memory sorted by time before it >> starts to sort by channel number. Apart from that being one possible explanation of what does seem a long time, what grounds do you have for the above statement? >> >> Is there a way to get sqlite to do the right thing? If there was only a >> way to have a multi-column index that included columns from different >> tables. Oh wait, there is, it's called an intermediate table. However the >> cost of doing this is pretty high for reasons I can't go into here. >> >> Any ideas? Maybe I'm just
Re: [sqlite] Multi column ORDER BY across table peformance problem....
I can't think of any reason why this would result in a faster query. But I went ahead and tried it on the off chance that maybe specifying the channel.ChannelMajorNumber inside an inner select might somehow trick sqlite into doing the right thing. Unfortunately, the result was the same. A few ms without the channel number and over 2 minutes with it. Again, sqlite is bringing in all results, not just the limit before it sorts on the channel number. I tried your query but had to alias the inner columns so they could be selected in the outer query. SELECT startTime FROM (SELECT showing.startTime as startTime, channel.ChannelMajorNumber as ChannelMajorNumber FROM showing JOIN channel ON showing.startTime >= 1240362000 AND showing.stationId = channel.stationId ) ORDER BY startTime, ChannelMajorNumber LIMIT 8; sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168567.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
try this select showing.startTime, from ( SELECT showing.startTime, channel.ChannelMajorNumber FROM showing JOIN channel on showing.stationId = channel.stationId Where showing.startTime >= 123923) ORDER BY showing.startTime, channel.ChannelMajorNumber LIMIT 8; --- On Tue, 4/21/09, sorka <sorka95...@gmail.com> wrote: From: sorka <sorka95...@gmail.com> Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem To: sqlite-users@sqlite.org Date: Tuesday, April 21, 2009, 9:46 PM It's in my original post above. sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168281.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
It's in my original post above. sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168281.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
send me your original query please. thanks Woody --- On Tue, 4/21/09, sorka <sorka95...@gmail.com> wrote: From: sorka <sorka95...@gmail.com> Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem.... To: sqlite-users@sqlite.org Date: Tuesday, April 21, 2009, 9:29 PM No. This can't be broken down into a query within a query. Perhaps maybe if you spell out an example of what you're thinking? Thanks. sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168105.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
No. This can't be broken down into a query within a query. Perhaps maybe if you spell out an example of what you're thinking? Thanks. sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168105.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
did you try the subquery method i suggested? Woody --- On Tue, 4/21/09, sorka <sorka95...@gmail.com> wrote: From: sorka <sorka95...@gmail.com> Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem.... To: sqlite-users@sqlite.org Date: Tuesday, April 21, 2009, 8:48 PM Wow. Anybody? I figured this would be a simple question for the gurus on this board. Seriously, nobody knows a better way to do this? sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23166621.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
Wow. Anybody? I figured this would be a simple question for the gurus on this board. Seriously, nobody knows a better way to do this? sorka wrote: > > This should be simple but apparently it isn't. > > I have two tables: > "CREATE TABLE showing ( " > "showingIdINTEGER PRIMARY KEY, " > "stationId INTEGER, " > "startTime INTEGER, " > ") " > CREATE INDEX showing_startTime on showing(startTime); > > > AND > "CREATE TABLE channel ( " > " channelIdINTEGER PRIMARY KEY, " > " ChannelNumber INTEGER, " > " stationIdINTEGER, " > " ) " > CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); > CREATE INDEX channel_stationId on channel(stationId); > > When I do this select: > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime LIMIT 8; > > I get back the correct 8 results in about 3 milliseconds. > > If I throw a secondary order term in there say ChannelNumber: > > SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= > 123923 AND showing.stationId = channel.stationId ORDER BY > showing.startTime, channel.ChannelMajorNumber LIMIT 8; > > It now takes over 120 seconds!!! > > I've tried various multi-column indices including one on > channel(stationId, ChannelNumber). > > No difference. > > As far as I can tell, when ordering on columns that cross tables, sqlite > will bring in all the records that match the equality or inequality search > term on the first column before it does a secondary sort. > > I have over 100,000 records in the showing table and about 100 records in > the channel table. > > Sqlite should be smart enough to do the secondary sort on ChannelNumber as > soon as it sees that the records coming back have a later startTime than > the previous one. i.e. Say the first 5 records have the same startTime and > different channel numbers. Then the next 5 records have a later start time > than the first. Sqlite should be smart enough to see this as the results > come back and do a secondary sort on ChannelNumber on the first 5 results > and then rinse and repeat. > > What appears to be happening is that even though startTime is indexed, > sqlite is bringing all 100K records into memory sorted by time before it > starts to sort by channel number. > > Is there a way to get sqlite to do the right thing? If there was only a > way to have a multi-column index that included columns from different > tables. Oh wait, there is, it's called an intermediate table. However the > cost of doing this is pretty high for reasons I can't go into here. > > Any ideas? Maybe I'm just doing something wrong and this should be simple. > > Thanks. > > > > -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23166621.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi column ORDER BY across table peformance problem....
just curious; have you tried doing the select in a subquery, and then the order by in the outer query? woody --- On Fri, 4/17/09, sorka <sorka95...@gmail.com> wrote: From: sorka <sorka95...@gmail.com> Subject: [sqlite] Multi column ORDER BY across table peformance problem.... To: sqlite-users@sqlite.org Date: Friday, April 17, 2009, 11:12 PM This should be simple but apparently it isn't. I have two tables: "CREATE TABLE showing ( " "showingIdINTEGER PRIMARY KEY, " "stationId INTEGER, " "startTime INTEGER, " ") " CREATE INDEX showing_startTime on showing(startTime); AND "CREATE TABLE channel ( " " channelIdINTEGER PRIMARY KEY, " " ChannelNumber INTEGER, " " stationIdINTEGER, " " ) " CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); CREATE INDEX channel_stationId on channel(stationId); When I do this select: SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= 123923 AND showing.stationId = channel.stationId ORDER BY showing.startTime LIMIT 8; I get back the correct 8 results in about 3 milliseconds. If I throw a secondary order term in there say ChannelNumber: SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= 123923 AND showing.stationId = channel.stationId ORDER BY showing.startTime, channel.ChannelMajorNumber LIMIT 8; It now takes over 120 seconds!!! I've tried various multi-column indices including one on channel(stationId, ChannelNumber). No difference. As far as I can tell, when ordering on columns that cross tables, sqlite will bring in all the records that match the equality or inequality search term on the first column before it does a secondary sort. I have over 100,000 records in the showing table and about 100 records in the channel table. Sqlite should be smart enough to do the secondary sort on ChannelNumber as soon as it sees that the records coming back have a later startTime than the previous one. i.e. Say the first 5 records have the same startTime and different channel numbers. Then the next 5 records have a later start time than the first. Sqlite should be smart enough to see this as the results come back and do a secondary sort on ChannelNumber on the first 5 results and then rinse and repeat. What appears to be happening is that even though startTime is indexed, sqlite is bringing all 100K records into memory sorted by time before it starts to sort by channel number. Is there a way to get sqlite to do the right thing? If there was only a way to have a multi-column index that included columns from different tables. Oh wait, there is, it's called an intermediate table. However the cost of doing this is pretty high for reasons I can't go into here. Any ideas? Maybe I'm just doing something wrong and this should be simple. Thanks. -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23109024.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Multi column ORDER BY across table peformance problem....
This should be simple but apparently it isn't. I have two tables: "CREATE TABLE showing ( " "showingIdINTEGER PRIMARY KEY, " "stationId INTEGER, " "startTime INTEGER, " ") " CREATE INDEX showing_startTime on showing(startTime); AND "CREATE TABLE channel ( " " channelIdINTEGER PRIMARY KEY, " " ChannelNumber INTEGER, " " stationIdINTEGER, " " ) " CREATE INDEX channel_ChannelNumber on channel(ChannelNumber); CREATE INDEX channel_stationId on channel(stationId); When I do this select: SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= 123923 AND showing.stationId = channel.stationId ORDER BY showing.startTime LIMIT 8; I get back the correct 8 results in about 3 milliseconds. If I throw a secondary order term in there say ChannelNumber: SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >= 123923 AND showing.stationId = channel.stationId ORDER BY showing.startTime, channel.ChannelMajorNumber LIMIT 8; It now takes over 120 seconds!!! I've tried various multi-column indices including one on channel(stationId, ChannelNumber). No difference. As far as I can tell, when ordering on columns that cross tables, sqlite will bring in all the records that match the equality or inequality search term on the first column before it does a secondary sort. I have over 100,000 records in the showing table and about 100 records in the channel table. Sqlite should be smart enough to do the secondary sort on ChannelNumber as soon as it sees that the records coming back have a later startTime than the previous one. i.e. Say the first 5 records have the same startTime and different channel numbers. Then the next 5 records have a later start time than the first. Sqlite should be smart enough to see this as the results come back and do a secondary sort on ChannelNumber on the first 5 results and then rinse and repeat. What appears to be happening is that even though startTime is indexed, sqlite is bringing all 100K records into memory sorted by time before it starts to sort by channel number. Is there a way to get sqlite to do the right thing? If there was only a way to have a multi-column index that included columns from different tables. Oh wait, there is, it's called an intermediate table. However the cost of doing this is pretty high for reasons I can't go into here. Any ideas? Maybe I'm just doing something wrong and this should be simple. Thanks. -- View this message in context: http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23109024.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users