Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-23 Thread sorka

For this particular query, the only reason for the join on the channel table
is simply to order by channel number and that's it.

Currently, what I'm doing is leaving off the order by channel number. Since
this is done through the C++ interface, I'm stepping through results until a
showing with a new and greater time comes along. At that point, I take all
the showings I got with the previous start time and sort them all manually
with qsort by channel number. As soon as I have enough results for what I'm
looking for, I stop stepping through rows and finish the transaction.

It works well enough but it really seems that sql should have the ability to
be smarter about ordering.


Jeremy Hinegardner wrote:
> 
> Ah, i see, so you are going to be calling this many times, and slowly
> incrementing the value of startTime.  Yes, in this query, if you have more
> than
> 8 shows at a particular start time, then when you increment the time to go
> to
> the next 'batch' of shows you will miss all the ones from then end of the
> previous time.
> 
> If that is the case, how about just making the inner query 'WHERE
> showing.startTme = ' and leave off the limit, and call the
> whole
> thing repeatedly?  I doubt that would do what you want, since you querying
> for
> startTime which is what your where clause, so you'll either get back rows
> that
> are all the value of your where clause or no rows.
> 
> I guess more information about the whole problem is going to be in
> necessary to
> help further.  For instance, it appears you have a table of 'showings' and
> a
> table of 'channels' and the goal is, find the next N items from the
> showings
> table for channels in the channel table. 
> 
> enjoy,
> 
> -jeremy
> 
> 
> 
> On Tue, Apr 21, 2009 at 10:11:41PM -0700, sorka wrote:
>> 
>> Um, well, except that I'd have huge gaping holes. i.e. say there are 100
>> shows at or after the specified time all on different channels. The above
>> query would limit to the first 8 showings on or after that time *before*
>> sorting by channel number. 
>> 
>> 
>> 
>> Jeremy Hinegardner wrote:
>> > 
>> > How about trying:
>> > 
>> >   SELECT s.startTime
>> > FROM ( SELECT  showing.startTime AS startTime
>> >   ,showing.stationId AS stationId
>> >   FROM showing
>> >  WHERE showing.startTime >= 123923
>> >   ORDER BY showing.startTime
>> >LIMIT 8 ) AS s
>> > JOIN channel AS c
>> >   ON s.stationId = c.stationId
>> > ORDER BY s.startTime, c.ChannelMajorNumber
>> > 
>> > I believe this should limit your table of 100,000 rows down to just the
>> 8
>> > you
>> > want on the inner query, and then join that against the 100 rows in the
>> > channel
>> > table.
>> > 
>> > Although, to tell you the truth, I don't see the purpose in your
>> original
>> > query:
>> > 
>> > SELECT showing.startTime 
>> >   FROM showing 
>> >   JOIN channel 
>> > ON showing.startTime >= 123923 
>> >AND showing.stationId = channel.stationId
>> >   ORDER BY showing.startTime, channel.ChannelMajorNumber 
>> >  LIMIT 8;
>> > 
>> > You are not doing anything with the channel table data other than
>> joining
>> > it
>> > agains the showing tables.  To me that means you are joining against
>> the
>> > channel
>> > table in order to filter out rows in the showing table that have
>> > stationId's
>> > that do not exist in the channel table.  
>> > 
>> > Is that correct?  If that is correct, then the query I gave will not do
>> > what you
>> > want.  
>> > 
>> > Can the goal you are attempting be stated as?
>> > 
>> > Find next or earliest showings from the showings table for channels
>> in
>> > the
>> > channel table?
>> > 
>> > If this is the case, maybe this query will work:
>> > 
>> >   SELECT showing.startTime
>> > FROM showing
>> >WHERE showing.startTime >= 123923000
>> >  AND showing.stationId IN ( SELECT DISTINCT channel.stationId
>> FROM
>> > channel )
>> >LIMIT 8;
>> > 
>> > enjoy,
>> > 
>> > -jeremy
>> > 
>> >  
>> > On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote:
>> >> 
>> >> Writing the query as you said you would returns results in nearly
>> >> instantly,
>> >> within a few ms. 
>> >> 
>> >> The problem is when you add a secondary ordering field. The intended
>> >> indices
>> >> are being used.
>> >> 
>> >> The problem, as I've explained several times already is that there is
>> no
>> >> way
>> >> to create a multicolumn index across tables. Sqlite is using the
>> >> stationId
>> >> index on the channel table to join showings via the stationId. The
>> query
>> >> plan shows it clearly. When ordering by startTime or even startTime
>> and
>> >> stationId, the results are returned in a few ms. But when adding
>> channel
>> >> number(i.e startTime and then channel number), it jumps to two
>> minutes.
>> >> This
>> >> is because sqlite is bringing in nearly 100K 

Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-23 Thread Jeremy Hinegardner
Ah, i see, so you are going to be calling this many times, and slowly
incrementing the value of startTime.  Yes, in this query, if you have more than
8 shows at a particular start time, then when you increment the time to go to
the next 'batch' of shows you will miss all the ones from then end of the
previous time.

If that is the case, how about just making the inner query 'WHERE
showing.startTme = ' and leave off the limit, and call the whole
thing repeatedly?  I doubt that would do what you want, since you querying for
startTime which is what your where clause, so you'll either get back rows that
are all the value of your where clause or no rows.

I guess more information about the whole problem is going to be in necessary to
help further.  For instance, it appears you have a table of 'showings' and a
table of 'channels' and the goal is, find the next N items from the showings
table for channels in the channel table. 

enjoy,

-jeremy



On Tue, Apr 21, 2009 at 10:11:41PM -0700, sorka wrote:
> 
> Um, well, except that I'd have huge gaping holes. i.e. say there are 100
> shows at or after the specified time all on different channels. The above
> query would limit to the first 8 showings on or after that time *before*
> sorting by channel number. 
> 
> 
> 
> Jeremy Hinegardner wrote:
> > 
> > How about trying:
> > 
> >   SELECT s.startTime
> > FROM ( SELECT  showing.startTime AS startTime
> >   ,showing.stationId AS stationId
> >   FROM showing
> >  WHERE showing.startTime >= 123923
> >   ORDER BY showing.startTime
> >LIMIT 8 ) AS s
> > JOIN channel AS c
> >   ON s.stationId = c.stationId
> > ORDER BY s.startTime, c.ChannelMajorNumber
> > 
> > I believe this should limit your table of 100,000 rows down to just the 8
> > you
> > want on the inner query, and then join that against the 100 rows in the
> > channel
> > table.
> > 
> > Although, to tell you the truth, I don't see the purpose in your original
> > query:
> > 
> > SELECT showing.startTime 
> >   FROM showing 
> >   JOIN channel 
> > ON showing.startTime >= 123923 
> >AND showing.stationId = channel.stationId
> >   ORDER BY showing.startTime, channel.ChannelMajorNumber 
> >  LIMIT 8;
> > 
> > You are not doing anything with the channel table data other than joining
> > it
> > agains the showing tables.  To me that means you are joining against the
> > channel
> > table in order to filter out rows in the showing table that have
> > stationId's
> > that do not exist in the channel table.  
> > 
> > Is that correct?  If that is correct, then the query I gave will not do
> > what you
> > want.  
> > 
> > Can the goal you are attempting be stated as?
> > 
> > Find next or earliest showings from the showings table for channels in
> > the
> > channel table?
> > 
> > If this is the case, maybe this query will work:
> > 
> >   SELECT showing.startTime
> > FROM showing
> >WHERE showing.startTime >= 123923000
> >  AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM
> > channel )
> >LIMIT 8;
> > 
> > enjoy,
> > 
> > -jeremy
> > 
> >  
> > On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote:
> >> 
> >> Writing the query as you said you would returns results in nearly
> >> instantly,
> >> within a few ms. 
> >> 
> >> The problem is when you add a secondary ordering field. The intended
> >> indices
> >> are being used.
> >> 
> >> The problem, as I've explained several times already is that there is no
> >> way
> >> to create a multicolumn index across tables. Sqlite is using the
> >> stationId
> >> index on the channel table to join showings via the stationId. The query
> >> plan shows it clearly. When ordering by startTime or even startTime and
> >> stationId, the results are returned in a few ms. But when adding channel
> >> number(i.e startTime and then channel number), it jumps to two minutes.
> >> This
> >> is because sqlite is bringing in nearly 100K records from the showing
> >> table,
> >> sorted by time, because of the startTime index, before it starts sorting
> >> by
> >> channel number. It would sure be nice to be able to get sqlite to sort on
> >> the channel number for each time group. i.e. once all the 10:00 PM
> >> showings
> >> have been returned and the first 10:30 PM showing is about to be
> >> returned,
> >> sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing
> >> this but should. 
> >> 
> >> At this point, I'm going to assume that this is just a limitation in the
> >> sqlite engine and I'll have to proceed in other ways to solve this issue.
> >> 
> >> 
> >> 
> >> Ian Walters wrote:
> >> > 
> >> >> SELECT showing.startTime FROM showing JOIN channel ON  
> >> >> showing.startTime >=
> >> >> 123923 AND showing.stationId = channel.stationId ORDER BY
> >> >> showing.startTime LIMIT 8;
> >> > 
> >> > I don't 

Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread Harold Wood
how about
 

select Starttime
 from channel C
 inner join (select showing.startTime AS startTime,
    showing.stationId AS stationId
    FROM showing
   WHERE showing.startTime >= 123923
     ORDER BY showing.startTime
     LIMIT 8 ) AS s
   on S.StationId = C.stationId
 ORDER BY s.startTime, c.ChannelMajorNumber
 limit 8
 


--- On Wed, 4/22/09, sorka <sorka95...@gmail.com> wrote:

From: sorka <sorka95...@gmail.com>
Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem
To: sqlite-users@sqlite.org
Date: Wednesday, April 22, 2009, 1:11 AM

Um, well, except that I'd have huge gaping holes. i.e. say there are 100
shows at or after the specified time all on different channels. The above
query would limit to the first 8 showings on or after that time *before*
sorting by channel number. 



Jeremy Hinegardner wrote:
> 
> How about trying:
> 
>   SELECT s.startTime
> FROM ( SELECT  showing.startTime AS startTime
>   ,showing.stationId AS stationId
>   FROM showing
>  WHERE showing.startTime >= 123923
>   ORDER BY showing.startTime
>LIMIT 8 ) AS s
> JOIN channel AS c
>   ON s.stationId = c.stationId
> ORDER BY s.startTime, c.ChannelMajorNumber
> 
> I believe this should limit your table of 100,000 rows down to just the 8
> you
> want on the inner query, and then join that against the 100 rows in the
> channel
> table.
> 
> Although, to tell you the truth, I don't see the purpose in your
original
> query:
> 
> SELECT showing.startTime 
>   FROM showing 
>   JOIN channel 
> ON showing.startTime >= 123923 
>AND showing.stationId = channel.stationId
>   ORDER BY showing.startTime, channel.ChannelMajorNumber 
>  LIMIT 8;
> 
> You are not doing anything with the channel table data other than joining
> it
> agains the showing tables.  To me that means you are joining against the
> channel
> table in order to filter out rows in the showing table that have
> stationId's
> that do not exist in the channel table.  
> 
> Is that correct?  If that is correct, then the query I gave will not do
> what you
> want.  
> 
> Can the goal you are attempting be stated as?
> 
> Find next or earliest showings from the showings table for channels in
> the
> channel table?
> 
> If this is the case, maybe this query will work:
> 
>   SELECT showing.startTime
> FROM showing
>WHERE showing.startTime >= 123923000
>  AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM
> channel )
>LIMIT 8;
> 
> enjoy,
> 
> -jeremy
> 
>  
> On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote:
>> 
>> Writing the query as you said you would returns results in nearly
>> instantly,
>> within a few ms. 
>> 
>> The problem is when you add a secondary ordering field. The intended
>> indices
>> are being used.
>> 
>> The problem, as I've explained several times already is that there
is no
>> way
>> to create a multicolumn index across tables. Sqlite is using the
>> stationId
>> index on the channel table to join showings via the stationId. The
query
>> plan shows it clearly. When ordering by startTime or even startTime
and
>> stationId, the results are returned in a few ms. But when adding
channel
>> number(i.e startTime and then channel number), it jumps to two
minutes.
>> This
>> is because sqlite is bringing in nearly 100K records from the showing
>> table,
>> sorted by time, because of the startTime index, before it starts
sorting
>> by
>> channel number. It would sure be nice to be able to get sqlite to sort
on
>> the channel number for each time group. i.e. once all the 10:00 PM
>> showings
>> have been returned and the first 10:30 PM showing is about to be
>> returned,
>> sort all of the 10:00 PM showings by channel number. Sqlite is NOT
doing
>> this but should. 
>> 
>> At this point, I'm going to assume that this is just a limitation
in the
>> sqlite engine and I'll have to proceed in other ways to solve this
issue.
>> 
>> 
>> 
>> Ian Walters wrote:
>> > 
>> >> SELECT showing.startTime FROM showing JOIN channel ON  
>> >> showing.startTime >=
>> >> 123923 AND showing.stationId = channel.stationId ORDER BY
>> >> showing.startTime LIMIT 8;
>> > 
>> > I don't know if it would be faster... but I would

Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread sorka

Um, well, except that I'd have huge gaping holes. i.e. say there are 100
shows at or after the specified time all on different channels. The above
query would limit to the first 8 showings on or after that time *before*
sorting by channel number. 



Jeremy Hinegardner wrote:
> 
> How about trying:
> 
>   SELECT s.startTime
> FROM ( SELECT  showing.startTime AS startTime
>   ,showing.stationId AS stationId
>   FROM showing
>  WHERE showing.startTime >= 123923
>   ORDER BY showing.startTime
>LIMIT 8 ) AS s
> JOIN channel AS c
>   ON s.stationId = c.stationId
> ORDER BY s.startTime, c.ChannelMajorNumber
> 
> I believe this should limit your table of 100,000 rows down to just the 8
> you
> want on the inner query, and then join that against the 100 rows in the
> channel
> table.
> 
> Although, to tell you the truth, I don't see the purpose in your original
> query:
> 
> SELECT showing.startTime 
>   FROM showing 
>   JOIN channel 
> ON showing.startTime >= 123923 
>AND showing.stationId = channel.stationId
>   ORDER BY showing.startTime, channel.ChannelMajorNumber 
>  LIMIT 8;
> 
> You are not doing anything with the channel table data other than joining
> it
> agains the showing tables.  To me that means you are joining against the
> channel
> table in order to filter out rows in the showing table that have
> stationId's
> that do not exist in the channel table.  
> 
> Is that correct?  If that is correct, then the query I gave will not do
> what you
> want.  
> 
> Can the goal you are attempting be stated as?
> 
> Find next or earliest showings from the showings table for channels in
> the
> channel table?
> 
> If this is the case, maybe this query will work:
> 
>   SELECT showing.startTime
> FROM showing
>WHERE showing.startTime >= 123923000
>  AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM
> channel )
>LIMIT 8;
> 
> enjoy,
> 
> -jeremy
> 
>  
> On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote:
>> 
>> Writing the query as you said you would returns results in nearly
>> instantly,
>> within a few ms. 
>> 
>> The problem is when you add a secondary ordering field. The intended
>> indices
>> are being used.
>> 
>> The problem, as I've explained several times already is that there is no
>> way
>> to create a multicolumn index across tables. Sqlite is using the
>> stationId
>> index on the channel table to join showings via the stationId. The query
>> plan shows it clearly. When ordering by startTime or even startTime and
>> stationId, the results are returned in a few ms. But when adding channel
>> number(i.e startTime and then channel number), it jumps to two minutes.
>> This
>> is because sqlite is bringing in nearly 100K records from the showing
>> table,
>> sorted by time, because of the startTime index, before it starts sorting
>> by
>> channel number. It would sure be nice to be able to get sqlite to sort on
>> the channel number for each time group. i.e. once all the 10:00 PM
>> showings
>> have been returned and the first 10:30 PM showing is about to be
>> returned,
>> sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing
>> this but should. 
>> 
>> At this point, I'm going to assume that this is just a limitation in the
>> sqlite engine and I'll have to proceed in other ways to solve this issue.
>> 
>> 
>> 
>> Ian Walters wrote:
>> > 
>> >> SELECT showing.startTime FROM showing JOIN channel ON  
>> >> showing.startTime >=
>> >> 123923 AND showing.stationId = channel.stationId ORDER BY
>> >> showing.startTime LIMIT 8;
>> > 
>> > I don't know if it would be faster... but I would have written that.
>> > 
>> > SELECT showing.startTime FROM showing JOIN channel USING (stationId)  
>> > WHERE startTime < 123923 ORDER BY startTime LIMIT 8.
>> > 
>> > Also I know in the latest version of SQLite its possible to 'hint'  
>> > what indexes should be used, which might be helpful.  There is also  
>> > something on the contrib page that lets you check a query to see what  
>> > indexes it does use.
>> > 
>> > Sorry if the above lacks detail, its kinda a busy day.
>> > 
>> > --
>> > Ian
>> > ___
>> > sqlite-users mailing list
>> > sqlite-users@sqlite.org
>> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>> > 
>> > 
>> 
>> -- 
>> View this message in context:
>> http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23169251.html
>> Sent from the SQLite mailing list archive at Nabble.com.
>> 
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> -- 
> 
>  Jeremy Hinegardner   

Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread Jeremy Hinegardner
That last query should be:

   SELECT showing.startTime
 FROM showing
WHERE showing.startTime >= 123923000
  AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM 
channel )
 ORDER BY showing.startTime
LIMIT 8;

I forgot the order by.

enjoy,

-jeremy


On Tue, Apr 21, 2009 at 11:07:15PM -0600, Jeremy Hinegardner wrote:
> How about trying:
> 
>   SELECT s.startTime
> FROM ( SELECT  showing.startTime AS startTime
>   ,showing.stationId AS stationId
>   FROM showing
>  WHERE showing.startTime >= 123923
>   ORDER BY showing.startTime
>LIMIT 8 ) AS s
> JOIN channel AS c
>   ON s.stationId = c.stationId
> ORDER BY s.startTime, c.ChannelMajorNumber
> 
> I believe this should limit your table of 100,000 rows down to just the 8 you
> want on the inner query, and then join that against the 100 rows in the 
> channel
> table.
> 
> Although, to tell you the truth, I don't see the purpose in your original 
> query:
> 
> SELECT showing.startTime 
>   FROM showing 
>   JOIN channel 
> ON showing.startTime >= 123923 
>AND showing.stationId = channel.stationId
>   ORDER BY showing.startTime, channel.ChannelMajorNumber 
>  LIMIT 8;
> 
> You are not doing anything with the channel table data other than joining it
> agains the showing tables.  To me that means you are joining against the 
> channel
> table in order to filter out rows in the showing table that have stationId's
> that do not exist in the channel table.  
> 
> Is that correct?  If that is correct, then the query I gave will not do what 
> you
> want.  
> 
> Can the goal you are attempting be stated as?
> 
> Find next or earliest showings from the showings table for channels in the
> channel table?
> 
> If this is the case, maybe this query will work:
> 
>   SELECT showing.startTime
> FROM showing
>WHERE showing.startTime >= 123923000
>  AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM 
> channel )
>LIMIT 8;
> 
> enjoy,
> 
> -jeremy
> 
>  
> On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote:
> > 
> > Writing the query as you said you would returns results in nearly instantly,
> > within a few ms. 
> > 
> > The problem is when you add a secondary ordering field. The intended indices
> > are being used.
> > 
> > The problem, as I've explained several times already is that there is no way
> > to create a multicolumn index across tables. Sqlite is using the stationId
> > index on the channel table to join showings via the stationId. The query
> > plan shows it clearly. When ordering by startTime or even startTime and
> > stationId, the results are returned in a few ms. But when adding channel
> > number(i.e startTime and then channel number), it jumps to two minutes. This
> > is because sqlite is bringing in nearly 100K records from the showing table,
> > sorted by time, because of the startTime index, before it starts sorting by
> > channel number. It would sure be nice to be able to get sqlite to sort on
> > the channel number for each time group. i.e. once all the 10:00 PM showings
> > have been returned and the first 10:30 PM showing is about to be returned,
> > sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing
> > this but should. 
> > 
> > At this point, I'm going to assume that this is just a limitation in the
> > sqlite engine and I'll have to proceed in other ways to solve this issue.
> > 
> > 
> > 
> > Ian Walters wrote:
> > > 
> > >> SELECT showing.startTime FROM showing JOIN channel ON  
> > >> showing.startTime >=
> > >> 123923 AND showing.stationId = channel.stationId ORDER BY
> > >> showing.startTime LIMIT 8;
> > > 
> > > I don't know if it would be faster... but I would have written that.
> > > 
> > > SELECT showing.startTime FROM showing JOIN channel USING (stationId)  
> > > WHERE startTime < 123923 ORDER BY startTime LIMIT 8.
> > > 
> > > Also I know in the latest version of SQLite its possible to 'hint'  
> > > what indexes should be used, which might be helpful.  There is also  
> > > something on the contrib page that lets you check a query to see what  
> > > indexes it does use.
> > > 
> > > Sorry if the above lacks detail, its kinda a busy day.
> > > 
> > > --
> > > Ian
> > > ___
> > > sqlite-users mailing list
> > > sqlite-users@sqlite.org
> > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> > > 
> > > 
> > 
> > -- 
> > View this message in context: 
> > http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23169251.html
> > Sent from the SQLite mailing list archive at Nabble.com.
> > 
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > 

Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread Jeremy Hinegardner
How about trying:

  SELECT s.startTime
FROM ( SELECT  showing.startTime AS startTime
  ,showing.stationId AS stationId
  FROM showing
 WHERE showing.startTime >= 123923
  ORDER BY showing.startTime
   LIMIT 8 ) AS s
JOIN channel AS c
  ON s.stationId = c.stationId
ORDER BY s.startTime, c.ChannelMajorNumber

I believe this should limit your table of 100,000 rows down to just the 8 you
want on the inner query, and then join that against the 100 rows in the channel
table.

Although, to tell you the truth, I don't see the purpose in your original query:

SELECT showing.startTime 
  FROM showing 
  JOIN channel 
ON showing.startTime >= 123923 
   AND showing.stationId = channel.stationId
  ORDER BY showing.startTime, channel.ChannelMajorNumber 
 LIMIT 8;

You are not doing anything with the channel table data other than joining it
agains the showing tables.  To me that means you are joining against the channel
table in order to filter out rows in the showing table that have stationId's
that do not exist in the channel table.  

Is that correct?  If that is correct, then the query I gave will not do what you
want.  

Can the goal you are attempting be stated as?

Find next or earliest showings from the showings table for channels in the
channel table?

If this is the case, maybe this query will work:

  SELECT showing.startTime
FROM showing
   WHERE showing.startTime >= 123923000
 AND showing.stationId IN ( SELECT DISTINCT channel.stationId FROM 
channel )
   LIMIT 8;

enjoy,

-jeremy

 
On Tue, Apr 21, 2009 at 08:58:56PM -0700, sorka wrote:
> 
> Writing the query as you said you would returns results in nearly instantly,
> within a few ms. 
> 
> The problem is when you add a secondary ordering field. The intended indices
> are being used.
> 
> The problem, as I've explained several times already is that there is no way
> to create a multicolumn index across tables. Sqlite is using the stationId
> index on the channel table to join showings via the stationId. The query
> plan shows it clearly. When ordering by startTime or even startTime and
> stationId, the results are returned in a few ms. But when adding channel
> number(i.e startTime and then channel number), it jumps to two minutes. This
> is because sqlite is bringing in nearly 100K records from the showing table,
> sorted by time, because of the startTime index, before it starts sorting by
> channel number. It would sure be nice to be able to get sqlite to sort on
> the channel number for each time group. i.e. once all the 10:00 PM showings
> have been returned and the first 10:30 PM showing is about to be returned,
> sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing
> this but should. 
> 
> At this point, I'm going to assume that this is just a limitation in the
> sqlite engine and I'll have to proceed in other ways to solve this issue.
> 
> 
> 
> Ian Walters wrote:
> > 
> >> SELECT showing.startTime FROM showing JOIN channel ON  
> >> showing.startTime >=
> >> 123923 AND showing.stationId = channel.stationId ORDER BY
> >> showing.startTime LIMIT 8;
> > 
> > I don't know if it would be faster... but I would have written that.
> > 
> > SELECT showing.startTime FROM showing JOIN channel USING (stationId)  
> > WHERE startTime < 123923 ORDER BY startTime LIMIT 8.
> > 
> > Also I know in the latest version of SQLite its possible to 'hint'  
> > what indexes should be used, which might be helpful.  There is also  
> > something on the contrib page that lets you check a query to see what  
> > indexes it does use.
> > 
> > Sorry if the above lacks detail, its kinda a busy day.
> > 
> > --
> > Ian
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> > 
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23169251.html
> Sent from the SQLite mailing list archive at Nabble.com.
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

-- 

 Jeremy Hinegardner  jer...@hinegardner.org 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread sorka

Writing the query as you said you would returns results in nearly instantly,
within a few ms. 

The problem is when you add a secondary ordering field. The intended indices
are being used.

The problem, as I've explained several times already is that there is no way
to create a multicolumn index across tables. Sqlite is using the stationId
index on the channel table to join showings via the stationId. The query
plan shows it clearly. When ordering by startTime or even startTime and
stationId, the results are returned in a few ms. But when adding channel
number(i.e startTime and then channel number), it jumps to two minutes. This
is because sqlite is bringing in nearly 100K records from the showing table,
sorted by time, because of the startTime index, before it starts sorting by
channel number. It would sure be nice to be able to get sqlite to sort on
the channel number for each time group. i.e. once all the 10:00 PM showings
have been returned and the first 10:30 PM showing is about to be returned,
sort all of the 10:00 PM showings by channel number. Sqlite is NOT doing
this but should. 

At this point, I'm going to assume that this is just a limitation in the
sqlite engine and I'll have to proceed in other ways to solve this issue.



Ian Walters wrote:
> 
>> SELECT showing.startTime FROM showing JOIN channel ON  
>> showing.startTime >=
>> 123923 AND showing.stationId = channel.stationId ORDER BY
>> showing.startTime LIMIT 8;
> 
> I don't know if it would be faster... but I would have written that.
> 
> SELECT showing.startTime FROM showing JOIN channel USING (stationId)  
> WHERE startTime < 123923 ORDER BY startTime LIMIT 8.
> 
> Also I know in the latest version of SQLite its possible to 'hint'  
> what indexes should be used, which might be helpful.  There is also  
> something on the contrib page that lets you check a query to see what  
> indexes it does use.
> 
> Sorry if the above lacks detail, its kinda a busy day.
> 
> --
> Ian
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23169251.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread Ian Walters
> SELECT showing.startTime FROM showing JOIN channel ON  
> showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;

I don't know if it would be faster... but I would have written that.

SELECT showing.startTime FROM showing JOIN channel USING (stationId)  
WHERE startTime < 123923 ORDER BY startTime LIMIT 8.

Also I know in the latest version of SQLite its possible to 'hint'  
what indexes should be used, which might be helpful.  There is also  
something on the contrib page that lets you check a query to see what  
indexes it does use.

Sorry if the above lacks detail, its kinda a busy day.

--
Ian
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread sorka

That's completely untrue. The only reason for the difference in the name is
because I changed it recently. MajoreChannelNumber is ChannelNumber, the
name just changed.

I've included the full schema and all the indices and indicated about how
many records are in each table.

The schema example is super simple and the question is one that anyone
reasonably familiar with sqlite should be able to answer.



sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168893.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread Harold Wood
sorry.  Ive had good results with subqueries in resolving similar performance 
issues.

--- On Tue, 4/21/09, sorka <sorka95...@gmail.com> wrote:

From: sorka <sorka95...@gmail.com>
Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem....
To: sqlite-users@sqlite.org
Date: Tuesday, April 21, 2009, 10:22 PM

I can't think of any reason why this would result in a faster query. But I
went ahead and tried it on the off chance that maybe specifying the
channel.ChannelMajorNumber inside an inner select might somehow trick sqlite
into doing the right thing. Unfortunately, the result was the same. A few ms
without the channel number and over 2 minutes with it. Again, sqlite is
bringing in all results, not just the limit before it sorts on the channel
number. 

I tried your query but had to alias the inner columns so they could be
selected in the outer query.

SELECT startTime FROM (SELECT showing.startTime as startTime,
channel.ChannelMajorNumber as ChannelMajorNumber FROM showing JOIN channel
ON showing.startTime >= 1240362000 AND showing.stationId = channel.stationId
) ORDER BY startTime, ChannelMajorNumber LIMIT 8;



sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime
>=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime
>=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However
the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be
simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168567.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread John Machin
On 22/04/2009 10:48 AM, sorka wrote:
> Wow. Anybody? I figured this would be a simple question for the gurus on this
> board. Seriously, nobody knows a better way to do this?

Seriously, consider alternative possibilities

> 
> 
> sorka wrote:
>> This should be simple but apparently it isn't.
>>
>> I have two tables:
>> "CREATE TABLE showing ( "
>> "showingIdINTEGER PRIMARY KEY, "
>> "stationId  INTEGER, "
>> "startTime  INTEGER, "
>> ") "
>> CREATE INDEX showing_startTime on showing(startTime);
>>
>>
>> AND
>> "CREATE TABLE channel ( "
>> "  channelIdINTEGER PRIMARY KEY, "
>> "  ChannelNumber   INTEGER, "
>> "  stationIdINTEGER, "
>> "  ) "
>> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
>> CREATE INDEX channel_stationId on channel(stationId);
>>
>> When I do this select:
>> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
>> 123923 AND showing.stationId = channel.stationId ORDER BY
>> showing.startTime LIMIT 8;

SQLite is smart enough to recognise that the showing_startTime index 
satisfies the ORDER BY clause ... so it's just traversing that index, 
filtering out cases of bogus stationID [I presume that's the reason for 
having the JOIN at all], and stopping when it's found 8 results.

What is the result of:
SELECT COUNT(*) FROM showing JOIN channel ON showing.startTime >=
123923 AND showing.stationId = channel.stationId;
?
Is that 8 or 10 or somewhere in between?


>>
>> I get back the correct 8 results in about 3 milliseconds. 
>>
>> If I throw a secondary order term in there say ChannelNumber:
>>
>> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
>> 123923 AND showing.stationId = channel.stationId ORDER BY
>> showing.startTime, channel.ChannelMajorNumber LIMIT 8;

What is channel.ChannelMajorNumber? It's not in the schema that you 
posted. People who might be interested in helping you might want to try 
running your queries prefaced with EXPLAIN and EXPLAIN QUERY PLAN (which 
doesn't need any data, just the CREATE statements). Perhaps they then 
give up.

Your tables contain only presumingly meaningless integers. [You have 
given us the full schema for each table, haven't you? There aren't any 
50 Kb BLOBs in there, are there?] So you wouldn't have a problem with 
making a zipped db with the two tables (no indexes) available on the web 
somewhere for people who wanted to experiment?

>>
>> It now takes over 120 seconds!!!
>>
>> I've tried various multi-column indices including one on
>> channel(stationId, ChannelNumber).

And was that index used by the query? Did you have ChannelNumber in the 
ORDER BY clause at the time, or something else?

>>
>> No difference.
>>
>> As far as I can tell, when ordering on columns that cross tables, sqlite
>> will bring in all the records that match the equality or inequality search
>> term on the first column before it does a secondary sort.


What any SQL query engine has to do is produce results that are the same 
as those obtained by this method:
1. Produce temporary result set ignoring the ORDER BY and the 
LIMIT/OFFSET. This will include all columns from the SELECT plus any 
extra that appear only in the ORDER BY.
2. Sort the results according to the ORDER BY.
3. Read the sort output, discarding columns that appear only in the 
ORDER BY, and filtering according to the LIMIT/OFFSET.

Is that what you mean?

>>
>> I have over 100,000 records in the showing table and about 100 records in
>> the channel table.
>>
>> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
>> soon as it sees that the records coming back have a later startTime than
>> the previous one. i.e. Say the first 5 records have the same startTime and
>> different channel numbers. Then the next 5 records have a later start time
>> than the first. Sqlite should be smart enough to see this as the results
>> come back and do a secondary sort on ChannelNumber on the first 5 results
>> and then rinse and repeat.

What is "the secondary sort on ChannelNumber" that you imply is already 
being done, just not soon enough?

>>
>> What appears to be happening is that even though startTime is indexed,
>> sqlite is bringing all 100K records into memory sorted by time before it
>> starts to sort by channel number. 

Apart from that being one possible explanation of what does seem a long 
time, what grounds do you have for the above statement?


>>
>> Is there a way to get sqlite to do the right thing? If there was only a
>> way to have a multi-column index that included columns from different
>> tables. Oh wait, there is, it's called an intermediate table. However the
>> cost of doing this is pretty high for reasons I can't go into here. 
>>
>> Any ideas? Maybe I'm just 

Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread sorka

I can't think of any reason why this would result in a faster query. But I
went ahead and tried it on the off chance that maybe specifying the
channel.ChannelMajorNumber inside an inner select might somehow trick sqlite
into doing the right thing. Unfortunately, the result was the same. A few ms
without the channel number and over 2 minutes with it. Again, sqlite is
bringing in all results, not just the limit before it sorts on the channel
number. 

I tried your query but had to alias the inner columns so they could be
selected in the outer query.

SELECT startTime FROM (SELECT showing.startTime as startTime,
channel.ChannelMajorNumber as ChannelMajorNumber FROM showing JOIN channel
ON showing.startTime >= 1240362000 AND showing.stationId = channel.stationId
) ORDER BY startTime, ChannelMajorNumber LIMIT 8;



sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168567.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread Harold Wood
try this
 

select showing.startTime, from (
 SELECT showing.startTime,
   channel.ChannelMajorNumber
  FROM showing 
  JOIN channel 
   on showing.stationId = channel.stationId
  Where showing.startTime >=  123923) 
ORDER BY showing.startTime, channel.ChannelMajorNumber 
LIMIT 8;
 
 
 
--- On Tue, 4/21/09, sorka <sorka95...@gmail.com> wrote:

From: sorka <sorka95...@gmail.com>
Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem
To: sqlite-users@sqlite.org
Date: Tuesday, April 21, 2009, 9:46 PM

It's in my original post above.



sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime
>=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime
>=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However
the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be
simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168281.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread sorka

It's in my original post above.



sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168281.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread Harold Wood
send me your original query please.
 
thanks
 
Woody

--- On Tue, 4/21/09, sorka <sorka95...@gmail.com> wrote:

From: sorka <sorka95...@gmail.com>
Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem....
To: sqlite-users@sqlite.org
Date: Tuesday, April 21, 2009, 9:29 PM

No. This can't be broken down into a query within a query. Perhaps maybe if
you spell out an example of what you're thinking?

Thanks.



sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime
>=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime
>=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However
the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be
simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168105.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread sorka

No. This can't be broken down into a query within a query. Perhaps maybe if
you spell out an example of what you're thinking?

Thanks.



sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23168105.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread Harold Wood
did you try the subquery method i suggested?
 
Woody

--- On Tue, 4/21/09, sorka <sorka95...@gmail.com> wrote:

From: sorka <sorka95...@gmail.com>
Subject: Re: [sqlite] Multi column ORDER BY across table peformance problem....
To: sqlite-users@sqlite.org
Date: Tuesday, April 21, 2009, 8:48 PM

Wow. Anybody? I figured this would be a simple question for the gurus on this
board. Seriously, nobody knows a better way to do this?




sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime
>=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime
>=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However
the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be
simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23166621.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-21 Thread sorka

Wow. Anybody? I figured this would be a simple question for the gurus on this
board. Seriously, nobody knows a better way to do this?




sorka wrote:
> 
> This should be simple but apparently it isn't.
> 
> I have two tables:
> "CREATE TABLE showing ( "
> "showingIdINTEGER PRIMARY KEY, "
> "stationId  INTEGER, "
> "startTime  INTEGER, "
> ") "
> CREATE INDEX showing_startTime on showing(startTime);
> 
> 
> AND
> "CREATE TABLE channel ( "
> "  channelIdINTEGER PRIMARY KEY, "
> "  ChannelNumber   INTEGER, "
> "  stationIdINTEGER, "
> "  ) "
> CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
> CREATE INDEX channel_stationId on channel(stationId);
> 
> When I do this select:
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime LIMIT 8;
> 
> I get back the correct 8 results in about 3 milliseconds. 
> 
> If I throw a secondary order term in there say ChannelNumber:
> 
> SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
> 123923 AND showing.stationId = channel.stationId ORDER BY
> showing.startTime, channel.ChannelMajorNumber LIMIT 8;
> 
> It now takes over 120 seconds!!!
> 
> I've tried various multi-column indices including one on
> channel(stationId, ChannelNumber).
> 
> No difference.
> 
> As far as I can tell, when ordering on columns that cross tables, sqlite
> will bring in all the records that match the equality or inequality search
> term on the first column before it does a secondary sort.
> 
> I have over 100,000 records in the showing table and about 100 records in
> the channel table.
> 
> Sqlite should be smart enough to do the secondary sort on ChannelNumber as
> soon as it sees that the records coming back have a later startTime than
> the previous one. i.e. Say the first 5 records have the same startTime and
> different channel numbers. Then the next 5 records have a later start time
> than the first. Sqlite should be smart enough to see this as the results
> come back and do a secondary sort on ChannelNumber on the first 5 results
> and then rinse and repeat.
> 
> What appears to be happening is that even though startTime is indexed,
> sqlite is bringing all 100K records into memory sorted by time before it
> starts to sort by channel number. 
> 
> Is there a way to get sqlite to do the right thing? If there was only a
> way to have a multi-column index that included columns from different
> tables. Oh wait, there is, it's called an intermediate table. However the
> cost of doing this is pretty high for reasons I can't go into here. 
> 
> Any ideas? Maybe I'm just doing something wrong and this should be simple.
> 
> Thanks.
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23166621.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi column ORDER BY across table peformance problem....

2009-04-17 Thread Harold Wood
just curious; have you tried doing the select in a subquery, and then the order 
by in the outer query?

 
woody
--- On Fri, 4/17/09, sorka <sorka95...@gmail.com> wrote:

From: sorka <sorka95...@gmail.com>
Subject: [sqlite] Multi column ORDER BY across table peformance problem....
To: sqlite-users@sqlite.org
Date: Friday, April 17, 2009, 11:12 PM

This should be simple but apparently it isn't.

I have two tables:
"CREATE TABLE showing ( "
"showingIdINTEGER PRIMARY KEY, "
"stationId  INTEGER, "
"startTime  INTEGER, "
") "
CREATE INDEX showing_startTime on showing(startTime);


AND
"CREATE TABLE channel ( "
"  channelIdINTEGER PRIMARY KEY, "
"  ChannelNumber   INTEGER, "
"  stationIdINTEGER, "
"  ) "
CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
CREATE INDEX channel_stationId on channel(stationId);

When I do this select:
SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
123923 AND showing.stationId = channel.stationId ORDER BY
showing.startTime LIMIT 8;

I get back the correct 8 results in about 3 milliseconds. 

If I throw a secondary order term in there say ChannelNumber:

SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
123923 AND showing.stationId = channel.stationId ORDER BY
showing.startTime, channel.ChannelMajorNumber LIMIT 8;

It now takes over 120 seconds!!!

I've tried various multi-column indices including one on channel(stationId,
ChannelNumber).

No difference.

As far as I can tell, when ordering on columns that cross tables, sqlite
will bring in all the records that match the equality or inequality search
term on the first column before it does a secondary sort.

I have over 100,000 records in the showing table and about 100 records in
the channel table.

Sqlite should be smart enough to do the secondary sort on ChannelNumber as
soon as it sees that the records coming back have a later startTime than the
previous one. i.e. Say the first 5 records have the same startTime and
different channel numbers. Then the next 5 records have a later start time
than the first. Sqlite should be smart enough to see this as the results
come back and do a secondary sort on ChannelNumber on the first 5 results
and then rinse and repeat.

What appears to be happening is that even though startTime is indexed,
sqlite is bringing all 100K records into memory sorted by time before it
starts to sort by channel number. 

Is there a way to get sqlite to do the right thing? If there was only a way
to have a multi-column index that included columns from different tables. Oh
wait, there is, it's called an intermediate table. However the cost of
doing
this is pretty high for reasons I can't go into here. 

Any ideas? Maybe I'm just doing something wrong and this should be simple.

Thanks.



-- 
View this message in context:
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23109024.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Multi column ORDER BY across table peformance problem....

2009-04-17 Thread sorka

This should be simple but apparently it isn't.

I have two tables:
"CREATE TABLE showing ( "
"showingIdINTEGER PRIMARY KEY, "
"stationId  INTEGER, "
"startTime  INTEGER, "
") "
CREATE INDEX showing_startTime on showing(startTime);


AND
"CREATE TABLE channel ( "
"  channelIdINTEGER PRIMARY KEY, "
"  ChannelNumber   INTEGER, "
"  stationIdINTEGER, "
"  ) "
CREATE INDEX channel_ChannelNumber on channel(ChannelNumber);
CREATE INDEX channel_stationId on channel(stationId);

When I do this select:
SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
123923 AND showing.stationId = channel.stationId ORDER BY
showing.startTime LIMIT 8;

I get back the correct 8 results in about 3 milliseconds. 

If I throw a secondary order term in there say ChannelNumber:

SELECT showing.startTime FROM showing JOIN channel ON showing.startTime >=
123923 AND showing.stationId = channel.stationId ORDER BY
showing.startTime, channel.ChannelMajorNumber LIMIT 8;

It now takes over 120 seconds!!!

I've tried various multi-column indices including one on channel(stationId,
ChannelNumber).

No difference.

As far as I can tell, when ordering on columns that cross tables, sqlite
will bring in all the records that match the equality or inequality search
term on the first column before it does a secondary sort.

I have over 100,000 records in the showing table and about 100 records in
the channel table.

Sqlite should be smart enough to do the secondary sort on ChannelNumber as
soon as it sees that the records coming back have a later startTime than the
previous one. i.e. Say the first 5 records have the same startTime and
different channel numbers. Then the next 5 records have a later start time
than the first. Sqlite should be smart enough to see this as the results
come back and do a secondary sort on ChannelNumber on the first 5 results
and then rinse and repeat.

What appears to be happening is that even though startTime is indexed,
sqlite is bringing all 100K records into memory sorted by time before it
starts to sort by channel number. 

Is there a way to get sqlite to do the right thing? If there was only a way
to have a multi-column index that included columns from different tables. Oh
wait, there is, it's called an intermediate table. However the cost of doing
this is pretty high for reasons I can't go into here. 

Any ideas? Maybe I'm just doing something wrong and this should be simple.

Thanks.



-- 
View this message in context: 
http://www.nabble.com/Multi-column-ORDER-BY-across-table-peformance-problem-tp23109024p23109024.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users