subject:"\[sqlite\] Performance\/bug in multikey 'group by' in 3.6.2"

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Igor Tandetnik

"D. Richard Hipp" <[EMAIL PROTECTED]> wrote
in message news:[EMAIL PROTECTED]
> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote:
>
>> I am interested in ... a way
>> to constraint/control index selection on queries.
>>
>
> What other SQL database engines have this capability and what syntax
> do they use?

MS SQL Server:

http://msdn.microsoft.com/en-us/library/ms187713.aspx

Igor Tandetnik



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Russ Leighton


Personally, I like the Mysql syntax best of the below links. Seems  
natural and unobtrusive (unlike 'hints').

On Sep 21, 2008, at 12:12 PM, Alex Scotti wrote:

> not surprisingly, the db2 approach is the only one that seems to
> follow the "ideal" of the relational dbms.  looks like you get to
> provide to the engine information about your data, rather than
> describing to the engine what steps it's supposed to take.  hints
> that directly influence query plans seem like a huge step backwards.
> if you want tight control over what the engine does you use a pure
> rules based (rather than cost based) optimizer.  but nobody
> (including sqlite) does that anymore for good reasons.
>
>
> On Sep 21, 2008, at 12:59 PM, Stephen Woodbridge wrote:
>
>> D. Richard Hipp wrote:
>>> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote:
>>>
 I am interested in ... a way
 to constraint/control index selection on queries.

>>>
>>> What other SQL database engines have this capability and what syntax
>>> do they use?
>>
>> Richard,
>>
>> Hope this are useful:
>>
>> Oracle:
>> http://www.adp-gmbh.ch/ora/sql/hints/index.html
>> http://download.oracle.com/docs/cd/E12096_01/books/admintool/
>> admintool_PhysicalSetup32.html
>>
>> Postgresql:
>> Does not support it unless it is a very recent addition, but its
>> planner
>> occasionally/frequently? makes a poor choice and adding support for
>> hints ala Oracle and/or discussions like "Why didn't Postgresql use  
>> my
>> index?" regularly occur on the user list.
>>
>> DB2:
>> http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/
>> com.ibm.db2.udb.doc/admin/t0005308.htm
>>
>> MySQL:
>> http://dev.mysql.com/doc/refman/5.0/en/index-hints.html
>>
>> -Steve
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Alex Scotti

not surprisingly, the db2 approach is the only one that seems to  
follow the "ideal" of the relational dbms.  looks like you get to  
provide to the engine information about your data, rather than  
describing to the engine what steps it's supposed to take.  hints  
that directly influence query plans seem like a huge step backwards.   
if you want tight control over what the engine does you use a pure  
rules based (rather than cost based) optimizer.  but nobody  
(including sqlite) does that anymore for good reasons.

On Sep 21, 2008, at 12:59 PM, Stephen Woodbridge wrote:

> D. Richard Hipp wrote:
>> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote:
>>
>>> I am interested in ... a way
>>> to constraint/control index selection on queries.
>>>
>>
>> What other SQL database engines have this capability and what syntax
>> do they use?
>
> Richard,
>
> Hope this are useful:
>
> Oracle:
> http://www.adp-gmbh.ch/ora/sql/hints/index.html
> http://download.oracle.com/docs/cd/E12096_01/books/admintool/ 
> admintool_PhysicalSetup32.html
>
> Postgresql:
> Does not support it unless it is a very recent addition, but its  
> planner
> occasionally/frequently? makes a poor choice and adding support for
> hints ala Oracle and/or discussions like "Why didn't Postgresql use my
> index?" regularly occur on the user list.
>
> DB2:
> http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/ 
> com.ibm.db2.udb.doc/admin/t0005308.htm
>
> MySQL:
> http://dev.mysql.com/doc/refman/5.0/en/index-hints.html
>
> -Steve
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Russell Leighton

A reason I think such functionality would be ideal for sqlite is that  
it avoids the need for a fancy query plan optimizer. The user would  
have a way to direct the query plan if the simple and obvious plan is  
suboptimal.



On Sep 21, 2008, at 11:36 AM, "D. Richard Hipp" <[EMAIL PROTECTED]> wrote:

>
> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote:
>
>> I am interested in ... a way
>> to constraint/control index selection on queries.
>>
>
> What other SQL database engines have this capability and what syntax
> do they use?
>
>
> D. Richard Hipp
> [EMAIL PROTECTED]
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Stephen Woodbridge

D. Richard Hipp wrote:
> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote:
> 
>> I am interested in ... a way
>> to constraint/control index selection on queries.
>>
> 
> What other SQL database engines have this capability and what syntax  
> do they use?

Richard,

Hope this are useful:

Oracle:
http://www.adp-gmbh.ch/ora/sql/hints/index.html
http://download.oracle.com/docs/cd/E12096_01/books/admintool/admintool_PhysicalSetup32.html

Postgresql:
Does not support it unless it is a very recent addition, but its planner 
occasionally/frequently? makes a poor choice and adding support for 
hints ala Oracle and/or discussions like "Why didn't Postgresql use my 
index?" regularly occur on the user list.

DB2:
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/t0005308.htm

MySQL:
http://dev.mysql.com/doc/refman/5.0/en/index-hints.html

-Steve
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Russell Leighton

Oracle has 'hints' which live in the comments emdedded in the select.  
Google 'oracle hint use index'. The 3rd hit down my result list has a  
nice overview.(I'd send the link but this stupid iPhone has no cut-n- 
paste). I think that hints are really ugly. Not sure about the other  
big dmbs.

On Sep 21, 2008, at 11:36 AM, "D. Richard Hipp" <[EMAIL PROTECTED]> wrote:

>
> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote:
>
>> I am interested in ... a way
>> to constraint/control index selection on queries.
>>
>
> What other SQL database engines have this capability and what syntax
> do they use?
>
>
> D. Richard Hipp
> [EMAIL PROTECTED]
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread D. Richard Hipp

On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote:

> I am interested in ... a way
> to constraint/control index selection on queries.
>

What other SQL database engines have this capability and what syntax  
do they use?

D. Richard Hipp
[EMAIL PROTECTED]

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Russ Leighton

On Sep 19, 2008, at 6:30 PM, D. Richard Hipp wrote:

>
> On Sep 19, 2008, at 5:47 PM, Russ Leighton wrote:
>
>>
>> Observation 1:  Group by (in my case) is faster w/out using the index
>> than with using the index by 10X
>>
>> In my app I have a table-
>>
>> create table foo(k1integer ,k2 integer,k3 integer,...);
>> create index foo_idx on foo(k1,k2,k3);
>>
>> when I do-
>>
>> select k1,k2,sum() as s,count(1) as c from foo group by k1,k2;
>>
>> The query is 10X slower with the above index created as when I drop
>> the index and don't use it. Bug or
>> just happens to be the case for some data distribution this is just
>> the way it is? Yes, I know I  have
>> a 3 column index and only using 2 for this query.
>
> I do not see how it is possible for what you say to be true - unless
> you have omitted important details of your query, such as a WHERE
> clause.
>

Okay, this is a bit embarrassing but I can't seem to reproduce the  
performance
difference. I spent hours tracking this issue down previously and  
now , going back to reproduce,
I don't see it. Index'd group by's are always faster. Must have been  
some strange state of the equipment.

Sorry for the false alarm.

That said, the _real_ thing I am interested in as a requested  
enhancement is a way
to constraint/control index selection on queries. This would give me  
an out should
the situation above repeat itself in a more consistent manner.

Thanks for sqlite. It is an amazing tool.

BTW, in my mind lite != small, rather  lite == simple+powerful+fast .  
We are using it over very large
quantities of data on very powerful servers.

-Russ

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-20 Thread Russell Leighton

It would be very nice to have a way to explicitly control index use.

I'm going to test my theory this weekend but I think if the index is  
not cached and the data large then the group by is faster without the  
index. If this is the case I have a real issue. I need the index for  
other queries and can't afford to drop it for the aggregations. I just  
want to tell the query planner to not use any index.

If there is no way to do this with the current parser could we have an  
extension to select? Maybe something like:
select with no index 
select with index index1,index2 

Thoughts?

On Sep 20, 2008, at 10:33 AM, "Jay A. Kreibich" <[EMAIL PROTECTED]> wrote:

> On Fri, Sep 19, 2008 at 10:47:33PM -0400, Russ Leighton scratched on  
> the wall:
>
>> What about the null values for the aggregation keys when I put a '+'
>> to disable the index? Is that 'as designed'?
>
>  The "+" operator gets rid of type-affinities, and that can lead to
>  unexpected results.  I'm not sure if one would consider these
>  specific results "as designed" or not, but it is known that "+" is
>  not totally without side effects:
>
>  http://www.sqlite.org/cvstrac/tktview?tn=3279
>
>
>   -j
>
>
> -- 
> Jay A. Kreibich < J A Y  @  K R E I B I.C H >
>
> "Our opponent is an alien starship packed with atomic bombs.  We have
> a protractor."   "I'll go home and see if I can scrounge up a ruler
> and a piece of string."  --from Anathem by Neal Stephenson
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-20 Thread Jay A. Kreibich

On Fri, Sep 19, 2008 at 10:47:33PM -0400, Russ Leighton scratched on the wall:

> What about the null values for the aggregation keys when I put a '+'  
> to disable the index? Is that 'as designed'?

  The "+" operator gets rid of type-affinities, and that can lead to
  unexpected results.  I'm not sure if one would consider these
  specific results "as designed" or not, but it is known that "+" is
  not totally without side effects:

  http://www.sqlite.org/cvstrac/tktview?tn=3279


   -j


-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-19 Thread D. Richard Hipp


On Sep 19, 2008, at 5:47 PM, Russ Leighton wrote:

>
> Observation 1:  Group by (in my case) is faster w/out using the index
> than with using the index by 10X
>
> In my app I have a table-
>
>  create table foo(k1integer ,k2 integer,k3 integer,...);
>  create index foo_idx on foo(k1,k2,k3);
>
> when I do-
>
>  select k1,k2,sum() as s,count(1) as c from foo group by k1,k2;
>
> The query is 10X slower with the above index created as when I drop
> the index and don't use it. Bug or
> just happens to be the case for some data distribution this is just
> the way it is? Yes, I know I  have
> a 3 column index and only using 2 for this query.

I do not see how it is possible for what you say to be true - unless  
you have omitted important details of your query, such as a WHERE  
clause.

What is the argument to sum(), btw?  You should get an error, I think.

D. Richard Hipp
[EMAIL PROTECTED]



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-19 Thread Russ Leighton


Observation 1:  Group by (in my case) is faster w/out using the index  
than with using the index by 10X

In my app I have a table-

  create table foo(k1integer ,k2 integer,k3 integer,...);
  create index foo_idx on foo(k1,k2,k3);

when I do-

  select k1,k2,sum() as s,count(1) as c from foo group by k1,k2;

The query is 10X slower with the above index created as when I drop  
the index and don't use it. Bug or
just happens to be the case for some data distribution this is just  
the way it is? Yes, I know I  have
a 3 column index and only using 2 for this query.


Observation 2: I can force the index to not be used with '+' but the  
result has nulls for key values!

If I do -

 select k1,k2,sum() as s,count(1) as c from foo group by +k1,+k2;

then the index is not used, it is fast BUT the output has null values  
for k1,k2! Seems like a bug

Thoughts?

-Russ


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

[sqlite] Performance/bug in multikey 'group by' in 3.6.2

12 matches

Site Navigation

Mail list logo

Footer information