Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
"D. Richard Hipp" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote: > >> I am interested in ... a way >> to constraint/control index selection on queries. >> > > What other SQL database engines have this capability and what syntax > do they use? MS SQL Server: http://msdn.microsoft.com/en-us/library/ms187713.aspx Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
Personally, I like the Mysql syntax best of the below links. Seems natural and unobtrusive (unlike 'hints'). On Sep 21, 2008, at 12:12 PM, Alex Scotti wrote: > not surprisingly, the db2 approach is the only one that seems to > follow the "ideal" of the relational dbms. looks like you get to > provide to the engine information about your data, rather than > describing to the engine what steps it's supposed to take. hints > that directly influence query plans seem like a huge step backwards. > if you want tight control over what the engine does you use a pure > rules based (rather than cost based) optimizer. but nobody > (including sqlite) does that anymore for good reasons. > > > On Sep 21, 2008, at 12:59 PM, Stephen Woodbridge wrote: > >> D. Richard Hipp wrote: >>> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote: >>> I am interested in ... a way to constraint/control index selection on queries. >>> >>> What other SQL database engines have this capability and what syntax >>> do they use? >> >> Richard, >> >> Hope this are useful: >> >> Oracle: >> http://www.adp-gmbh.ch/ora/sql/hints/index.html >> http://download.oracle.com/docs/cd/E12096_01/books/admintool/ >> admintool_PhysicalSetup32.html >> >> Postgresql: >> Does not support it unless it is a very recent addition, but its >> planner >> occasionally/frequently? makes a poor choice and adding support for >> hints ala Oracle and/or discussions like "Why didn't Postgresql use >> my >> index?" regularly occur on the user list. >> >> DB2: >> http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/ >> com.ibm.db2.udb.doc/admin/t0005308.htm >> >> MySQL: >> http://dev.mysql.com/doc/refman/5.0/en/index-hints.html >> >> -Steve >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
not surprisingly, the db2 approach is the only one that seems to follow the "ideal" of the relational dbms. looks like you get to provide to the engine information about your data, rather than describing to the engine what steps it's supposed to take. hints that directly influence query plans seem like a huge step backwards. if you want tight control over what the engine does you use a pure rules based (rather than cost based) optimizer. but nobody (including sqlite) does that anymore for good reasons. On Sep 21, 2008, at 12:59 PM, Stephen Woodbridge wrote: > D. Richard Hipp wrote: >> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote: >> >>> I am interested in ... a way >>> to constraint/control index selection on queries. >>> >> >> What other SQL database engines have this capability and what syntax >> do they use? > > Richard, > > Hope this are useful: > > Oracle: > http://www.adp-gmbh.ch/ora/sql/hints/index.html > http://download.oracle.com/docs/cd/E12096_01/books/admintool/ > admintool_PhysicalSetup32.html > > Postgresql: > Does not support it unless it is a very recent addition, but its > planner > occasionally/frequently? makes a poor choice and adding support for > hints ala Oracle and/or discussions like "Why didn't Postgresql use my > index?" regularly occur on the user list. > > DB2: > http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/ > com.ibm.db2.udb.doc/admin/t0005308.htm > > MySQL: > http://dev.mysql.com/doc/refman/5.0/en/index-hints.html > > -Steve > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
A reason I think such functionality would be ideal for sqlite is that it avoids the need for a fancy query plan optimizer. The user would have a way to direct the query plan if the simple and obvious plan is suboptimal. On Sep 21, 2008, at 11:36 AM, "D. Richard Hipp" <[EMAIL PROTECTED]> wrote: > > On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote: > >> I am interested in ... a way >> to constraint/control index selection on queries. >> > > What other SQL database engines have this capability and what syntax > do they use? > > > D. Richard Hipp > [EMAIL PROTECTED] > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
D. Richard Hipp wrote: > On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote: > >> I am interested in ... a way >> to constraint/control index selection on queries. >> > > What other SQL database engines have this capability and what syntax > do they use? Richard, Hope this are useful: Oracle: http://www.adp-gmbh.ch/ora/sql/hints/index.html http://download.oracle.com/docs/cd/E12096_01/books/admintool/admintool_PhysicalSetup32.html Postgresql: Does not support it unless it is a very recent addition, but its planner occasionally/frequently? makes a poor choice and adding support for hints ala Oracle and/or discussions like "Why didn't Postgresql use my index?" regularly occur on the user list. DB2: http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/t0005308.htm MySQL: http://dev.mysql.com/doc/refman/5.0/en/index-hints.html -Steve ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
Oracle has 'hints' which live in the comments emdedded in the select. Google 'oracle hint use index'. The 3rd hit down my result list has a nice overview.(I'd send the link but this stupid iPhone has no cut-n- paste). I think that hints are really ugly. Not sure about the other big dmbs. On Sep 21, 2008, at 11:36 AM, "D. Richard Hipp" <[EMAIL PROTECTED]> wrote: > > On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote: > >> I am interested in ... a way >> to constraint/control index selection on queries. >> > > What other SQL database engines have this capability and what syntax > do they use? > > > D. Richard Hipp > [EMAIL PROTECTED] > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote: > I am interested in ... a way > to constraint/control index selection on queries. > What other SQL database engines have this capability and what syntax do they use? D. Richard Hipp [EMAIL PROTECTED] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
On Sep 19, 2008, at 6:30 PM, D. Richard Hipp wrote: > > On Sep 19, 2008, at 5:47 PM, Russ Leighton wrote: > >> >> Observation 1: Group by (in my case) is faster w/out using the index >> than with using the index by 10X >> >> In my app I have a table- >> >> create table foo(k1integer ,k2 integer,k3 integer,...); >> create index foo_idx on foo(k1,k2,k3); >> >> when I do- >> >> select k1,k2,sum() as s,count(1) as c from foo group by k1,k2; >> >> The query is 10X slower with the above index created as when I drop >> the index and don't use it. Bug or >> just happens to be the case for some data distribution this is just >> the way it is? Yes, I know I have >> a 3 column index and only using 2 for this query. > > I do not see how it is possible for what you say to be true - unless > you have omitted important details of your query, such as a WHERE > clause. > Okay, this is a bit embarrassing but I can't seem to reproduce the performance difference. I spent hours tracking this issue down previously and now , going back to reproduce, I don't see it. Index'd group by's are always faster. Must have been some strange state of the equipment. Sorry for the false alarm. That said, the _real_ thing I am interested in as a requested enhancement is a way to constraint/control index selection on queries. This would give me an out should the situation above repeat itself in a more consistent manner. Thanks for sqlite. It is an amazing tool. BTW, in my mind lite != small, rather lite == simple+powerful+fast . We are using it over very large quantities of data on very powerful servers. -Russ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
It would be very nice to have a way to explicitly control index use. I'm going to test my theory this weekend but I think if the index is not cached and the data large then the group by is faster without the index. If this is the case I have a real issue. I need the index for other queries and can't afford to drop it for the aggregations. I just want to tell the query planner to not use any index. If there is no way to do this with the current parser could we have an extension to select? Maybe something like: select with no index select with index index1,index2 Thoughts? On Sep 20, 2008, at 10:33 AM, "Jay A. Kreibich" <[EMAIL PROTECTED]> wrote: > On Fri, Sep 19, 2008 at 10:47:33PM -0400, Russ Leighton scratched on > the wall: > >> What about the null values for the aggregation keys when I put a '+' >> to disable the index? Is that 'as designed'? > > The "+" operator gets rid of type-affinities, and that can lead to > unexpected results. I'm not sure if one would consider these > specific results "as designed" or not, but it is known that "+" is > not totally without side effects: > > http://www.sqlite.org/cvstrac/tktview?tn=3279 > > > -j > > > -- > Jay A. Kreibich < J A Y @ K R E I B I.C H > > > "Our opponent is an alien starship packed with atomic bombs. We have > a protractor." "I'll go home and see if I can scrounge up a ruler > and a piece of string." --from Anathem by Neal Stephenson > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
On Fri, Sep 19, 2008 at 10:47:33PM -0400, Russ Leighton scratched on the wall: > What about the null values for the aggregation keys when I put a '+' > to disable the index? Is that 'as designed'? The "+" operator gets rid of type-affinities, and that can lead to unexpected results. I'm not sure if one would consider these specific results "as designed" or not, but it is known that "+" is not totally without side effects: http://www.sqlite.org/cvstrac/tktview?tn=3279 -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2
On Sep 19, 2008, at 5:47 PM, Russ Leighton wrote: > > Observation 1: Group by (in my case) is faster w/out using the index > than with using the index by 10X > > In my app I have a table- > > create table foo(k1integer ,k2 integer,k3 integer,...); > create index foo_idx on foo(k1,k2,k3); > > when I do- > > select k1,k2,sum() as s,count(1) as c from foo group by k1,k2; > > The query is 10X slower with the above index created as when I drop > the index and don't use it. Bug or > just happens to be the case for some data distribution this is just > the way it is? Yes, I know I have > a 3 column index and only using 2 for this query. I do not see how it is possible for what you say to be true - unless you have omitted important details of your query, such as a WHERE clause. What is the argument to sum(), btw? You should get an error, I think. D. Richard Hipp [EMAIL PROTECTED] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Performance/bug in multikey 'group by' in 3.6.2
Observation 1: Group by (in my case) is faster w/out using the index than with using the index by 10X In my app I have a table- create table foo(k1integer ,k2 integer,k3 integer,...); create index foo_idx on foo(k1,k2,k3); when I do- select k1,k2,sum() as s,count(1) as c from foo group by k1,k2; The query is 10X slower with the above index created as when I drop the index and don't use it. Bug or just happens to be the case for some data distribution this is just the way it is? Yes, I know I have a 3 column index and only using 2 for this query. Observation 2: I can force the index to not be used with '+' but the result has nulls for key values! If I do - select k1,k2,sum() as s,count(1) as c from foo group by +k1,+k2; then the index is not used, it is fast BUT the output has null values for k1,k2! Seems like a bug Thoughts? -Russ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users