from:"Russ Leighton"

Re: [sqlite] Specifing which index to use. Was: Performance/bug in multikey 'group by' in 3.6.2

2008-09-26 Thread Russ Leighton


On Sep 26, 2008, at 2:29 PM, Kees Nuyt wrote:

> On Fri, 26 Sep 2008 12:54:36 -0400, Russell wrote:
>
>> I need a 2 key index for some queries and also want to aggregate on
>> these 2 columns. I need this index BUT I have many large sqlite dbs I
>> iterate over and they won't fit in the filesystem cache. Run time  
>> when
>> the index is present is 105min. Run time with out the index is 3min.
>
> Did you populate the sqlite_stat1 index statistics table
> with ANALYZE ? It might influence the query plan.

Yup. 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Final Resolution for Index Control?

2008-09-25 Thread Russ Leighton


DRH,

I think most people like the nature of your proposal to allow control  
of indexes in queries. There were a number
of suggestions on variations of the syntax. Have you made a final  
decision on what the syntax will be?

Thx

Russ
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Russ Leighton


Personally, I like the Mysql syntax best of the below links. Seems  
natural and unobtrusive (unlike 'hints').

On Sep 21, 2008, at 12:12 PM, Alex Scotti wrote:

> not surprisingly, the db2 approach is the only one that seems to
> follow the "ideal" of the relational dbms.  looks like you get to
> provide to the engine information about your data, rather than
> describing to the engine what steps it's supposed to take.  hints
> that directly influence query plans seem like a huge step backwards.
> if you want tight control over what the engine does you use a pure
> rules based (rather than cost based) optimizer.  but nobody
> (including sqlite) does that anymore for good reasons.
>
>
> On Sep 21, 2008, at 12:59 PM, Stephen Woodbridge wrote:
>
>> D. Richard Hipp wrote:
>>> On Sep 21, 2008, at 8:51 AM, Russ Leighton wrote:
>>>
>>>> I am interested in ... a way
>>>> to constraint/control index selection on queries.
>>>>
>>>
>>> What other SQL database engines have this capability and what syntax
>>> do they use?
>>
>> Richard,
>>
>> Hope this are useful:
>>
>> Oracle:
>> http://www.adp-gmbh.ch/ora/sql/hints/index.html
>> http://download.oracle.com/docs/cd/E12096_01/books/admintool/
>> admintool_PhysicalSetup32.html
>>
>> Postgresql:
>> Does not support it unless it is a very recent addition, but its
>> planner
>> occasionally/frequently? makes a poor choice and adding support for
>> hints ala Oracle and/or discussions like "Why didn't Postgresql use  
>> my
>> index?" regularly occur on the user list.
>>
>> DB2:
>> http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/
>> com.ibm.db2.udb.doc/admin/t0005308.htm
>>
>> MySQL:
>> http://dev.mysql.com/doc/refman/5.0/en/index-hints.html
>>
>> -Steve
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-21 Thread Russ Leighton

On Sep 19, 2008, at 6:30 PM, D. Richard Hipp wrote:

>
> On Sep 19, 2008, at 5:47 PM, Russ Leighton wrote:
>
>>
>> Observation 1:  Group by (in my case) is faster w/out using the index
>> than with using the index by 10X
>>
>> In my app I have a table-
>>
>> create table foo(k1integer ,k2 integer,k3 integer,...);
>> create index foo_idx on foo(k1,k2,k3);
>>
>> when I do-
>>
>> select k1,k2,sum() as s,count(1) as c from foo group by k1,k2;
>>
>> The query is 10X slower with the above index created as when I drop
>> the index and don't use it. Bug or
>> just happens to be the case for some data distribution this is just
>> the way it is? Yes, I know I  have
>> a 3 column index and only using 2 for this query.
>
> I do not see how it is possible for what you say to be true - unless
> you have omitted important details of your query, such as a WHERE
> clause.
>

Okay, this is a bit embarrassing but I can't seem to reproduce the  
performance
difference. I spent hours tracking this issue down previously and  
now , going back to reproduce,
I don't see it. Index'd group by's are always faster. Must have been  
some strange state of the equipment.

Sorry for the false alarm.

That said, the _real_ thing I am interested in as a requested  
enhancement is a way
to constraint/control index selection on queries. This would give me  
an out should
the situation above repeat itself in a more consistent manner.

Thanks for sqlite. It is an amazing tool.

BTW, in my mind lite != small, rather  lite == simple+powerful+fast .  
We are using it over very large
quantities of data on very powerful servers.

-Russ

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Performance/bug in multikey 'group by' in 3.6.2

2008-09-19 Thread Russ Leighton


Observation 1:  Group by (in my case) is faster w/out using the index  
than with using the index by 10X

In my app I have a table-

  create table foo(k1integer ,k2 integer,k3 integer,...);
  create index foo_idx on foo(k1,k2,k3);

when I do-

  select k1,k2,sum() as s,count(1) as c from foo group by k1,k2;

The query is 10X slower with the above index created as when I drop  
the index and don't use it. Bug or
just happens to be the case for some data distribution this is just  
the way it is? Yes, I know I  have
a 3 column index and only using 2 for this query.


Observation 2: I can force the index to not be used with '+' but the  
result has nulls for key values!

If I do -

 select k1,k2,sum() as s,count(1) as c from foo group by +k1,+k2;

then the index is not used, it is fast BUT the output has null values  
for k1,k2! Seems like a bug

Thoughts?

-Russ


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Specifing which index to use. Was: Performance/bug in multikey 'group by' in 3.6.2

[sqlite] Final Resolution for Index Control?

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

Re: [sqlite] Performance/bug in multikey 'group by' in 3.6.2

[sqlite] Performance/bug in multikey 'group by' in 3.6.2

5 matches

Site Navigation

Mail list logo

Footer information