Re: [sqlite] Full Table Scan after Analyze

2011-03-25 Thread Simon Slavin

On 25 Mar 2011, at 11:11pm, Jay A. Kreibich wrote:

> On Fri, Mar 25, 2011 at 10:30:59PM +, Simon Slavin scratched on the wall:
> 
>> Actually I'm surprised and not terribly impressed that SQLite ever
>> does a scan when there's an ideal index available.
> 
>  Why?  Do you want it to run slower?
> 
>  Indexes are not magic bullets.  Using an index to retrieve a row is
>  typically 5x to 20x more expensive than scanning a row.  There are
>  plenty of instances when a scan will be faster than an index use, and
>  not just in small tables.  Just as SQLite tries to use any index it
>  can to speed up a query, it also tries to avoid using indexes that
>  will slow it down-- and there are plenty of ways this can happen.

You know, I'd never thought of that.  All an index does is let you search a 
B-tree rather than a list.  Thank you.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full Table Scan after Analyze

2011-03-25 Thread Jay A. Kreibich
On Fri, Mar 25, 2011 at 10:30:59PM +, Simon Slavin scratched on the wall:

> Actually I'm surprised and not terribly impressed that SQLite ever
> does a scan when there's an ideal index available.

  Why?  Do you want it to run slower?

  Indexes are not magic bullets.  Using an index to retrieve a row is
  typically 5x to 20x more expensive than scanning a row.  There are
  plenty of instances when a scan will be faster than an index use, and
  not just in small tables.  Just as SQLite tries to use any index it
  can to speed up a query, it also tries to avoid using indexes that
  will slow it down-- and there are plenty of ways this can happen.

  The whole idea of an "ideal index" is something of a falsehood.  It
  all depends on context.  You cannot tell from the schema and query
  alone if an index will help.  You can make good guesses, but you can't
  *know* without understanding the size, shape, and distribution of the
  actual data in the tables.  That's the whole point of ANALYZE... to
  provide that context data to the query optimizer.

  If SQLite *correctly* gathered stats that say a full scan would be
  faster, why would you want it to do things the slow way?

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full Table Scan after Analyze

2011-03-25 Thread Eric Minbiole
> Suggestion:  After you run ANALYZE, go back and modify values in
> sqlite_stat1 (and sqlite_stat2 if you are using SQLITE_ENABLE_STAT2) with
> "typical" values that you would expect to see in most databases.  Yes, you
> can read and write the sqlite_stat1 and sqlite_stat2 tables.  Maybe you can
> come up with some prepackaged default values for sqlite_stat1 and
> sqlite_stat2 based on experiments in your development lab, and then just
> put
> your prepackaged defaults into the real sqlite_stat1 and sqlite_stat2 after
> running ANALYZE.
>
>
Thank you for the helpful suggestion-- I do recall reading previously that
you could modify the _stat tables, but haven't ever pursued that approach.
(Mostly, I was avoiding the extra work of having to maintain these "magic"
stat values in our code; partially lazy, partially in case my test data
wasn't as "typical" as I thought it would be.)  Either way, it seems a
straightforward enough solution that I can try.

One question: After I modify this stat data, do I need to close/reopen the
connection for the changes to take effect?  Or will the results be used
automatically the next time I create the prepared statement?

Thank you again.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full Table Scan after Analyze

2011-03-25 Thread Richard Hipp
On Fri, Mar 25, 2011 at 6:30 PM, Simon Slavin  wrote:

>
> On 25 Mar 2011, at 10:18pm, Richard Hipp wrote:
>
> > Suggestion:  After you run ANALYZE, go back and modify values in
> > sqlite_stat1 (and sqlite_stat2 if you are using SQLITE_ENABLE_STAT2) with
> > "typical" values that you would expect to see in most databases.  Yes,
> you
> > can read and write the sqlite_stat1 and sqlite_stat2 tables.
>
> Can you DROP them ?
>

No.  But you can do "DELETE FROM sqlite_stat1;" which accomplishes the same
thing.


>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full Table Scan after Analyze

2011-03-25 Thread Simon Slavin

On 25 Mar 2011, at 10:18pm, Richard Hipp wrote:

> Suggestion:  After you run ANALYZE, go back and modify values in
> sqlite_stat1 (and sqlite_stat2 if you are using SQLITE_ENABLE_STAT2) with
> "typical" values that you would expect to see in most databases.  Yes, you
> can read and write the sqlite_stat1 and sqlite_stat2 tables.

Can you DROP them ?  Because this might be simpler and would be more 
future-proof in case the content of those tables changes in the future.

Actually I'm surprised and not terribly impressed that SQLite ever does a scan 
when there's an ideal index available.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full Table Scan after Analyze

2011-03-25 Thread Richard Hipp
On Fri, Mar 25, 2011 at 5:24 PM, Eric Minbiole  wrote:

> We have an application that has been using SQLite successfully for several
> years.  We use SQLite for various purposes within our product, with a few
> dozen tables overall.
>
> Recently, I started to upgrade from version 3.6.3 to 3.7.5.  During that
> time, I noticed that several previously fast indexed queries turned into
> slow full table scans.  For example, on a simple (hypothetical) example:
>
> CREATE TABLE tbl (
>  id INTEGER,
>  value INTEGER
> );
> CREATE INDEX idx_id ON tbl (id);
>
> The query "SELECT * FROM tbl WHERE id = 1" previously made use of idx_id.
> However, in 3.7.5, we noticed that it was doing a full table scan, causing
> significant performance issues.
>
> After some debugging, found that the issue is related to our (mis-)use of
> the "analyze" command:  Historically, our software runs analyze any time
> the
> db schema changes.  However, if our customer had not yet made use of a
> particular feature, the corresponding table(s) might be _empty_ when
> analyze
> is run.  On previous versions, this did not cause any obvious problems.  In
> the new version, the query planner reasonably assumes that a full table
> scan
> is faster than an index on a small/empty table.  However, when the customer
> later makes use of those features (populating the tables), the queries
> become quite slow, as it still does a full table scan.
>
> I'm trying to determine the best way to resolve this issue:
>

Suggestion:  After you run ANALYZE, go back and modify values in
sqlite_stat1 (and sqlite_stat2 if you are using SQLITE_ENABLE_STAT2) with
"typical" values that you would expect to see in most databases.  Yes, you
can read and write the sqlite_stat1 and sqlite_stat2 tables.  Maybe you can
come up with some prepackaged default values for sqlite_stat1 and
sqlite_stat2 based on experiments in your development lab, and then just put
your prepackaged defaults into the real sqlite_stat1 and sqlite_stat2 after
running ANALYZE.



>
> - Re-running analyze after adding data to the table is an obvious
> suggestion.  However, as we have lots of tables in use for various
> purposes,
> I'd need to sprinkle lots of "if (first time data added) analyze()" code
> around.
> - I could add a user triggered "maintenance" feature to manually
> re-analyze,
> but that takes away some of the "zero configuration" benefits of our
> product.
> - Even if I stop calling analyze at all going forward, there may be some
> existing (empty) tables that may have already been analyzed by previous
> code.
> - I could make use of the "INDEXED BY" clause, but this goes against the
> documented intent of this feature ("*not* intended for use in tuning the
> performance of a query"), and requires re-writing of many queries.
>
> I'm sure there are other good ideas-- I'm flexible, and open to suggestion.
> I'd appreciate any suggestions the group might have.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Full Table Scan after Analyze

2011-03-25 Thread Eric Minbiole
We have an application that has been using SQLite successfully for several
years.  We use SQLite for various purposes within our product, with a few
dozen tables overall.

Recently, I started to upgrade from version 3.6.3 to 3.7.5.  During that
time, I noticed that several previously fast indexed queries turned into
slow full table scans.  For example, on a simple (hypothetical) example:

CREATE TABLE tbl (
  id INTEGER,
  value INTEGER
);
CREATE INDEX idx_id ON tbl (id);

The query "SELECT * FROM tbl WHERE id = 1" previously made use of idx_id.
However, in 3.7.5, we noticed that it was doing a full table scan, causing
significant performance issues.

After some debugging, found that the issue is related to our (mis-)use of
the "analyze" command:  Historically, our software runs analyze any time the
db schema changes.  However, if our customer had not yet made use of a
particular feature, the corresponding table(s) might be _empty_ when analyze
is run.  On previous versions, this did not cause any obvious problems.  In
the new version, the query planner reasonably assumes that a full table scan
is faster than an index on a small/empty table.  However, when the customer
later makes use of those features (populating the tables), the queries
become quite slow, as it still does a full table scan.

I'm trying to determine the best way to resolve this issue:

- Re-running analyze after adding data to the table is an obvious
suggestion.  However, as we have lots of tables in use for various purposes,
I'd need to sprinkle lots of "if (first time data added) analyze()" code
around.
- I could add a user triggered "maintenance" feature to manually re-analyze,
but that takes away some of the "zero configuration" benefits of our
product.
- Even if I stop calling analyze at all going forward, there may be some
existing (empty) tables that may have already been analyzed by previous
code.
- I could make use of the "INDEXED BY" clause, but this goes against the
documented intent of this feature ("*not* intended for use in tuning the
performance of a query"), and requires re-writing of many queries.

I'm sure there are other good ideas-- I'm flexible, and open to suggestion.
I'd appreciate any suggestions the group might have.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Capitalisation

2011-03-25 Thread Igor Tandetnik
J Trahair  wrote:
> Hi. I've noticed that SELECT statements are specific to the capitalisation of 
> the WHERE section.
> 
> Suppose I have a Customers table with a CustomerCode field, and a value in 
> one of the records of TRA001, eg:
> 
> CustomerCode   CustomerName
> TRA001 Trahair
> SMI001  Smith
> 
> If the SELECT statement is SELECT * FROM Customers WHERE CustomerCode = 
> 'tra001' that is, with tra001 in lower case, SQLite *does
> not find the record*. 
> 
>Is there a setting I can change so that SQLite SELECT statements are not case 
>specific? 

create table Customers (CustomerCode text collate NOCASE, CustomerName text);

Now all comparisons on CustomerCode will be case-insensitive by default. If you 
want that in specific SELECT statement instead, you can write

SELECT * FROM Customers WHERE CustomerCode = 'tra001' collate NOCASE;

-- 
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Capitalisation

2011-03-25 Thread Mihai Militaru
On Fri, 25 Mar 2011 10:35:37 +0100
"J Trahair"  wrote:

> If the SELECT statement is SELECT * FROM Customers WHERE CustomerCode = 
> 'tra001' that is, with tra001 in lower case, SQLite *does not find the
> record*.
SELECT * FROM Customers WHERE CustomerCode LIKE 'tra001'
would do. There are exceptions, for non-ASCII characters in unicode - 
http://www.sqlite.org/lang_expr.html#like - however, the like function can be
redefined to match your needs: 
http://www.sqlite.org/pragma.html#pragma_case_sensitive_like

> SQL Server Express and MySQL - and even Access - do not have this 
> disadvantage.
My opinion is that matching the exact string you want is not a "disadvantage", 
'tra001' and 'TRA001' are different things, why would the equal sign
match them?...

-- 
Mihai Militaru 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Capitalisation

2011-03-25 Thread Oliver Peters
J Trahair  writes:

[...]

> SELECT * FROM Customers WHERE CustomerCode = 'tra001'
> that is, with tra001 in
> lower case, SQLite *does not find the record*.

[...]

use LIKE (http://www.sqlite.org/lang_expr.html#like)

SELECT * FROM Customers WHERE CustomerCode LIKE 'tra001'


Oliver

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Capitalisation

2011-03-25 Thread Cory Nelson
On Fri, Mar 25, 2011 at 2:35 AM, J Trahair
 wrote:
> Hi. I've noticed that SELECT statements are specific to the capitalisation of 
> the WHERE section.
>
> Suppose I have a Customers table with a CustomerCode field, and a value in 
> one of the records of TRA001, eg:
>
> CustomerCode   CustomerName
> TRA001             Trahair
> SMI001              Smith
>
> If the SELECT statement is SELECT * FROM Customers WHERE CustomerCode = 
> 'tra001' that is, with tra001 in lower case, SQLite *does not find the 
> record*.
>
> SQL Server Express and MySQL - and even Access - do not have this 
> disadvantage. Is there a setting I can change so that SQLite SELECT 
> statements are not case specific?

SQL Server Express and MySQL do not have this behavior by default.
I'm not sure where you got that idea.

You can enable ASCII case-insensitivity using COLLATE NOCASE in your
SELECT statement, or as a property of the column in your CREATE TABLE
statement.

-- 
Cory Nelson
http://int64.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Capitalisation

2011-03-25 Thread J Trahair
Hi. I've noticed that SELECT statements are specific to the capitalisation of 
the WHERE section.

Suppose I have a Customers table with a CustomerCode field, and a value in one 
of the records of TRA001, eg:

CustomerCode   CustomerName
TRA001 Trahair
SMI001  Smith

If the SELECT statement is SELECT * FROM Customers WHERE CustomerCode = 
'tra001' that is, with tra001 in lower case, SQLite *does not find the record*.

SQL Server Express and MySQL - and even Access - do not have this disadvantage. 
Is there a setting I can change so that SQLite SELECT statements are not case 
specific?

Thanks

J Trahair
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users