Re: [sqlite] COUNT() extremely slow first time!

2011-02-24 Thread Sven L

Note sure. Will get back on this as soon as I have played a bit with the 
analyzer utility!
 
What I meant was My own estimated x bytes/row :)
x + ID + ~20 characters should make some ~32 bytes :)
 
 Date: Wed, 23 Feb 2011 10:58:00 -0800
 From: fle...@fletchowns.net
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] COUNT() extremely slow first time!
 
 Is the record size you refer to here the same as the Average payload per
 entry that sqlite3_analyzer determines for me?
 
 On Wed, Feb 23, 2011 at 5:09 AM, Max Vlasov max.vla...@gmail.com wrote:
 
  Greg, you should also take the record size into account. My hypothesis is
  that if your record is comparatively small (several fits into 1024) the
  speed of select count will be the same for any page size (my quick tests
  confirm this). It's interesting to know what is an average size of your
  record to understand why the numbers are so different.
 
  Max
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-23 Thread Sven L

Interesting!
I've been using PRAGMA page_size = 4096; in my software. Perhaps I should 
increase it and see if I can get a performance gain.
 
Does it affect INSERTs too?
 
 Date: Tue, 22 Feb 2011 10:59:29 -0800
 From: fle...@fletchowns.net
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] COUNT() extremely slow first time!
 
 I'm currently dealing with a similar issue. I've found that the page_size
 PRAGMA setting can have a dramatic effect on how long it takes to warm up
 the table. On Windows 7, with page_size=1024, a SELECT COUNT(last_column)
 takes 40.2 seconds. page_size=4096 takes 15.5 seconds. page_size=8192 takes
 8.5 seconds. This was done with a reboot between each test.
 
 This page recommends a page_size of 4096:
 http://www.sqlite.org/cvstrac/wiki?p=PerformanceTuningWindows
 
 If I see better performance with the larger page sizes (going to test 16384
 and beyond after this) is there any reason not to use them?
 
 Greg
 
 On Mon, Feb 21, 2011 at 4:37 PM, Stephen Oberholtzer 
 oliverkloz...@gmail.com wrote:
 
  On Mon, Feb 21, 2011 at 11:35 AM, Jim Wilcoxson pri...@gmail.com wrote:
   On Mon, Feb 21, 2011 at 11:05 AM, Sven L larvpo...@hotmail.se wrote:
  
   Thank you for your detailed explanation!
   First, can you please tell me how to purge the cache in Windows 7? This
  could be very useful for my tests!
  
   Sorry, dunno for Windows. On Mac OSX it is the purge command, in the
   development tools. On Linux, you do: echo 3 
   /prog/sys/vm/drop_caches
 
  Just make sure you either (a) quote the 3 (echo '3' 
  /proc/sys/vm/drop_caches) or (b) put a space between the 3 and the .
  If you don't quote it, and you don't put the space in (echo
  3/proc/sys/vm/drop_caches) it doesn't do anything, for reasons I
  won't go into.
 
  --
  -- Stevie-O
  Real programmers use COPY CON PROGRAM.EXE
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-23 Thread Sven L

Does this trick work on the primary key? If not, why?
 
 From: max.vla...@gmail.com
 Date: Wed, 23 Feb 2011 16:09:04 +0300
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] COUNT() extremely slow first time!
 
 On Tue, Feb 22, 2011 at 9:59 PM, Greg Barker fle...@fletchowns.net wrote:
 
  I'm currently dealing with a similar issue. I've found that the page_size
  PRAGMA setting can have a dramatic effect on how long it takes to warm up
  the table. On Windows 7, with page_size=1024, a SELECT COUNT(last_column)
  takes 40.2 seconds. page_size=4096 takes 15.5 seconds. page_size=8192 takes
  8.5 seconds. This was done with a reboot between each test.
 
  This page recommends a page_size of 4096:
  http://www.sqlite.org/cvstrac/wiki?p=PerformanceTuningWindows
 
  If I see better performance with the larger page sizes (going to test 16384
  and beyond after this) is there any reason not to use them?
 
 
 Greg, you should also take the record size into account. My hypothesis is
 that if your record is comparatively small (several fits into 1024) the
 speed of select count will be the same for any page size (my quick tests
 confirm this). It's interesting to know what is an average size of your
 record to understand why the numbers are so different.
 
 Returning to the original topic, for performance reasons I sometimes
 recommend using an index created on the id/rowid. It's a strange construct
 that makes no sense, but actually it sometimes give a speed improvement.
 This is because any index contains only the data used in it and if the query
 doesn't require getting additional data from the table it was created for,
 sqlite only reads this index and nothing else.
 
 So to get the fastest count result one can create the following index
 (assuming id is the alias for rowid)
 
 CREATE INDEX [idx_MyTableId] ON [MyTable] ([ID] )
 
 And use the following query
 
 SELECT COUNT(id) from (SELECT id FROM MyTable ORDER By Id)
 
 Order by here forces using this index and I used outer select since
 count(id) inside the main select for unknown reasons triggers the table
 scanning.
 
 For any query in my tests that usually takes 5-50 seconds, this one is
 always less than a second. But is costs a little in term of the size (the
 index takes space) and the speed of insert. If this is a small price to pay
 then this may be an answer.
 
 Max
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-23 Thread Sven L

Thanks for pointing this out!
 
In my case I have spent much time on normalizing my tables, so the row size 
should be constant in most cases. I do wonder though, what if the row size is 
32 bytes? Or is there a minimum?
 
For instance, I have many lookup tables with ID+text (usually around 20 
characters):
MyID|MyText
 
With a page size of 4096, will SQLite put ~200 rows in one page?
 

 
 Date: Wed, 23 Feb 2011 10:47:03 -0500
 From: pri...@gmail.com
 To: t...@djii.com; sqlite-users@sqlite.org
 Subject: Re: [sqlite] COUNT() extremely slow first time!
 
 The SQLite cache size is in pages (2000 by default), so by increasing
 the page size 8x, you're also increasing the cache size and memory
 requirements by 8x. Not saying it's a bad thing, just something to be
 aware of.
 
 If you want to compare 1K and 8K page size and only compare the effect
 page size has, you should either increase the cache size to 16000 for
 1K pages or decrease the cache to 250 for 8K pages.
 
 The other thing to be aware of is that SQLite will not allow a row to
 cross 2 pages. (It does allow a row to be larger than a page, using
 an overflow page.) So for example, if your page size is 1024 and row
 size is 512 bytes, you can fit 2 rows on a page. I'm simplifying this
 somewhat and ignoring internal SQLite data, but you get the idea. If
 your row size is 513 bytes, you will have 511 bytes of waste on each
 page, so 50% of your database will be air. As your row size heads
 toward 1024 there will be less waste. At 1025 bytes, SQLite will
 start splitting rows into overflow pages, putting 1024 bytes into the
 overflow page and 1 byte in the btree page. These numbers aren't
 right, but illustrate the point.
 
 So to find a good page size, experiment and measure.
 
 Jim
 --
 HashBackup: easy onsite and offsite Unix backup
 http://www.hashbackup.com
 
 
 
 On Wed, Feb 23, 2011 at 10:20 AM, Teg t...@djii.com wrote:
  Hello Greg,
 
  I found this to be the case too. The difference between 1K and 8K is
  staggering. I default all my windows DB's to 8K now.
 
 
  Tuesday, February 22, 2011, 1:59:29 PM, you wrote:
 
  GB I'm currently dealing with a similar issue. I've found that the 
  page_size
  GB PRAGMA setting can have a dramatic effect on how long it takes to warm 
  up
  GB the table. On Windows 7, with page_size=1024, a SELECT 
  COUNT(last_column)
  GB takes 40.2 seconds. page_size=4096 takes 15.5 seconds. page_size=8192 
  takes
  GB 8.5 seconds. This was done with a reboot between each test.
 
 
 
 
  --
  Best regards,
   Tegmailto:t...@djii.com
 
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem with VACUUM feature

2011-02-23 Thread Sven L

Make sure your antivirus is turned off when you run your vacuum test.
It's a long-shot, but I've seen some AVs lock files etc...

 
 Date: Wed, 23 Feb 2011 10:14:15 -0600
 From: j...@kreibi.ch
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] Problem with VACUUM feature
 
 On Wed, Feb 23, 2011 at 11:50:05AM +0530, Sudha Venkatareddy scratched on the 
 wall:
  Hi All,
  
  Input:MyDb.db with size 23KB (has lot of empty pages caused due to delete
  operation)
  
  *Expected OutPut: after applying Vacuum command, should be MyDb.db with
  reduced file size of 13KB.*
  
  *Actual output: MyDb.db remains size 23KB(size not changes from original)
  and creates temporary file etilqs_Hm4RUi6JPXcMZ17 whose data is same as
  MyDb.db but the size is reduced to 13KB*
 
 VACUUM is a two step process. First, the data is copied from the
 original file to a temp file. This is a high-level copy, where the
 data is compacted and reordered, and free pages are eliminated.
 
 The second step copies the data from the temp file back to the
 primary file. This is done as a low-level page-by-page copy. It is
 *not* an OS file copy. By using the page update system already built
 into SQLite, the copy-back will create a rollback journal and remain
 transaction-safe for the whole VACUUM process.
 
 From the sound of things, the first step is working, but the second
 step is failing for some reason. My first guess would be that there
 are permissions issues with creating the rollback file, so the second
 copy process fails. That's just a guess, however, as there could be
 a number of other issues. If you can figure out if a rollback file
 is ever being created, that would help determine if the copy-back is
 starting, but fails for some reason, or if the copy-back step is
 failing right from the start. Given the small database size, it
 might be somewhat hard to figure that out, however-- any rollback is
 going to be there and gone (or not there at all) very quickly.
 
 -j
 
 -- 
 Jay A. Kreibich  J A Y @ K R E I B I.C H 
 
 Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable. -- Angela Johnson
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Auto-grow setting?

2011-02-22 Thread Sven L

Can't seem to find a setting to control how the database file grows when full. 
Is there such a setting?
It looks like the file increases by some  100 kB when it is full. I want to 
change this to around 10 MB (or even more) to avoid file fragmentation.
 
Any ideas?
 
Thanks
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Auto-grow setting?

2011-02-22 Thread Sven L

What I meant is this:
Database size = 1 MB. When opening connection, set chunk to ~100 kB.
Database size = 100 MB. When opening connection, set chunk to ~10 MB.
Database size = 1 GB. When opening connection, set chunk to ~100 MB.
 
I guess SQLITE_FCNTL_CHUNK_SIZE should be a 2^n value, so this gives me this 
algorithm:
 
unsigned int v = databaseSizeBytes / 10;
// Round up to the next highest power of 2 (well-known bit trick ;)
--v;
v |= v  1;
v |= v  2;
v |= v  4;
v |= v  8;
v |= v  16;
++v;
 
// A minimum of 32 kB is desirable?
chunkSize = max(32768, v);

 
Thoughts?
 
 
 From: paiva...@gmail.com
 Date: Tue, 22 Feb 2011 10:01:03 -0500
 Subject: Re: [sqlite] Auto-grow setting?
 To: sqlite-users@sqlite.org
 CC: larvpo...@hotmail.se
 
 Please reply to the list, not to me only.
 
 It's impossible to set chunk size to percentage of the database size,
 you can only set a constant value.
 
 
 Pavel
 
 On Tue, Feb 22, 2011 at 9:13 AM, Sven L larvpo...@hotmail.se wrote:
  Thanks a lot! :D
 
  What do you think of setting the chunk size to approximately 10% of the
  database file size? Or is it better to use a constant?
 
  From: paiva...@gmail.com
  Date: Tue, 22 Feb 2011 08:30:54 -0500
  Subject: Re: [sqlite] Auto-grow setting?
  To: sqlite-users@sqlite.org
  CC: larvpo...@hotmail.se
 
  Is SQLITE_FCNTL_CHUNK_SIZE what you are looking for? See more
  information about it here:
  http://www.sqlite.org/c3ref/c_fcntl_chunk_size.html. Notice that this
  feature appeared only in recent version of SQLite, so if you have some
  earlier version you won't be able to control it and SQLite will
  grow/shrink database page-by-page (maximum page size is 32Kb).
 
 
  Pavel
 
  On Tue, Feb 22, 2011 at 7:28 AM, Sven L larvpo...@hotmail.se wrote:
  
   Can't seem to find a setting to control how the database file grows when
   full. Is there such a setting?
   It looks like the file increases by some  100 kB when it is full. I
   want to change this to around 10 MB (or even more) to avoid file
   fragmentation.
  
   Any ideas?
  
   Thanks
   ___
   sqlite-users mailing list
   sqlite-users@sqlite.org
   http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
 
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Sven L

SQLite 3.7.5.
 
I have a table (13 columns with 1 primary key, 1 index on a date column, 1 
UNIQUE-constraint on two columns).
I insert ~130 rows, 1000 at a time (transaction-wise).
I close the database, reopen it and immediately perform a COUNT-operation:
 
SELECT COUNT(ItemID) FROM Items;

This takes around 40 seconds the first time! WHY?!
I believe the value is somehow cached inside the database after the above call.
 
 
The table used:
 
CREATE TABLE IF NOT EXISTS Item
(
 ItemID INTEGER PRIMARY KEY,
 A INTEGER NOT NULL,
 B INTEGER NOT NULL,
 C INTEGER NOT NULL,
 D INTEGER NOT NULL,
 E INTEGER NOT NULL,
 F INTEGER NOT NULL,
 G INTEGER NOT NULL,
 H DATE NOT NULL,
 I CHAR(3) NOT NULL,
 J INTEGER NOT NULL,
 K INTEGER NOT NULL,
 L INTEGER NOT NULL,
 
 UNIQUE (B, A)
);
CREATE INDEX IF NOT EXISTS Item_H_Index ON Item (H);
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Sven L

Same result :(
Note that I have compiled SQLite with the following switches:
SQLITE_ENABLE_STAT2
SQLITE_THREADSAFE=2
 
I've learnt that COUNT(*) is slower than COUNT(ID), since * means the engine 
has to traverse all columns and it might even return another value if there are 
NULL-values...
 
Also, this is quite interesting:
 
sqlite EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items;
0|0|0|SCAN TABLE Items (~100 rows)
sqlite EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items;
sqlite

I would expect an index scan on the first statement. The second statement tells 
me nada?!
 
Thanks for your help!

 
 From: slav...@bigfraud.org
 Date: Mon, 21 Feb 2011 14:24:50 +
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] COUNT() extremely slow first time!
 
 
 On 21 Feb 2011, at 2:23pm, Sven L wrote:
 
  SELECT COUNT(ItemID) FROM Items;
  
  This takes around 40 seconds the first time! WHY?!
 
 Try again, doing everything identically except that instead of the above line 
 use
 
 SELECT COUNT(*) FROM Items;
 
 Simon.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Sven L

Thank you for your detailed explanation!
First, can you please tell me how to purge the cache in Windows 7? This could 
be very useful for my tests!
 
I'm quite sure my database itself is not fragmented, since I have only inserted 
data. The file system is in good shape too; Windows reports 0% fragmentation. 
Perhaps there is some other bottleneck, like disk performance in general (this 
is a 5400 rpm)? Or antivirus-related CPU holds? (I did turn real-time scanning 
off though.)
I have even turned Windows Search off (and got a 20% performance gain!). My 
32-bit application is running under Windows 7 (64-bit). Could WOW64 have 
something to do with this performance issue?
 
The size of the database is 1,98 GB, with 1.5M rows. Couldn't it be that the 
table scan simply is darn slow for huge tables?
 
In an ideal world the COUNT() would be performed on the primary key in RAM. 
That's not possible? :P
 
Thanks again!
 
 
 Date: Mon, 21 Feb 2011 10:17:03 -0500
 From: pri...@gmail.com
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] COUNT() extremely slow first time!
 
 This is a common issue on the mailing list. The first time you do
 count(*), SQLite (actually your OS) has to load data into memory.
 Most OS's will keep the file in a buffer cache, so the 2nd count(*)
 doesn't have to read from disk.
 
 Here's a timing from my own system, after a purge command to clear the
 buffer cache:
 
 $ time sqlite3 hb.db 'select count(*) from logs'
 -- Loading resources from /Users/jim/.sqliterc
 count(*)
 --
 734909
 
 real 0m0.580s
 user 0m0.190s
 sys 0m0.034s
 
 Same command again, with the file cached:
 
 $ time sqlite3 hb.db 'select count(*) from logs'
 -- Loading resources from /Users/jim/.sqliterc
 count(*)
 --
 734909
 
 real 0m0.189s
 user 0m0.165s
 sys 0m0.019s
 
 This time is consistent no matter how many times I run it, because the
 file is still cached. Doing a purge command to clear the cache and
 re-running the query, we get:
 
 $ purge
 $ time sqlite3 hb.db 'select count(*) from logs'
 -- Loading resources from /Users/jim/.sqliterc
 count(*)
 --
 734909
 
 real 0m0.427s
 user 0m0.175s
 sys 0m0.024s
 
 On my system, there is not a huge difference, but it is consistent.
 Now, if you have a fragmented file system, you will see a much larger
 difference. There are many posts on the mailing list about both file
 system fragmentation and logical fragmentation within the SQLite file
 itself. Your first count(*) is subject to these fragmentation
 effects, while your 2nd usually is not, because the file is in memory.
 
 Some people on the list believe fragmentation is an unimportant detail
 you shouldn't worry about, because you have little control over it.
 That may be true, but it's useful to understand how it can affect
 performance. I think you are seeing this first hand.
 
 Jim
 --
 HashBackup: easy onsite and offsite Unix backup
 http://www.hashbackup.com
 
 
 
 On Mon, Feb 21, 2011 at 9:37 AM, Sven L larvpo...@hotmail.se wrote:
 
  Same result :(
  Note that I have compiled SQLite with the following switches:
  SQLITE_ENABLE_STAT2
  SQLITE_THREADSAFE=2
 
  I've learnt that COUNT(*) is slower than COUNT(ID), since * means the 
  engine has to traverse all columns and it might even return another value 
  if there are NULL-values...
 
  Also, this is quite interesting:
 
  sqlite EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items;
  0|0|0|SCAN TABLE Items (~100 rows)
  sqlite EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items;
  sqlite
 
  I would expect an index scan on the first statement. The second statement 
  tells me nada?!
 
  Thanks for your help!
 
 
  From: slav...@bigfraud.org
  Date: Mon, 21 Feb 2011 14:24:50 +
  To: sqlite-users@sqlite.org
  Subject: Re: [sqlite] COUNT() extremely slow first time!
 
 
  On 21 Feb 2011, at 2:23pm, Sven L wrote:
 
   SELECT COUNT(ItemID) FROM Items;
  
   This takes around 40 seconds the first time! WHY?!
 
  Try again, doing everything identically except that instead of the above 
  line use
 
  SELECT COUNT(*) FROM Items;
 
  Simon.
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Sven L

Thanks :)
 
This did the trick:
First make a copy of the database: copy HugeDatabase.db HugeDatabase_copy.db
Then for each run, replace the database with its copy. This is why I thought 
the COUNT operation was somehow written to the database after its first run... 
:P

 From: slav...@bigfraud.org
 Date: Mon, 21 Feb 2011 16:56:01 +
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] COUNT() extremely slow first time!
 
 
 On 21 Feb 2011, at 4:35pm, Jim Wilcoxson wrote:
 
  On Mon, Feb 21, 2011 at 11:05 AM, Sven L larvpo...@hotmail.se wrote:
  
  Thank you for your detailed explanation!
  First, can you please tell me how to purge the cache in Windows 7? This 
  could be very useful for my tests!
  
  Sorry, dunno for Windows. On Mac OSX it is the purge command, in the
  development tools.
 
 In current versions it's just /usr/bin/purge/, which should be in your path. 
 No idea why a non-programmer should need it, but there it is.
 
  On Linux, you do: echo 3 
  /prog/sys/vm/drop_caches
 
 And in Windows it's ... almost impossible. You can sync to disk, using 
 fflush(), and there's no reason you can't do this from the command-line:
 
 http://technet.microsoft.com/en-us/sysinternals/bb897438.aspx
 
 But for a purge you have to invalidate the contents of cache, and there's no 
 way for an application to tell Windows to do this. For a start, each 
 application has its own user mode address cache, so if you run another 
 utility to flush cache, it just messes with its own space. Second, there's no 
 system call that allows you access to the cache because Microsoft considers 
 it private to the device level. I am not dissing Microsoft for this: there 
 are fair arguments that this is actually the Right Way to do it.
 
 So the way to do it is to overrun the cache by yourself. If you know your 
 cache is 2Gig, find a 2Gig disk file that has nothing to do with your test 
 suite and read it. Or generate 2Gig of gibberish and write it to disk, then 
 delete that file. Pah.
 
 Simon.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Question regarding SQLITE_CANTOPEN

2011-02-17 Thread Sven L

sqlite3* db = NULL;
if (sqlite3_open_v2(path, db, SQLITE_OPEN_READWRITE, NULL) == SQLITE_CANTOPEN)
{
// Ok, database does not exist. Still, db != NULL...?
ASSERT(db != NULL);
}
 
Database cannot be opened, but we get an open handle to it... Please explain!
 
Thanks.   
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite version 3.7.6 testing

2011-02-13 Thread Sven L

Lol... we just had a long discussion about sqlite_stat2 and ANALYZE command.
I removed these from my code, since the command should not be used except for 
rare cases :P
 
Should I re-add my ANALYZE-code and ask my original question again? Confused :P
 
 From: d...@sqlite.org
 Date: Sat, 12 Feb 2011 10:27:40 -0500
 To: sqlite-...@sqlite.org; sqlite-users@sqlite.org
 Subject: [sqlite] SQLite version 3.7.6 testing
 
 The scheduled release of SQLite version 3.7.6 is mid-April. We are still
 two months away. However, version 3.7.6 will contain a lot of pent-up
 changes and so your advance testing of this release will be appreciated.
 
 The latest pre-3.7.6 code can be found here:
 
 http://www.sqlite.org/draft/download.html
 
 And sketchy information about the changes added (so far) can be seen here:
 
 http://www.sqlite.org/draft/releaselog/3_7_6.html
 
 Of particular concern in 3.7.6 are some tweaks we've done to the query
 planner. The query planner is the part of SQLite that decides on the best
 and fastest algorithm to use to satisfy your queries. SQLite uses a
 cost-based query planner. In other words, it tries to estimate the CPU and
 disk I/O load used by various alternative algorithms and it selects the
 algorithm with the lowest estimated cost. The problem with this approach is
 that the cost really is an estimate based on incomplete information, and so
 it sometimes comes out wrong. The cost functions for 3.7.6 have been
 enhanced to be more accurate (we believe) particularly if SQLite is compiled
 using SQLITE_ENABLE_STAT2 and you run ANALYZE on your database files to
 gather statistics. But there could be instances where the cost estimates
 are not as good as before, resulting in performance regressions.
 
 So, if you have the ability to download an early snapshot of SQLite 3.7.6
 and compile it into your application for performance testing purposes, we
 would really appreciate it if you would do so. And do so sooner rather than
 later so that we can have plenty of time to fix an issues you discover prior
 to the 3.7.6 release. If the query planner in 3.7.6 improves the
 performance of your application, we'd also like to hear about that too.
 
 I'll try to post snapshots of 3.7.6 on a regular basis so that you can
 download and compile a fairly recent version at any time. Please pester me
 if I forget.
 
 Thanks for your help.
 -- 
 D. Richard Hipp
 d...@sqlite.org
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding Manual Control Of Query Plans

2011-02-10 Thread Sven L

I remember why I added the sqlite_stat2 flag together with the ANALYZE command. 
It makes certain illformed queries efficient.
 
CREATE TABLE IF NOT EXISTS TC -- 10 rows
(
C INTEGER PRIMARY KEY,
Y CHAR(255) NOT NULL UNIQUE,
);

CREATE TABLE IF NOT EXISTS TB -- 100 rows
(
B INTEGER PRIMARY KEY, -- primary key!
X CHAR(255) NOT NULL UNIQUE
);
 
CREATE TABLE IF NOT EXISTS TA -- 10 rows
(
A INTEGER PRIMARY KEY,
C INTEGER NOT NULL,
B INTEGER NOT NULL,

UNIQUE (C, B) -- unique together with C!
);

Now, consider this query:
SELECT * FROM TB NATURAL JOIN TA; -- slow
vs
SELECT * FROM TA NATURAL JOIN TB; -- fast
 
That's why I needed the statistics. Otherwise, SQLite failed to select the best 
index. I understand it would choose the primary key in this case, which is of 
course wrong.
 
(I haven't tested the above for a while now, but I remember the problem I 
encountered had something to do with those tables.)
 
Regards

 
 From: slav...@bigfraud.org
 Date: Tue, 8 Feb 2011 16:53:52 +
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] Regarding Manual Control Of Query Plans
 
 
 On 8 Feb 2011, at 4:22pm, Sven L wrote:
 
  Thank you very much for your detailed explanation!
 
 You're welcome. I've actually never had to think out this feature of SQLite 
 before, so it was interesting for me too. I hope Richard or the rest of the 
 team will correct me if I got anything wrong.
 
  I will comment out my calls to ANALYZE, and see how my software performs.
  
  The reason why I added it in the first place is that users are allowed to 
  create their own queries, and since not all of them are SQL experts, I 
  wanted the engine to be as tolerant as possible. Perhaps it's a better idea 
  to simply crave decent SQL!
 
 Allowing users to make up their own queries on the fly does make things a 
 little more difficult as you worked out: you can't pre-make good indexes. The 
 big server/client databases cache temporary indexes (and share them between 
 users), so they handle unexpected queries far better: if any query comes in 
 that doesn't suit any indexes it simply makes up a new temporary index and 
 keeps it in case it's needed later. SQLite can't do this because it's 
 designed for a tiny footprint and can't chew up lots of memory or disk space 
 without a good reason.
 
 But your initial questions did sound a little like premature optimisation and 
 I think you'll get decent results without worrying too much about it. I'm 
 sure the overwhelming number of SQLite users have never used ANALYZE even 
 once.
 
 Simon.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding Manual Control Of Query Plans

2011-02-08 Thread Sven L

Is it reasonable to guess that sqlite_stat* are used only for NATURAL JOINs and 
other not-so-obvious joins?
 
In my software, the database sometimes grows up to 1-2 GB immediately due to 
the nature of the application. Hence, the guidelines for the ANALYZE command do 
not suffice for me. Do it once every x months they say. User chooses to 
import huge amounts of data on regular basis. He is also able to run custom 
queries, which is why I found the sqlite_stat-tables useful.
 
 From: slav...@bigfraud.org
 Date: Mon, 7 Feb 2011 22:43:13 +
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] Regarding Manual Control Of Query Plans
 
 
 On 7 Feb 2011, at 9:37pm, Sven L wrote:
 
  Thanks for input. So you're saying that creating the tables manually is not 
  the same as issuing the ANALYZE command?
 
 You have no idea what the ANALYZE command does because it's not documented. 
 For all you know different versions of SQLite put different things in this 
 table. So don't try to simulate it, let SQLite do it. It's hardly difficult: 
 just issue that one instruction. As I wrote upthread, you might want to do it 
 about as often as you'd run the integrity check routine.
 
 By the way you may be overestimating the impact of doing this. It will matter 
 only in situations where the query optimizer will never have to guess how 
 best to do its searches. If you've created good indexes and specified your 
 WHERE and ORDER BY clauses well, use of this table will be very low. There is 
 definitely little point in doing this until the particular installation has 
 lots of data in its tables ... perhaps after months of having had data 
 entered.
 
 Simon.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding Manual Control Of Query Plans

2011-02-08 Thread Sven L

Thank you very much for your detailed explanation!
I will comment out my calls to ANALYZE, and see how my software performs.
 
The reason why I added it in the first place is that users are allowed to 
create their own queries, and since not all of them are SQL experts, I wanted 
the engine to be as tolerant as possible. Perhaps it's a better idea to simply 
crave decent SQL!

 
 From: slav...@bigfraud.org
 Date: Tue, 8 Feb 2011 15:16:58 +
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] Regarding Manual Control Of Query Plans
 
 
 On 8 Feb 2011, at 2:39pm, Sven L wrote:
 
  Is it reasonable to guess that sqlite_stat* are used only for NATURAL JOINs 
  and other not-so-obvious joins?
 
 No. Consider this:
 
 SELECT * FROM myTable WHERE a=104 AND b=213
 
 Suppose there are two indexes on myTable: one indexes only column a, and the 
 other indexes only column b. The query optimizer has to choose between them. 
 Which index would it be best to use ?
 
 The answer depends on the chunkiness of each column. If only three different 
 values ever appear in column a, but column b can have a thousand different 
 values, then it will be more efficient to use the index on column b. This 
 will return fewer rows which have to be scanned one-by-one for values in a.
 
 But you're a good programmer. Knowing that you were going to have SELECT 
 commands like that one above you chose to create an index on both columns 
 (either order, it doesn't matter). Since you have done this, the results that 
 ANALYZE gathers don't matter at all ! The query optimizer finds the good 
 index and never has to choose between two bad indexes because it has the one 
 perfect index which will always be best.
 
 So generally speaking ANALYZE matters only if the query optimizer has two or 
 more equally bad options. If you do your job as a database designer well it 
 will never need to consider chunkiness. The above explanation is simplified 
 but gives you the general idea.
 
  In my software, the database sometimes grows up to 1-2 GB immediately due 
  to the nature of the application. Hence, the guidelines for the ANALYZE 
  command do not suffice for me. Do it once every x months they say. User 
  chooses to import huge amounts of data on regular basis. He is also able to 
  run custom queries, which is why I found the sqlite_stat-tables useful.
 
 Doesn't matter. This new data won't change the relative chunkiness of the 
 values in the columns. Though the precise numbers change, the /relative/ 
 chunkiness probably won't, so the choice of indexes probably won't need to 
 change either. Even less likely that it'll change by enough to change which 
 index the query optimiser should pick.
 
 Doing another ANALYZE only really matters if the character of your data 
 changes, which is usually when a column that used to have very big chunks 
 suddenly has tiny chunks. The only time this happens if when a business 
 changes what it does in a significant way: going from having five product 
 lines to 500, or from selling in three countries to selling in thirty 
 countries. After they've been putting in new data reflecting this new usage 
 for a few months, /then/ it might be useful to run ANALYZE again.
 
 Even then, the only thing you're changing is what the query optimizer chooses 
 as the best index. It might make the wrong decision and take 12ms for a 
 SELECT instead of 4ms. Not really a huge problem: if a delay of 8ms is 
 mission-critical you're probably using hardware at the edge of its 
 capabilities.
 
 Simon.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Regarding Manual Control Of Query Plans

2011-02-07 Thread Sven L

I've discovered that the ANALYZE command is extremely slow on my large 
databases. I read the documentation and it says that the command should be 
issued when the schema is changed a lot. Hmm, when exactly could that be? 
Currently, I issue the command every time I have inserted ~1000 rows in my main 
table.
 
Now I'm thinking of using precomputed sqlite_stat1 and sqlite_stat2 tables, 
since the relationship between my tables is pretty much (say 95%) constant in 
all database instances. I'm wondering the same as John did here: 
http://osdir.com/ml/sqlite-users/2009-02/msg00298.html

1) Will this work?
2) Is it dangerous? e.g. could this completely confuse the query optimiser, 
or is it the case that as long as the hard coded values are realistic it 
doesn't matter that they don't reflect the reality of the table (which is what 
the docs imply)?
2) Has anyone experience of trying this?

Thanks!   
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding Manual Control Of Query Plans

2011-02-07 Thread Sven L

Sounds reasonable!
So, with this in mind, it makes sense to precompute the sqlite_stat* tables. 
Right?
 
 From: slav...@bigfraud.org
 Date: Mon, 7 Feb 2011 15:33:50 +
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] Regarding Manual Control Of Query Plans
 
 
 On 7 Feb 2011, at 3:12pm, Sven L wrote:
 
  I've discovered that the ANALYZE command is extremely slow on my large 
  databases. I read the documentation and it says that the command should be 
  issued when the schema is changed a lot. Hmm, when exactly could that be? 
  Currently, I issue the command every time I have inserted ~1000 rows in my 
  main table.
 
 You should never need the ANALYZE command in a routine your users run. If 
 your first 1000 rows are representative of the 'chunkiness' of your data, do 
 it once, then there's no need to issue that command ever again.
 
 The results are used only to help the query analyser choose which order to 
 consider columns in. In order to do this it's useful to know things like The 
 column called 'currency' is almost always the same value, whereas the column 
 called 'totalAmount' changes a lot.. That's what that command does. If you 
 radically changed your data by, for example, moving from a national to an 
 international service then it might be worth running ANALYZE again once you 
 had a bunch of invoices in other currencies in, but only then, and only once.
 
 Simon.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding Manual Control Of Query Plans

2011-02-07 Thread Sven L

Yes, this is what I have read on the net too. But, tests show that without the 
sqlite_stat* tables, my queries are ridiculously slow (probably unwanted table 
scans etc).
Real-life data... can't I simply take my real-life database and extract the 
data in sqlite_stat*?
 
Btw, this command produces nothing even though the table does contain several 
rows: .dump sqlite_stat2

Thanks
 
 From: slav...@bigfraud.org
 Date: Mon, 7 Feb 2011 16:44:00 +
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] Regarding Manual Control Of Query Plans
 
 
 On 7 Feb 2011, at 4:38pm, Sven L wrote:
 
  So, with this in mind, it makes sense to precompute the sqlite_stat* 
  tables. Right?
 
 Which you do by running ANALYZE, but since it needs real-life data to work on 
 there's no point doing it until your customer has put some data in. I don't 
 write this type of application any more, but I might put it in a maintenance 
 routine -- some obscure menu option near the Config Preferences or something. 
 Run it as part of your yearly maintenance procedure, after you've run 'PRAGMA 
 integrity_check'.
 
 Simon.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding Manual Control Of Query Plans

2011-02-07 Thread Sven L

Thanks for input. So you're saying that creating the tables manually is not the 
same as issuing the ANALYZE command?
 
However, .dump sqlite_stat2 does NOT work. And I have been using the 
SQLITE_ENABLE_STAT2 switch for decades now :P
This is what this command produces:
 
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
COMMIT;

-
A bug?
 
 
 
 From: d...@sqlite.org
 Date: Mon, 7 Feb 2011 14:42:06 -0500
 To: sqlite-users@sqlite.org
 Subject: Re: [sqlite] Regarding Manual Control Of Query Plans
 
 On Mon, Feb 7, 2011 at 1:48 PM, Sven L larvpo...@hotmail.se wrote:
 
 
  Yes, this is what I have read on the net too. But, tests show that without
  the sqlite_stat* tables, my queries are ridiculously slow (probably unwanted
  table scans etc).
  Real-life data... can't I simply take my real-life database and extract the
  data in sqlite_stat*?
 
 
 You can take a sample database that is representative of what to expect for
 most of your users, run ANALYZE, then do .dump sqlite_stat1. Save the
 output. Then when creating a new database for a user, after doing all your
 CREATE TABLE and CREATE INDEX statements, but before adding actual content,
 run ANALYZE there. The ANALYZE command will take a millisecond or so since
 there is no data.
 
 Then run the INSERT INTO sqlite_stat1 statements you saved in order to
 fill the stat1 table with content from your sample database.
 
 
 
  Btw, this command produces nothing even though the table does contain
  several rows: .dump sqlite_stat2
 
 
 Compile with SQLITE_ENABLE_STAT2 in order to get the stat2 information.
 stat2 is a 10-sample histogram of the content of each index. This enables
 SQLite to make smarter choices about when to use index lookup and when to do
 table scans. For example:
 
 SELECT * FROM tab WHERE x=0;
 
 The above might do a table scan if a majority of the tab.x values are 0.
 But if only 1% of the tab.x values are 0, then it will use an index.
 
 
 
  Thanks
 
   From: slav...@bigfraud.org
   Date: Mon, 7 Feb 2011 16:44:00 +
   To: sqlite-users@sqlite.org
   Subject: Re: [sqlite] Regarding Manual Control Of Query Plans
  
  
   On 7 Feb 2011, at 4:38pm, Sven L wrote:
  
So, with this in mind, it makes sense to precompute the sqlite_stat*
  tables. Right?
  
   Which you do by running ANALYZE, but since it needs real-life data to
  work on there's no point doing it until your customer has put some data in.
  I don't write this type of application any more, but I might put it in a
  maintenance routine -- some obscure menu option near the Config Preferences
  or something. Run it as part of your yearly maintenance procedure, after
  you've run 'PRAGMA integrity_check'.
  
   Simon.
   ___
   sqlite-users mailing list
   sqlite-users@sqlite.org
   http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
 
 
 
 -- 
 D. Richard Hipp
 d...@sqlite.org
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users