Re: [sqlite] limiting database size

2005-07-11 Thread Sean Heber
I want to put a limit like 10mb then later on e able to change this  
database limit

to 15mb if I want to, is this possible?


You could use auto_vacuum (so the data file shrinks when data is  
deleted) and check the file size so you know when it is full.


l8r
Sean




[sqlite] Write performance

2005-06-17 Thread Sean Heber
SQLite write (insert) performance doesn't seem very good to me so I'm  
trying to find ways to improve it in my application.  I had this idea  
so I figured I'd run it past you folks who likely better know how the  
db works under the hood.


My thought is that perhaps the write performance suffers because the  
data is being written into the data file in some particular sorted  
order.  Thus it needs to re-order what is already there to insert  
content into the middle of the file at times rather than being able  
to just append new content onto the end.


I've been inserting data into a table with a non-integer primary  
key.  The data does not come in in an ordered fashion.  My thought is  
that perhaps performance suffers because it needs to rewrite huge  
chunks of the data file in order to slot the new data into its  
correct position within the file based on the sorting of the primary  
key.  Would I get a performance improvement if I gave my table an  
auto_increment integer primary key and just added an index to the  
column I'm currently using as my primary key instead?  The thinking  
is that this would then mean that the data would be inserted and thus  
sorted within the data file in numerical order based on the  
auto_increment column and thus the data would be much more often  
appended at or near the end of the data file which is a much faster  
operation than trying to put something into the middle of a large  
(and growing) file.


Does that make any sense at all or am I reaching too far here?  :-)

l8r
Sean



Re: [sqlite] Write performance

2005-06-17 Thread Sean Heber

SQLite write (insert) performance doesn't seem very good to me so I'm
trying to find ways to improve it in my application.  I had this idea
so I figured I'd run it past you folks who likely better know how the
db works under the hood.



did you wrap your inserts in a transaction?


I don't think I can do that because the inserts come in at random  
intervals as they are generated by outside influences.  I'm not just  
doing an import of existing data as the data is created on the fly by  
factors somewhat outside of my control.  I tried to make an in-memory  
temporary table and insert into that and periodically start a  
transaction with an insert into real_table select from memory_table  
thing in bulk, but that didn't seem to make a significant  
difference.  Perhaps I've got something else wrong.


Thanks for the suggestion.

l8r
Sean



Re: [sqlite] Write performance

2005-06-17 Thread Sean Heber

I've been inserting data into a table with a non-integer primary
key.  The data does not come in in an ordered fashion.  My thought is
that perhaps performance suffers because it needs to rewrite huge
chunks of the data file in order to slot the new data into its
correct position within the file based on the sorting of the primary
key.



No.  SQLite uses the B-Tree algorithm.  Inserting a new value in
the middle of a table involves rewriting 4 or 5 pages in the worst
case.  The usual case is to rewrite just the one page where the value
is being inserted.


That makes sense, okay.  Perhaps my problem is that my data is too  
big.  Is a page around 1k?  I often insert blob data in the 64k  
range.  Perhaps that is my problem.


l8r
Sean



Re: [sqlite] Reads and Writes

2005-06-15 Thread Sean Heber

After SQLite obtains a read lock it will have to perform at least the
following:
- check the file header to check for corruption
- check for a hot transaction log, and roll back changes if necessary
- read the database schema from the sqlite_master table

This is the most likely cause of the repetitive behavior you're  
seeing.


Ok, that makes sense.



That said: the OS will of course cache the database file as well, and
many of the 'read' calls will be very fast.


I think you're right about that.  After researching it a bit more,  
I've noticed that disk read usage is actually almost zero during all  
of this which certainly supports your claim.  It is the disk write  
activity that might be slowing things down for me.  I'll have to see  
if I can figure out anything to speed up inserts.  I've tried setting  
synchronize to off but after a time the inserts seem to build up and  
performance degrades to about the same point it is with synchronize  
set to full.


Thanks for the info.

l8r
Sean



[sqlite] Reads and Writes

2005-06-14 Thread Sean Heber
My program has a lot of simple select queries.  Most of them are of  
the select count() from...  variety.  These are very simple queries  
where there is a single WHERE clause and the columns referenced are  
either a primary key column or another indexed column.


I would expect the database to keep the indexes in memory and thus  
have virtually no need to go to disk when doing these count() queries  
and such.  Yet when I do an strace on the process, I see significant  
read() and seek() calls on the database file.  What are the  
conditions for when it needs to read from the database file?


My database file is only around 4MB and I have set the  
default_cache_size to 5.  From what I've read, that should  
translate to almost 50MB of cache size which would be more than  
enough to keep the entire database in memory, I'd think.  Yet it  
doesn't seem to actually do that since it is reading from the file so  
often.


I've seen this on OSX and Linux but this strace output is from Linux:
Snippet of strace output:

fcntl64(5, F_SETLK64, {type=F_RDLCK, whence=SEEK_SET,  
start=1073741824, len=1}, 0xbfffdb00) = 0
fcntl64(5, F_SETLK64, {type=F_RDLCK, whence=SEEK_SET,  
start=1073741826, len=510}, 0xbfffdb00) = 0
fcntl64(5, F_SETLK64, {type=F_UNLCK, whence=SEEK_SET,  
start=1073741824, len=1}, 0xbfffdb00) = 0

access(/db/data-journal, F_OK) = -1 ENOENT (No such file or directory)
fstat64(5, {st_mode=S_IFREG|0644, st_size=4402176, ...}) = 0
_llseek(5, 0, [0], SEEK_SET)= 0
read(5, SQLite format 3\0\4\0\1\1\0@  \0\0\4\200\0\0\0\0..., 1024)  
= 1024

_llseek(5, 3072, [3072], SEEK_SET)  = 0
read(5, \2\2]\0\22\1\214\6\0\0\16%\2\323\2\265\2z\2\361\3\245\3...,  
1024) = 1024

_llseek(5, 395264, [395264], SEEK_SET)  = 0
read(5, \n\0\0\0$\0\\\0\0\\\0v\0\217\0\251\0\303\0\335\0\367\1...,  
1024) = 1024
fcntl64(5, F_SETLK64, {type=F_UNLCK, whence=SEEK_SET, start=0,  
len=0}, 0xbfffdfd0) = 0
fcntl64(5, F_SETLK64, {type=F_RDLCK, whence=SEEK_SET,  
start=1073741824, len=1}, 0xbfffdb60) = 0
fcntl64(5, F_SETLK64, {type=F_RDLCK, whence=SEEK_SET,  
start=1073741826, len=510}, 0xbfffdb60) = 0
fcntl64(5, F_SETLK64, {type=F_UNLCK, whence=SEEK_SET,  
start=1073741824, len=1}, 0xbfffdb60) = 0

access(/db/data-journal, F_OK) = -1 ENOENT (No such file or directory)
fstat64(5, {st_mode=S_IFREG|0644, st_size=4402176, ...}) = 0
_llseek(5, 0, [0], SEEK_SET)= 0
read(5, SQLite format 3\0\4\0\1\1\0@  \0\0\4\200\0\0\0\0..., 1024)  
= 1024

_llseek(5, 2048, [2048], SEEK_SET)  = 0
read(5, \5\0\0\0\2\3\364\0\0\0\n\253\3\372\3\364\0\0\0\0\0\0\0...,  
1024) = 1024

_llseek(5, 5120, [5120], SEEK_SET)  = 0
read(5, \2\0\0\0\6\3\277\1\0\0\r\341\3\365\3\352\3\277\3\311\3...,  
1024) = 1024

_llseek(5, 1768448, [1768448], SEEK_SET) = 0
read(5, \n\0\0\0q\0\370\0\0\370\0\377\1\6\1\f\1\22\1\30\1\36\1...,  
1024) = 1024

_llseek(5, 405504, [405504], SEEK_SET)  = 0
read(5, \5\0\0\0\207\1(\0\0\0\1\207\1(\1-\0012\0017\1\1A\1F\1...,  
1024) = 1024

_llseek(5, 137216, [137216], SEEK_SET)  = 0
read(5, \r\0\0\0\1\1\35\0\1\35\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0...,  
1024) = 1024

_llseek(5, 293888, [293888], SEEK_SET)  = 0
read(5, \r\0\0\0\2\0\217\0\0\217\3H\0\0\0\0\0\0\0\0\0\0\0\0\0\0...,  
1024) = 1024

_llseek(5, 406528, [406528], SEEK_SET)  = 0
read(5, \5\0\0\0|\1\30\0\0\0\n]\1\30\1\36\1$\1*\0010\0016\1\1...,  
1024) = 1024

_llseek(5, 1178624, [1178624], SEEK_SET) = 0
read(5, \r\0\0\0\2\0\353\0\0\353\3\221\0\0\0\0\0\0\0\0\0\0\0\0...,  
1024) = 1024

_llseek(5, 2127872, [2127872], SEEK_SET) = 0
read(5, \r\0\0\0\3\0s\0\3\221\0s\0\342\0\0\0\0\0\0\0\0\0\0\0\0...,  
1024) = 1024


etc

There is more or less a pattern that is very similar to the snippet  
above that repeats over and over in the trace.  It sort of seems like  
it is reading a header or something in the database file over and  
over again.


Additional notes:
I set the following pragmas in this order when I create/open the  
database file (which only happens once when the app loads):

  PRAGMA auto_vacuum = 1
  PRAGMA temp_store = MEMORY
  PRAGMA synchronous = OFF
  PRAGMA default_cache_size = 5

Thanks,
Sean