Re: [sqlite] Re: Threads

2007-11-14 Thread spaminos-sqlite
- Original Message 
> From: John Stanton <[EMAIL PROTECTED]>
> To: sqlite-users@sqlite.org
> Sent: Wednesday, November 14, 2007 11:43:48 AM
> Subject: Re: [sqlite] Re: Threads
> 
> If you machine has a single disk it fundamentally does not have parallel 
> I/O.  If you have a machine with multiple dik spindles and multiple 
> channels then you can have parallel access.  Multiple Sqlite databases 
> residing on the same disk are accessed sequentially because the access 
> depends upon the disk head positioning.

> If you have a mutliple processor machine or a multiple core processor 
> then you have some parallel computing ability.


It can be added that while disks can only perform one operation at a time, 
modern disks have NCQ capabilities that enable them to reduce seek times by 
using an elevator algorithm for example.
The OS is also performing some optimizations when queuing up several I/O 
requests to the same device.
 
So yeah it's possible to increase throughput (and keeping latency in check) by 
running several queries in parallel on the same db (especially if in a scenario 
that involves a large majority of read access) that resides in one file on one 
disk.
 
Also, like you mentioned, the CPU cost of performing one query is not 
negligible (especially when performing complex queries that can use quite a lot 
of CPU), so if the host has multiple cores, things will get a little quicker.
 
Nicolas

Re: [sqlite] 3.5.0 alpha TryEnterCriticalSection linker error

2007-09-04 Thread spaminos-sqlite
- Original Message 
> From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> To: sqlite-users@sqlite.org
> Sent: Tuesday, September 4, 2007 3:32:38 PM
> Subject: Re: [sqlite] 3.5.0 alpha TryEnterCriticalSection linker error
> 
> 
> =?ISO-8859-1?Q?Daniel_=D6nnerby?= <[EMAIL PROTECTED]> wrote:
> > Hi all!
> > 
> > Tried upgrading to the 3.5.0 alpha from 3.4.2 using VS 2005 on XP. I 
> > downloaded the ZIP with preprocessed C code.
> > Compiling SQLite to a .lib was no problem, but when linking it to an 
> > ..exe I got the following:
> > sqlite.lib(mutex_w32.obj) : error LNK2019: unresolved external symbol 
> > _TryEnterCriticalSection referenced in function _sqlite3_mutex_try
> > 
> > I made some attempt to fix it checking all kind of possible errors on my 
> > side (defines, compiler/linker settings etc) without any luck.
> > Anyone got the same error?
> > BTW. SQLite 3.4.2 works just fine.
> > 
> > This is not a big deal for me to solve, just thought I share with the 
> > development team.
> > 
> 
> http://www.sqlite.org/cvstrac/chngview?cn=4399
> 
> --
> D. Richard Hipp [EMAIL PROTECTED]
 
Isn't it time to drop the Win9X support from the default build?

I'm thinking that any optimization should be enabled for the majority of users. 
Or if it's not really an optimization, why keeping it in the code then?

If some people still need to compile for legacy OSes, they can always grab the 
source and compile without those optimizations.

An alternative is to call this function when available using "GetProcAddress" 
(this is the case for a lot of other modern calls that cannot be done right 
now).
 
Just my 2c
 
Nicolas

-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Problem with SQLite FastCGI module "malformed database schema"

2007-07-26 Thread spaminos-sqlite
> 
> - Original Message 
> From: Zbigniew Baniewski <[EMAIL PROTECTED]>
> To: sqlite-users@sqlite.org
> Sent: Thursday, July 26, 2007 4:12:00 PM
> Subject: Re: [sqlite] Problem with SQLite FastCGI module "malformed database 
> schema"
> 
> 
> On Thu, Jul 26, 2007 at 12:59:45PM -0700, Joe Wilson wrote:
> 
> > /var on my machine is drwxr-xr-x, and I don't have any issues with sqlite 
> > reading and writing temp files to /var/tmp/. Even if the permissions of /var
> > were d--x--x--x, it would also be fine. As long as /var/tmp is rwx for
> > the sqlite process, it will work.
> > 
> > Just make sure the the first accessible temp dir has enough disk space
> > for the temporary files.
> 
> Of course. It's a little 5 GB HDD - but filled only to 1/3 of it's capacity.
> There's just one big partition (beside swap).
> 
> The problem is, that the error message actually gives not any clue. It just
> tells, that "there was problem while truing to open temporary database file"
> - but there (considering the above) shouldn't be any problem. /var/tmp has
> rwxrwxrwx privileges. Even worse: it's working without any problems most of
> the time - and then, after f.e. a week, it refuses to work any further,
> without any particular reason. It did open that temporary files 100 times
> before - and it can't open it at 101. time.
> 

If it works and then later stops working, it's much more likely that you're 
having a problem with some file descriptor that is not closed properly (on the 
temp db most likely).
If it dies after let's say a week, you can check after a few days using lsof if 
the process is keeping files open in /tmp

Good luck

Nicolas

Re: [sqlite] Capturing output from SQLlite with variables in a BASH script

2007-06-22 Thread spaminos-sqlite
- Original Message 
From: Martin Jenkins <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Sent: Friday, June 22, 2007 2:00:45 PM
Subject: Re: [sqlite] Capturing output from SQLlite with variables in a BASH 
script


litenoob wrote:
> Hi, I'm wondering how to write a BASH script that will capture my SQLite
> output.
> 
> I can do it for a single line with something like this:
> 
> somevar=`sqlite3 dbfilename "SELECT name FROM tablename WHERE name='smith'
> LIMIT 1;"`
> 
> However, if I want to do anything with multiple lines, I haven't figured out
> a good way.  So far, I'm using a workaround by outputting to a file, then
> reading it after I exit the SQLite commandline, but that slows down the
> script significantly.
> 
> e.g.
> 
> sqlite3 dbfilename << EOF
> 
> .output temp1
> select id from tablename where name = "bush";
> .output temp2
> select id from tablename where name = "osama";
> 
> .quit
> EOF
> 
> read id1 < temp1
> read id2 < temp2
> 
> What's the better way to do this without actually writing to a file?
> 
> Thanks!

If you're using bash you can simply do something like:
sqlite3 dbfilename 'SELECT name FROM tablename WHERE name="smith"' | ( while 
read name ; do echo "--> $name" ; done )

You can actually put whatever you want within parenthesis (even more 
parenthised goodness).

That, or use a scripting language like perl or python :)

Nicolas

Re: [sqlite] Database replication question

2007-06-11 Thread spaminos-sqlite
> - Original Message 
> From: Joe Wilson <[EMAIL PROTECTED]>
> To: sqlite-users@sqlite.org
> Sent: Monday, June 11, 2007 8:36:32 PM
> Subject: Re: [sqlite] Database replication question
> 
> 
> Large bulk inserts with more than one index (implicit or explicit) 
> is not SQLite's strong suit.
> 
> If you search the mailing list archives you'll find a few suggestions:
> 
> - "BEGIN EXCLUSIVE" (or is it "BEGIN IMMEDIATE"?) on the 
>   database file and then copy the file over - fastest way

What do you mean by "copy the file over"? A straight copy of the binary content 
of the file? If so, I can't really do that because the version of sqlite are 
potentially different on the two machines.

> 
> or
> 
> - increasing cache sizes 
> - pre-sorting the data in index order prior to bulk insert
> - creating the other indexes after all the data is inserted
> 
> If you do not require a live backup you could use the copy trick
> and augment that with a daily archive via 
> 
> sqlite3 file.db .dump | gzip etc...
> 
> in case the database file becomes corrupted.

If the performance problem is with the seconday index, is there a way to 
"pause" indexing before a large bulk insert and then "resume" it later without 
rebuilding the entire index (to avoid doing: drop index + inserts + create 
index)? Maybe it's a stupid question, but I am guessing that there is some sort 
of version number for the rows in the db, so playing "catchup" on an index 
could work?

Nicolas

[sqlite] Database replication question

2007-06-11 Thread spaminos-sqlite
Hi all

I am trying to put in place a simple replication process to copy a database 
from one machine to an other.
The table I have is something like
CREATE TABLE sn2uid(sn VARCHAR(100) NOT NULL, uid INTEGER NOT NULL, PRIMARY KEY 
(sn));
CREATE INDEX uidindex on sn2uid ( uid )

Where the (sn,uid) pairs are pretty much random.

On my test data, I have around 3 million entries, the size on disk being about 
280 Mb
If I do a 'select * from sn2uid' > db, I get around 100Mb worth of data. I was 
thinking to simply stream the result from that query over tcp (http really), 
and do the inserts on the other side...

The problem I have is that, while doing this select takes about 10 seconds on 
my machine, I didn't find any quick way to insert quickly onto the other 
machine.

After a while, the db file size grows very very slowly, even when using 
transactions

My question is:

is there a way to do a select or a .dump so that when inserting the data on the 
other end, things will be faster? Or maybe there are some pragmas I can use 
that would improve performance?

To me, it seems that the reason things are slow is that even though I do the 
inserts in a transaction, the btrees are modified independently, and in that 
case randomly. If I was getting the data in the right order in terms of the 
btree, I think things could be significantly faster...

What I tried was to simply something like:
sqlite3 myorg.db '.dump sn2uid' > ~/sn2uid.txt
sqlite3 mynew.db < ~/sn2uid.txt

Would grouping inserts together by groups of 1 or so make things faster 
instead of one gigantic transaction? I am wondering in particular if the btree 
insert code is smart enough to build the tree and merge it into the main db 
file faster in that case?

Thanks!

Nicolas

Re: [sqlite] sqlite internal structs don't make use of C bitfields?

2007-05-30 Thread spaminos-sqlite
- Original Message 
> From: Dennis Cote <[EMAIL PROTECTED]>
> To: sqlite-users@sqlite.org
> Sent: Wednesday, May 30, 2007 12:09:25 PM
> Subject: Re: [sqlite] sqlite internal structs don't make use of C bitfields?

> You may want to look at how the isInited field is used. You may be able 
> to combine it with the others as long as it stays in the first byte and 
> the code only checks for zero vs nonzero values on that byte (then again 
> that may not be safe if other combined bitfield are set nonzero before 
> the isInited field is set). If its safe, you could save another byte per 
> structure.

There seems to be some other removal of redundant fields:
u8 childPtrSize; /* 0 if leaf==1.  4 if leaf==0 */

implies that "leaf" can be replaced by "!childPtrSize", right? Well, on the 
bitfield version, it's only saving 1 bit (we can go the other way and replace 
childPtrSize by something like leaf?0:4 and save more space).
hdrOffset seems to be an other interesting subject as it seems to be the same 
kind of deal.

Nicolas

Re: [sqlite] sqlite internal structs don't make use of C bitfields?

2007-05-30 Thread spaminos-sqlite
- Original Message 
> MemPage bitfield patch below. 
> 
> sizeof(MemPage) on Linux: 
> 
>   original: 84
>   patched:  76
> ...
> Break-even for memory is 904/8 = 113 MemPage structs allocated.

I didn't look at the code, so mind me :)

If the MemPage are malloced individually (instead of being put in arrays), then 
they are 16 byte aligned on most platforms, making the allocated block 
effectively the same size (well, that depends on how many bytes are used by 
malloc before the user block in memory).

If on the other hand those structs are packed in arrays then there can be a 
benefit.
But there, I would think that a good experiment would be to split the fields 
into different arrays (the same old optimizations on chunky vs planar for those 
coming from computer graphics) and group data by frequency of use and/or 
locality for the caches.
An example I remember from back in the days was a struct containing data for 
each pixel that we split into two structs (puting the data used less frequently 
in a separate struct), and with this change we got over 500% speed improvement 
on the typical workload just because the processor was doing less cache miss 
and could prefetch much more efficiently when iterating over data.

Also, my take on bitfields is that they are not thread/multi processor friendly 
(there is no atomic "set bit"), and also compilers typically don't optimize 
well with that (so before applying this patch, I would test on other platforms 
than gcc linux x86).

Nicolas

Re: [sqlite] Transaction detection...

2007-03-27 Thread spaminos-sqlite
- Original Message 
> From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> To: sqlite-users@sqlite.org
> Sent: Tuesday, March 27, 2007 1:50:51 PM
> Subject: Re: [sqlite] Transaction detection...
> 
> 
> [EMAIL PROTECTED] wrote:
> > Hi all
> > 
> > Is there a way to detect if a transaction is active on the current 
> > connection?
> > Basically the equivalent of "PQtransactionStatus" in postgresql.
> > 
> > I need that to automate rollback or commit depending on other variables 
> > within my application while keeping the connection open.
> > 
> 
> http://www.sqlite.org/capi3ref#sqlite3_get_autocommit

Cool thanks, that should do it.

Sorry I missed that api call.

Nicolas

[sqlite] Transaction detection...

2007-03-27 Thread spaminos-sqlite
Hi all

Is there a way to detect if a transaction is active on the current connection?
Basically the equivalent of "PQtransactionStatus" in postgresql.

I need that to automate rollback or commit depending on other variables within 
my application while keeping the connection open.

Thanks!

Nicolas

Re: [sqlite] indexes in memory

2006-11-09 Thread spaminos-sqlite
- Original Message 
From: Christian Smith <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Sent: Thursday, November 9, 2006 8:38:51 AM
Subject: Re: [sqlite] indexes in memory


> chetana bhargav uttered:

> > Hi,
>
> > I have a question regrading indexes,
> >
> > When I open a connection,
> >
> >  Will indexes be loaded into memory. If one of the tables in the DB, the 
> > connection for which I have opened, has an index.
> >   If, so is there any way to selectively load/unload that from memory.
> 
> 
> Indexes will be loaded into the cache as needed. The whole SQLite database 
> is page based, and the cache caches the pages. The tables and indexes are 
> implemented as page based btrees, with nodes represented by pages.
> 
> The cache is unaware of the higher level structure of the btrees, and 
> there is no way to selectively bring load/unload tables or indexes from 
> memory. The page cache will manage itself on an LRU basis.
> 
> > ...
> > Chetana.

> Christian
 
 
I found that when opening your connection, if you're about to do a lot of 
operations it can be worth doing a "SELECT keyname FROM ... "over the whole 
data to prepopulate the cache with the index data.
 
Even on pretty large datasets this only takes a few seconds and the following 
operations will be much faster (and the overall time to complete the batch is 
much smaller).
 
Nicolas

Re: [sqlite] Insert statement taking too long

2006-11-09 Thread spaminos-sqlite
- Original Message 
From: Unit 5 <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Sent: Thursday, November 9, 2006 8:02:51 AM
Subject: RE: [sqlite] Insert statement taking too long


> --- Robert Simpson <[EMAIL PROTECTED]> wrote:
> > You need to create an index on the columns you're
> > joining.  Otherwise I
> > believe 100,000,000 rows (10k x 10k) in table 2 will
> > be scanned while SQLite
> > looks for matches against the 10,000 rows in table
> > 1.

> While that makes sense, I suspect there is something
> else going on.  
> 
> I did a few more tests.  For example, if I remove the
> INSERT but keep the exact same SELECT statement with
> the joins, it is fast again.  So, it seems that it is
> quite slow when doing the insert's.  I was thinking
> that perhaps the statement was not in a transaction,
> but I tried that too.

Could it just be that your data set is just too big and doesn't fit in memory? 
Your statement most likely results in random inserts in the target table.
 
Talking about this, is there a way to tell sqlite to put "holes" in the file so 
that when doing random inserts (even in a transaction), only portions of the 
file need to be moved around?
 
It would waste some disk space, but for improved performance (it's a 
trade-off), I would be willing to give away large amount of disk. I know this 
is quite the opposite of what (auto) vacuum does but when data doesn't fit in 
memory and most access is random there is not much performance benefit in 
having the data not sparse in the DB file. The "holes" could be recreated from 
time to time to ensure the sparseness of the db file (hence giving a guaranty 
on insert times).
 
Nicolas

[sqlite] SQLite disk performance

2006-05-04 Thread spaminos-sqlite
Hi all

As discussed before, I have performance issues when using sqlite on big (multi 
gig) databases.
I am still trying to use sqlite as the main db because it simplifies deployment 
so much.

The main reason seems to be that inserting in the btree is very slow (even 
using transactions) because a lot of data has to be moved around in the file.

Would using sparse files, or at least leave some space between group of nodes, 
help, so that the need to move data around in the file is not needed as much?.

Maybe that could be a pragma of some sort (so that people that still want the 
most compact db file won't suffer)?

Even better, maybe it's a setting I didn't see :)

An other concern I have is row size: the average data in our data contains a 
blob of about 3 kb. My understanding is that in that case, the first kilobyte 
(primary key included) is stored in the b-tree node and the rest somewhere else 
in the file.
My question now is: is there some sort of fragmentation inside the dbfile 
happenning with this extra data?

Nicolas



Re: [sqlite] SQLite :memory: performance difference between v2 and v3?

2006-05-01 Thread spaminos-sqlite
- Original Message 
From: Jay Sprenkle <[EMAIL PROTECTED]>
On 5/1/06, Ivan Voras <[EMAIL PROTECTED]> wrote:
> > I've spent a long time debugging low performance of an
> > application that uses :memory: database and have found
> > that sqlite v2 is much faster than v3. After some
> > digging around it seems that even two proof-of-concept
> > programs that are identical except for used SQLite
> > version reproduce this behaviour just fine:

> Testing is difficult to do correctly. As several people noted on this list 
> just
> today the first time they ran a query it had much different
> performance than subsequent
> runs of the query. Did you run these tests more than one time? What's your
> environment?

The precaching trick desribed earlier can not be done on memory databases as 
they are already... in memory.

The main reason first queries (sometimes it's more like the first few hundred 
queries if the db is big) are significantly slower are because of the way 
sqlite relies blindly on the OS caching mechanism for caching the 
indexes/primary keys in memory.

In any case, when doing any kind of benchmarking that involves disk access, you 
must clear the OS disk cache so that the algorithm used by the OS is removed 
from the equation (and also to be able to compare results), otherwise all 
you're doing is benchmarking a moving target.

To discard the disk caches:
on linux: easy, just umount and mount the partition that contains the db file
on windows: I don't know of any other way than clearing the whole cache with a 
tool such as Clearmem.exe (from the Windows Server 2003 Resource Kit, but you 
can find it on the web very easily)

Hope this helps.

Nicolas






Re: [sqlite] questions about performance

2006-04-12 Thread spaminos-sqlite
<[EMAIL PROTECTED]> wrote:

> The problem (I suspect) is that you have an index on Table3.
> As you insert to records into Table3, those record go at the
> end, which is very efficient.  But the index entries have to
> be inserted in index order, which means they get scattered
> out all through the index.  This works fine as long as the
> index is small enough to fit in cache (either SQLite's private
> cache, or failing that your operating systems disk cache.)
> Once the index gets too big to fit in cache, you tend to
> start thrashing.
>
> The problem is a lack of locality of reference in the index.
> Each index insertion requires O(logN) disk reads and writes.
> This is no big deal as long as a "disk read and write" is
> serviced from cache but gets to be a very big deal when it
> actually involves real disk I/O.  You start to get real disk
> I/O when the index loses locality of reference and exceeds
> the size of your cache.  I do not know how to fix this and 
> still make the indices useful.

If you remember I had this problem with my app, the work around I found is to 
read the whole DB on startup (well, the one used for indexing my users) to help 
the OS cache it better (for 10M users it's about 600 megs, so it's still 
reasonable).
Now as a separate file for indexes/primary keys is not an option (even though, 
I still think it would be the easiest solution, that's how mysql does it, I 
don't know about postgresql), an alternative would be to change the layout in 
the file so that indexes/primary keys are stored together in the file and the 
rest of the rows somewhere else in the file (the file could grow by having an 
alternation of index/key data and other data).
At first this might seem slower, but in practice it won't (well, it might just 
be worth a try):
the area of the file that contains indexes/primary keys will be accessed all 
the time and thus cached by the OS while the rest of the data will be less 
cached and discarded from memory.
So even though it looks like 2 seeks and 2 reads (or writes) would be needed 
for every row, in practice, the operations on the index/key part will be from 
cache and will be faster (and when writing using transactions, the added seeks 
should not really increase the time by much).

Nicolas







Re: [sqlite] sqlite performance with sizeable tables

2006-03-16 Thread spaminos-sqlite
Sorry it took me some time to get back to this thread.

- Original Message 
From: Christian Smith <[EMAIL PROTECTED]>

> When your database does not fit in memory, yes, you're right, the OS may
> well get caching wrong, and in the worst way possible. Two things though:
> - SQLite does have known scaling issues when using multi-gigabyte
>   databases.
> - Memory is cheap. If it don't fit, spend that few hundred dollars a few
>   days of your time is worth and buy another few gigs or RAM.

The machine has 2 GB of RAM and the table that seems to cause the problems is 
less than 700 megs (proven by the fact that if I precache this database, things 
get zippy).
To me it seems like the problem is related to the way the reads are done, but I 
can be wrong:
to me it seems that caches never really get "hot" (and with nothing else 
running on the machine, the OS is pretty much caching all reads done by sqlite).

>
> >Right now, sqlite shows performance that is on par with a simple
> >filesystem structure (except it's much faster to backup because
> >traversing a multimilion file structure takes several hours). I was
> >expecting a few things by moving to sqlite:
> >* getting a better ramp up (cold cache) performance than a dump
> >  filesystem structure.

> Nothing will speed up the physical disks.

I agree that disk i/o is the bottleneck, but what can be worked around is the 
latencies needed to seek in the file and the way operations are send out to the 
disk to help the OS cache more useful information and/or have a better caching 
mechanism.

> >* having a [much] higher throughput (so that I can combine multiple
> >  servers into one), as the records are small and there is no
> >  fragmentation of the folders here.
> >* having consistant latencies (filesystems tend to vary a lot).
> >
> >> - Have you tested other databases? What sort of performance did you get
> >  from those?
> >I am in the process of setting up mysql with our app, I will keep you
> >posted on the result.
> 
> Prepare to be disappointed, IMO. The most important thing when dealing
> with big fast databases is disk IO, and anything (MySQL, PGSQL, Oracle,
> SQLite) will find disks a bottleneck.

The reason it took me a while to reply to this thread was that I setup a MySQL 
server (5.0.19, run on the same machine) and adapted my application to run with 
it.

The results on the exact same operations that I did before:
* importing of the data is faster with MySQL, and performance doesn't degrade 
exponentially with the number or rows (insertions at the beginning and at the 
end of the import operation are of similar speed). sqlite and MySQL started at 
around the same speed, but after a few million inserts, sqlite becomes slow.
* cached cleared, on my typical run test (read&write combination), MySQL ran 2 
minutes faster than sqlite (6 minutes vs 8 minutes), getting nice latencies 
after about 3 minutes (and that's where the difference is, I think). I think 
that after 3 minutes, MySQL manage to have most critical data cached in RAM.
* with precached dbs (file sizes are very similar for sqlite and MySQL, for the 
difference that MySQL separates the data and index into 2 files), MySQL is 
faster too (1m30 vs 2 minutes).
* latencies stay low (as in, rarely blocks for > 4 seconds, sqlite would block 
for > 10 seconds especially for the early queries when there is no cache).

Maybe the reason MySQL performs better is simply because they use 2 files 
instead of 1.

It does make sense: if the index/offsets are all together on disk, then the OS 
can very efficiently cache this information for us.
Was splitting the file into 2 ever tried for sqlite?
I mean, there is already the -journal file, so why not having a -index file as 
well?

> > To accomodate a lot of users, I have one main DB that holds all users ->
> > unique ID The other dbs are a partitioned db really, so that later
> > threads conflict only when accessing users within the same range and the
> > dbs don't get out of proportion (because blobs are used I thought a lot
> > of records could become a performance bottleneck). If I precache the big
> > table name -> uid, the import of 650,000 records takes about 2 minutes.
> >If I don't precache it, it takes about 8 minutes. I start/stop
> >transactions every X seconds (right now, I have it set at 10 seconds).
> 
> Are these name->uid lookups used in a read-only fashion after import? If
> so, then multiple threads can indeed read the table in parallel, and you
> should have good performance.

A lot of the time this table is accessed read only, yes.
I tried to have several threads access the table at the same time before, and 
the performance was a bit lower (maybe because transactions occur at the same 
time, and block the readers?), but also, and this was why I rolled back this 
change, the busy handler that does a simple "usleep", causes starvation and 
thus potential high latencies.

> 
> For now, I'd say:
> - If the import/update case is 

Re: [sqlite] sqlite performance with sizeable tables

2006-03-13 Thread spaminos-sqlite
- Original Message 
> From: Joe Wilson <[EMAIL PROTECTED]>
> The disk read/write reordering optimization only works if the 
> database file is contiguous on the disk and not fragmented. 

> --- [EMAIL PROTECTED] wrote:
> > Basically, the db file is accessed with seek + read/write operations.
> > Given a set of such operations, it can be very beneficial to reorder those 
> > operations so that
> > the file is accessed going from the begining to the end of the file (and 
> > not pure random).
> > This is a problem mostly for reads (writes are reordered when doing a 
> > combination of async
> > writes + flush at the end).

This was my first idea on trying to speed things up, after that I realized that 
there had to be a better and more reliable way to do async i/o.

In fact,  I found that the aio API does exactly that

See http://mia.ece.uic.edu/~papers/WWW/books/posix4/DOCU_008.HTM

The aio_read/aio_write + aio_suspend calls are POSIX and available on linux, 
bsd, solaris (and Windows event api is very similar to it).

This api, when used with aio_suspend will not rely on signals like SIG_IO, but 
only on low level kernel signals (invisible to the application), making this 
thread safe and pretty much without any side effects on the rest of the 
application (that uses sqlite).

After finding out about this api, I found out that at least mysql and 
postgresql use it, so I am guessing that changing the sql engine to generate 
batches of read/writes is possible.

My guess is that using this api will increase performance a lot as the hard 
drive heads won't have to go back and forth, seeking at random places on the 
disk (thus reducing the impact of having small caches).

Nicolas






Re: [sqlite] sqlite performance with sizeable tables

2006-03-10 Thread spaminos-sqlite
- Original Message 
> From: Christian Smith <[EMAIL PROTECTED]>

> But SQLite depends on the OS caching abilities for much of it's
> performance. Removing it is like saying SQLite is rubbish on Intel
> processors after testing on a i486.

yes and no: while it's nice to be able to rely somehow on the OS cache, the OS 
can not guess what the usage pattern of the app is going to be.
Most of the time, the os will guess wrong too: a typical case is random access 
of a file, the os does readahead for example that is not beneficial in that 
case (and actually can slow things down).
I always try to give the best hints possible to the underlying subsystem (like 
never rely 100% on the optimizer to know what to do with a bunch of code, 
because it won't).

When dealing with big databases (that don't typically fit in memory), the OS 
will most likely cache the wrong kind of data, where as the lib could cache 
important information such as some key offsets in the file (the cache would 
contain some elements from the btree so that we can resume directly from there).

> The SQLite cache is very restricted. It only caches data over a single
> transaction, discarding it on COMMIT. This is because another process may
> update the file between transactions.
Oh I didn't know that the db cache was that short lived!?


> The file format has a file generation number, which could be used to delay
> the discarding of the SQLite cache between transactions, but I believe it
> is as yet unimplemented.

Oh I see the cache problem is probably related to multiple process access.
In my case I have a multithreaded process using nice locks to access the db 
(that way I never get busy database too).

> >I was very excited about this result, so I tried async access on my
> >actual app (that holds 8M+ records), but I was disapointed to see that
> >async actually even slow things down?!
> By much? This surprises me. Perhaps there is another bottleneck in the app
> that isn't covered by this test case?

Well, this test focuses on accessing one database, and my app accesses several.
It was not much slower, maybe a 10% penalty, but I was expecting a dramatic 
gain of performance :)

On the other hand, I tried to make better use of the cache: if I run my 1M 
inserts in 10 transactions of 100,000,  things get a bit slower than 100 
transactions of 10,000 inserts.
I tried one transaction of 1,000,000 inserts and the test app hangs at 100% cpu 
for over 30 minutes now, not sure about what is going on here.

> >Oh also, a way to improve performance:
> >reads can be reordered, so if sqlite was doing async reads of several
> >places in the db file at the same time, the OS/drive would reorder things
> >the same way they are reordered when writing for a good boost in
> >performance.

> This would add complication to SQLite. We depend on the OS cache to shield
> us from having to do async IO. It's difficult to share SIGIO, so an app
> that also did async IO would have problems with SQLite. In general, read
> performance is not a problem in SQLite. The biggest bottleneck is
> transaction COMMIT, due to the synchronous nature of the operation.
On unix/linux systems SIGIO is one way of doing async stuff (I never use it 
because it doesn't play well with threads).
Using poll or select will not interfere with other parts of the application.
I do believe that batching requests can dramatically improve performance:
a typical join will read 2 tables on disk, causing a lot of seeks.
Read performance is important when dealing with large files (relative to RAM 
size), I believe that quite a bit of people use sqlite with multi GB files, so 
I don't think this is a stupid thing to look into.

> General observations/questions:
> - What did you expect from SQLite? Can you give indication of performance
>   expectations?
> - What did you use previously? How does SQLite compare on performance?
I was expecting much better performance that what I am getting right now:
before I was using the filesystem as a DB (basically, folders + filenames -> 
data)
Namely reiserfs performs pretty well for doing this kind of thing (millions of 
record).
But that was wasting a lot of disk space and hitting all sorts of limits in the 
various OSes.

Tried BDB a few years ago as a replacement, but "wedged" databases and 
licensing issues kept us away from it.
then I red about sqlite giving performance of the order of 25000 inserts/second.
I thought that I could probably get around 1 inserts/second on bigger 
databases.

Right now, sqlite shows performance that is on par with a simple filesystem 
structure (except it's much faster to backup because traversing a multimilion 
file structure takes several hours).
I was expecting a few things by moving to sqlite:
* getting a better ramp up (cold cache) performance than a dump filesystem 
structure.
* having a [much] higher throughput (so that I can combine multiple servers 
into one), as the records are small and there is no fragmentat

Re: [sqlite] sqlite performance with sizeable tables

2006-03-10 Thread spaminos-sqlite
- Original Message 
> From: [EMAIL PROTECTED]
> <[EMAIL PROTECTED]> wrote:
> > 
> > One question though: are the file access "sorted", so that seeks are
> > minimised when performing a transaction (making the assumption that 
> > the file is not fragmented on disk)?
> > 

> I'm not sure what you are asking.  Can you restate your question?

Basically, the db file is accessed with seek + read/write operations.
Given a set of such operations, it can be very beneficial to reorder those 
operations so that the file is accessed going from the begining to the end of 
the file (and not pure random).
This is a problem mostly for reads (writes are reordered when doing a 
combination of async writes + flush at the end).

I am guessing that given the nature of sql (a lot of iterations through 
keys/indexes), it might be possible to create batches of disk i/o queries:
create a table for the next N reads (with offset) and do those reads sorted by 
offset.

In the graphics world (where I come from), it's common practice to at least do 
prefetching of data (send in batch on the bus), and I believe that the benefits 
would be similar here.

Nicolas






Re: [sqlite] sqlite performance with sizeable tables

2006-03-09 Thread spaminos-sqlite
- Original Message 
> From: Christian Smith <[EMAIL PROTECTED]>

> Is this a likely usage scenario? Will your application regularly
> umount/mount the filesystem between transactions? While sounding
> facetious, I'm not trying to. Your otherwise excellent example is let down
> by a probably unreasonable usage pattern.

My example was just that, a test case trying to explain why my actual app goes 
to a crowl.
The umount/mount is there to insure that I am not benchmarking the OS caching 
abilities.
What this does is that it gives a better idea on the performance of the caching 
of sqlite.

> The pages containing nodes closer to the root of the affected btrees will
> be cached pretty quickly, but the random access pattern and constant
> random updates will make for very non-localised updates to the DB file,
> which would require lots of seeks on sync.

> You might be well advised to look at how long it takes to do the updates,
> then the COMMIT seperately. I'd hazard a guess that you'll be spending the
> majority of the time in the COMMIT commands. Something like:

All right, so I added the intermediate timing to see where the time is spend.

My results show that on the typical 7 seconds operation (after the numbers 
stabilise)
2 seconds are spend between the BEGIN / END and  5 seconds to perform the 
COMMIT.
The first transactions show things like:


When I saw this, what I tried later on was to play around with
PRAGMA synchronous =OFF and PRAGMA synchronous = NORMAL

While synchronous = NORMAL didn't show any real improvement on the timings 
compared to FULL
full async mode was dramatically faster:
the random test that was taking 600+ seconds to execute now runs in 172 seconds 
(and running the global sync after the test only took a couple seconds, meaning 
most data was already written on disk).
I looked at timings, and now the time is basically spend between the BEGIN/END, 
the COMMIT is very fast.

I was very excited about this result, so I tried async access on my actual app 
(that holds 8M+ records), but I was disapointed to see that async actually even 
slow things down?!

I also tried to mount the filesystem (reiser) in "journal" mode 
(rw,noatime,data=journal), I had it in "writeback", and things get slower with 
this setting.

So what I tried next was to simply run my app after doing a
cat generaldb.db > /dev/null
(this takes about 15 seconds on my system, it's a 600Megs file)

This has the effect of preloading the db file in memory in the OS caches.

After I do this, the app becomes very fast (as in 4 times faster).

note: I noticed that even if I specify some ridiculous number like
PRAGMA cache_size = 45 (that should be enough to cache the whole db by the 
way)
for the cache, the memory footprint of my app stays pretty low, as if it was 
not caching much!?

So at this point, I think the caching/polling subsytem seems to be the problem.

Oh also, a way to improve performance:
reads can be reordered, so if sqlite was doing async reads of several places in 
the db file at the same time, the OS/drive would reorder things the same way 
they are reordered when writing for a good boost in performance.

Nicolas






Re: [sqlite] sqlite performance with sizeable tables

2006-03-08 Thread spaminos-sqlite
- Original Message 
From: [EMAIL PROTECTED]
To: sqlite-users@sqlite.org

> SQLite inserts in records in primary key order.  (That is not
> strictly true - but it is close enough to being true for the
> purposes of what follows.)  So when you insert records that
> are already in fname order, they can be appended to the file
> and the insert goes relatively quickly. This is because the
> file does not need to be reorganized to make space for the
> new row (it is simply appended) and because nearby information
> is already in cache and can be fetched quickly.  But if you 
> insert with randomly ordered fnames, then records are constantly 
> being inserted into different places in the middle of the file.
> This takes more time because the file has to be reorganized 
> to make space for the new record and because each record is 
> in a different spot you are have no locality of reference 
> and the cache is much less effective.

OK, now I understand better why there is such a difference in performance. It 
seems that sqlite maybe is not an option for me as I am updating records pretty 
often (50% of the time roughly: access some data, perform some task with it and 
later write back the updated version).

One question though: are the file access "sorted", so that seeks are minimised 
when performing a transaction (making the assumption that the file is not 
fragmented on disk)?

Nicolas






[sqlite] sqlite performance with sizeable tables

2006-03-07 Thread spaminos-sqlite
Hi all

it seems that I am running in a problem with the way sqlite accesses the disk 
when inserting rows of data in databases that are large in number of records 
but not necessary big on disk (I am talking millions of records in files that 
are in the order of a few hundred MBytes).

I reduced to this test case:

I have a table that I create with
PRAGMA cache_size = 32000
CREATE TABLE IF NOT EXISTS name2uid(fname VARCHAR(100) NOT NULL , lname 
VARCHAR(32) NOT NULL, uid INTEGER NOT NULL,
 PRIMARY KEY (fname, lname));

then for my test, I do inserts that look like this:
REPLACE INTO name2uid (fname, lname, uid) VALUES ('%s','SMITH',%d)
where I set fname as U%xser (%x replaced by the uid that I generate myself).

I create transactions of 1 replace at a time.

I am running on Linux Fedora Core 3 on a Opteron 146 (2GHz), with 2GB RAM, SATA 
drive with a partition reserved for the sqlite database.

IMPORTANT: I umount/mount the partition between tests to clear the disk caches 
from the OS.

Test 1:
>From an empty DB.
I loop so that the uids are consecutive numbers from 1 to 1 million.
at this point, each transaction takes less than 1 second to execute. The whole 
1M inserts (100 transactions) take 74 seconds ie 13000 inserts/second.

Test 2:
with the DB from Test 1, I run the exact same sequence of inserts.
the 1M inserts take 103 seconds to execute that's still 9700 inserts/second.
First transaction is 8 seconds then about 1 second.
At this point everything is OK.

Test 3:
from the DB from Test 2, I run 1 million inserts where the uid is selected 
randomly between 1 and 1 million.
At this point, the performance is pretty bad:
the first 1 insert transaction takes 31 seconds to run, the next ones take 
over 5 seconds, for a total run time of 623 seconds.
That's 1600 inserts/second (6 times slower than the ordered case).

It seems that the performance degrades pretty badly with the number of records:
this is still a relatively small dataset (especially given the simplicity of 
the table).
To me, it looks like there is a problem in the way the files are accessed.

I tried to partition the dataset by creating separate databases or tables, but 
creating separate databases make things slower (expected, as the problem is 
disk I/O seeks probably), and partitioning tables give me only a 20% speed gain 
on those 1M insert tests.

Things get really slow afterwards, for example moving to 2M records (137 MB on 
disk):
Test 1 w/ 2M is 138 seconds, about 2 times slower than the 1M case (normal)
Test 2 w/ 2M is 192 seconds, 2 times slower (normal).
Test 3 w/ 2M is 2390 seconds 4 times slower than in the 1M case (12 times 
slower than the ordered case).

I didn't try these tests with bigger sets (my original app was on a DB of about 
8M records, and things were very very slow).

Given the size of the DB on disk, I would think that the speed could be much 
more consistant than that (especially considering that it should cache a lot of 
the information quickly).

Any idea?

Nicolas


PS: I use the c api for running those tests.