Re: [sqlite] sqlite performance, locking threadi

2006-12-29 Thread Emerson Clarke

Ken,

Thanks.

Ok, i think i did miss the point with your test case.  I assumed that
it was always possible to perform multiple select, insert, delete, and
update statements within a single transaction anyway.  Which i guess
relates to you last point.  I dont see why if i have a single global
transaction i cant perform all the operations i want within that
transaction to maximise performance.  In fact i know i can. because
thats what i do with a single thread.

I have always assumed that the simplest and highest performance
solution would be to have multiple threads with one transaction and
one cache.  I will take a look at the test_server.c code and see if i
can understand what you are suggesting.

I dont see why writer starvation would be a problem, but again i
probably just dont understand what your suggesting there.  Anyway,
thanks for your help.  I suspect that you are right and the ultimate
solution will be to write my own synchronisation code, since that will
allow me to have a consistent api across multiple databases like
sqlite and mysql.

I could achieve this if i could have multiple threads executing
statements within the same transaction since that is the effective
behaviour i get from most other databases which are not in process.
In those cases it is the process boundary and the tcp/ip connections
which are effectivley serialising all interactions which is equivalent
of having only one thread when using sqlite.

Emerson

On 12/29/06, Ken [EMAIL PROTECTED] wrote:


 Emerson,

   You just need to lock the entire transaction using a mutex before hand. That 
means each thread will have its own sqlite cache.  Reread the sqlite locking 
and concurrency guide, you'll see that SQLITE is NOT a transaction based system 
such as Postgress/mysql/ oracle. It locks the entire database file at the point 
of writing, not at the first insert/update/delete. Its better to think of 
sqlite as an easy way to perform data access against a FLAT file. Rather than a 
transaction oriented system.

 Regarding my test case. You missed the point, there is a select statement in 
the middle that each thread executes. The nice thing is that each and every 
thread uses a shared cache and can execute the select statement concurrently 
without a mutex. The locking when using a shared cache works at the table level 
instead of file!

 Another thing you should look at: Create a master DB, then attach the 
individual databases to the master. That way you only need one Cache. How do 
you know you wouldn't benefit from merging your caches? Have you tried? Do you 
have test cases to prove that it doesn't help?

 You stated that you want to put your owner locking mutex wrapper around the 
database interactions so that there would be only 1 thread access the DB.. That 
means that you need to release the lock after each 
statement(insert/update/delete) and not perform a BEGIN transaction/commit... 
Which means very slow performance. And besides only one thread regardless could 
be accessing the DB. Whereas the test_server.c  architecture (you need to 
modify this for your own needs) will let you create multiple shared connections 
to the database. And each thread can perform selects concurrently. But all 
threads are serialized into the single thread for writing. This takes care of 
all locking issues and is technically the same as mutexing the writes across 
threads. And it addresses writer starvation, which you have not addressed with 
your simple mutex locking.

 It seems to me you might want write your own locking implementation or even 
abandoning sqlite if it doesn't fit your needs.


Emerson Clarke [EMAIL PROTECTED] wrote: Ken,

Thanks i understand your example well.

What im looking for is the ability to have multiple threads perform a
write operation based on my mutex, not some internal sqlite write
mutex.  If i am managing the concurrency and performing correct
synchronisation, why can i not have multiple threads writing to the
database at once and achive a better overall performance.

Given that the indexing process i am referring too has several steps
which it must perform in successive order, and not all of them are
database writes, i am simply trying to use threads to emulate a
situation where there is only one thread doing all the database
writes, if that makes sense.

So in this case, what ever synchronisation sqlite is doing internally
is actually getting in the way of what i am trying to do.

There were no gains in performance in your test case becuase there was
no reason to have multiple threads.  The only thing each thread was
doing was writing to the database, so of course the only thing you
will introduce by using multiple threads is overhead.

Emerson



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] sqlite performance, locking threadi

2006-12-29 Thread Emerson Clarke

OK, thanks Joe.

I will look up those peculiarities as you suggested, im interested in
understanding what they are.

I agree with you, DRH has done a great job.  Sqlite is a fantastic
piece of software and and outstanding example of the way open source
should be.  It is small, efficient and incredibly easy to use.

It is remarkable that it has so many capabilities and as such it has
rapidly become the critical component in many of the things i am
working on.

I was just a bit confused about the way the threading works, hence
this discussion.

Emerson

On 12/29/06, Emerson Clarke [EMAIL PROTECTED] wrote:

Ken,

Thanks.

Ok, i think i did miss the point with your test case.  I assumed that
it was always possible to perform multiple select, insert, delete, and
update statements within a single transaction anyway.  Which i guess
relates to you last point.  I dont see why if i have a single global
transaction i cant perform all the operations i want within that
transaction to maximise performance.  In fact i know i can. because
thats what i do with a single thread.

I have always assumed that the simplest and highest performance
solution would be to have multiple threads with one transaction and
one cache.  I will take a look at the test_server.c code and see if i
can understand what you are suggesting.

I dont see why writer starvation would be a problem, but again i
probably just dont understand what your suggesting there.  Anyway,
thanks for your help.  I suspect that you are right and the ultimate
solution will be to write my own synchronisation code, since that will
allow me to have a consistent api across multiple databases like
sqlite and mysql.

I could achieve this if i could have multiple threads executing
statements within the same transaction since that is the effective
behaviour i get from most other databases which are not in process.
In those cases it is the process boundary and the tcp/ip connections
which are effectivley serialising all interactions which is equivalent
of having only one thread when using sqlite.

Emerson

On 12/29/06, Ken [EMAIL PROTECTED] wrote:

  Emerson,

You just need to lock the entire transaction using a mutex before hand. 
That means each thread will have its own sqlite cache.  Reread the sqlite locking 
and concurrency guide, you'll see that SQLITE is NOT a transaction based system 
such as Postgress/mysql/ oracle. It locks the entire database file at the point of 
writing, not at the first insert/update/delete. Its better to think of sqlite as 
an easy way to perform data access against a FLAT file. Rather than a transaction 
oriented system.

  Regarding my test case. You missed the point, there is a select statement in 
the middle that each thread executes. The nice thing is that each and every thread 
uses a shared cache and can execute the select statement concurrently without a 
mutex. The locking when using a shared cache works at the table level instead of 
file!

  Another thing you should look at: Create a master DB, then attach the 
individual databases to the master. That way you only need one Cache. How do you 
know you wouldn't benefit from merging your caches? Have you tried? Do you have 
test cases to prove that it doesn't help?

  You stated that you want to put your owner locking mutex wrapper around the 
database interactions so that there would be only 1 thread access the DB.. That 
means that you need to release the lock after each statement(insert/update/delete) 
and not perform a BEGIN transaction/commit... Which means very slow performance. 
And besides only one thread regardless could be accessing the DB. Whereas the 
test_server.c  architecture (you need to modify this for your own needs) will let 
you create multiple shared connections to the database. And each thread can 
perform selects concurrently. But all threads are serialized into the single 
thread for writing. This takes care of all locking issues and is technically the 
same as mutexing the writes across threads. And it addresses writer starvation, 
which you have not addressed with your simple mutex locking.

  It seems to me you might want write your own locking implementation or even 
abandoning sqlite if it doesn't fit your needs.


 Emerson Clarke [EMAIL PROTECTED] wrote: Ken,

 Thanks i understand your example well.

 What im looking for is the ability to have multiple threads perform a
 write operation based on my mutex, not some internal sqlite write
 mutex.  If i am managing the concurrency and performing correct
 synchronisation, why can i not have multiple threads writing to the
 database at once and achive a better overall performance.

 Given that the indexing process i am referring too has several steps
 which it must perform in successive order, and not all of them are
 database writes, i am simply trying to use threads to emulate a
 situation where there is only one thread doing all the database
 writes, if that makes sense.

 So in this case, what ever 

Re: [sqlite] sqlite performance, locking threadi

2006-12-29 Thread Ken
The test server.c code is an example that has a single thread that performs the 
DB interactions each client thread, communicates via a queue interface. Each 
client will get serialized into the  DBserver thread and get its work done. 
Thus eliminating any lower level locking and mutexing inside the application 
code.
 
 Your assumption regarding 1 thread/1 cache is pretty accurate. This is what 
the test_server.c code does, however each client thread does however get a 
database connection handle.
 
 If you have a single global transaction you can do insr/selec/upd/delete.. The 
thing is that the connection may not be passed around amongst threads. 
 
 You cannot have multiple threads executing within the same transaction!!! To 
my understanding, that is not allowed. Please someone else correct me if I'm 
wrong...
 
 The difference is with other DB's they utilize a finer grain of locking 
internally, either page locking, row locking etc.. Sqlite uses Database 
LOCKING, which is full file. So its really only designed to be used by a single 
thread of execution. (again DRH please correct me here if I'm wrong).
 
 
Emerson Clarke [EMAIL PROTECTED] wrote: Ken,

Thanks.

Ok, i think i did miss the point with your test case.  I assumed that
it was always possible to perform multiple select, insert, delete, and
update statements within a single transaction anyway.  Which i guess
relates to you last point.  I dont see why if i have a single global
transaction i cant perform all the operations i want within that
transaction to maximise performance.  In fact i know i can. because
thats what i do with a single thread.

I have always assumed that the simplest and highest performance
solution would be to have multiple threads with one transaction and
one cache.  I will take a look at the test_server.c code and see if i
can understand what you are suggesting.

I dont see why writer starvation would be a problem, but again i
probably just dont understand what your suggesting there.  Anyway,
thanks for your help.  I suspect that you are right and the ultimate
solution will be to write my own synchronisation code, since that will
allow me to have a consistent api across multiple databases like
sqlite and mysql.

I could achieve this if i could have multiple threads executing
statements within the same transaction since that is the effective
behaviour i get from most other databases which are not in process.
In those cases it is the process boundary and the tcp/ip connections
which are effectivley serialising all interactions which is equivalent
of having only one thread when using sqlite.

Emerson

On 12/29/06, Ken  wrote:

  Emerson,

 You just need to lock the entire transaction using a mutex before hand. That 
 means each thread will have its own sqlite cache. Reread the sqlite locking 
 and concurrency guide, you'll see that SQLITE is NOT a transaction based 
 system such as Postgress/mysql/ oracle. It locks the entire database file at 
 the point of writing, not at the first insert/update/delete. Its better to 
 think of sqlite as an easy way to perform data access against a FLAT file. 
 Rather than a transaction oriented system.

 Regarding my test case. You missed the point, there is a select statement in 
 the middle that each thread executes. The nice thing is that each and every 
 thread uses a shared cache and can execute the select statement concurrently 
 without a mutex. The locking when using a shared cache works at the table 
 level instead of file!

 Another thing you should look at: Create a master DB, then attach the 
 individual databases to the master. That way you only need one Cache. How do 
 you know you wouldn't benefit from merging your caches? Have you tried? Do 
 you have test cases to prove that it doesn't help?

 You stated that you want to put your owner locking mutex wrapper around the 
 database interactions so that there would be only 1 thread access the DB.. 
 That means that you need to release the lock after each 
 statement(insert/update/delete) and not perform a BEGIN transaction/commit... 
 Which means very slow performance. And besides only one thread regardless 
 could be accessing the DB. Whereas the test_server.c architecture (you need 
 to modify this for your own needs) will let you create multiple shared 
 connections to the database. And each thread can perform selects 
 concurrently. But all threads are serialized into the single thread for 
 writing. This takes care of all locking issues and is technically the same as 
 mutexing the writes across threads. And it addresses writer starvation, which 
 you have not addressed with your simple mutex locking.

 It seems to me you might want write your own locking implementation or even 
 abandoning sqlite if it doesn't fit your needs.


 Emerson Clarke  wrote: Ken,

 Thanks i understand your example well.

 What im looking for is the ability to have multiple threads perform a
 write operation based on my mutex, not some internal