Re: [sqlite] Optimizing concurrency with sql query - locks?

Olaf Schmidt Wed, 06 May 2009 10:44:39 -0700

"Rosemary Alles" <al...@ipac.caltech.edu> schrieb im
Newsbeitrag news:af79a266-b697-4924-b304-2b1feccba...@ipac.caltech.edu...


> Run on a single processor, the following query is quite fast:
> ...snip...
> When concurrency is introduced (simply running the query on several
> processors against the same database - say 300 instances of it) ...
Normally it does not make sense, to start more instances than
available CPU-Cores (as long as a spanned process/thread
gets the chance, to do some real work - and does not have to
"idle around" in a waitstate).

On our Appserver we typically use only a worker-threadpool of
ca. twice the CPU-cores - the worker-pool is then provided with
new jobs from a central server-queue (where incoming requests
are "parked" first).

> ...a massive slow down with significant fluctuations in time between
> instances. Only reads are done (i.e. queries) - no writes. How does
> one optimize concurrency in sqlite3

As others already commented, if your Queries cannot be run within
the SQLite-Cache ("locally visible to one CPU-core" - because
your underlying table was too large to fit in), then the single resource
(your disk) comes into the game and SQLite does not scale well.

In case of smaller DBs or tables (or increased cache-settings for SQLite)
this little engine scales surprisingly well on parallel Read-Requests
(speaking for the mode with a separate connection on each thread,
 so the shared-cache-mode is currently not used on our Appserver).

Here come some values (tested against a quad-core here, hosting
only a small NWind.db, which entirely fits into the cache) - each
client running an endless stress-loop with always the same query
("Select * from Invoices" - which stresses the engine a bit,
because that View contains some Joins already, resultset-size
2155 records on 26 columns). No resultset-caching was done
on the Appserver-end - so the query was always performed
against the sqlite-engine directly (so no tricks here).

Server-Responses per second (disconnected and serialized Rs):
1 client: ca. 21
2 clients: ca. 37
3 clients: ca. 51
4 clients: ca. 62
5 clients: ca. 71
6 clients: ca. 80

The reason, that I was not able, to get close to the 80 Responses
per second already with 4 clients was, that the clientside also had
some stress (visualizing the retrieved Resultsets in a DataGrid, before
starting a new request) - and 4 of them were started on a DualCore
XP-machine first (causing enough stress already on that XP-machine) -
the 5th and 6th client were additionally started then on another client-
machine (running on Linux then using the Wine-layer - in the same way
as the RPC-Server was doing on the quad-machine).

Here's a screenshot of this running scenario (6 clients causing stress,
the screenshot was taken on the XP-machine, which was running 4
of the 6 clients - the server is visible in a VNC-session-window).
http://www.datenhaus.de/Downloads/Quad-Stress.png

And as you can see on the KDE4-systemmonitor - the 4 CPU-cores
are pretty equally involved in that scenario (one core a bit more -
due to the WineServer, which apparently was handling the IO-translation
into the Linux-socket-stack "on its own" - also interesting to see,
that Wine apparently has something like a "cutting-point" at ca. 80%
CPU-usage - probably to "play fair" with the underlying linux-OS
or something like that... (at least on that debian-machine here).

As as side-note (becoming somewhat more offtopic now in this
thread) - the network-coverage was at ca. 17MB/sec on eth0 in
this small stress-test (1GBit-network) - the RPC-transfers were
running FastLZ-compressed over the wire - if I would
have switched that off, then we would have seen a ca. 50%
coverage on the GBit-channel (because usually circa factor 3 is
achievable with compression on typical DB-resultsets).
And it's not that easy, to reach ca. 50% GBit-coverage with
"typical DB-Server-Engines" - measured on the serverside under
concurrency-stress (e.g. against Postgres or MySQL or
MS-SQLServer). That's mainly due to the fact, that Appservers
usually don't (have to) support serverside cursors - so typically
larger packets (completely prepared and serialized resultsets)
can be send over the wire (not that much ping-pong necessary
on the protocol).

So for our purposes (having more or less smaller DBs here),
the current sqlite-implementation scales good enough in
these typical, more ReadDirection-related-scenarios -
the currently implemented locking-scheme seems not all
that restrictive (if working against non-shared SQLite-caches) -
so I hope, the sqlite-devs (although they often recommend,
to work singlethreaded wherever possible <g>) keep up their
good work on that front. :-)

Regards,

Olaf Schmidt



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Optimizing concurrency with sql query - locks?

Reply via email to