Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Boris Kolpackov
Hi Marc,

"Marc L. Allen"  writes:

> If I understand correctly, the purpose of the overall mutex is to prevent a
> page from being removed underneath a user.  If the standard DB locking
> semantics are working properly, I think there is no possibility of a page's
> data from being modified underneath another user.

Yes, that's my understanding also. This is all taken care of at higher
(transaction and table) locking levels.


> If the above it true, cache protection semantics are strictly concerned with
> page management. That is, a page is requested that is not in the cache and
> needs to be inserted into it. If the cache is full, another page needs to
> be released. All that is required is protecting pages currently in use from
> being released.
>
> I think, instead of a mutex serializing access to the entire cache, all
> that is needed is a mutex serializing access to the cache meta-data and
> the use of reference counts to help the page replacement algorithm make
> a good choice in which page to remove.

All sounds very sensible to me.

Boris
-- 
Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog
Compiler-based ORM system for C++  http://codesynthesis.com/products/odb
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Boris Kolpackov
Hi Pavel,

Pavel Ivanov  writes:

> > "There are four levels to the shared-cache locking model: transaction
> >  level locking, table level locking, cache level locking, and schema
> >  level locking. They are described in the following three sub-sections."
>
> This would be a bad way of documenting this because "cache level
> locking" works on a completely different level than transaction, table
> or schema level locking. The exclusive mutex is taken inside one call
> to sqlite3_step() and released upon exit. But all other "locks" have
> "meta-level" notion and work between calls to sqlite3_*() functions.

I agree. But I also think that documenting it like this is better
than not documenting it at all. This "read serialization" behavior
of SQLite was a complete surprise to me.

Maybe instead of documenting it, we should just fix it. I personally
will rather write code than documentation ;-).

Boris
-- 
Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog
Compiler-based ORM system for C++  http://codesynthesis.com/products/odb
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Marc L. Allen
I don't believe I'm suggesting one mutex per page.

If I understand correctly, the purpose of the overall mutex is to prevent a 
page from being removed underneath a user.  If the standard DB locking 
semantics are working properly, I think there is no possibility of a page's 
data from being modified underneath another user.  That would be no different 
than a physical DB page being modified underneath another user.

If the above it true, cache protection semantics are strictly concerned with 
page management.  That is, a page is requested that is not in the cache and 
needs to be inserted into it.  If the cache is full, another page needs to be 
released.  All that is required is protecting pages currently in use from being 
released.

I think, instead of a mutex serializing access to the entire cache, all that is 
needed is a mutex serializing access to the cache meta-data and the use of 
reference counts to help the page replacement algorithm make a good choice in 
which page to remove.

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Simon Slavin
Sent: Monday, August 13, 2012 10:23 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Multi-Thread Reads to SQLite Database


On 13 Aug 2012, at 3:11pm, Marc L. Allen  wrote:

> I wonder if it would be possible to refine the cache locking mechanism.
> 
> If I understand the modified DB/Table locking semantics when running under a 
> shared-cache, it appears that the cache page should be protected against 
> readers and writers.

All shared assets must be protected against readers and writers.  A reader 
needs to know that nothing is going to scramble the data while it's reading.  A 
writer must know that nothing is reading the data it's about to change.  (A 
little handwaving and a few exceptions, but that's the gist of it.)

> Perhaps only the list of pages in the cache need to be protected by a 
> higher-level mutex?  That is, if you want access to a page, you grab the 
> mutex, flag the page as 'in-use' with a reference counter, release the mutex, 
> and go on about your business.

You are proposing one mutex per page.  This is a system which many DBMSes use 
but it would /greatly/ slow down SQLite.  Also it would complicate the code 
quite a bit since a write to one page often leaks over to neighbouring pages.

> If you have multiple readers, they would be able to access the physical page 
> concurrently.  When access is complete, the reference count would be 
> decremented.

To get the effect of this, simply stop using shared-cache.  Let each process 
have its own cache.  That way each process knows nothing is messing with its 
cache.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Simon Slavin

On 13 Aug 2012, at 3:11pm, Marc L. Allen  wrote:

> I wonder if it would be possible to refine the cache locking mechanism.
> 
> If I understand the modified DB/Table locking semantics when running under a 
> shared-cache, it appears that the cache page should be protected against 
> readers and writers.

All shared assets must be protected against readers and writers.  A reader 
needs to know that nothing is going to scramble the data while it's reading.  A 
writer must know that nothing is reading the data it's about to change.  (A 
little handwaving and a few exceptions, but that's the gist of it.)

> Perhaps only the list of pages in the cache need to be protected by a 
> higher-level mutex?  That is, if you want access to a page, you grab the 
> mutex, flag the page as 'in-use' with a reference counter, release the mutex, 
> and go on about your business.

You are proposing one mutex per page.  This is a system which many DBMSes use 
but it would /greatly/ slow down SQLite.  Also it would complicate the code 
quite a bit since a write to one page often leaks over to neighbouring pages.

> If you have multiple readers, they would be able to access the physical page 
> concurrently.  When access is complete, the reference count would be 
> decremented.

To get the effect of this, simply stop using shared-cache.  Let each process 
have its own cache.  That way each process knows nothing is messing with its 
cache.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Marc L. Allen
I wonder if it would be possible to refine the cache locking mechanism.

If I understand the modified DB/Table locking semantics when running under a 
shared-cache, it appears that the cache page should be protected against 
readers and writers.

Perhaps only the list of pages in the cache need to be protected by a 
higher-level mutex?  That is, if you want access to a page, you grab the mutex, 
flag the page as 'in-use' with a reference counter, release the mutex, and go 
on about your business.  If you have multiple readers, they would be able to 
access the physical page concurrently.  When access is complete, the reference 
count would be decremented.

Marc
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Pavel Ivanov
> "There are four levels to the shared-cache locking model: transaction
>  level locking, table level locking, cache level locking, and schema
>  level locking. They are described in the following three sub-sections."

This would be a bad way of documenting this because "cache level
locking" works on a completely different level than transaction, table
or schema level locking. The exclusive mutex is taken inside one call
to sqlite3_step() and released upon exit. But all other "locks" have
"meta-level" notion and work between calls to sqlite3_*() functions.


Pavel


On Mon, Aug 13, 2012 at 6:45 AM, Boris Kolpackov
 wrote:
> Hi Richard,
>
> Richard Hipp  writes:
>
>> In shared-cache mode, SQLite acquires a mutex on the shared cache, thus
>> serializing access.
>
> Oh, that's quite a surprise. I would have never said that something
> like this is going on just by reading the documentation page on shared
> cache:
>
> http://www.sqlite.org/sharedcache.html
>
> Specifically, quoting Section 2.2, "Table Level Locking":
>
> "When two or more connections use a shared-cache, locks are used to
>  serialize concurrent access attempts on a per-table basis. Tables
>  support two types of locks, "read-locks" and "write-locks". Locks
>  are granted to connections - at any one time, each database connection
>  has either a read-lock, write-lock or no lock on each database table.
>
>  At any one time, a single table may have any number of active read-locks
>  or a single active write lock. To read data [from] a table, a connection
>  must first obtain a read-lock. To write to a table, a connection must
>  obtain a write-lock on that table."
>
> From this I at the time concluded that there could be multiple connections
> reading from a table concurrently. But it appears I was wrong.
>
> Perhaps this page should be updated to reflect this. Specifically, the
> last paragraph in Section 2.0:
>
> "There are three levels to the shared-cache locking model, transaction
>  level locking, table level locking and schema level locking. They are
>  described in the following three sub-sections."
>
> Should probably read something along these lines:
>
> "There are four levels to the shared-cache locking model: transaction
>  level locking, table level locking, cache level locking, and schema
>  level locking. They are described in the following three sub-sections."
>
> Then a new section should be inserted between 2.2 and 2.3 that reads
> along these lines:
>
> "2.3 Cache Level Locking
>
>  When a connection needs to access (read or write) data from a table,
>  it must acquire an exclusive mutex on the shared cache. In particular,
>  this means that in the shared cache mode actual access to the database
>  pages, regardless of whether for reading or writing, is always serialized."
>
> If you find it suitable, feel free to use this to update the documentation.
>
> Boris
> --
> Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog
> Compiler-based ORM system for C++  http://codesynthesis.com/products/odb
> Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
> XML data binding for embedded systems  http://codesynthesis.com/products/xsde
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Pavel Ivanov
Another idea for you:

> In our test, on the other hand, each thread queries its own set of
> rows from the table. So, based on your explanation, here each thread
> should end up with its own set of pages (more or less). However, even
> in this case, I still observe a near sequential performance.

Even though each thread works basically with its own set of pages they
all use the same cache and thus are subject to the combined limit on
total number of pages in the cache. And to keep that combined limit
working they need to maintain a combined LRU list of pages to know
which pages should be evicted from cache first in case that's needed.


Pavel


On Mon, Aug 13, 2012 at 5:51 AM, Boris Kolpackov
 wrote:
> Hi Richard,
>
> Richard Hipp  writes:
>
>> In shared-cache mode, the page cache is shared across threads. That means
>> that each thread must acquire a mutex on the page cache in order to read
>> it. Which means that access to the page cache is serialized.
>
> I just ran our concurrency test in different configurations and I
> observer a similar behavior. That is, in the shared-cache mode,
> read-only transactions on the same table are executed pretty much
> sequentially.
>
> Also, your explanation doesn't feel quite satisfactory to me. In his
> original email, Eric mentioned that his table contains just 50 rows.
> Surely all this data would be loaded into the cache the first time
> it is requested and then accessed concurrently by all the threads.
> The only way I can see how the sequential performance could be
> explained here is if the cache mutex did not distinguish between
> readers and writers (which would seem to be a fairly natural thing
> to do).
>
> In our test, on the other hand, each thread queries its own set of
> rows from the table. So, based on your explanation, here each thread
> should end up with its own set of pages (more or less). However, even
> in this case, I still observe a near sequential performance.
>
> Any idea what else might be going on here?
>
> Boris
> --
> Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog
> Compiler-based ORM system for C++  http://codesynthesis.com/products/odb
> Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
> XML data binding for embedded systems  http://codesynthesis.com/products/xsde
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Boris Kolpackov
Hi Richard,

Richard Hipp  writes:

> In shared-cache mode, SQLite acquires a mutex on the shared cache, thus
> serializing access.

Oh, that's quite a surprise. I would have never said that something
like this is going on just by reading the documentation page on shared
cache:

http://www.sqlite.org/sharedcache.html

Specifically, quoting Section 2.2, "Table Level Locking":

"When two or more connections use a shared-cache, locks are used to
 serialize concurrent access attempts on a per-table basis. Tables
 support two types of locks, "read-locks" and "write-locks". Locks
 are granted to connections - at any one time, each database connection
 has either a read-lock, write-lock or no lock on each database table.

 At any one time, a single table may have any number of active read-locks
 or a single active write lock. To read data [from] a table, a connection
 must first obtain a read-lock. To write to a table, a connection must
 obtain a write-lock on that table."

>From this I at the time concluded that there could be multiple connections
reading from a table concurrently. But it appears I was wrong.

Perhaps this page should be updated to reflect this. Specifically, the
last paragraph in Section 2.0:

"There are three levels to the shared-cache locking model, transaction
 level locking, table level locking and schema level locking. They are
 described in the following three sub-sections."

Should probably read something along these lines:

"There are four levels to the shared-cache locking model: transaction
 level locking, table level locking, cache level locking, and schema
 level locking. They are described in the following three sub-sections."

Then a new section should be inserted between 2.2 and 2.3 that reads
along these lines:

"2.3 Cache Level Locking

 When a connection needs to access (read or write) data from a table,
 it must acquire an exclusive mutex on the shared cache. In particular,
 this means that in the shared cache mode actual access to the database
 pages, regardless of whether for reading or writing, is always serialized."

If you find it suitable, feel free to use this to update the documentation.

Boris
-- 
Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog
Compiler-based ORM system for C++  http://codesynthesis.com/products/odb
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Marc L. Allen
The shared cache does not know that the table is small nor that there is 
nothing else to load.

When a thread accesses that shared cache, it must protect itself from the data 
page it's on being modified, either because the page is simply flushed from the 
cache (if the cache does such things) or replaced by another page because this 
one happened to be the least recently used (or whatever rule the cache uses).

(As with others, I haven't actually looked at the code.)

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Boris Kolpackov
Sent: Monday, August 13, 2012 8:51 AM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Multi-Thread Reads to SQLite Database

Hi Richard,

Richard Hipp  writes:

> In shared-cache mode, the page cache is shared across threads. That 
> means that each thread must acquire a mutex on the page cache in order 
> to read it. Which means that access to the page cache is serialized.

I just ran our concurrency test in different configurations and I observer a 
similar behavior. That is, in the shared-cache mode, read-only transactions on 
the same table are executed pretty much sequentially.

Also, your explanation doesn't feel quite satisfactory to me. In his original 
email, Eric mentioned that his table contains just 50 rows.
Surely all this data would be loaded into the cache the first time it is 
requested and then accessed concurrently by all the threads.
The only way I can see how the sequential performance could be explained here 
is if the cache mutex did not distinguish between readers and writers (which 
would seem to be a fairly natural thing to do).

In our test, on the other hand, each thread queries its own set of rows from 
the table. So, based on your explanation, here each thread should end up with 
its own set of pages (more or less). However, even in this case, I still 
observe a near sequential performance.

Any idea what else might be going on here?

Boris
-- 
Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog
Compiler-based ORM system for C++  http://codesynthesis.com/products/odb
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Richard Hipp
On Mon, Aug 13, 2012 at 8:51 AM, Boris Kolpackov wrote:

>
> Also, your explanation doesn't feel quite satisfactory to me.
>

In shared-cache mode, SQLite acquires a mutex on the shared cache, thus
serializing access.

The mutex is acquired because it is usually needed.  You can argue that
there are special cases where the mutex is not strictly needed.  That point
is debatable.  But in any case, SQLite does not attempt to implement those
special cases.  For safety and reliability, SQLite always acquires a mutex
when accessing the shared cache.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-13 Thread Boris Kolpackov
Hi Richard,

Richard Hipp  writes:

> In shared-cache mode, the page cache is shared across threads. That means
> that each thread must acquire a mutex on the page cache in order to read
> it. Which means that access to the page cache is serialized.

I just ran our concurrency test in different configurations and I
observer a similar behavior. That is, in the shared-cache mode,
read-only transactions on the same table are executed pretty much
sequentially.

Also, your explanation doesn't feel quite satisfactory to me. In his
original email, Eric mentioned that his table contains just 50 rows.
Surely all this data would be loaded into the cache the first time
it is requested and then accessed concurrently by all the threads.
The only way I can see how the sequential performance could be
explained here is if the cache mutex did not distinguish between
readers and writers (which would seem to be a fairly natural thing
to do).

In our test, on the other hand, each thread queries its own set of
rows from the table. So, based on your explanation, here each thread
should end up with its own set of pages (more or less). However, even
in this case, I still observe a near sequential performance.

Any idea what else might be going on here?

Boris
-- 
Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog
Compiler-based ORM system for C++  http://codesynthesis.com/products/odb
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-Thread Reads to SQLite Database

2012-08-10 Thread Richard Hipp
On Fri, Aug 10, 2012 at 12:18 PM, esum  wrote:

>
> However, when I ran this same test with SQLITE_OPEN_READWRITE |
> SQLITE_OPEN_SHAREDCACHE for the flags, I get the following [slower]
> results:
>
> Why am I seeing such a high increase in times as I add threads in shared
> cache mode as opposed to without it?
>

In shared-cache mode, the page cache is shared across threads.  That means
that each thread must acquire a mutex on the page cache in order to read
it.  Which means that access to the page cache is serialized.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users