Re: [sqlite] Multi-Thread Reads to SQLite Database
Hi Marc, "Marc L. Allen" writes: > If I understand correctly, the purpose of the overall mutex is to prevent a > page from being removed underneath a user. If the standard DB locking > semantics are working properly, I think there is no possibility of a page's > data from being modified underneath another user. Yes, that's my understanding also. This is all taken care of at higher (transaction and table) locking levels. > If the above it true, cache protection semantics are strictly concerned with > page management. That is, a page is requested that is not in the cache and > needs to be inserted into it. If the cache is full, another page needs to > be released. All that is required is protecting pages currently in use from > being released. > > I think, instead of a mutex serializing access to the entire cache, all > that is needed is a mutex serializing access to the cache meta-data and > the use of reference counts to help the page replacement algorithm make > a good choice in which page to remove. All sounds very sensible to me. Boris -- Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog Compiler-based ORM system for C++ http://codesynthesis.com/products/odb Open-source XML data binding for C++ http://codesynthesis.com/products/xsd XML data binding for embedded systems http://codesynthesis.com/products/xsde ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
Hi Pavel, Pavel Ivanov writes: > > "There are four levels to the shared-cache locking model: transaction > > level locking, table level locking, cache level locking, and schema > > level locking. They are described in the following three sub-sections." > > This would be a bad way of documenting this because "cache level > locking" works on a completely different level than transaction, table > or schema level locking. The exclusive mutex is taken inside one call > to sqlite3_step() and released upon exit. But all other "locks" have > "meta-level" notion and work between calls to sqlite3_*() functions. I agree. But I also think that documenting it like this is better than not documenting it at all. This "read serialization" behavior of SQLite was a complete surprise to me. Maybe instead of documenting it, we should just fix it. I personally will rather write code than documentation ;-). Boris -- Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog Compiler-based ORM system for C++ http://codesynthesis.com/products/odb Open-source XML data binding for C++ http://codesynthesis.com/products/xsd XML data binding for embedded systems http://codesynthesis.com/products/xsde ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
I don't believe I'm suggesting one mutex per page. If I understand correctly, the purpose of the overall mutex is to prevent a page from being removed underneath a user. If the standard DB locking semantics are working properly, I think there is no possibility of a page's data from being modified underneath another user. That would be no different than a physical DB page being modified underneath another user. If the above it true, cache protection semantics are strictly concerned with page management. That is, a page is requested that is not in the cache and needs to be inserted into it. If the cache is full, another page needs to be released. All that is required is protecting pages currently in use from being released. I think, instead of a mutex serializing access to the entire cache, all that is needed is a mutex serializing access to the cache meta-data and the use of reference counts to help the page replacement algorithm make a good choice in which page to remove. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Simon Slavin Sent: Monday, August 13, 2012 10:23 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Multi-Thread Reads to SQLite Database On 13 Aug 2012, at 3:11pm, Marc L. Allen wrote: > I wonder if it would be possible to refine the cache locking mechanism. > > If I understand the modified DB/Table locking semantics when running under a > shared-cache, it appears that the cache page should be protected against > readers and writers. All shared assets must be protected against readers and writers. A reader needs to know that nothing is going to scramble the data while it's reading. A writer must know that nothing is reading the data it's about to change. (A little handwaving and a few exceptions, but that's the gist of it.) > Perhaps only the list of pages in the cache need to be protected by a > higher-level mutex? That is, if you want access to a page, you grab the > mutex, flag the page as 'in-use' with a reference counter, release the mutex, > and go on about your business. You are proposing one mutex per page. This is a system which many DBMSes use but it would /greatly/ slow down SQLite. Also it would complicate the code quite a bit since a write to one page often leaks over to neighbouring pages. > If you have multiple readers, they would be able to access the physical page > concurrently. When access is complete, the reference count would be > decremented. To get the effect of this, simply stop using shared-cache. Let each process have its own cache. That way each process knows nothing is messing with its cache. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
On 13 Aug 2012, at 3:11pm, Marc L. Allen wrote: > I wonder if it would be possible to refine the cache locking mechanism. > > If I understand the modified DB/Table locking semantics when running under a > shared-cache, it appears that the cache page should be protected against > readers and writers. All shared assets must be protected against readers and writers. A reader needs to know that nothing is going to scramble the data while it's reading. A writer must know that nothing is reading the data it's about to change. (A little handwaving and a few exceptions, but that's the gist of it.) > Perhaps only the list of pages in the cache need to be protected by a > higher-level mutex? That is, if you want access to a page, you grab the > mutex, flag the page as 'in-use' with a reference counter, release the mutex, > and go on about your business. You are proposing one mutex per page. This is a system which many DBMSes use but it would /greatly/ slow down SQLite. Also it would complicate the code quite a bit since a write to one page often leaks over to neighbouring pages. > If you have multiple readers, they would be able to access the physical page > concurrently. When access is complete, the reference count would be > decremented. To get the effect of this, simply stop using shared-cache. Let each process have its own cache. That way each process knows nothing is messing with its cache. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
I wonder if it would be possible to refine the cache locking mechanism. If I understand the modified DB/Table locking semantics when running under a shared-cache, it appears that the cache page should be protected against readers and writers. Perhaps only the list of pages in the cache need to be protected by a higher-level mutex? That is, if you want access to a page, you grab the mutex, flag the page as 'in-use' with a reference counter, release the mutex, and go on about your business. If you have multiple readers, they would be able to access the physical page concurrently. When access is complete, the reference count would be decremented. Marc ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
> "There are four levels to the shared-cache locking model: transaction > level locking, table level locking, cache level locking, and schema > level locking. They are described in the following three sub-sections." This would be a bad way of documenting this because "cache level locking" works on a completely different level than transaction, table or schema level locking. The exclusive mutex is taken inside one call to sqlite3_step() and released upon exit. But all other "locks" have "meta-level" notion and work between calls to sqlite3_*() functions. Pavel On Mon, Aug 13, 2012 at 6:45 AM, Boris Kolpackov wrote: > Hi Richard, > > Richard Hipp writes: > >> In shared-cache mode, SQLite acquires a mutex on the shared cache, thus >> serializing access. > > Oh, that's quite a surprise. I would have never said that something > like this is going on just by reading the documentation page on shared > cache: > > http://www.sqlite.org/sharedcache.html > > Specifically, quoting Section 2.2, "Table Level Locking": > > "When two or more connections use a shared-cache, locks are used to > serialize concurrent access attempts on a per-table basis. Tables > support two types of locks, "read-locks" and "write-locks". Locks > are granted to connections - at any one time, each database connection > has either a read-lock, write-lock or no lock on each database table. > > At any one time, a single table may have any number of active read-locks > or a single active write lock. To read data [from] a table, a connection > must first obtain a read-lock. To write to a table, a connection must > obtain a write-lock on that table." > > From this I at the time concluded that there could be multiple connections > reading from a table concurrently. But it appears I was wrong. > > Perhaps this page should be updated to reflect this. Specifically, the > last paragraph in Section 2.0: > > "There are three levels to the shared-cache locking model, transaction > level locking, table level locking and schema level locking. They are > described in the following three sub-sections." > > Should probably read something along these lines: > > "There are four levels to the shared-cache locking model: transaction > level locking, table level locking, cache level locking, and schema > level locking. They are described in the following three sub-sections." > > Then a new section should be inserted between 2.2 and 2.3 that reads > along these lines: > > "2.3 Cache Level Locking > > When a connection needs to access (read or write) data from a table, > it must acquire an exclusive mutex on the shared cache. In particular, > this means that in the shared cache mode actual access to the database > pages, regardless of whether for reading or writing, is always serialized." > > If you find it suitable, feel free to use this to update the documentation. > > Boris > -- > Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog > Compiler-based ORM system for C++ http://codesynthesis.com/products/odb > Open-source XML data binding for C++ http://codesynthesis.com/products/xsd > XML data binding for embedded systems http://codesynthesis.com/products/xsde > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
Another idea for you: > In our test, on the other hand, each thread queries its own set of > rows from the table. So, based on your explanation, here each thread > should end up with its own set of pages (more or less). However, even > in this case, I still observe a near sequential performance. Even though each thread works basically with its own set of pages they all use the same cache and thus are subject to the combined limit on total number of pages in the cache. And to keep that combined limit working they need to maintain a combined LRU list of pages to know which pages should be evicted from cache first in case that's needed. Pavel On Mon, Aug 13, 2012 at 5:51 AM, Boris Kolpackov wrote: > Hi Richard, > > Richard Hipp writes: > >> In shared-cache mode, the page cache is shared across threads. That means >> that each thread must acquire a mutex on the page cache in order to read >> it. Which means that access to the page cache is serialized. > > I just ran our concurrency test in different configurations and I > observer a similar behavior. That is, in the shared-cache mode, > read-only transactions on the same table are executed pretty much > sequentially. > > Also, your explanation doesn't feel quite satisfactory to me. In his > original email, Eric mentioned that his table contains just 50 rows. > Surely all this data would be loaded into the cache the first time > it is requested and then accessed concurrently by all the threads. > The only way I can see how the sequential performance could be > explained here is if the cache mutex did not distinguish between > readers and writers (which would seem to be a fairly natural thing > to do). > > In our test, on the other hand, each thread queries its own set of > rows from the table. So, based on your explanation, here each thread > should end up with its own set of pages (more or less). However, even > in this case, I still observe a near sequential performance. > > Any idea what else might be going on here? > > Boris > -- > Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog > Compiler-based ORM system for C++ http://codesynthesis.com/products/odb > Open-source XML data binding for C++ http://codesynthesis.com/products/xsd > XML data binding for embedded systems http://codesynthesis.com/products/xsde > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
Hi Richard, Richard Hipp writes: > In shared-cache mode, SQLite acquires a mutex on the shared cache, thus > serializing access. Oh, that's quite a surprise. I would have never said that something like this is going on just by reading the documentation page on shared cache: http://www.sqlite.org/sharedcache.html Specifically, quoting Section 2.2, "Table Level Locking": "When two or more connections use a shared-cache, locks are used to serialize concurrent access attempts on a per-table basis. Tables support two types of locks, "read-locks" and "write-locks". Locks are granted to connections - at any one time, each database connection has either a read-lock, write-lock or no lock on each database table. At any one time, a single table may have any number of active read-locks or a single active write lock. To read data [from] a table, a connection must first obtain a read-lock. To write to a table, a connection must obtain a write-lock on that table." >From this I at the time concluded that there could be multiple connections reading from a table concurrently. But it appears I was wrong. Perhaps this page should be updated to reflect this. Specifically, the last paragraph in Section 2.0: "There are three levels to the shared-cache locking model, transaction level locking, table level locking and schema level locking. They are described in the following three sub-sections." Should probably read something along these lines: "There are four levels to the shared-cache locking model: transaction level locking, table level locking, cache level locking, and schema level locking. They are described in the following three sub-sections." Then a new section should be inserted between 2.2 and 2.3 that reads along these lines: "2.3 Cache Level Locking When a connection needs to access (read or write) data from a table, it must acquire an exclusive mutex on the shared cache. In particular, this means that in the shared cache mode actual access to the database pages, regardless of whether for reading or writing, is always serialized." If you find it suitable, feel free to use this to update the documentation. Boris -- Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog Compiler-based ORM system for C++ http://codesynthesis.com/products/odb Open-source XML data binding for C++ http://codesynthesis.com/products/xsd XML data binding for embedded systems http://codesynthesis.com/products/xsde ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
The shared cache does not know that the table is small nor that there is nothing else to load. When a thread accesses that shared cache, it must protect itself from the data page it's on being modified, either because the page is simply flushed from the cache (if the cache does such things) or replaced by another page because this one happened to be the least recently used (or whatever rule the cache uses). (As with others, I haven't actually looked at the code.) -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Boris Kolpackov Sent: Monday, August 13, 2012 8:51 AM To: sqlite-users@sqlite.org Subject: Re: [sqlite] Multi-Thread Reads to SQLite Database Hi Richard, Richard Hipp writes: > In shared-cache mode, the page cache is shared across threads. That > means that each thread must acquire a mutex on the page cache in order > to read it. Which means that access to the page cache is serialized. I just ran our concurrency test in different configurations and I observer a similar behavior. That is, in the shared-cache mode, read-only transactions on the same table are executed pretty much sequentially. Also, your explanation doesn't feel quite satisfactory to me. In his original email, Eric mentioned that his table contains just 50 rows. Surely all this data would be loaded into the cache the first time it is requested and then accessed concurrently by all the threads. The only way I can see how the sequential performance could be explained here is if the cache mutex did not distinguish between readers and writers (which would seem to be a fairly natural thing to do). In our test, on the other hand, each thread queries its own set of rows from the table. So, based on your explanation, here each thread should end up with its own set of pages (more or less). However, even in this case, I still observe a near sequential performance. Any idea what else might be going on here? Boris -- Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog Compiler-based ORM system for C++ http://codesynthesis.com/products/odb Open-source XML data binding for C++ http://codesynthesis.com/products/xsd XML data binding for embedded systems http://codesynthesis.com/products/xsde ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
On Mon, Aug 13, 2012 at 8:51 AM, Boris Kolpackov wrote: > > Also, your explanation doesn't feel quite satisfactory to me. > In shared-cache mode, SQLite acquires a mutex on the shared cache, thus serializing access. The mutex is acquired because it is usually needed. You can argue that there are special cases where the mutex is not strictly needed. That point is debatable. But in any case, SQLite does not attempt to implement those special cases. For safety and reliability, SQLite always acquires a mutex when accessing the shared cache. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
Hi Richard, Richard Hipp writes: > In shared-cache mode, the page cache is shared across threads. That means > that each thread must acquire a mutex on the page cache in order to read > it. Which means that access to the page cache is serialized. I just ran our concurrency test in different configurations and I observer a similar behavior. That is, in the shared-cache mode, read-only transactions on the same table are executed pretty much sequentially. Also, your explanation doesn't feel quite satisfactory to me. In his original email, Eric mentioned that his table contains just 50 rows. Surely all this data would be loaded into the cache the first time it is requested and then accessed concurrently by all the threads. The only way I can see how the sequential performance could be explained here is if the cache mutex did not distinguish between readers and writers (which would seem to be a fairly natural thing to do). In our test, on the other hand, each thread queries its own set of rows from the table. So, based on your explanation, here each thread should end up with its own set of pages (more or less). However, even in this case, I still observe a near sequential performance. Any idea what else might be going on here? Boris -- Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog Compiler-based ORM system for C++ http://codesynthesis.com/products/odb Open-source XML data binding for C++ http://codesynthesis.com/products/xsd XML data binding for embedded systems http://codesynthesis.com/products/xsde ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-Thread Reads to SQLite Database
On Fri, Aug 10, 2012 at 12:18 PM, esum wrote: > > However, when I ran this same test with SQLITE_OPEN_READWRITE | > SQLITE_OPEN_SHAREDCACHE for the flags, I get the following [slower] > results: > > Why am I seeing such a high increase in times as I add threads in shared > cache mode as opposed to without it? > In shared-cache mode, the page cache is shared across threads. That means that each thread must acquire a mutex on the page cache in order to read it. Which means that access to the page cache is serialized. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users