Thursday, April 15, 2004, 9:16:01 AM, Christian Smith wrote: > On Wed, 14 Apr 2004, Doug Currie wrote:
>>One way to get table level locking without a great deal of pain is to >>integrate the shadow paging ideas with BTree management. Rather than >>using page tables for the shadow pages, use the BTrees themselves. >>This means that any change to a BTree requires changes along the >>entire path back to the root so that only free pages are used to store >>new data, including the BTree itself. Writing the root page(s) of the >>BTree(s) commits the changes to that table (these tables). > Actually, this gets my vote. Keeps the pager layer the same, The pager gets *much* simpler because it doesn't need to make a log file. The log file is not necessary because writes only go to free pages. Well, there would be one write-ahead log. It's needed to prevent partial updates to the page number pointers to the roots page(s) of the BTree(s) at commit. This log is created at commit time, and is much simpler and much smaller than the present log file. > and only requires a cache of the root btree for each object > (table/index) in the database to be maintained on a per-transaction > basis Yes, you need to cache the page number of each BTree root at transaction start. You'd also need a forest of free pages organized by transaction so they can be freed at the right time (when the oldest read-transaction that can reference them has completed). > , reducing the complications of what to do under memory pressure > when pages are spilled from the cache as we should be able to keep > them in memory all the time. Yes. > Committing of a transaction would then be an atomic update root btree page > number in the catalog table. Yes, atomically for all the BTrees modified. This is probably a single page of data (4 to 8 bytes of root page number per BTree, i.e., per table and per index). Well, I usually assume fairly large pages compared with SQLite's default of 1K. Using larger pages also decreases the depth of the BTree which reduces the number of pages written. This design works well. It has the advantage (compared with shadow pager) that reads are not burdened with page table indirection. It has the potential disadvantage (compared with SQLite 2.8) that small writes can modify several pages (based on the depth of the BTree). I used this design in a proprietary database in the late 1980s. The only reason I didn't consider modifying SQLite this way up until now is that I was anticipating BTree changes for 3.0, so I confined my efforts to the pager layer. e --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]