Hi guys, while 1.0.0-M3 is being voted, I started to work on the next iteration. Kiran is also working on the bulk load tool at the same time, so I will work on the transaction support.
Updating the bTree is a fairly costly operation, as we have to copy many pages, reclaim some free pages to store new data, and update the header many times. We could save many of those writes if we do it at the end of the update. One of the biggest save would be to write the header only once (currently, it's being updated every time we ask for a free page). But even if we implement such a mechanism, there are cases where we would like to save more. Here are two possible use cases : - we are injecting many entries in a BTree - we are updating many BTrees but we want a consistant backend even if there is a crash in the middle of all those updates, across oall the updated BTrees (this is typically what we have to guarantee when updating entries in LDAP). So where does it bring us ? We need to avoid writing to disk as much as we can up to the point we can do it safely. The idea is to copy the pages in memory, without flushing them on disk, until we commit (or rollback) the operation. If we have to copy the same page more than once, then we will update the copy in memory. In any case, we operate on a given revision for all the operation, regardless the number of updates. When the user is done, we have two use cases : - a commit is called : we flush on disk the pages we have stored in memory, and when it's done, we update the header. We are all good. - a rollback is called : there is nothing else to do than to discard all the pages we copied in memory The only pb is to manage the free pages and the deleted pages. The free pages will be reclaimed using the existing list of free pages, and eventually from the end of the file if we don't have enough free pages. This is quite easy. The deleted pages is more complicated to deal with, as we may have many pages we will free. But again, those deleted pages can be put into the free page list. Worst case is if we can't update the copiedPage BTree : we may lose some pages (but that would not corrupt the backend). This is what I foresee, I need to go a bit deeper in the algorithm, but I don't think I'm too far from something that flies. ATM, I will probaby work on teh insert() operation only, and on a BTree only. Thoughts ? -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com
