On 2008-06-16 23:37, Matthew Dillon wrote: > :<[EMAIL PROTECTED]> wrote: > :> I am not too familiar with Reiser so I can't really come to that > :> conclusion, but it doesn't seem likely that the reasons are similar. > : > :They used B+ in earlier versions. In Reiser4 they decided to use > : > :http://en.wikipedia.org/wiki/B*-tree > : > :and > : > :http://en.wikipedia.org/wiki/Dancing_trees > : > :-- > :Dan > > I can never keep all the variations straight, I need a Wiki reference > every time to get the name right :-). > > These are all minor variations of B+Trees. The B* variation keeps > nodes 2/3rds full instead of half full. It is an optimization which > is intended to maintain higher fill ratios to make better use of system > caches. > > A Dancing tree ... hehehehe. Looks like Hans invented that one. I'm > not sure it even counts as a variation of a B+Tree. It doesn't apply > to HAMMER at all because HAMMER makes no modifications to the B-Tree > whatsoever on the front-end. When you create, delete, rename, write, > etc... when you do those operations HAMMER caches them in a > virtualization layer in memory and doesn't make any modifications to > its on-media data structures (or their in-memory representations) at > all until the meta-data is synced to disk. > > HAMMER doesn't have to make real-time on-media modifications for any > meta-data change for ANY operation. Not a single one. Not rename, not > hard or soft links, not truncations, nothing. > > HAMMER uses a modified B+Tree for its on-disk representation, which is > a B-Tree with only keys at internal nodes and only records at the leafs. > This was done to reduce structural bloat, allow for a leaf->leaf > linking optimization in the future, and for other reasons. The Wiki > definition isn't quite right. The leaf->leaf links are optional (and > HAMMER doesn't implement them, though I've left room to do so). It's > still a B+Tree if you don't have them. > > HAMMER's custom modification is to formalize both a left AND a right > bounding element. So a HAMMER B+Tree looks like this: > > B B B B B B <---- internal node (e.g. 6 elements w/ 5 leaf ptrs) > L L L L L <---- leaf nodes > > Whereas a normal B+Tree looks like this (note the missing right bound): > > B B B B B <---- internal node > L L L L L <---- leaf nodes > > Note that HAMMER uses a radix of 64, so e.g. 64 internal elements but one > is a right-bound so only 63 pointers to the next level. The leafs then > have 64 elements. So HAMMER's B+Tree recursion is 63x63x63....x64. > > HAMMER's internal nodes have a left and right bounding element. A > standard B+Tree only has a left bounding element. By adding a right > bounding element HAMMER can cache pointers into its B+Tree and 'pick up' > searches, insertions, and deletions relative to the cached pointers > instead > of having to start at the root of the tree. More importantly, it can > pickup searches, insertions, and deletions at internal nodes, not just > leaf nodes. So I can cache a proximity pointer and if I do a good job > I never have to traverse the B+Tree above that point. > > Without the right bound a search cannot be picked up from mid-tree > without potentially going down the wrong branch and then having to > backtrack.
But the case when you have to backtrack should be easily detected (when the sought element is greater than the right-most bound), right? And with a radix of 64 that would on average only happen on every 64'th lookup. I'm not arguing the effectiveness of your implementation, just checking my understanding of B+trees. -- Erik Wikström
