On Sat, 2005-08-27 at 10:53 -0700, Andrew Morton wrote: > I'd say the main naivety in gang lookup is the awkward top-level iteration > algorithm. The way it bales out all the way to the top level of the tree > once __lookup() hits the end of the slots[] array, even though results[] > isn't full yet. It's surely possible to go back up the tree just a > sufficient distance to resume the iteration, rather than all the way to the > top. But it's hard, and it's all in CPU cache anyway there. > > It would be much simpler if it was using recursion, of course.
I agree; I actually checked this point: most page gang lookups do have to restart the search. At least using a bitmap gets it back on point much faster. The page radix tree lookups are usually at most four levels deep, anyway. > > This patch replaces > > the integer count with an unsigned long representing the bitmap of > > occupied elements. We then use that bitmap to find the first occupied > > entry instead of looping over all the entries from the beginning of the > > radix node. > > But only in __lookup(). I think most gang lookups use > radix_tree_gang_lookup_tag() -> __lookup_tag(). > > And __lookup_tag() could use find_next_bit() on the tags anyway, as the > comment says. I spent a bit of time doing that, had some bug, shelved it, > never got on to fixing it up. > > There's a userspace test/devel setup at > http://www.zip.com.au/~akpm/linux/patches/stuff/rtth.tar.gz, btw. OK ... I'll take a look. I didn't mean to do this, it's just that for the idr replacement code I had to use bitmap lookup, so this seemed like a natural precursor. > > The penalty of doing this is that on 32 bit machines, the size of the > > radix tree array is reduced from 64 to 32 (so an unsigned long can > > represent the bitmap). > > If we did the bitmap lookup in __lookup_tag() we wouldn't have this > restriction. > > Maybe we can > > a) fix radix_tree_gang_lookup() to use find_next_bit() Well, not quite; with the size changes, the tag map now never overflows an unsigned long. > b) remove radix_tree_node.count yes, did that. > c) Add a new tag field which simply means "present" > d) remove radix_tree_gang_lookup() and __lookup() altogether > e) Implement radix_tree_gang_lookup() via radix_tree_gang_lookup_tag() > > That would involve setting and clearing bit in the "present" tag field when > adding and removing items. OK, I see how to do all of this using the currently implemented logic (the occupied word is what you would like to be the present tag). I'll see what I can do. > > I also exported radix_tree_preload() so modules can make use of radix > > trees. > > uh, OK. Note that radix_tree_preload() uses prempt_disable() protection. > So it has the limitation that the guarantee which it provides will become > unreliable, kernel-wide, if anyone anywhere tries to do a > radix_tree_insert() from interrupt context. radix_tree_insert() is reliable from IRQ provided you don't try to use radix_tree_preload() and you defined your radix tree gfp flag to be GFP_ATOMIC. preloading is only optional, and should only be done really if you have process context to preload with GFP_KERNEL. Preloading with GFP_ATOMIC is pretty pointless since radix_tree_insert() will also try to allocate with the radix tree flags. James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

