On Thu, 2008-01-10 at 10:28 -0800, Linus Torvalds wrote: > > On Thu, 10 Jan 2008, Matt Mackall wrote: > > > > > > (I'm not a fan of slabs per se - I think all the constructor/destructor > > > crap is just that: total crap - but the size/type binning is a big deal, > > > and I think SLOB was naïve to think a pure first-fit makes any sense. Now > > > you guys are size-binning by just two or three bins, and it seems to make > > > a difference for some loads, but compared to SLUB/SLAB it's a total hack). > > > > Here I'm going to differ with you. The premises of the SLAB concept > > (from the original paper) are: > > I really don't think we differ. > > The advantage of slab was largely the binning by type. Everything else was > just a big crock. SLUB does the binning better, by really just making the > type binning be about what really matters - the *size* of the type. > > So my argument was that the type/size binning makes sense (size more so > than type), but the rest of the original Sun arguments for why slab was > such a great idea were basically just the crap. > > Hard type binning was a mistake (but needed by slab due to the idiotic > notion that constructors/destructors are "good for caches" - bleargh). I > suspect that hard size binning is a mistake too (ie there are probably > cases where you do want to split unused bigger size areas), but the fact > that all of our allocators are two-level (with the page allocator acting > as a size-agnostic free space) may help it somewhat. > > And yes, I do agree that any current allocator has problems with the big > sizes that don't fit well into a page or two (like task_struct). That > said, most of those don't have lots of allocations under many normal > circumstances (even if there are uses that will really blow them up). > > The *big* slab users at least for me tend to be ext3_inode_cache and > dentry. Everything else is orders of magnitude less. And of the two bad > ones, ext3_inode_cache is the bad one at 700+ bytes or whatever (resulting > in ~10% fragmentation just due to the page thing, regardless of whether > you use an order-0 or order-1 page allocation). > > Of course, dentries fit better in a page (due to being smaller), but then > the bigger number of dentries per page make it harder to actually free > pages, so then you get fragmentation from that. Oh well. You can't win.
One idea I've been kicking around is pushing the boundary for the buddy allocator back a bit (to 64k, say) and using SL*B under that. The page allocators would call into buddy for larger than 64k (rare!) and SL*B otherwise. This would let us greatly improve our handling of things like task structs and skbs and possibly also things like 8k stacks and jumbo frames. As SL*B would never be competing with the page allocator for contiguous pages (the buddy allocator's granularity would be 64k), I don't think this would exacerbate the page-level fragmentation issues. Crazy? -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/