Josef Bacik <jo...@redhat.com> writes: > When doing DIO tracing I noticed we were doing a ton of allocations, a lot of > the time for extent_states. Some of the time we don't even use the > prealloc'ed > extent_state, it just get's free'd up. So instead create a per-cpu cache like > the radix tree stuff. So we will check to see if our per-cpu cache has a > prealloc'ed extent_state in it and if so we just continue, else we alloc a new > one and fill the cache. Then if we need to use a prealloc'ed extent_state we > can just take it out of our per-cpu cache. We will also refill the cache on > free to try and limit the number of times we have to ask the allocator for > caches. With this patch dbench 50 goes from ~210 mb/s to ~260 mb/s. Thanks,
You're just reimplementing a poor man's custom slab cache -- all of this is already done in slab. If the difference is really that big better fix slab and have everyone benefit? Did you use slub or slab? Did you analyze where the cycles are spent? -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html