On 18/12/06, Ingo Molnar <[EMAIL PROTECTED]> wrote:
* Catalin Marinas <[EMAIL PROTECTED]> wrote: > >> [...] It could be so simple that it would never need to free any > >> pages, just grow the size as required and reuse the freed memleak > >> objects from a list. > > > >sounds good to me. Please make it a per-CPU pool. > > Isn't there a risk for the pools to become imbalanced? A lot of > allocations would initially happen on the first CPU. hm, what's the problem with imbalance? These are trees and imbalance isnt a big issue.
It could just be more available (freed) memleak_objects on one CPU than on the others and use more memory. Not a big problem though.
> > We'll have to fix the locking too, to be per-CPU - memleak_lock is > > quite a scalability problem right now. > > The memleak_lock is indeed too coarse (but it was easier to track the > locking dependencies). With a new allocator, however, I could do a > finer grain locking. It probably still needs a (rw)lock for the hash > table. Having per-CPU hash tables is inefficient as we would have to > look up all the tables at every freeing or scanning for the > corresponding memleak_object. at freeing we only have to look up the tree belonging to object->cpu.
At freeing, kmemleak only gets a pointer value which has to be looked up in the hash table for the corresponding memleak_object. Only after that, we can know memleak_object->cpu. That's why I think we only need to have a global hash table. The hash table look-up can be RCU. It would work with per-CPU hash tables but we still need to look-up the other hash tables in case the freeing happened on a different CPU (i.e. look-up the current hash table and, if it fails, look-up the other per-CPU hashes). Freeing would need to remove the entry from the hash table and acquire a lock but this would be per-CPU. I'm not sure how often you get this scenario (allocation and freeing on different CPUs) but it might introduce an overhead to the memory freeing. Do you have a better solution here?
> There is a global object_list as well covered by memleak_lock (only > for insertions/deletions as traversing is RCU). [...] yeah, that would have to become per-CPU too.
That's not that difficult but, as above, we need the hash table look-up before we find which list it is on.
> [...] List insertion/deletion is very small compared to the hash-table > look-up and it wouldn't introduce a scalability problem. it's a common misconception to think that 'small' critical sections are fine. That's not the issue. The pure fact of having globally modified resource is the problem, the lock cacheline would ping-pong, etc.
You are right but I didn't mean that small critical sections are better, just that in case we need a critical section for the global hash table look-up, extending this critical region with list addition/deletion wouldn't make things any worse (than they are, regarding scalability). -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/