>       at least one location:
>
>       When adding a new dva node into the tree, a kmem_alloc is done with
>       a KM_SLEEP argument.
>
>       thus, this process thread could block waiting for memory.
>
>       I would suggest adding a  pre-allocated pool of dva nodes.

This is how the Solaris memory allocator works.  It keeps pools of
"pre-allocated" nodes about until memory conditions are low.

>       When a new dva node is needed, first check this pre-allocated
>       pool and allocate from their.

There are two reasons why this is a really bad idea:

        - the system will run out of memory even sooner if people
          start building their own free-lists

        - a single freelist does not scale; at two CPUs it becomes
          the allocation bottleneck (I've measured and removed two
          such bottlenecks from Solaris 9)


You might want to learn about how the Solaris memory allocator works;
it pretty much works like you want, except that it is all part of the
framework.  And, just as in your case, it does run out some times but
a private freelist does not help against that.

>       Why? This would eliminate a possible sleep condition if memory
>            is not immediately available. The pool would add a working
>            set of dva nodes that could be monitored. Per alloc latencies
>            could be amortized over a chunk allocation.

That's how the Solaris memory allocator already works.

Casper
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to