On Fri, 22 Feb 2013 14:21:58 -0800
Tejun Heo <t...@kernel.org> wrote:

> Hello, Andrew.
> 
> On Fri, Feb 22, 2013 at 2:16 PM, Andrew Morton
> <a...@linux-foundation.org> wrote:
> > Given that this code is performing allocations under
> > spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end()
> > (and perhaps even a repeat loop around those) to make the allocations
> > more reliable.
> 
> It's already using preload so GFP_NOWAIT is enough to guarantee
> GFP_KERNEL level allocation.
> 

Not "guarantee".

- idr_preload() might fail.  It is a design error that idr_preload()
  returns void - it should do what radix_tree_preload() does.

- even if idr_preload() completed the preload, an interrupt might
  come in and its handler might drain this cpu's preload pool.

This all sounds like crazy will-never-happen stuff.  That's what I
thought about the radix-tree code when used for pagecache.  But under
extreme worloads, the cant-happens kept on happening, which is how all
that funky preload/preempt-disable stuff occurred.

It would be more robust still if the per-cpu preload magazines were
per-radixtree and per-idr, rather than kernel-wide.  That would reduce
the risk of interrupt-time stealing.  Or make foo_preload() return with
interrupts disabled rather than jsut preempt_disable().

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to