kern

Masao Uebayashi Mon, 08 Nov 2010 19:35:08 -0800

The root cause of such a problem is, we don't do I/O scheduling at all.

I think that pool(9) (and pool_cache(9)) is a great tool to manage
limited resources.  The way to go would be to extend it to manage
bandwidth in I/O subsystems.  (We already use pool_cache(9) for
KVA.)


Resources are shared, and have dependencies.  I/O resources would
look like some hierachical structure chained with pool backend
callbacks.  Probably combined with device tree too.

That said, I have no idea how to *fix* our networking stack.  It's
unmaintainable IMO...

On Thu, Oct 14, 2010 at 05:07:39PM +0900, [email protected] wrote:
> Hello,
> 
> Mindaugas said:
> > Can you explain what the actual problem M_WAIT causes i.e. how would
> > it "freeze network stack"?
> 
> In M_WAIT case, m_reclaim() will run and run until get mbuf cluster
> if mclpool limit reached.  If m_reclaim() repeatedly but cannot to
> get new mbuf cluster, m_clget() will not return.
> 
> network stacks using mbufs is use with M_DONTWAIT, but it will failed
> to get new mbuf cluster in this case.  "freeze" means that.
> 
> Generally NMBCLUSTERS is enough and SB_MAX is small, the problem
> is rarely caused.  It depends parameter configuration, I think.
> 
> In other words, to cause this problem,
> - many packets receives from the network, it will read by userland
> - userland program sends many packets, but not read packet from the stack
> - and mclpool limit reached.
> - no one calls pool_put()
> 
> In M_DONTWAIT case, M_EXT flag is result of m_clget().
> code of checking M_EXT is already prepared, but didn't used
> it with M_WAIT.  I think this is careless miss.
> 
> Thanks.
> --
> Masaru OKI <[email protected]>

Re: CVS commit: src/sys/kern

Reply via email to