> Date: Thu, 22 Apr 2021 14:43:24 +0200
> From: Alexandr Nedvedicky <alexandr.nedvedi...@oracle.com>
> 
> Hello,
> 
> On Thu, Apr 22, 2021 at 01:09:34PM +0200, Alexander Bluhm wrote:
> > On Thu, Apr 22, 2021 at 12:33:13PM +0200, Hrvoje Popovski wrote:
> > > r620-1# papnpaiancini:cc :p :op
> > > opooolo_llc_ac_caccahhceh_ei_eti_tieetmme_mm__amgamigacigci__cc_hccehhcekcekc::
> > > k :m  bmubmfubfuppflp llc pc pcuup  uf rfferree eel el iilsitss tm tom
> > > omddoidfiiifeifeidde:d ::i ti etietmme m
> > > a  daddardd rd0 r0
> > > xx0fxfffffffffffffddf88d08c0cc0c6c76afc9b3f04500400++01+61 610 6x0
> > > fx0fxffffffffffdffdf88d08
> > > 00020720d72a8c0049703eb!ef!e==!0=x009x59x95995b9ebbaee3ae3ae344ef54f5a4bff7db07990a9
> > 
> > Wow.  3 CPUs panic in pool_cache_get() pool_cache_item_magic_check
> > simultaneously.  This makes me think we may have a bug there.
> > 
> 
>     I took a look at arch/amd64/include/intrdefs.h where interrupt
>     priorities are defined.
> 
>     IPL_NET has priority set to 7,
>     IPL_SOFTNET has higher priority set 5
> 
>     all allocations are coming from mbpool via m_gethdr(), interrupt
>     level priority for mbpool is set to IPL_NET. If I understand
>     code in m_pool_get() right, then the pool_cache_enter() does not
>     stop guys who call m_gethdr() with IPL_SOFTNET.
> 
>     if we put KERNEL_LOCK() there the problem is gone, mostlikely
>     because the IPL_SOFTNET guy waits for KERNEL_LOCK therefore it
>     can not interfere with our IPL_NET task, which forwards packet.
> 
>     I admit it's a poor speculation, I have no 'hard proof' for my
>     claim here.  So I might be very wrong here.

Not sure what you are trying to say here, but IPL_SOFTNET is lower
than IPL_NET.  So code that runs at IPL_SOFTNET will raise the IPL to
IPL_NET in pool_cache_enter(), blocking IPL_NET interrupts until
pool_cache_leave() is called and the IPL is lowered again to
IPL_SOFTNET.

I'm fairly confident the "normal" pools are mpsafe; we have been using
those in concurrent contexts without holding the kernel lock for a
long time already.  But the pool cache layer is still relatively new...

Reply via email to