On Sat, Feb 18, 2017 at 6:28 PM, Dmitry Vyukov <dvyu...@google.com> wrote: > On Fri, Feb 10, 2017 at 6:17 AM, Jason Wang <jasow...@redhat.com> wrote: >> >> >> On 2017年02月10日 02:10, Michael S. Tsirkin wrote: >>> >>> On Thu, Feb 09, 2017 at 05:02:31AM -0500, Jason Wang wrote: >>>> >>>> ----- Original Message ----- >>>>> >>>>> Hello, >>>>> >>>>> I've got the following report while running syzkaller fuzzer on mmotm >>>>> (git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git) >>>>> remotes/mmotm/auto-latest ee4ba7533626ba7bf2f8b992266467ac9fdc045e: >>>>> >>>> [...] >>>> >>>>> other info that might help us debug this: >>>>> >>>>> Possible interrupt unsafe locking scenario: >>>>> >>>>> CPU0 CPU1 >>>>> ---- ---- >>>>> lock(&(&r->consumer_lock)->rlock); >>>>> local_irq_disable(); >>>>> lock(&(&r->producer_lock)->rlock); >>>>> lock(&(&r->consumer_lock)->rlock); >>>>> <Interrupt> >>>>> lock(&(&r->producer_lock)->rlock); >>>>> >>>> Thanks a lot for the testing. >>>> >>>> Looks like we could address this by using skb_array_consume_bh() instead. >>>> >>>> Could you pls verify if the following patch works? >>> >>> I think we should use _bh for the produce call as well, >>> since resizing takes the producer lock. >> >> Looks not since irq was disabled during resizing? > > > Hello, > > Is there a fix for this that we can pick up? > This killed 10'000 VMs on our testing infra over the last day. Still > happening on linux-next.
Ah, sorry, I see the patch above with skb_array_consume_bh. It's just that it's not in linux-next. Will manually apply it now then. Should we also do something with produce_skb?