Re: Bisected GFP in bfq_bfqq_expire on v5.1-rc1
> Il giorno 1 apr 2019, alle ore 10:55, Dmitrii Tcvetkov > ha scritto: > > On Mon, 1 Apr 2019 09:29:16 +0200 > Paolo Valente wrote: >> >> >>> Il giorno 29 mar 2019, alle ore 15:10, Jens Axboe >>> ha scritto: >>> >>> On 3/29/19 7:02 AM, Dmitrii Tcvetkov wrote: Hi, I got kernel panic since v5.1-rc1 when working with files on block device with BFQ scheduler assigned. I didn't find trivial way to reproduce the panic but "git checkout origin/linux-5.0.y" on linux-stable-rc[1] git repo on btrfs filesystem reproduces the problem 100% of the time on my bare-metal machine and in a VM. Bisect led me to commit 9dee8b3b057e1 (block, bfq: fix queue removal from weights tree). After reverting this commit on top of current mainline master(9936328b41ce) I can't reproduce the problem. dmesg with the panic and bisect log attached. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable-rc.git >>> >>> Paolo, can you please take a look at this? >>> >>> >> >> Yep. >> >> That you very much Dmitrii for also bisecting. I feel like this >> failure may be caused by the typo fixed by this patch: >> https://patchwork.kernel.org/patch/10877113/ >> >> Could you please give this fix a try? > > Still reproduces with the patch on top of current mainline > master(v5.1-rc3). > > Crashes with and without CONFIG_BFQ_GROUP_IOSCHED look same to me. > Original dmesg was also from kernel with CONFIG_BFQ_GROUP_IOSCHED=n. > > gpf.txt contains crash with the patch and CONFIG_BFQ_GROUP_IOSCHED=n > gpf-w-bfq-group-iosched.txt - with the patch and CONFIG_BFQ_GROUP_IOSCHED=y > config.txt - kernel config for the VM with CONFIG_BFQ_GROUP_IOSCHED=n > > Ok, thank you. Could you please do a list *(bfq_bfqq_expire+0x1f3) for me? Thanks, Paolo > >
Re: Bisected GFP in bfq_bfqq_expire on v5.1-rc1
On Mon, 1 Apr 2019 09:29:16 +0200 Paolo Valente wrote: > > > > Il giorno 29 mar 2019, alle ore 15:10, Jens Axboe > > ha scritto: > > > > On 3/29/19 7:02 AM, Dmitrii Tcvetkov wrote: > >> Hi, > >> > >> I got kernel panic since v5.1-rc1 when working with files on block > >> device with BFQ scheduler assigned. I didn't find trivial way to > >> reproduce the panic but "git checkout origin/linux-5.0.y" > >> on linux-stable-rc[1] git repo on btrfs filesystem reproduces the > >> problem 100% of the time on my bare-metal machine and in a VM. > >> > >> Bisect led me to commit 9dee8b3b057e1 (block, bfq: fix queue > >> removal from weights tree). After reverting this commit on top of > >> current mainline master(9936328b41ce) I can't reproduce the > >> problem. > >> > >> dmesg with the panic and bisect log attached. > >> > >> [1] > >> https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > > > Paolo, can you please take a look at this? > > > > > > Yep. > > That you very much Dmitrii for also bisecting. I feel like this > failure may be caused by the typo fixed by this patch: > https://patchwork.kernel.org/patch/10877113/ > > Could you please give this fix a try? Still reproduces with the patch on top of current mainline master(v5.1-rc3). Crashes with and without CONFIG_BFQ_GROUP_IOSCHED look same to me. Original dmesg was also from kernel with CONFIG_BFQ_GROUP_IOSCHED=n. gpf.txt contains crash with the patch and CONFIG_BFQ_GROUP_IOSCHED=n gpf-w-bfq-group-iosched.txt - with the patch and CONFIG_BFQ_GROUP_IOSCHED=y config.txt - kernel config for the VM with CONFIG_BFQ_GROUP_IOSCHED=n [ 23.996750][C0] general protection fault: [#1] SMP PTI [ 23.998228][C0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G T 5.1.0-rc3-ARCH-test #5 [ 24.000351][C0] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 [ 24.002359][C0] RIP: 0010:bfq_bfqq_expire+0x1f3/0x3d0 [ 24.003630][C0] Code: 01 00 00 00 00 00 00 a8 02 75 3c 41 83 ec 01 41 83 fc 01 76 32 48 0f ba ab 08 01 00 00 03 48 8b 83 c8 00 00 00 48 85 c0 74 07 40 40 00 00 00 00 5b 5d 41 5c 41 5d 41 5e c3 e8 38 3 4 d3 ff 48 [ 24.008227][C0] RSP: 0018:97577ba03ed0 EFLAGS: 00010002 [ 24.009521][C0] RAX: 6b6b6b6b6b6b6b6b RBX: 975777e5ad98 RCX: 001b [ 24.011155][C0] RDX: 0046 RSI: RDI: 9757796d8100 [ 24.012796][C0] RBP: 975777672948 R08: R09: 975777e5af00 [ 24.014428][C0] R10: 975777e5ad98 R11: 0001 R12: [ 24.016058][C0] R13: 0002 R14: 975777672900 R15: 97577ba1b340 [ 24.017691][C0] FS: () GS:97577ba0() knlGS: [ 24.019514][C0] CS: 0010 DS: ES: CR0: 80050033 [ 24.020862][C0] CR2: 75243010 CR3: 00017a7d CR4: 000406b0 [ 24.022490][C0] Call Trace: [ 24.023179][C0] [ 24.023757][C0] bfq_idle_slice_timer+0x5f/0xb0 [ 24.024864][C0] ? bfq_dispatch_request+0x870/0x870 [ 24.026125][C0] __hrtimer_run_queues+0xf4/0x1a0 [ 24.027350][C0] hrtimer_interrupt+0xfe/0x220 [ 24.028521][C0] smp_apic_timer_interrupt+0x57/0x90 [ 24.029816][C0] apic_timer_interrupt+0xf/0x20 [ 24.030997][C0] [ 24.031695][C0] RIP: 0010:default_idle+0x9/0x20 [ 24.032911][C0] Code: 01 00 00 00 00 ad de 48 89 44 24 20 48 05 00 01 00 00 48 89 44 24 28 eb c5 e8 93 62 8a ff 90 90 90 65 8b 05 d9 f9 60 5d fb f4 <65> 8b 05 d0 f9 60 5d c3 66 66 2e 0f 1f 84 00 00 00 00 0 0 0f 1f 40 [ 24.037738][C0] RSP: 0018:a3003ee0 EFLAGS: 0246 ORIG_RAX: ff13 [ 24.039781][C0] RAX: RBX: RCX: 0001 [ 24.041682][C0] RDX: 633a RSI: 0083 RDI: [ 24.043566][C0] RBP: a3079870 R08: R09: [ 24.045508][C0] R10: R11: R12: a30154c0 [ 24.047433][C0] R13: 97577fff9a00 R14: 7bdd3d68 R15: 7e797c3e [ 24.049370][C0] do_idle+0xd6/0x100 [ 24.050327][C0] cpu_startup_entry+0x14/0x20 [ 24.051474][C0] start_kernel+0x44d/0x46d [ 24.052562][C0] secondary_startup_64+0xa4/0xb0 [ 24.053782][C0] Modules linked in: [ 24.054727][C0] ---[ end trace 4834b676d8758fa9 ]--- [ 24.056060][C0] RIP: 0010:bfq_bfqq_expire+0x1f3/0x3d0 [ 24.057399][C0] Code: 01 00 00 00 00 00 00 a8 02 75 3c 41 83 ec 01 41 83 fc 01 76 32 48 0f ba ab 08 01 00 00 03 48 8b 83 c8 00 00 00 48 85 c0 74 07 40 40 00 00 00 00 5b 5d 41 5c 41 5d 41 5e c3 e8 38 3 4 d3 ff 48 [ 24.062130][C0] RSP: 0018:97577ba03ed0 EFLAGS: 00010002 [ 24.063630][C0] RAX: 6b6b6b6b6b6b6b6b RBX: 975777e5ad98 RCX: 001b [ 24.065640][C0]