Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!
On Fri, Sep 09, 2016 at 08:03:42PM +0200, Stefan Priebe - Profihost AG wrote: > Am 08.09.2016 um 19:33 schrieb Shaohua Li: > > On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote: > >> On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: > >>> Hi, > >>> > >>> while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. > >>> > >>> Trace: > >>> [ cut here ] > >>> kernel BUG at block/blk-core.c:2032! > >>> invalid opcode: [#1] SMP > >>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport > >>> iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core > >>> x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 irqbypass i2c_smbus > >>> ipmi_si crc32_pclmul i2c_core ghash_clmulni_intel shpchp ipmi_msghandler > >>> button loop fuse btrfs dm_mod raid10 raid0 multipath linear raid456 > >>> async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq > >>> raid1 md_mod sg sd_mod ixgbe i40e mdio usbhid ehci_pci ehci_hcd ahci > >>> usbcore ptp libahci usb_common megaraid_sas pps_core > >>> CPU: 8 PID: 1105 Comm: md0_raid5 Not tainted 4.8.0-rc5-3-g3abda5c #2 > >>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 > >>> task: 97de5e1e task.stack: 97de597a > >>> RIP: 0010:[] [] > >>> generic_make_request+0x1c0/0x1d0 > >>> RSP: 0018:97de597a3aa0 EFLAGS: 00010286 > >>> RAX: 97de5e1e RBX: 97dd227e5030 RCX: > >>> RDX: c001 RSI: 0001 RDI: 97de5e7d9db8 > >>> RBP: 97de597a3ad8 R08: 0008 R09: > >>> R10: R11: 0001 R12: > >>> R13: 97de5aa20c00 R14: 02f0 R15: 97e65dce0e00 > >>> FS: () GS:97e67f20() > >>> knlGS: > >>> CS: 0010 DS: ES: CR0: 80050033 > >>> CR2: 7f0e4e1ec000 CR3: 78c06000 CR4: 001406e0Stack: > >>> 97de597a3b50 1000 97dd227e4c80 > >>> 97de5aa20c00 02f0 97e65dce0e00 97de597a3ba0 > >>> c02595db c025e04b 0001597a3b01 00020006 > >>> Call Trace: > >>> [] ops_run_io+0x3bb/0x990 [raid456] > >>> [] ? raid_run_ops+0xefb/0x1520 [raid456] > >>> [] handle_stripe+0x9a6/0x2280 [raid456] > >>> [] ? default_wake_function+0x12/0x20 > >>> [] ? autoremove_wake_function+0x12/0x40 > >>> [] handle_active_stripes.isra.54+0x193/0x4b0 [raid456] > >>> [] ? __release_stripe+0x15/0x20 [raid456] > >>> [] raid5d+0x4a9/0x740 [raid456] > >>> [] ? init_timer_key+0xa0/0xa0 > >>> [] md_thread+0x12b/0x130 [md_mod] > >>> [] ? wait_woken+0x90/0x90 > >>> [] ? find_pers+0x70/0x70 [md_mod] > >>> [] kthread+0xdb/0x100 > >>> [] ret_from_fork+0x1f/0x40 > >>> [] ? kthread_park+0x60/0x60 > >>> Code: bd 70 08 00 00 f0 49 83 ad 70 08 00 00 01 74 05 e9 5a ff ff ff 41 > >>> ff 95 80 08 00 00 e9 4e ff ff ff 48 c7 40 08 00 00 00 00 eb 8c <0f> 0b > >>> 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 > >>> RIP [] generic_make_request+0x1c0/0x1d0 > >>> RSP > >>> ---[ end trace 457dbe5e9cdd3473 ]--- > >> > >> CC'ing Shaohua - this is: > >> > >> BUG_ON(bio->bi_next); > >> > >> which doesn't look healthy. > > > > Hi Stefan, > > does below patch help? Looks there is a race condition introduced recently. > > Yes this one fixes it. Thanks, will push to Linus soon.
Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!
On Fri, Sep 09, 2016 at 08:03:42PM +0200, Stefan Priebe - Profihost AG wrote: > Am 08.09.2016 um 19:33 schrieb Shaohua Li: > > On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote: > >> On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: > >>> Hi, > >>> > >>> while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. > >>> > >>> Trace: > >>> [ cut here ] > >>> kernel BUG at block/blk-core.c:2032! > >>> invalid opcode: [#1] SMP > >>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport > >>> iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core > >>> x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 irqbypass i2c_smbus > >>> ipmi_si crc32_pclmul i2c_core ghash_clmulni_intel shpchp ipmi_msghandler > >>> button loop fuse btrfs dm_mod raid10 raid0 multipath linear raid456 > >>> async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq > >>> raid1 md_mod sg sd_mod ixgbe i40e mdio usbhid ehci_pci ehci_hcd ahci > >>> usbcore ptp libahci usb_common megaraid_sas pps_core > >>> CPU: 8 PID: 1105 Comm: md0_raid5 Not tainted 4.8.0-rc5-3-g3abda5c #2 > >>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 > >>> task: 97de5e1e task.stack: 97de597a > >>> RIP: 0010:[] [] > >>> generic_make_request+0x1c0/0x1d0 > >>> RSP: 0018:97de597a3aa0 EFLAGS: 00010286 > >>> RAX: 97de5e1e RBX: 97dd227e5030 RCX: > >>> RDX: c001 RSI: 0001 RDI: 97de5e7d9db8 > >>> RBP: 97de597a3ad8 R08: 0008 R09: > >>> R10: R11: 0001 R12: > >>> R13: 97de5aa20c00 R14: 02f0 R15: 97e65dce0e00 > >>> FS: () GS:97e67f20() > >>> knlGS: > >>> CS: 0010 DS: ES: CR0: 80050033 > >>> CR2: 7f0e4e1ec000 CR3: 78c06000 CR4: 001406e0Stack: > >>> 97de597a3b50 1000 97dd227e4c80 > >>> 97de5aa20c00 02f0 97e65dce0e00 97de597a3ba0 > >>> c02595db c025e04b 0001597a3b01 00020006 > >>> Call Trace: > >>> [] ops_run_io+0x3bb/0x990 [raid456] > >>> [] ? raid_run_ops+0xefb/0x1520 [raid456] > >>> [] handle_stripe+0x9a6/0x2280 [raid456] > >>> [] ? default_wake_function+0x12/0x20 > >>> [] ? autoremove_wake_function+0x12/0x40 > >>> [] handle_active_stripes.isra.54+0x193/0x4b0 [raid456] > >>> [] ? __release_stripe+0x15/0x20 [raid456] > >>> [] raid5d+0x4a9/0x740 [raid456] > >>> [] ? init_timer_key+0xa0/0xa0 > >>> [] md_thread+0x12b/0x130 [md_mod] > >>> [] ? wait_woken+0x90/0x90 > >>> [] ? find_pers+0x70/0x70 [md_mod] > >>> [] kthread+0xdb/0x100 > >>> [] ret_from_fork+0x1f/0x40 > >>> [] ? kthread_park+0x60/0x60 > >>> Code: bd 70 08 00 00 f0 49 83 ad 70 08 00 00 01 74 05 e9 5a ff ff ff 41 > >>> ff 95 80 08 00 00 e9 4e ff ff ff 48 c7 40 08 00 00 00 00 eb 8c <0f> 0b > >>> 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 > >>> RIP [] generic_make_request+0x1c0/0x1d0 > >>> RSP > >>> ---[ end trace 457dbe5e9cdd3473 ]--- > >> > >> CC'ing Shaohua - this is: > >> > >> BUG_ON(bio->bi_next); > >> > >> which doesn't look healthy. > > > > Hi Stefan, > > does below patch help? Looks there is a race condition introduced recently. > > Yes this one fixes it. Thanks, will push to Linus soon.
Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!
Am 08.09.2016 um 19:33 schrieb Shaohua Li: > On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote: >> On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: >>> Hi, >>> >>> while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. >>> >>> Trace: >>> [ cut here ] >>> kernel BUG at block/blk-core.c:2032! >>> invalid opcode: [#1] SMP >>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport >>> iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core >>> x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 irqbypass i2c_smbus >>> ipmi_si crc32_pclmul i2c_core ghash_clmulni_intel shpchp ipmi_msghandler >>> button loop fuse btrfs dm_mod raid10 raid0 multipath linear raid456 >>> async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq >>> raid1 md_mod sg sd_mod ixgbe i40e mdio usbhid ehci_pci ehci_hcd ahci >>> usbcore ptp libahci usb_common megaraid_sas pps_core >>> CPU: 8 PID: 1105 Comm: md0_raid5 Not tainted 4.8.0-rc5-3-g3abda5c #2 >>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 >>> task: 97de5e1e task.stack: 97de597a >>> RIP: 0010:[] [] >>> generic_make_request+0x1c0/0x1d0 >>> RSP: 0018:97de597a3aa0 EFLAGS: 00010286 >>> RAX: 97de5e1e RBX: 97dd227e5030 RCX: >>> RDX: c001 RSI: 0001 RDI: 97de5e7d9db8 >>> RBP: 97de597a3ad8 R08: 0008 R09: >>> R10: R11: 0001 R12: >>> R13: 97de5aa20c00 R14: 02f0 R15: 97e65dce0e00 >>> FS: () GS:97e67f20() knlGS: >>> CS: 0010 DS: ES: CR0: 80050033 >>> CR2: 7f0e4e1ec000 CR3: 78c06000 CR4: 001406e0Stack: >>> 97de597a3b50 1000 97dd227e4c80 >>> 97de5aa20c00 02f0 97e65dce0e00 97de597a3ba0 >>> c02595db c025e04b 0001597a3b01 00020006 >>> Call Trace: >>> [] ops_run_io+0x3bb/0x990 [raid456] >>> [] ? raid_run_ops+0xefb/0x1520 [raid456] >>> [] handle_stripe+0x9a6/0x2280 [raid456] >>> [] ? default_wake_function+0x12/0x20 >>> [] ? autoremove_wake_function+0x12/0x40 >>> [] handle_active_stripes.isra.54+0x193/0x4b0 [raid456] >>> [] ? __release_stripe+0x15/0x20 [raid456] >>> [] raid5d+0x4a9/0x740 [raid456] >>> [] ? init_timer_key+0xa0/0xa0 >>> [] md_thread+0x12b/0x130 [md_mod] >>> [] ? wait_woken+0x90/0x90 >>> [] ? find_pers+0x70/0x70 [md_mod] >>> [] kthread+0xdb/0x100 >>> [] ret_from_fork+0x1f/0x40 >>> [] ? kthread_park+0x60/0x60 >>> Code: bd 70 08 00 00 f0 49 83 ad 70 08 00 00 01 74 05 e9 5a ff ff ff 41 >>> ff 95 80 08 00 00 e9 4e ff ff ff 48 c7 40 08 00 00 00 00 eb 8c <0f> 0b >>> 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 >>> RIP [] generic_make_request+0x1c0/0x1d0 >>> RSP >>> ---[ end trace 457dbe5e9cdd3473 ]--- >> >> CC'ing Shaohua - this is: >> >> BUG_ON(bio->bi_next); >> >> which doesn't look healthy. > > Hi Stefan, > does below patch help? Looks there is a race condition introduced recently. Yes this one fixes it. Thanks. Stefan > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index b95c54c..ee7fc37 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -2423,10 +2423,10 @@ static void raid5_end_read_request(struct bio * bi) > } > } > rdev_dec_pending(rdev, conf->mddev); > + bio_reset(bi); > clear_bit(R5_LOCKED, >dev[i].flags); > set_bit(STRIPE_HANDLE, >state); > raid5_release_stripe(sh); > - bio_reset(bi); > } > > static void raid5_end_write_request(struct bio *bi) > @@ -2498,6 +2498,7 @@ static void raid5_end_write_request(struct bio *bi) > if (sh->batch_head && bi->bi_error && !replacement) > set_bit(STRIPE_BATCH_ERR, >batch_head->state); > > + bio_reset(bi); > if (!test_and_clear_bit(R5_DOUBLE_LOCKED, >dev[i].flags)) > clear_bit(R5_LOCKED, >dev[i].flags); > set_bit(STRIPE_HANDLE, >state); > @@ -2505,7 +2506,6 @@ static void raid5_end_write_request(struct bio *bi) > > if (sh->batch_head && sh != sh->batch_head) > raid5_release_stripe(sh->batch_head); > - bio_reset(bi); > } > > static void raid5_build_block(struct stripe_head *sh, int i, int previous) >
Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!
Am 08.09.2016 um 19:33 schrieb Shaohua Li: > On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote: >> On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: >>> Hi, >>> >>> while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. >>> >>> Trace: >>> [ cut here ] >>> kernel BUG at block/blk-core.c:2032! >>> invalid opcode: [#1] SMP >>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport >>> iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core >>> x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 irqbypass i2c_smbus >>> ipmi_si crc32_pclmul i2c_core ghash_clmulni_intel shpchp ipmi_msghandler >>> button loop fuse btrfs dm_mod raid10 raid0 multipath linear raid456 >>> async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq >>> raid1 md_mod sg sd_mod ixgbe i40e mdio usbhid ehci_pci ehci_hcd ahci >>> usbcore ptp libahci usb_common megaraid_sas pps_core >>> CPU: 8 PID: 1105 Comm: md0_raid5 Not tainted 4.8.0-rc5-3-g3abda5c #2 >>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 >>> task: 97de5e1e task.stack: 97de597a >>> RIP: 0010:[] [] >>> generic_make_request+0x1c0/0x1d0 >>> RSP: 0018:97de597a3aa0 EFLAGS: 00010286 >>> RAX: 97de5e1e RBX: 97dd227e5030 RCX: >>> RDX: c001 RSI: 0001 RDI: 97de5e7d9db8 >>> RBP: 97de597a3ad8 R08: 0008 R09: >>> R10: R11: 0001 R12: >>> R13: 97de5aa20c00 R14: 02f0 R15: 97e65dce0e00 >>> FS: () GS:97e67f20() knlGS: >>> CS: 0010 DS: ES: CR0: 80050033 >>> CR2: 7f0e4e1ec000 CR3: 78c06000 CR4: 001406e0Stack: >>> 97de597a3b50 1000 97dd227e4c80 >>> 97de5aa20c00 02f0 97e65dce0e00 97de597a3ba0 >>> c02595db c025e04b 0001597a3b01 00020006 >>> Call Trace: >>> [] ops_run_io+0x3bb/0x990 [raid456] >>> [] ? raid_run_ops+0xefb/0x1520 [raid456] >>> [] handle_stripe+0x9a6/0x2280 [raid456] >>> [] ? default_wake_function+0x12/0x20 >>> [] ? autoremove_wake_function+0x12/0x40 >>> [] handle_active_stripes.isra.54+0x193/0x4b0 [raid456] >>> [] ? __release_stripe+0x15/0x20 [raid456] >>> [] raid5d+0x4a9/0x740 [raid456] >>> [] ? init_timer_key+0xa0/0xa0 >>> [] md_thread+0x12b/0x130 [md_mod] >>> [] ? wait_woken+0x90/0x90 >>> [] ? find_pers+0x70/0x70 [md_mod] >>> [] kthread+0xdb/0x100 >>> [] ret_from_fork+0x1f/0x40 >>> [] ? kthread_park+0x60/0x60 >>> Code: bd 70 08 00 00 f0 49 83 ad 70 08 00 00 01 74 05 e9 5a ff ff ff 41 >>> ff 95 80 08 00 00 e9 4e ff ff ff 48 c7 40 08 00 00 00 00 eb 8c <0f> 0b >>> 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 >>> RIP [] generic_make_request+0x1c0/0x1d0 >>> RSP >>> ---[ end trace 457dbe5e9cdd3473 ]--- >> >> CC'ing Shaohua - this is: >> >> BUG_ON(bio->bi_next); >> >> which doesn't look healthy. > > Hi Stefan, > does below patch help? Looks there is a race condition introduced recently. Yes this one fixes it. Thanks. Stefan > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index b95c54c..ee7fc37 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -2423,10 +2423,10 @@ static void raid5_end_read_request(struct bio * bi) > } > } > rdev_dec_pending(rdev, conf->mddev); > + bio_reset(bi); > clear_bit(R5_LOCKED, >dev[i].flags); > set_bit(STRIPE_HANDLE, >state); > raid5_release_stripe(sh); > - bio_reset(bi); > } > > static void raid5_end_write_request(struct bio *bi) > @@ -2498,6 +2498,7 @@ static void raid5_end_write_request(struct bio *bi) > if (sh->batch_head && bi->bi_error && !replacement) > set_bit(STRIPE_BATCH_ERR, >batch_head->state); > > + bio_reset(bi); > if (!test_and_clear_bit(R5_DOUBLE_LOCKED, >dev[i].flags)) > clear_bit(R5_LOCKED, >dev[i].flags); > set_bit(STRIPE_HANDLE, >state); > @@ -2505,7 +2506,6 @@ static void raid5_end_write_request(struct bio *bi) > > if (sh->batch_head && sh != sh->batch_head) > raid5_release_stripe(sh->batch_head); > - bio_reset(bi); > } > > static void raid5_build_block(struct stripe_head *sh, int i, int previous) >
Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!
On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote: > On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: > > Hi, > > > > while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. > > > > Trace: > > [ cut here ] > > kernel BUG at block/blk-core.c:2032! > > invalid opcode: [#1] SMP > > Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport > > iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core > > x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 irqbypass i2c_smbus > > ipmi_si crc32_pclmul i2c_core ghash_clmulni_intel shpchp ipmi_msghandler > > button loop fuse btrfs dm_mod raid10 raid0 multipath linear raid456 > > async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq > > raid1 md_mod sg sd_mod ixgbe i40e mdio usbhid ehci_pci ehci_hcd ahci > > usbcore ptp libahci usb_common megaraid_sas pps_core > > CPU: 8 PID: 1105 Comm: md0_raid5 Not tainted 4.8.0-rc5-3-g3abda5c #2 > > Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 > > task: 97de5e1e task.stack: 97de597a > > RIP: 0010:[] [] > > generic_make_request+0x1c0/0x1d0 > > RSP: 0018:97de597a3aa0 EFLAGS: 00010286 > > RAX: 97de5e1e RBX: 97dd227e5030 RCX: > > RDX: c001 RSI: 0001 RDI: 97de5e7d9db8 > > RBP: 97de597a3ad8 R08: 0008 R09: > > R10: R11: 0001 R12: > > R13: 97de5aa20c00 R14: 02f0 R15: 97e65dce0e00 > > FS: () GS:97e67f20() knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 7f0e4e1ec000 CR3: 78c06000 CR4: 001406e0Stack: > > 97de597a3b50 1000 97dd227e4c80 > > 97de5aa20c00 02f0 97e65dce0e00 97de597a3ba0 > > c02595db c025e04b 0001597a3b01 00020006 > > Call Trace: > > [] ops_run_io+0x3bb/0x990 [raid456] > > [] ? raid_run_ops+0xefb/0x1520 [raid456] > > [] handle_stripe+0x9a6/0x2280 [raid456] > > [] ? default_wake_function+0x12/0x20 > > [] ? autoremove_wake_function+0x12/0x40 > > [] handle_active_stripes.isra.54+0x193/0x4b0 [raid456] > > [] ? __release_stripe+0x15/0x20 [raid456] > > [] raid5d+0x4a9/0x740 [raid456] > > [] ? init_timer_key+0xa0/0xa0 > > [] md_thread+0x12b/0x130 [md_mod] > > [] ? wait_woken+0x90/0x90 > > [] ? find_pers+0x70/0x70 [md_mod] > > [] kthread+0xdb/0x100 > > [] ret_from_fork+0x1f/0x40 > > [] ? kthread_park+0x60/0x60 > > Code: bd 70 08 00 00 f0 49 83 ad 70 08 00 00 01 74 05 e9 5a ff ff ff 41 > > ff 95 80 08 00 00 e9 4e ff ff ff 48 c7 40 08 00 00 00 00 eb 8c <0f> 0b > > 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 > > RIP [] generic_make_request+0x1c0/0x1d0 > > RSP > > ---[ end trace 457dbe5e9cdd3473 ]--- > > CC'ing Shaohua - this is: > > BUG_ON(bio->bi_next); > > which doesn't look healthy. Hi Stefan, does below patch help? Looks there is a race condition introduced recently. diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index b95c54c..ee7fc37 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -2423,10 +2423,10 @@ static void raid5_end_read_request(struct bio * bi) } } rdev_dec_pending(rdev, conf->mddev); + bio_reset(bi); clear_bit(R5_LOCKED, >dev[i].flags); set_bit(STRIPE_HANDLE, >state); raid5_release_stripe(sh); - bio_reset(bi); } static void raid5_end_write_request(struct bio *bi) @@ -2498,6 +2498,7 @@ static void raid5_end_write_request(struct bio *bi) if (sh->batch_head && bi->bi_error && !replacement) set_bit(STRIPE_BATCH_ERR, >batch_head->state); + bio_reset(bi); if (!test_and_clear_bit(R5_DOUBLE_LOCKED, >dev[i].flags)) clear_bit(R5_LOCKED, >dev[i].flags); set_bit(STRIPE_HANDLE, >state); @@ -2505,7 +2506,6 @@ static void raid5_end_write_request(struct bio *bi) if (sh->batch_head && sh != sh->batch_head) raid5_release_stripe(sh->batch_head); - bio_reset(bi); } static void raid5_build_block(struct stripe_head *sh, int i, int previous)
Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!
On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote: > On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: > > Hi, > > > > while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. > > > > Trace: > > [ cut here ] > > kernel BUG at block/blk-core.c:2032! > > invalid opcode: [#1] SMP > > Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport > > iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core > > x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 irqbypass i2c_smbus > > ipmi_si crc32_pclmul i2c_core ghash_clmulni_intel shpchp ipmi_msghandler > > button loop fuse btrfs dm_mod raid10 raid0 multipath linear raid456 > > async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq > > raid1 md_mod sg sd_mod ixgbe i40e mdio usbhid ehci_pci ehci_hcd ahci > > usbcore ptp libahci usb_common megaraid_sas pps_core > > CPU: 8 PID: 1105 Comm: md0_raid5 Not tainted 4.8.0-rc5-3-g3abda5c #2 > > Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 > > task: 97de5e1e task.stack: 97de597a > > RIP: 0010:[] [] > > generic_make_request+0x1c0/0x1d0 > > RSP: 0018:97de597a3aa0 EFLAGS: 00010286 > > RAX: 97de5e1e RBX: 97dd227e5030 RCX: > > RDX: c001 RSI: 0001 RDI: 97de5e7d9db8 > > RBP: 97de597a3ad8 R08: 0008 R09: > > R10: R11: 0001 R12: > > R13: 97de5aa20c00 R14: 02f0 R15: 97e65dce0e00 > > FS: () GS:97e67f20() knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 7f0e4e1ec000 CR3: 78c06000 CR4: 001406e0Stack: > > 97de597a3b50 1000 97dd227e4c80 > > 97de5aa20c00 02f0 97e65dce0e00 97de597a3ba0 > > c02595db c025e04b 0001597a3b01 00020006 > > Call Trace: > > [] ops_run_io+0x3bb/0x990 [raid456] > > [] ? raid_run_ops+0xefb/0x1520 [raid456] > > [] handle_stripe+0x9a6/0x2280 [raid456] > > [] ? default_wake_function+0x12/0x20 > > [] ? autoremove_wake_function+0x12/0x40 > > [] handle_active_stripes.isra.54+0x193/0x4b0 [raid456] > > [] ? __release_stripe+0x15/0x20 [raid456] > > [] raid5d+0x4a9/0x740 [raid456] > > [] ? init_timer_key+0xa0/0xa0 > > [] md_thread+0x12b/0x130 [md_mod] > > [] ? wait_woken+0x90/0x90 > > [] ? find_pers+0x70/0x70 [md_mod] > > [] kthread+0xdb/0x100 > > [] ret_from_fork+0x1f/0x40 > > [] ? kthread_park+0x60/0x60 > > Code: bd 70 08 00 00 f0 49 83 ad 70 08 00 00 01 74 05 e9 5a ff ff ff 41 > > ff 95 80 08 00 00 e9 4e ff ff ff 48 c7 40 08 00 00 00 00 eb 8c <0f> 0b > > 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 > > RIP [] generic_make_request+0x1c0/0x1d0 > > RSP > > ---[ end trace 457dbe5e9cdd3473 ]--- > > CC'ing Shaohua - this is: > > BUG_ON(bio->bi_next); > > which doesn't look healthy. Hi Stefan, does below patch help? Looks there is a race condition introduced recently. diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index b95c54c..ee7fc37 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -2423,10 +2423,10 @@ static void raid5_end_read_request(struct bio * bi) } } rdev_dec_pending(rdev, conf->mddev); + bio_reset(bi); clear_bit(R5_LOCKED, >dev[i].flags); set_bit(STRIPE_HANDLE, >state); raid5_release_stripe(sh); - bio_reset(bi); } static void raid5_end_write_request(struct bio *bi) @@ -2498,6 +2498,7 @@ static void raid5_end_write_request(struct bio *bi) if (sh->batch_head && bi->bi_error && !replacement) set_bit(STRIPE_BATCH_ERR, >batch_head->state); + bio_reset(bi); if (!test_and_clear_bit(R5_DOUBLE_LOCKED, >dev[i].flags)) clear_bit(R5_LOCKED, >dev[i].flags); set_bit(STRIPE_HANDLE, >state); @@ -2505,7 +2506,6 @@ static void raid5_end_write_request(struct bio *bi) if (sh->batch_head && sh != sh->batch_head) raid5_release_stripe(sh->batch_head); - bio_reset(bi); } static void raid5_build_block(struct stripe_head *sh, int i, int previous)
Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!
On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: Hi, while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. Trace: [ cut here ] kernel BUG at block/blk-core.c:2032! invalid opcode: [#1] SMP Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 irqbypass i2c_smbus ipmi_si crc32_pclmul i2c_core ghash_clmulni_intel shpchp ipmi_msghandler button loop fuse btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod sg sd_mod ixgbe i40e mdio usbhid ehci_pci ehci_hcd ahci usbcore ptp libahci usb_common megaraid_sas pps_core CPU: 8 PID: 1105 Comm: md0_raid5 Not tainted 4.8.0-rc5-3-g3abda5c #2 Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 task: 97de5e1e task.stack: 97de597a RIP: 0010:[] [] generic_make_request+0x1c0/0x1d0 RSP: 0018:97de597a3aa0 EFLAGS: 00010286 RAX: 97de5e1e RBX: 97dd227e5030 RCX: RDX: c001 RSI: 0001 RDI: 97de5e7d9db8 RBP: 97de597a3ad8 R08: 0008 R09: R10: R11: 0001 R12: R13: 97de5aa20c00 R14: 02f0 R15: 97e65dce0e00 FS: () GS:97e67f20() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f0e4e1ec000 CR3: 78c06000 CR4: 001406e0Stack: 97de597a3b50 1000 97dd227e4c80 97de5aa20c00 02f0 97e65dce0e00 97de597a3ba0 c02595db c025e04b 0001597a3b01 00020006 Call Trace: [] ops_run_io+0x3bb/0x990 [raid456] [] ? raid_run_ops+0xefb/0x1520 [raid456] [] handle_stripe+0x9a6/0x2280 [raid456] [] ? default_wake_function+0x12/0x20 [] ? autoremove_wake_function+0x12/0x40 [] handle_active_stripes.isra.54+0x193/0x4b0 [raid456] [] ? __release_stripe+0x15/0x20 [raid456] [] raid5d+0x4a9/0x740 [raid456] [] ? init_timer_key+0xa0/0xa0 [] md_thread+0x12b/0x130 [md_mod] [] ? wait_woken+0x90/0x90 [] ? find_pers+0x70/0x70 [md_mod] [] kthread+0xdb/0x100 [] ret_from_fork+0x1f/0x40 [] ? kthread_park+0x60/0x60 Code: bd 70 08 00 00 f0 49 83 ad 70 08 00 00 01 74 05 e9 5a ff ff ff 41 ff 95 80 08 00 00 e9 4e ff ff ff 48 c7 40 08 00 00 00 00 eb 8c <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 RIP [] generic_make_request+0x1c0/0x1d0 RSP ---[ end trace 457dbe5e9cdd3473 ]--- CC'ing Shaohua - this is: BUG_ON(bio->bi_next); which doesn't look healthy. -- Jens Axboe
Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!
On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: Hi, while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. Trace: [ cut here ] kernel BUG at block/blk-core.c:2032! invalid opcode: [#1] SMP Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport iptable_filter ip_tables x_tables 8021q garp bonding sb_edac edac_core x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 irqbypass i2c_smbus ipmi_si crc32_pclmul i2c_core ghash_clmulni_intel shpchp ipmi_msghandler button loop fuse btrfs dm_mod raid10 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod sg sd_mod ixgbe i40e mdio usbhid ehci_pci ehci_hcd ahci usbcore ptp libahci usb_common megaraid_sas pps_core CPU: 8 PID: 1105 Comm: md0_raid5 Not tainted 4.8.0-rc5-3-g3abda5c #2 Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015 task: 97de5e1e task.stack: 97de597a RIP: 0010:[] [] generic_make_request+0x1c0/0x1d0 RSP: 0018:97de597a3aa0 EFLAGS: 00010286 RAX: 97de5e1e RBX: 97dd227e5030 RCX: RDX: c001 RSI: 0001 RDI: 97de5e7d9db8 RBP: 97de597a3ad8 R08: 0008 R09: R10: R11: 0001 R12: R13: 97de5aa20c00 R14: 02f0 R15: 97e65dce0e00 FS: () GS:97e67f20() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f0e4e1ec000 CR3: 78c06000 CR4: 001406e0Stack: 97de597a3b50 1000 97dd227e4c80 97de5aa20c00 02f0 97e65dce0e00 97de597a3ba0 c02595db c025e04b 0001597a3b01 00020006 Call Trace: [] ops_run_io+0x3bb/0x990 [raid456] [] ? raid_run_ops+0xefb/0x1520 [raid456] [] handle_stripe+0x9a6/0x2280 [raid456] [] ? default_wake_function+0x12/0x20 [] ? autoremove_wake_function+0x12/0x40 [] handle_active_stripes.isra.54+0x193/0x4b0 [raid456] [] ? __release_stripe+0x15/0x20 [raid456] [] raid5d+0x4a9/0x740 [raid456] [] ? init_timer_key+0xa0/0xa0 [] md_thread+0x12b/0x130 [md_mod] [] ? wait_woken+0x90/0x90 [] ? find_pers+0x70/0x70 [md_mod] [] kthread+0xdb/0x100 [] ret_from_fork+0x1f/0x40 [] ? kthread_park+0x60/0x60 Code: bd 70 08 00 00 f0 49 83 ad 70 08 00 00 01 74 05 e9 5a ff ff ff 41 ff 95 80 08 00 00 e9 4e ff ff ff 48 c7 40 08 00 00 00 00 eb 8c <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 RIP [] generic_make_request+0x1c0/0x1d0 RSP ---[ end trace 457dbe5e9cdd3473 ]--- CC'ing Shaohua - this is: BUG_ON(bio->bi_next); which doesn't look healthy. -- Jens Axboe