29.10.2015, 03:35, "Neil Brown" <ne...@suse.de>: > On Wed, Oct 28 2015, Roman Gushchin wrote: > >> After commit 566c09c53455 ("raid5: relieve lock contention in >> get_active_stripe()") >> __find_stripe() is called under conf->hash_locks + hash. >> But handle_stripe_clean_event() calls remove_hash() under >> conf->device_lock. >> >> Under some cirscumstances the hash chain can be circuited, >> and we get an infinite loop with disabled interrupts and locked hash >> lock in __find_stripe(). This leads to hard lockup on multiple CPUs >> and following system crash. >> >> I was able to reproduce this behavior on raid6 over 6 ssd disks. >> The devices_handle_discard_safely option should be set to enable trim >> support. The following script was used: >> >> for i in `seq 1 32`; do >> dd if=/dev/zero of=large$i bs=10M count=100 & >> done >> >> Signed-off-by: Roman Gushchin <kl...@yandex-team.ru> >> Cc: Neil Brown <ne...@suse.de> >> Cc: Shaohua Li <s...@kernel.org> >> Cc: linux-r...@vger.kernel.org >> Cc: <sta...@vger.kernel.org> # 3.10 - 3.19 > > Hi Roman, > thanks for reporting this and providing a fix. > > I'm a bit confused by that stable range: 3.10 - 3.19 > > The commit you identify as introducing the bug was added in 3.13, so > presumably 3.10, 3.11, 3.12 are not affected.
Sure, it's my mistake. Correct range is 3.13 - 3.19. Sorry. > Also the bug is still present in mainline, so 4.0, 4.1, 4.2 are also > affected, though the patch needs to be revised a bit for 4.1 and later. Yes, exactly, but things are a bit more complicated in mainline. I'll try to prepare a patch for mainline in a couple of days. Thanks, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/