@ Ryan we do not test Xenial or Disco On Thu, Aug 22, 2019 at 7:41 PM Ryan Harper <1784...@bugs.launchpad.net> wrote:
> Finally, I did verify xenial proposed with our original test. I had > over 100 installs with no issue. > > @Jason > > Have you had any runs on Xenial or Disco? (or do you not test those)? > > -- > You received this bug notification because you are a member of Canonical > Field Critical, which is subscribed to a duplicate bug report (1796292). > https://bugs.launchpad.net/bugs/1784665 > > Title: > bcache: bch_allocator_thread(): hung task timeout > > Status in linux package in Ubuntu: > Fix Committed > Status in linux source package in Xenial: > Fix Committed > Status in linux source package in Bionic: > New > Status in linux source package in Disco: > Fix Committed > Status in linux source package in Eoan: > Fix Committed > > Bug description: > [Impact] > > bcache_allocator() can call the following: > > bch_allocator_thread() > -> bch_prio_write() > -> bch_bucket_alloc() > -> wait on &ca->set->bucket_wait > > But the wake up event on bucket_wait is supposed to come from > bch_allocator_thread() itself causing a deadlock. > > [Test Case] > > This is a simple script that can easily trigger the deadlock condition: > https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh > > A better test case has been also provided in bug 1796292 (duplicate of > this bug): > > https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh > > [Fix] > > Fix by making the call to bch_prio_write() non-blocking, so that > bch_allocator_thread() never waits on itself. Moreover, make sure to > wake up the garbage collector thread when bch_prio_write() is failing > to allocate buckets to increase the chance of freeing up more buckets. > > In addition to that it would be safe to also import other upstream > bcache fixes (all clean cherry picks): > > 7e865eba00a3df2dc8c4746173a8ca1c1c7f042e bcache: fix potential deadlock > in cached_def_free() > 80265d8dfd77792e133793cef44a21323aac2908 bcache: acquire > bch_register_lock later in cached_dev_free() > ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace > bch_read_string_list() by __sysfs_match_string() > ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of > functions to sysfs.c > 04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string > arrays to sysfs.c > 5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning > in bcache_device_init() > 20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of > sparse complaints about lock imbalances > 42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings > about set-but-not-used variables > f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused > variable > 47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings > 9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch > fall-through > 4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation > to __bch_check_keys() > fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation > ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop > variable in memory shrink > f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value > in memory shrink > 688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs > output value of strip size > 09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform > duplicated CACHE_SET_IO_DISABLE set > c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy > during journal > a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic > 616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target > calc on large devices > 1f0ffa67349c56ea54c03ccfd1e073c990e7411e bcache: only set > BCACHE_DEV_WB_RUNNING when cached device attached > eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot() > 9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a > discard operation > > [Regression Potential] > > The upstream fixes are all clean cherry picks from stable (most of > them are small cleanups), so regression potential is minimal. > > The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in > bcache_allocator()" that is addressing the main deadlock bug (that > seems to be a mainline bug - not fixed yet). We should spend more time > trying to reproduce this deadlock with a mainline kernel and post the > patch to the LKML for review / feedback. > > However, considering that this patch seems to fix/prevent the specific > deadlock problem reported in this bug (tested on the affected > platform) it can be considered safe to apply it. > > [Original Bug Report] > > $ cat /proc/version_signature > Ubuntu 4.15.0-29.31-generic 4.15.18 > > $ lsb_release -rd > Description: Ubuntu Cosmic Cuttlefish (development branch) > Release: 18.10 > > $ apt-cache policy linux-image-`uname -r` > linux-image-4.15.0-29-generic: > Installed: 4.15.0-29.31 > Candidate: 4.15.0-29.31 > Version table: > *** 4.15.0-29.31 500 > 500 http://archive.ubuntu.com/ubuntu cosmic/main amd64 Packages > 100 /var/lib/dpkg/status > > 3) mkfs.ext4 /dev/bcache0 returns successful creating an ext4 > filesystem on top of a bcache device > > 4) mkfs.ext4 doesn't return and kernel prints hung process info > > [ 58.018099] cloud-init[920]: Running command ['mkfs.ext4', '-F', > '-L', 'root-fs', '-U', 'f01aec97-9457-11e8-b8d6-525400123401', > '/dev/bcache0'] with allowed return codes [0] (capture=True) > [ 242.652018] INFO: task kworker/u4:0:5 blocked for more than 120 > seconds. > [ 242.653767] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 242.655391] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 242.657397] INFO: task kworker/0:2:410 blocked for more than 120 > seconds. > [ 242.659126] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 242.660980] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 242.663000] INFO: task bcache_allocato:2326 blocked for more than 120 > seconds. > [ 242.664807] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 242.666516] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 242.668503] INFO: task bcache_writebac:2345 blocked for more than 120 > seconds. > [ 242.670301] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 242.671936] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 242.673909] INFO: task mkfs.ext4:2803 blocked for more than 120 > seconds. > [ 242.675414] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 242.677038] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 363.483998] INFO: task kworker/u4:0:5 blocked for more than 120 > seconds. > [ 363.488441] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 363.489598] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 363.491043] INFO: task kworker/0:2:410 blocked for more than 120 > seconds. > [ 363.492252] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 363.494085] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 363.495659] INFO: task bcache_allocato:2326 blocked for more than 120 > seconds. > [ 363.496957] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 363.498454] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 363.499866] INFO: task bcache_writebac:2345 blocked for more than 120 > seconds. > [ 363.501156] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 363.502597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 363.504048] INFO: task mkfs.ext4:2803 blocked for more than 120 > seconds. > [ 363.505505] Tainted: P O 4.15.0-29-generic > #31-Ubuntu > [ 363.506677] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > > System has two virtio block devices. bcache was created like so: > > make-bcache -C /dev/vdb > make-bcache -B /dev/vda2 > > resulting in /dev/bcache0 > > ProblemType: Bug > DistroRelease: Ubuntu 18.10 > Package: linux-image-4.15.0-29-generic 4.15.0-29.31 > ProcVersionSignature: User Name 4.15.0-29.31-generic 4.15.18 > Uname: Linux 4.15.0-29-generic x86_64 > AlsaDevices: > total 0 > crw-rw---- 1 root audio 116, 1 Jul 31 15:52 seq > crw-rw---- 1 root audio 116, 33 Jul 31 15:52 timer > AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': > 'aplay' > ApportVersion: 2.20.10-0ubuntu7 > Architecture: amd64 > ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': > 'arecord' > AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', > '/dev/snd/timer'] failed with exit code 1: > CRDA: N/A > Date: Tue Jul 31 15:53:56 2018 > IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': > 'iwconfig' > Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub > MachineType: QEMU Standard PC (i440FX + PIIX, 1996) > PciMultimedia: > > ProcEnviron: > TERM=xterm > PATH=(custom, no user) > XDG_RUNTIME_DIR=<set> > LANG=C.UTF-8 > SHELL=/bin/bash > ProcFB: > > ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-29-generic > root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 > RelatedPackageVersions: > linux-restricted-modules-4.15.0-29-generic N/A > linux-backports-modules-4.15.0-29-generic N/A > linux-firmware N/A > RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' > SourcePackage: linux > UpgradeStatus: No upgrade log present (probably fresh install) > dmi.bios.date: 04/01/2014 > dmi.bios.vendor: SeaBIOS > dmi.bios.version: 1.11.1-1 > dmi.chassis.type: 1 > dmi.chassis.vendor: QEMU > dmi.chassis.version: pc-i440fx-bionic > dmi.modalias: > dmi:bvnSeaBIOS:bvr1.11.1-1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-bionic:cvnQEMU:ct1:cvrpc-i440fx-bionic: > dmi.product.name: Standard PC (i440FX + PIIX, 1996) > dmi.product.version: pc-i440fx-bionic > dmi.sys.vendor: QEMU > > To manage notifications about this bug go to: > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784665/+subscriptions > -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1784665 Title: bcache: bch_allocator_thread(): hung task timeout To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784665/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs