Actually we cannot be sure no. The server didn't have any metrics prior
to few days ago and the issue was already there.

It's worth nothing that few servers have this bcache configuration,
because the cache mode is configured as writethrough and the load is
pretty significant.

So no last "good" version.

Actually, we have various IO wait issue on another platform (but running
xenial-hwe kernel) and we suspected bcache already. Mentioning it to
show that bcache behavior seems to be related to some disk performance
since quite some time.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898786

Title:
  Issue with bcache bch_mca_scan causing huge IO wait

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Bionic:
  New

Bug description:
  Hello,

  In short, we faced an issue with a huge IO wait on a bionic Ubuntu 
4.15.0-118.119-generic kernel.
  This is the full list of process and the kernel function they were stuck in 
[0].

  The main issue can probably be summarized by this perf reports
  * first identify that the cpu are stuck in idle because of something[1]
  * second, see what kernel function seems to stuck the process kswapd0 and 
kswapd1 [2].

  We could see that this seems to be the mutex_lock in the bch_mca_scan
  function [3].

  After running the command:

   | sudo bash -c "echo 1 > /sys/fs/bcache/f1a1e8cb-3e6b-40ea-852e-
  583c48d0c2b8/internal/btree_shrinker_disabled"

  The server started to respond normally and the IO wait dropped significantly
  [0]: https://pastebin.canonical.com/p/wYYKwHdRXk/
  [1]: https://pastebin.canonical.com/p/n2Tw57QyBC/
  [2]: https://pastebin.canonical.com/p/3QqFTfdHhX/
  [3]: 
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/drivers/md/bcache/btree.c?h=Ubuntu-4.15.0-118.119#n674

  ====================
  $ cat /proc/version_signature
  Ubuntu 4.15.0-118.119-generic 4.15.18

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-118-generic 4.15.0-118.119
  ProcVersionSignature: User Name 4.15.0-118.119-generic 4.15.18
  Uname: Linux 4.15.0-118-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Sep 29 10:04 seq
   crw-rw---- 1 root audio 116, 33 Sep 29 10:04 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.16
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Tue Oct  6 20:36:18 2020
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: HP ProLiant DL380 G7
  PciMultimedia:

  ProcFB: 0 radeondrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-118-generic 
root=UUID=c6ad1629-a506-4043-a339-6d57f0708d12 ro console=ttyS1,115200 nosplash
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-118-generic N/A
   linux-backports-modules-4.15.0-118-generic  N/A
   linux-firmware                              1.173.18
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: Upgraded to bionic on 2019-09-27 (375 days ago)
  dmi.bios.date: 05/05/2011
  dmi.bios.vendor: HP
  dmi.bios.version: P67
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: 
dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr:
  dmi.product.family: ProLiant
  dmi.product.name: ProLiant DL380 G7
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to