[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2021-01-27 Thread Ponnuvel Palaniyappan
Thanks for the clarification, Heitor!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Disco:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2021-01-26 Thread Heitor Alves de Siqueira
@pponnuvel stat(2) doesn't perform a read(2) on the file, so it won't
call the show() method for priority_stats (which was the problematic
call for this specific sysfs attribute).

In any case, this should be fixed in all supported Ubuntu series, so
even read(2) shouldn't be an issue anymore. Please let us know if you
see any performance regressions related to this change!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Disco:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2021-01-25 Thread Ponnuvel Palaniyappan
A question: Does stat(2) calls (lstat, fstat and the like) on priority_stats 
cause the same issue?
Context: https://github.com/sosreport/sos/pull/2384#discussion_r563975063

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Disco:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2020-07-15 Thread Guilherme G. Piccoli
** Changed in: linux
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Disco:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-12-06 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.3.0-24.26

---
linux (5.3.0-24.26) eoan; urgency=medium

  * eoan/linux: 5.3.0-24.26 -proposed tracker (LP: #1852232)

  * Eoan update: 5.3.9 upstream stable release (LP: #1851550)
- io_uring: fix up O_NONBLOCK handling for sockets
- dm snapshot: introduce account_start_copy() and account_end_copy()
- dm snapshot: rework COW throttling to fix deadlock
- Btrfs: fix inode cache block reserve leak on failure to allocate data 
space
- btrfs: qgroup: Always free PREALLOC META reserve in
  btrfs_delalloc_release_extents()
- iio: adc: meson_saradc: Fix memory allocation order
- iio: fix center temperature of bmc150-accel-core
- libsubcmd: Make _FORTIFY_SOURCE defines dependent on the feature
- perf tests: Avoid raising SEGV using an obvious NULL dereference
- perf map: Fix overlapped map handling
- perf script brstackinsn: Fix recovery from LBR/binary mismatch
- perf jevents: Fix period for Intel fixed counters
- perf tools: Propagate get_cpuid() error
- perf annotate: Propagate perf_env__arch() error
- perf annotate: Fix the signedness of failure returns
- perf annotate: Propagate the symbol__annotate() error return
- perf annotate: Fix arch specific ->init() failure errors
- perf annotate: Return appropriate error code for allocation failures
- perf annotate: Don't return -1 for error when doing BPF disassembly
- staging: rtl8188eu: fix null dereference when kzalloc fails
- RDMA/siw: Fix serialization issue in write_space()
- RDMA/hfi1: Prevent memory leak in sdma_init
- RDMA/iw_cxgb4: fix SRQ access from dump_qp()
- RDMA/iwcm: Fix a lock inversion issue
- HID: hyperv: Use in-place iterator API in the channel callback
- kselftest: exclude failed TARGETS from runlist
- selftests/kselftest/runner.sh: Add 45 second timeout per test
- nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request
- arm64: cpufeature: Effectively expose FRINT capability to userspace
- arm64: Fix incorrect irqflag restore for priority masking for compat
- arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 
#1542419
- tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
- tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
- serial/sifive: select SERIAL_EARLYCON
- tty: n_hdlc: fix build on SPARC
- misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach
- RDMA/core: Fix an error handling path in 'res_get_common_doit()'
- RDMA/cm: Fix memory leak in cm_add/remove_one
- RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path
- RDMA/mlx5: Do not allow rereg of a ODP MR
- RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcu
- RDMA/mlx5: Add missing synchronize_srcu() for MW cases
- gpio: max77620: Use correct unit for debounce times
- fs: cifs: mute -Wunused-const-variable message
- arm64: vdso32: Fix broken compat vDSO build warnings
- arm64: vdso32: Detect binutils support for dmb ishld
- serial: mctrl_gpio: Check for NULL pointer
- serial: 8250_omap: Fix gpio check for auto RTS/CTS
- arm64: Default to building compat vDSO with clang when CONFIG_CC_IS_CLANG
- arm64: vdso32: Don't use KBUILD_CPPFLAGS unconditionally
- efi/cper: Fix endianness of PCIe class code
- efi/x86: Do not clean dummy variable in kexec path
- MIPS: include: Mark __cmpxchg as __always_inline
- riscv: avoid kernel hangs when trapped in BUG()
- riscv: avoid sending a SIGTRAP to a user thread trapped in WARN()
- riscv: Correct the handling of unexpected ebreak in do_trap_break()
- x86/xen: Return from panic notifier
- ocfs2: clear zero in unaligned direct IO
- fs: ocfs2: fix possible null-pointer dereferences in
  ocfs2_xa_prepare_entry()
- fs: ocfs2: fix a possible null-pointer dereference in
  ocfs2_write_end_nolock()
- fs: ocfs2: fix a possible null-pointer dereference in
  ocfs2_info_scan_inode_alloc()
- btrfs: silence maybe-uninitialized warning in clone_range
- arm64: armv8_deprecated: Checking return value for memory allocation
- sched/fair: Scale bandwidth quota and period without losing quota/period
  ratio precision
- sched/vtime: Fix guest/system mis-accounting on task switch
- perf/core: Rework memory accounting in perf_mmap()
- perf/core: Fix corner case in perf_rotate_context()
- perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp
- drm/amdgpu: fix memory leak
- iio: imu: adis16400: release allocated memory on failure
- iio: imu: adis16400: fix memory leak
- iio: imu: st_lsm6dsx: fix waitime for st_lsm6dsx i2c controller
- MIPS: include: Mark __xchg as __always_inline
- MIPS: fw: sni: Fix out of bounds init of o32 stack
- s390/cio: fix virtio-ccw DMA without PV
- virt: vbox: fix memory 

[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-11-12 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.4.0-168.197

---
linux (4.4.0-168.197) xenial; urgency=medium

  * CVE-2018-12207
- KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct
- KVM: x86: MMU: Consolidate quickly_check_mmio_pf() and 
is_mmio_page_fault()
- KVM: x86: MMU: Move handle_mmio_page_fault() call to kvm_mmu_page_fault()
- KVM: MMU: rename has_wrprotected_page to mmu_gfn_lpage_is_disallowed
- KVM: MMU: introduce kvm_mmu_gfn_{allow,disallow}_lpage
- KVM: x86: MMU: Make mmu_set_spte() return emulate value
- KVM: x86: MMU: Move initialization of parent_ptes out from
  kvm_mmu_alloc_page()
- KVM: x86: MMU: always set accessed bit in shadow PTEs
- KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to
  link_shadow_page()
- KVM: x86: MMU: Remove unused parameter parent_pte from kvm_mmu_get_page()
- KVM: x86: simplify ept_misconfig
- KVM: x86: extend usage of RET_MMIO_PF_* constants
- KVM: MMU: drop vcpu param in gpte_access
- kvm: Convert kvm_lock to a mutex
- kvm: x86: Do not release the page inside mmu_set_spte()
- KVM: x86: make FNAME(fetch) and __direct_map more similar
- KVM: x86: remove now unneeded hugepage gfn adjustment
- KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
- KVM: x86: add tracepoints around __direct_map and FNAME(fetch)
- SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
  active
- SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
- SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
- SAUCE: kvm: Add helper function for creating VM worker threads
- SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
- SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
- SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
- KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts
- KVM: x86: use Intel speculation bugs and features as derived in generic 
x86
  code
- x86/msr: Add the IA32_TSX_CTRL MSR
- x86/cpu: Add a helper function x86_read_arch_cap_msr()
- x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
- x86/speculation/taa: Add mitigation for TSX Async Abort
- x86/speculation/taa: Add sysfs reporting for TSX Async Abort
- kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
- x86/tsx: Add "auto" option to the tsx= cmdline parameter
- x86/speculation/taa: Add documentation for TSX Async Abort
- x86/tsx: Add config options to set tsx=on|off|auto
- SAUCE: x86/speculation/taa: Call tsx_init()
- SAUCE: x86/cpu: Include cpu header from bugs.c
- [Config] Disable TSX by default when possible

  * CVE-2019-0154
- SAUCE: i915_bpo: drm/i915: Lower RM timeout to avoid DSI hard hangs
- SAUCE: i915_bpo: drm/i915/gen8+: Add RC6 CTX corruption WA
- SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
- SAUCE: i915_bpo: drm/i915/gtt: Add read only pages to gen8_pte_encode
- SAUCE: i915_bpo: drm/i915/gtt: Read-only pages for insert_entries on bdw+
- SAUCE: i915_bpo: drm/i915/gtt: Disable read-only support under GVT
- SAUCE: i915_bpo: drm/i915: Rename gen7 cmdparser tables
- SAUCE: i915_bpo: drm/i915: Disable Secure Batches for gen6+
- SAUCE: i915_bpo: drm/i915/cmdparser: Use binary search for faster register
  lookup
- SAUCE: i915_bpo: drm/i915/cmdparser: Check reg_table_count before
  derefencing.
- SAUCE: i915_bpo: drm/i915: Remove Master tables from cmdparser
- SAUCE: i915_bpo: drm/i915: Add support for mandatory cmdparsing
- SAUCE: i915_bpo: drm/i915: Support ro ppgtt mapped cmdparser shadow 
buffers
- SAUCE: i915_bpo: drm/i915: Allow parsing of unsized batches
- SAUCE: i915_bpo: drm/i915: Add gen9 BCS cmdparsing
- SAUCE: i915_bpo: drm/i915/cmdparser: Add support for backward jumps
- SAUCE: i915_bpo: drm/i915/cmdparser: Ignore Length operands during command
  matching

linux (4.4.0-167.196) xenial; urgency=medium

  * xenial/linux: 4.4.0-167.196 -proposed tracker (LP: #1849051)

  * Xenial update: 4.4.197 upstream stable release (LP: #1848780)
- KVM: s390: Test for bad access register and size at the start of 
S390_MEM_OP
- s390/topology: avoid firing events before kobjs are created
- s390/cio: avoid calling strlen on null pointer
- s390/cio: exclude subchannels with no parent from pseudo check
- KVM: nVMX: handle page fault in vmread fix
- ASoC: Define a set of DAPM pre/post-up events
- powerpc/powernv: Restrict OPAL symbol map to only be readable by root
- can: mcp251x: mcp251x_hw_reset(): allow more time after a reset
- crypto: qat - Silence smp_processor_id() warning
- ieee802154: atusb: fix use-after-free at disconnect
- cfg80211: initialize on-stack chandefs
- ima: always return negative code for error
- fs: nfs: Fix possible null-pointer de

[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-11-12 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.3.0-22.24

---
linux (5.3.0-22.24) eoan; urgency=medium

  * [REGRESSION]  md/raid0: cannot assemble multi-zone RAID0 with default_layout
setting (LP: #1849682)
- Revert "md/raid0: avoid RAID0 data corruption due to layout confusion."

  * refcount underflow and type confusion in shiftfs (LP: #1850867) // 
CVE-2019-15793
- SAUCE: shiftfs: Correct id translation for lower fs operations
- SAUCE: shiftfs: prevent type confusion
- SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

  * CVE-2018-12207
- kvm: x86, powerpc: do not allow clearing largepages debugfs entry
- SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
  active
- SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
- SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
- SAUCE: kvm: Add helper function for creating VM worker threads
- SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
- SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
- SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
- x86/msr: Add the IA32_TSX_CTRL MSR
- x86/cpu: Add a helper function x86_read_arch_cap_msr()
- x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
- x86/speculation/taa: Add mitigation for TSX Async Abort
- x86/speculation/taa: Add sysfs reporting for TSX Async Abort
- kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
- x86/tsx: Add "auto" option to the tsx= cmdline parameter
- x86/speculation/taa: Add documentation for TSX Async Abort
- x86/tsx: Add config options to set tsx=on|off|auto
- [Config] Disable TSX by default when possible

  * CVE-2019-0154
- SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
- SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
- SAUCE: drm/i915: Rename gen7 cmdparser tables
- SAUCE: drm/i915: Disable Secure Batches for gen6+
- SAUCE: drm/i915: Remove Master tables from cmdparser
- SAUCE: drm/i915: Add support for mandatory cmdparsing
- SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
- SAUCE: drm/i915: Allow parsing of unsized batches
- SAUCE: drm/i915: Add gen9 BCS cmdparsing
- SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
- SAUCE: drm/i915/cmdparser: Add support for backward jumps
- SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (5.3.0-21.22) eoan; urgency=medium

  * eoan/linux: 5.3.0-21.22 -proposed tracker (LP: #1850486)

  * Fix signing of staging modules in eoan (LP: #1850234)
- [Packaging] Leave unsigned modules unsigned after adding .gnu_debuglink

linux (5.3.0-20.21) eoan; urgency=medium

  * eoan/linux: 5.3.0-20.21 -proposed tracker (LP: #1849064)

  * eoan: alsa/sof: Enable SOF_HDA link and codec (LP: #1848490)
- [Config] Enable SOF_HDA link and codec

  * Eoan update: 5.3.7 upstream stable release (LP: #1848750)
- panic: ensure preemption is disabled during panic()
- [Config] updateconfigs for USB_RIO500
- USB: rio500: Remove Rio 500 kernel driver
- USB: yurex: Don't retry on unexpected errors
- USB: yurex: fix NULL-derefs on disconnect
- USB: usb-skeleton: fix runtime PM after driver unbind
- USB: usb-skeleton: fix NULL-deref on disconnect
- xhci: Fix false warning message about wrong bounce buffer write length
- xhci: Prevent device initiated U1/U2 link pm if exit latency is too long
- xhci: Check all endpoints for LPM timeout
- xhci: Fix USB 3.1 capability detection on early xHCI 1.1 spec based hosts
- usb: xhci: wait for CNR controller not ready bit in xhci resume
- xhci: Prevent deadlock when xhci adapter breaks during init
- xhci: Fix NULL pointer dereference in xhci_clear_tt_buffer_complete()
- USB: adutux: fix use-after-free on disconnect
- USB: adutux: fix NULL-derefs on disconnect
- USB: adutux: fix use-after-free on release
- USB: iowarrior: fix use-after-free on disconnect
- USB: iowarrior: fix use-after-free on release
- USB: iowarrior: fix use-after-free after driver unbind
- USB: usblp: fix runtime PM after driver unbind
- USB: chaoskey: fix use-after-free on release
- USB: ldusb: fix NULL-derefs on driver unbind
- serial: uartlite: fix exit path null pointer
- serial: uartps: Fix uartps_major handling
- USB: serial: keyspan: fix NULL-derefs on open() and write()
- USB: serial: ftdi_sio: add device IDs for Sienna and Echelon PL-20
- USB: serial: option: add Telit FN980 compositions
- USB: serial: option: add support for Cinterion CLS8 devices
- USB: serial: fix runtime PM after driver unbind
- USB: usblcd: fix I/O after disconnect
- USB: microtek: fix info-leak at probe
- USB: dummy-hcd: fix power budget for SuperSpeed mode
- usb: renesas_usbhs: gadget: Do not discard queues in

[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-11-12 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.0.0-35.38

---
linux (5.0.0-35.38) disco; urgency=medium

  * [REGRESSION]  md/raid0: cannot assemble multi-zone RAID0 with default_layout
setting (LP: #1849682)
- SAUCE: Fix revert "md/raid0: avoid RAID0 data corruption due to layout
  confusion."

  * refcount underflow and type confusion in shiftfs (LP: #1850867) // 
CVE-2019-15793
- SAUCE: shiftfs: Correct id translation for lower fs operations
- SAUCE: shiftfs: prevent type confusion
- SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

  * CVE-2018-12207
- kvm: Convert kvm_lock to a mutex
- kvm: x86: Do not release the page inside mmu_set_spte()
- KVM: x86: make FNAME(fetch) and __direct_map more similar
- KVM: x86: remove now unneeded hugepage gfn adjustment
- KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
- KVM: x86: add tracepoints around __direct_map and FNAME(fetch)
- kvm: x86, powerpc: do not allow clearing largepages debugfs entry
- SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
  active
- SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
- SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
- SAUCE: kvm: Add helper function for creating VM worker threads
- SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
- SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
- SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
- KVM: x86: use Intel speculation bugs and features as derived in generic 
x86
  code
- x86/msr: Add the IA32_TSX_CTRL MSR
- x86/cpu: Add a helper function x86_read_arch_cap_msr()
- x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
- x86/speculation/taa: Add mitigation for TSX Async Abort
- x86/speculation/taa: Add sysfs reporting for TSX Async Abort
- kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
- x86/tsx: Add "auto" option to the tsx= cmdline parameter
- x86/speculation/taa: Add documentation for TSX Async Abort
- x86/tsx: Add config options to set tsx=on|off|auto
- SAUCE: x86/speculation/taa: Call tsx_init()
- [Config] Disable TSX by default when possible

  * CVE-2019-0154
- SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
- SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
- SAUCE: drm/i915: Rename gen7 cmdparser tables
- SAUCE: drm/i915: Disable Secure Batches for gen6+
- SAUCE: drm/i915: Remove Master tables from cmdparser
- SAUCE: drm/i915: Add support for mandatory cmdparsing
- SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
- SAUCE: drm/i915: Allow parsing of unsized batches
- SAUCE: drm/i915: Add gen9 BCS cmdparsing
- SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
- SAUCE: drm/i915/cmdparser: Add support for backward jumps
- SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (5.0.0-34.36) disco; urgency=medium

  * disco/linux:  -proposed tracker (LP: #1850574)

  * [REGRESSION]  md/raid0: cannot assemble multi-zone RAID0 with default_layout
setting (LP: #1849682)
- Revert "md/raid0: avoid RAID0 data corruption due to layout confusion."

linux (5.0.0-33.35) disco; urgency=medium

  * disco/linux: 5.0.0-33.35 -proposed tracker (LP: #1849003)

  * Disco update: upstream stable patchset 2019-10-18 (LP: #1848817)
- tpm: use tpm_try_get_ops() in tpm-sysfs.c.
- drm/bridge: tc358767: Increase AUX transfer length limit
- drm/panel: simple: fix AUO g185han01 horizontal blanking
- video: ssd1307fb: Start page range at page_offset
- drm/stm: attach gem fence to atomic state
- drm/panel: check failure cases in the probe func
- drm/rockchip: Check for fast link training before enabling psr
- drm/radeon: Fix EEH during kexec
- gpu: drm: radeon: Fix a possible null-pointer dereference in
  radeon_connector_set_property()
- PCI: rpaphp: Avoid a sometimes-uninitialized warning
- ipmi_si: Only schedule continuously in the thread in maintenance mode
- clk: qoriq: Fix -Wunused-const-variable
- clk: sunxi-ng: v3s: add missing clock slices for MMC2 module clocks
- drm/amd/display: fix issue where 252-255 values are clipped
- drm/amd/display: reprogram VM config when system resume
- powerpc/powernv/ioda2: Allocate TCE table levels on demand for default DMA
  window
- clk: actions: Don't reference clk_init_data after registration
- clk: sirf: Don't reference clk_init_data after registration
- clk: sprd: Don't reference clk_init_data after registration
- clk: zx296718: Don't reference clk_init_data after registration
- powerpc/xmon: Check for HV mode when dumping XIVE info from OPAL
- powerpc/rtas: use device model APIs and serialization during LPM
- powerpc/futex: Fix warning: 'oldval' may be used un

[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-11-12 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.15.0-69.78

---
linux (4.15.0-69.78) bionic; urgency=medium

  * KVM NULL pointer deref (LP: #1851205)
- KVM: nVMX: handle page fault in vmread fix

  * CVE-2018-12207
- KVM: MMU: drop vcpu param in gpte_access
- kvm: Convert kvm_lock to a mutex
- kvm: x86: Do not release the page inside mmu_set_spte()
- KVM: x86: make FNAME(fetch) and __direct_map more similar
- KVM: x86: remove now unneeded hugepage gfn adjustment
- KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
- KVM: x86: add tracepoints around __direct_map and FNAME(fetch)
- kvm: x86, powerpc: do not allow clearing largepages debugfs entry
- SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
  active
- SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
- SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
- SAUCE: kvm: Add helper function for creating VM worker threads
- SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
- SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
- SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
- KVM: x86: use Intel speculation bugs and features as derived in generic 
x86
  code
- x86/msr: Add the IA32_TSX_CTRL MSR
- x86/cpu: Add a helper function x86_read_arch_cap_msr()
- x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
- x86/speculation/taa: Add mitigation for TSX Async Abort
- x86/speculation/taa: Add sysfs reporting for TSX Async Abort
- kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
- x86/tsx: Add "auto" option to the tsx= cmdline parameter
- x86/speculation/taa: Add documentation for TSX Async Abort
- x86/tsx: Add config options to set tsx=on|off|auto
- SAUCE: x86/speculation/taa: Call tsx_init()
- SAUCE: x86/cpu: Include cpu header from bugs.c
- [Config] Disable TSX by default when possible

  * CVE-2019-0154
- SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
- SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
- drm/i915/gtt: Add read only pages to gen8_pte_encode
- drm/i915/gtt: Read-only pages for insert_entries on bdw+
- drm/i915/gtt: Disable read-only support under GVT
- drm/i915: Prevent writing into a read-only object via a GGTT mmap
- drm/i915/cmdparser: Check reg_table_count before derefencing.
- drm/i915/cmdparser: Do not check past the cmd length.
- drm/i915: Silence smatch for cmdparser
- drm/i915: Move engine->needs_cmd_parser to engine->flags
- SAUCE: drm/i915: Rename gen7 cmdparser tables
- SAUCE: drm/i915: Disable Secure Batches for gen6+
- SAUCE: drm/i915: Remove Master tables from cmdparser
- SAUCE: drm/i915: Add support for mandatory cmdparsing
- SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
- SAUCE: drm/i915: Allow parsing of unsized batches
- SAUCE: drm/i915: Add gen9 BCS cmdparsing
- SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
- SAUCE: drm/i915/cmdparser: Add support for backward jumps
- SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (4.15.0-68.77) bionic; urgency=medium

  * bionic/linux: 4.15.0-68.77 -proposed tracker (LP: #1849855)

  * [REGRESSION]  md/raid0: cannot assemble multi-zone RAID0 with default_layout
setting (LP: #1849682)
- Revert "md/raid0: avoid RAID0 data corruption due to layout confusion."

linux (4.15.0-67.76) bionic; urgency=medium

  * bionic/linux: 4.15.0-67.76 -proposed tracker (LP: #1849035)

  * Unexpected CFS throttling  (LP: #1832151)
- sched/fair: Add lsub_positive() and use it consistently
- sched/fair: Fix low cpu usage with high throttling by removing expiration 
of
  cpu-local slices
- sched/fair: Fix -Wunused-but-set-variable warnings

  * [CML] New device IDs for CML-U (LP: #1843774)
- i2c: i801: Add support for Intel Comet Lake
- spi: pxa2xx: Add support for Intel Comet Lake

  * CVE-2019-17666
- SAUCE: rtlwifi: rtl8822b: Fix potential overflow on P2P code
- SAUCE: rtlwifi: Fix potential overflow on P2P code

  * md raid0/linear doesn't show error state if an array member is removed and
allows successful writes (LP: #1847773)
- md raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone

  * Change Config Option CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE for s390x from yes
to no (LP: #1848492)
- [Config] Change Config Option CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE for 
s390x
  from yes to no

  * [Packaging] Support building Flattened Image Tree (FIT) kernels
(LP: #1847969)
- [Packaging] add rules to build FIT image
- [Packaging] force creation of headers directory

  * bcache: Performance degradation when querying priority_stats (LP: #1840043)
- bcache: add cond_resched() in __bch_cache_cmp()

  * Add installer support fo

[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-26 Thread Heitor Alves de Siqueira
Verified on eoan with test case from description:
# uname -r
5.3.0-20-generic 

Performance kept steady throughout the priority_stat sysfs query.

** Tags removed: verification-needed-eoan
** Tags added: verification-done-eoan

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-26 Thread Heitor Alves de Siqueira
Verified on disco with test case from description:
# uname -r
5.0.0-33-generic

Write performance wasn't significantly affected in this test,
fluctuating between ~120MB/s and ~110MB/s.

** Tags removed: verification-needed-disco
** Tags added: verification-done-disco

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-26 Thread Heitor Alves de Siqueira
Verified on bionic with test case from description:
# uname -r  

4.15.0-67-generic

Write performance dropped from ~150MB/s to ~147MB/s, and the system is
still responsive.

** Tags removed: verification-needed-bionic
** Tags added: verification-done-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-25 Thread Heitor Alves de Siqueira
Verified on xenial with test case from description:
# uname -r
4.4.0-167-generic

Bcache write performance wasn't affected significantly by the sysfs
query (~200MB/s to ~197MB/s), so the patch looks to be working fine.

** Tags removed: verification-needed-xenial
** Tags added: verification-done-xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-24 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
eoan' to 'verification-done-eoan'. If the problem still exists, change
the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-eoan

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-22 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
xenial' to 'verification-done-xenial'. If the problem still exists,
change the tag 'verification-needed-xenial' to 'verification-failed-
xenial'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-22 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
bionic' to 'verification-done-bionic'. If the problem still exists,
change the tag 'verification-needed-bionic' to 'verification-failed-
bionic'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-22 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
disco' to 'verification-done-disco'. If the problem still exists, change
the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-disco

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-17 Thread Kleber Sacilotto de Souza
** Changed in: linux (Ubuntu Xenial)
   Status: New => Fix Committed

** Changed in: linux (Ubuntu Bionic)
   Status: New => Fix Committed

** Changed in: linux (Ubuntu Disco)
   Status: New => Fix Committed

** Changed in: linux (Ubuntu Eoan)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-17 Thread Kleber Sacilotto de Souza
** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
 Assignee: Heitor Alves de Siqueira (halves)
   Status: In Progress

** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Disco)
 Assignee: (unassigned) => Heitor Alves de Siqueira (halves)

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Heitor Alves de Siqueira (halves)

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => Heitor Alves de Siqueira (halves)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  New
Status in linux source package in Bionic:
  New
Status in linux source package in Disco:
  New
Status in linux source package in Eoan:
  In Progress

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-10 Thread Heitor Alves de Siqueira
Patches sent to kernel-team list:
https://lists.ubuntu.com/archives/kernel-team/2019-October/104658.html

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  In Progress

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-10 Thread Heitor Alves de Siqueira
Upstream commit:
https://git.kernel.org/linus/d55a4ae9e1af


** Changed in: linux
   Status: In Progress => Fix Committed

** Changed in: linux (Ubuntu)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Committed
Status in linux package in Ubuntu:
  In Progress

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-10-10 Thread Heitor Alves de Siqueira
** Description changed:

  [Impact]
- Performance degradation for read/write workloads in bcache devices, 
occasional system stalls
+ Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls
+ 
+ [Test Case]
+ Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.
+ 
+ 1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
+ # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress
+ 
+ 2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
+ # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done
+ 
+ 3) Monitor the read/write workload for any performance impact
+ 
+ [Fix]
+ To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.
+ 
+ [Regression Potential]
+ Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.
+ 
+ --
  
  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.
  
  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed and
  sorted afterwards, which can cause very high CPU contention in cases of
  larger bcache setups.
  
  From our tests, the sorting step of the priority_stats query causes the
  most expressive performance reduction, as it can hinder tasks that are
  not even doing any bcache IO. If a task is "unlucky" to be scheduled in
  the same CPU as the sysfs query, its performance will be harshly reduced
  as both compete for CPU time. We've had users report systems stalls of
  up to ~6s due to this, as a result from monitoring tools that query the
  priority_stats periodically (e.g. Prometheus Node Exporter from [0]).
  These system stalls have triggered several other issues such as ceph-mon
  re-elections, problems in percona-cluster and general network stalls, so
  the impact is not isolated to bcache IO workloads.
  
+ An example benchmark can be seen in [1], where the read performance on a
+ bcache device suffered quite heavily (going from ~40k IOPS to ~4k IOPS
+ due to priority_stats). Other comparison charts are found under [2].
+ 
  [0] https://github.com/prometheus/node_exporter
- 
- [Test Case]
- Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.
- 
- 1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
- # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress
- 
- 2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
- # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done
- 
- 3) Monitor the read/write workload for any performance impact
+ [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
+ [2] https://people.canonical.com/~halves/priority_stats/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  In Progress
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g.

[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-08-20 Thread Peter Sabaini
** Tags added: canonical-bootstack

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  In Progress
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  [Impact]
  Performance degradation for read/write workloads in bcache devices, 
occasional system stalls

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  [0] https://github.com/prometheus/node_exporter

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-08-16 Thread Heitor Alves de Siqueira
The cond_resched() patch should make its way into the 5.4 merge window
(see bcache maintainer's reponse at [0]).

[0] https://lore.kernel.org/lkml/74950e24-245a-c627-0e2e-
32ac0b304...@suse.de/

** Also affects: linux
   Importance: Undecided
   Status: New

** Changed in: linux
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  In Progress
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  [Impact]
  Performance degradation for read/write workloads in bcache devices, 
occasional system stalls

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  [0] https://github.com/prometheus/node_exporter

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-08-14 Thread Heitor Alves de Siqueira
I've uploaded comparison graphs at [0] to avoid decompressing the
.tar.gz archive. Complete logs and graphs from individual runs are in
the archive only, as they're too unwieldy to navigate.

[0] https://people.canonical.com/~halves/priority_stats/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  [Impact]
  Performance degradation for read/write workloads in bcache devices, 
occasional system stalls

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  [0] https://github.com/prometheus/node_exporter

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-08-13 Thread Heitor Alves de Siqueira
>From the comparison graphs, we can see that the impact of the sysfs
query is very significant for both read and write workloads. Taking just
the "sysfs" test from my previous comment into account:

In the random write tests:
* For 512b block sizes, we see about an 85% reduction in BW (down to ~2MB/s 
from ~16MB/s)
* For 4k block sizes, the reduction in BW is also about 85% (~30MB/s compared 
to ~240MB/s)
* For 512k block sizes (== bucket size) and higher, BW is reduced by about 64% 
(~90MB/s vs ~250MB/s)

In the random read tests:
* For 512b block sizes, BW goes down to ~3MB/s from ~25MB/s
* For 4k block sizes, the BW reduction is around ~90% (~10MB/s compared to 
~160MB/s)
* For 512k block sizes and higher, BW is reduced by about 90% (150MB/s vs 
1.6GB/s)

We can see similar results as the above for the IOPS measures, and the
latency measures are also much worse in the sysfs test. We observe
frequent latency spikes (150ms+) when running fio together with the
priority_stats query, and latency averages increase by about 50ms at
least.

Surprisingly, the "mutex" patch didn't improve the test results much.
This was the case for both read and write workloads, which suggests that
the bucket locking doesn't have that much impact on the system compared
to the sorting.

The cond_resched() patch showed great results, even though it causes the
sysfs queries to take a bit longer. The write throughput of the bcache
device is _much_ better with it, and the system doesn't stall anymore
(even when pinning processes to the same CPU as the sysfs query). In
some cases, it brings performance back to values close to the "raw"
tests (i.e. without any sysfs queries). This patch seems like the best
short-term solution for now, as the sysfs query taking a bit longer
shouldn't really be a problem in most setups (whereas the IO performance
and other issues are much more noticeable).

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  [Impact]
  Performance degradation for read/write workloads in bcache devices, 
occasional system stalls

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  [0] https://github.com/prometheus/node_exporter

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2019-08-13 Thread Heitor Alves de Siqueira
This has been reported upstream as well, with a tentative patch in
https://lkml.org/lkml/2019/3/7/8

I've done some tests in fio to get some more data on this issue, with the 
following scenarios:
- "raw" test -> fio test without any sysfs queries
- "sysfs" test   -> fio + scripted sysfs queries
- "mutex" test   -> fio + sysfs + mutex patch
- "resched" test -> fio + sysfs + cond_resched() patch

The "mutex patch" removed bucket locking from the sysfs query. This
caused the stats to be computed all wrong of course, but we weren't
interested in the stats themselves for now (just on the performance
impact of the bucket locking).

The "cond_resched()" patch was suggested upstream by Shile Zhang, and
introduces a cond_resched() call in the comparison function used for
sorting.

Tests were run on a NVMe-backed bcache device with writeback enabled.
The full logs and graphs are available in the bcache-results.tar.gz
attachment.

** Attachment added: "bcache-results.tar.gz"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840043/+attachment/5282348/+files/bcache-results.tar.gz

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  [Impact]
  Performance degradation for read/write workloads in bcache devices, 
occasional system stalls

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  [0] https://github.com/prometheus/node_exporter

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp