[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2023-11-07 Thread Mauricio Faria de Oliveira
** Attachment added: "dm_fake_dev.sh"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+attachment/5716855/+files/dm_fake_dev.sh

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-08-13 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.0.0-25.26

---
linux (5.0.0-25.26) disco; urgency=medium

  * CVE-2019-1125
- x86/cpufeatures: Carve out CQM features retrieval
- x86/cpufeatures: Combine word 11 and 12 into a new scattered features word
- x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
- x86/speculation: Enable Spectre v1 swapgs mitigations
- x86/entry/64: Use JMP instead of JMPQ
- x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS

 -- Kleber Sacilotto de Souza   Thu, 01 Aug
2019 12:04:35 +0200

** Changed in: linux (Ubuntu Disco)
   Status: Fix Committed => Fix Released

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2019-1125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-08-09 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.2.0-10.11

---
linux (5.2.0-10.11) eoan; urgency=medium

  * eoan/linux: 5.2.0-10.11 -proposed tracker (LP: #1838113)

  * Packaging resync (LP: #1786013)
- [Packaging] resync git-ubuntu-log

  * Eoan update: v5.2.4 upstream stable release (LP: #1838428)
- bnx2x: Prevent load reordering in tx completion processing
- caif-hsi: fix possible deadlock in cfhsi_exit_module()
- hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback()
- igmp: fix memory leak in igmpv3_del_delrec()
- ipv4: don't set IPv6 only flags to IPv4 addresses
- ipv6: rt6_check should return NULL if 'from' is NULL
- ipv6: Unlink sibling route in case of failure
- net: bcmgenet: use promisc for unsupported filters
- net: dsa: mv88e6xxx: wait after reset deactivation
- net: make skb_dst_force return true when dst is refcounted
- net: neigh: fix multiple neigh timer scheduling
- net: openvswitch: fix csum updates for MPLS actions
- net: phy: sfp: hwmon: Fix scaling of RX power
- net_sched: unset TCQ_F_CAN_BYPASS when adding filters
- net: stmmac: Re-work the queue selection for TSO packets
- net/tls: make sure offload also gets the keys wiped
- nfc: fix potential illegal memory access
- r8169: fix issue with confused RX unit after PHY power-down on RTL8411b
- rxrpc: Fix send on a connected, but unbound socket
- sctp: fix error handling on stream scheduler initialization
- sctp: not bind the socket in sctp_connect
- sky2: Disable MSI on ASUS P6T
- tcp: be more careful in tcp_fragment()
- tcp: fix tcp_set_congestion_control() use from bpf hook
- tcp: Reset bytes_acked and bytes_received when disconnecting
- vrf: make sure skb->data contains ip header to make routing
- net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn
- net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling
- net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query
- net: bridge: don't cache ether dest pointer on input
- net: bridge: stp: don't cache eth dest pointer before skb pull
- macsec: fix use-after-free of skb during RX
- macsec: fix checksumming after decryption
- netrom: fix a memory leak in nr_rx_frame()
- netrom: hold sock when setting skb->destructor
- selftests: txring_overwrite: fix incorrect test of mmap() return value
- net/tls: fix poll ignoring partially copied records
- net/tls: reject offload of TLS 1.3
- net/mlx5e: Fix port tunnel GRE entropy control
- net/mlx5e: Rx, Fix checksum calculation for new hardware
- net/mlx5e: Fix return value from timeout recover function
- net/mlx5e: Fix error flow in tx reporter diagnose
- bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips.
- mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed
- net/mlx5: E-Switch, Fix default encap mode
- mlxsw: spectrum: Do not process learned records with a dummy FID
- dma-buf: balance refcount inbalance
- dma-buf: Discard old fence_excl on retrying get_fences_rcu for realloc
- Revert "gpio/spi: Fix spi-gpio regression on active high CS"
- gpiolib: of: fix a memory leak in of_gpio_flags_quirks()
- gpio: davinci: silence error prints in case of EPROBE_DEFER
- MIPS: lb60: Fix pin mappings
- perf script: Assume native_arch for pipe mode
- perf/core: Fix exclusive events' grouping
- perf/core: Fix race between close() and fork()
- ext4: don't allow any modifications to an immutable file
- ext4: enforce the immutable flag on open files
- mm: add filemap_fdatawait_range_keep_errors()
- jbd2: introduce jbd2_inode dirty range scoping
- ext4: use jbd2_inode dirty range scoping
- ext4: allow directory holes
- KVM: nVMX: do not use dangling shadow VMCS after guest reset
- KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested
- Revert "kvm: x86: Use task structs fpu field for user"
- sd_zbc: Fix report zones buffer allocation
- block: Limit zone array allocation size
- net: sched: verify that q!=NULL before setting q->flags
- Linux 5.2.4

  * linux hwe i386 kernel 5.0.0-21.22~18.04.1 crashes on Lenovo x220
(LP: #1838115)
- x86/mm: Check for pfn instead of page in vmalloc_sync_one()
- x86/mm: Sync also unmappings in vmalloc_sync_all()
- mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()

  * br_netfilter: namespace sysctl operations (LP: #1836910)
- netfilter: bridge: port sysctls to use brnf_net
- netfilter: bridge: namespace bridge netfilter sysctls
- netfilter: bridge: prevent UAF in brnf_exit_net()

  * Eoan update: v5.2.3 upstream stable release (LP: #1838089)
- ath10k: Check tx_stats before use it
- ath10k: htt: don't use txdone_fifo with SDIO
- ath10k: fix incorrect multicast/broadcast rate setting
- ath9k: Don't trust TX status TID number

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-08-07 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
xenial' to 'verification-done-xenial'. If the problem still exists,
change the tag 'verification-needed-xenial' to 'verification-failed-
xenial'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-31 Thread Mauricio Faria de Oliveira
Verification done for bionic-proposed
based on the 3 tests / comments above.

** Tags removed: verification-needed-bionic
** Tags added: verification-done-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-31 Thread Mauricio Faria de Oliveira
Verification with bionic-proposed of fio results.

For details see comment #5.

Overall performance has improved on the -proposed kernel
(tested in build without the patchset) and the patchset
as is in the -proposed kernel did improve it a bit more.

The results below show 3 results for each cache-mode on
(rand)rw tests, in:
- 'orig'inal kernel (4.15.0-55), 
- 'test' kernel (4.15.0-55 + patchset), 
- 'tpwo' test-proposed without patchset,
- 'tpwp' test-proposed with patchset (i.e., bionic-proposed)

All numbers are good/better.

fio.test1 fio.orig fio.test.none.randrw.log:   read: IOPS=79, BW=318KiB/s 
(325kB/s)(55.9MiB/180092msec)
fio.test2 fio.orig fio.test.none.randrw.log:   read: IOPS=81, BW=328KiB/s 
(336kB/s)(57.7MiB/180137msec)
fio.test3 fio.orig fio.test.none.randrw.log:   read: IOPS=80, BW=324KiB/s 
(331kB/s)(56.9MiB/180142msec)
fio.test1 fio.test fio.test.none.randrw.log:   read: IOPS=81, BW=326KiB/s 
(334kB/s)(57.4MiB/180222msec)
fio.test2 fio.test fio.test.none.randrw.log:   read: IOPS=80, BW=324KiB/s 
(331kB/s)(56.0MiB/180177msec)
fio.test3 fio.test fio.test.none.randrw.log:   read: IOPS=77, BW=312KiB/s 
(319kB/s)(54.8MiB/180103msec)
fio.test1 fio.tpwo fio.test.none.randrw.log:   read: IOPS=189, BW=759KiB/s 
(777kB/s)(133MiB/180047msec)
fio.test2 fio.tpwo fio.test.none.randrw.log:   read: IOPS=192, BW=768KiB/s 
(787kB/s)(135MiB/180052msec)
fio.test3 fio.tpwo fio.test.none.randrw.log:   read: IOPS=190, BW=760KiB/s 
(778kB/s)(134MiB/180054msec)
fio.test1 fio.tpwp fio.test.none.randrw.log:   read: IOPS=212, BW=849KiB/s 
(869kB/s)(149MiB/180080msec)
fio.test2 fio.tpwp fio.test.none.randrw.log:   read: IOPS=211, BW=845KiB/s 
(865kB/s)(149MiB/180063msec)
fio.test3 fio.tpwp fio.test.none.randrw.log:   read: IOPS=210, BW=843KiB/s 
(863kB/s)(148MiB/180051msec)
fio.test1 fio.orig fio.test.none.rw.log:   read: IOPS=81.8k, BW=319MiB/s 
(335MB/s)(56.2GiB/180023msec)
fio.test2 fio.orig fio.test.none.rw.log:   read: IOPS=82.7k, BW=323MiB/s 
(339MB/s)(56.8GiB/180070msec)
fio.test3 fio.orig fio.test.none.rw.log:   read: IOPS=82.8k, BW=323MiB/s 
(339MB/s)(56.9GiB/180006msec)
fio.test1 fio.test fio.test.none.rw.log:   read: IOPS=83.2k, BW=325MiB/s 
(341MB/s)(57.1GiB/180035msec)
fio.test2 fio.test fio.test.none.rw.log:   read: IOPS=85.2k, BW=333MiB/s 
(349MB/s)(58.5GiB/180030msec)
fio.test3 fio.test fio.test.none.rw.log:   read: IOPS=82.9k, BW=324MiB/s 
(339MB/s)(56.9GiB/180003msec)
fio.test1 fio.tpwo fio.test.none.rw.log:   read: IOPS=119k, BW=463MiB/s 
(486MB/s)(19.0GiB/44178msec)
fio.test2 fio.tpwo fio.test.none.rw.log:   read: IOPS=120k, BW=468MiB/s 
(491MB/s)(19.0GiB/43727msec)
fio.test3 fio.tpwo fio.test.none.rw.log:   read: IOPS=123k, BW=481MiB/s 
(505MB/s)(19.0GiB/42553msec)
fio.test1 fio.tpwp fio.test.none.rw.log:   read: IOPS=128k, BW=501MiB/s 
(525MB/s)(19.0GiB/40860msec)
fio.test2 fio.tpwp fio.test.none.rw.log:   read: IOPS=134k, BW=525MiB/s 
(551MB/s)(19.0GiB/38971msec)
fio.test3 fio.tpwp fio.test.none.rw.log:   read: IOPS=144k, BW=563MiB/s 
(591MB/s)(19.0GiB/36352msec)
fio.test1 fio.orig fio.test.writearound.randrw.log:   read: IOPS=78, 
BW=315KiB/s (322kB/s)(55.3MiB/180123msec)
fio.test2 fio.orig fio.test.writearound.randrw.log:   read: IOPS=78, 
BW=313KiB/s (321kB/s)(55.1MiB/180114msec)
fio.test3 fio.orig fio.test.writearound.randrw.log:   read: IOPS=81, 
BW=325KiB/s (333kB/s)(57.2MiB/180108msec)
fio.test1 fio.test fio.test.writearound.randrw.log:   read: IOPS=81, 
BW=324KiB/s (332kB/s)(57.0MiB/180107msec)
fio.test2 fio.test fio.test.writearound.randrw.log:   read: IOPS=78, 
BW=315KiB/s (323kB/s)(55.4MiB/180069msec)
fio.test3 fio.test fio.test.writearound.randrw.log:   read: IOPS=78, 
BW=312KiB/s (320kB/s)(54.9MiB/180109msec)
fio.test1 fio.tpwo fio.test.writearound.randrw.log:   read: IOPS=188, 
BW=754KiB/s (772kB/s)(133MiB/180033msec)
fio.test2 fio.tpwo fio.test.writearound.randrw.log:   read: IOPS=190, 
BW=760KiB/s (778kB/s)(134MiB/180073msec)
fio.test3 fio.tpwo fio.test.writearound.randrw.log:   read: IOPS=190, 
BW=762KiB/s (781kB/s)(134MiB/180060msec)
fio.test1 fio.tpwp fio.test.writearound.randrw.log:   read: IOPS=210, 
BW=840KiB/s (861kB/s)(148MiB/180100msec)
fio.test2 fio.tpwp fio.test.writearound.randrw.log:   read: IOPS=212, 
BW=848KiB/s (869kB/s)(149MiB/180065msec)
fio.test3 fio.tpwp fio.test.writearound.randrw.log:   read: IOPS=211, 
BW=846KiB/s (867kB/s)(149MiB/180088msec)
fio.test1 fio.orig fio.test.writearound.rw.log:   read: IOPS=81.8k, BW=319MiB/s 
(335MB/s)(56.2GiB/180046msec)
fio.test2 fio.orig fio.test.writearound.rw.log:   read: IOPS=81.6k, BW=319MiB/s 
(334MB/s)(56.0GiB/180004msec)
fio.test3 fio.orig fio.test.writearound.rw.log:   read: IOPS=82.7k, BW=323MiB/s 
(339MB/s)(56.8GiB/180002msec)
fio.test1 fio.test fio.test.writearound.rw.log:   read: IOPS=82.8k, BW=323MiB/s 
(339MB/s)(56.9GiB/180097msec)
fio.test2 fio.test fio.test.writearound.rw.log:   read: IOPS=79.6k, BW=311MiB/s 
(326MB/s)(54.6GiB/180003msec)
fio.test3 fio.test fio.test.writearound.rw.log:   read: IOPS=81.7k, B

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-31 Thread Mauricio Faria de Oliveira
Verification with bionic-proposed of xfstests results.

No regressions introduced by this bcache patchset.

The comparison between -updates and -proposed is not directly possible
because -proposed introduced failures via other components in I/O path
(e.g., block, ext4).

This is described below, and just to make sure, the -proposed kernel
has been rebuilt with the bcache patchset reverted, and test results
are the same (same failures with/without the patchset; no regression).

It's also been confirmed (below) that tests with a raw block device
(sda) instead of bcache device (thus eliminating the bcache code)
shows the new/introduced failures (ext4/035, generic/553, generic/554).

(the output below does look better on a very wide screen. :-)


proposed kernel: 4.15.0-55.62 (with patchset)
---

xfstests.test.none.log: Failures: ext4/032 ext4/035 generic/371 
--- generic/484 generic/491 --- generic/537 --- 
--- generic/553 generic/554
xfstests.test.writearound.log:  Failures: ext4/032 ext4/035 --- 
generic/451 generic/484 generic/491 --- generic/537 --- 
--- generic/553 generic/554
xfstests.test.writeback.log:Failures:  ext4/035 --- 
--- generic/484 generic/491 --- generic/537 --- 
--- generic/553 generic/554
xfstests.test.writethrough.log: Failures: ext4/032 ext4/035 --- 
--- generic/484 generic/491 --- generic/537 --- 
--- generic/553 generic/554

sda-only (no bcache)Failures: ext4/032 ext4/035 generic/371
--- generic/484 generic/491 --- --- generic/538
--- generic/553 generic/554


proposed kernel (4.15.0-56 without patchset)
---

$ uname -rv
4.15.0-56-generic #62+test20190730b1 SMP Tue Jul 30 18:25:01 -03 2019


xfstests.test.none.log: Failures: ext4/032 ext4/035 generic/371 
--- generic/484 generic/491 --- generic/537 --- 
--- generic/553 generic/554
xfstests.test.writearound.log:  Failures: ext4/032 ext4/035 --- 
generic/451 generic/484 generic/491 --- generic/537 --- 
generic/547 generic/553 generic/554
xfstests.test.writeback.log:Failures:  ext4/035 --- 
--- generic/484 generic/491 --- generic/537 --- 
--- generic/553 generic/554
xfstests.test.writethrough.log: Failures: ext4/032 ext4/035 --- 
--- generic/484 generic/491 --- generic/537 --- 
--- generic/553 generic/554


test kernel (4.15.0-55 with patchset)
---

xfstests.test.none.log: Failures: ext4/032  --- 
--- generic/484 generic/491 generic/504 generic/537 --- 
--- --- ---
xfstests.test.writearound.log:  Failures: ext4/032  --- 
generic/451 generic/484 generic/491 generic/504 generic/537 --- 
--- --- --- 
xfstests.test.writeback.log:Failures:   --- 
--- generic/484 generic/491 generic/504 generic/537 --- 
--- --- --- 
xfstests.test.writethrough.log: Failures: ext4/032  --- 
--- generic/484 generic/491 generic/504 generic/537 --- 
--- --- --- 

orig kernel (4.15.0-55)
---

xfstests.test.none.log: Failures: ext4/032  --- 
--- generic/484 generic/491 generic/504 generic/537 --- 
--- --- --- 
xfstests.test.writearound.log:  Failures: ext4/032  generic/371 
generic/451 generic/484 generic/491 generic/504 generic/537 --- 
--- --- --- 
xfstests.test.writeback.log:Failures:   --- 
--- generic/484 generic/491 generic/504 generic/537 --- 
--- --- --- 
xfstests.test.writethrough.log: Failures: ext4/032  --- 
--- generic/484 generic/491 generic/504 generic/537 --- 
--- --- ---

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache de

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-31 Thread Mauricio Faria de Oliveira
Verification done for Disco (one patch change only).

Only one of the two bcache devices stop working upon failures in one backing 
device.
(see comment #21 for details).

# uname -rv
5.0.0-22-generic #23-Ubuntu SMP Tue Jul 23 17:23:54 UTC 2019

# ./setup-two-bcache-one-cache.sh >/dev/null 2>&1
[   25.748828] bcache: register_bdev() registered backing device dm-1
[   25.759145] bcache: register_bdev() registered backing device dm-0
[   25.767247] bcache: run_cache_set() invalidating existing data
[   25.778928] bcache: register_cache() registered cache device dm-2
[   26.768350] bcache: bch_cached_dev_attach() Caching dm-0 as bcache1 on set 
2bf1e70a-6f20-4680-bc63-f803142f294d
[   26.795147] bcache: bch_cached_dev_attach() Caching dm-1 as bcache0 on set 
2bf1e70a-6f20-4680-bc63-f803142f294d

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache1  251:128  0 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop2  7:201G  0 loop 
└─fake-loop2 253:20 1024M  0 dm   
  ├─bcache0  251:00 1024M  0 disk 
  └─bcache1  251:128  0 1024M  0 disk 

# echo writeback | tee /sys/block/bcache*/bcache/cache_mode
writeback

# echo always | tee /sys/block/bcache*/bcache/stop_when_cache_set_failed
always

# ./dm_fake_dev.sh /dev/loop0 bad
[   42.723192] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   42.730031] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   42.736198] bcache: register_bcache() error /dev/dm-0: device already 
registered (emitting change event)
[   42.738697] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   42.742277] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
# [   42.746748] Buffer I/O error on dev bcache1, logical block 262112, async 
page read
[   42.752642] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   42.755650] Buffer I/O error on dev bcache1, logical block 262112, async 
page read
[   42.758209] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   42.760642] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   42.762860] Buffer I/O error on dev bcache1, logical block 1, async page read


# dd if=/dev/zero of=/dev/bcache1 bs=4k & dd if=/dev/zero of=/dev/bcache0 bs=4k 
&
[1] 1557
[2] 1558
# [   58.982340] bcache: bch_count_backing_io_errors() dm-0: IO error on 
backing device, unrecoverable
[   58.984076] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.985718] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.987382] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.989011] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.990645] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.992293] Buffer I/O error on dev bcache1, logical block 0, lost async 
page write
[   58.993733] Buffer I/O error on dev bcache1, logical block 1, lost async 
page write
[   58.995201] Buffer I/O error on dev bcache1, logical block 2, lost async 
page write
[   58.996651] Buffer I/O error on dev bcache1, logical block 3, lost async 
page write
...
[   59.096950] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   59.098669] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   59.100621] bcache: bch_cached_dev_error() stop bcache1: too many IO errors 
on backing device dm-0
[   59.100621]
dd: error writing '/dev/bcache1': No space left on device
262142+0 records in
262141+0 records out

[   60.111733] bcache: bcache_device_free() bcache1 stopped

1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.10457 s, 510 MB/s
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 4.67245 s, 230 MB/s

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop2  7:201G  0 loop 
└─fake-loop2 253:20 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
fake-loop0   253:001G  0 dm 

only bcache1 was stopped. bcache0 remains working.

# reboot

# ./setup-two-bcache-one-cache.reboot.sh >/dev/null 2>&1
[   17.606164] bcache: register_bdev() registered backing device dm-0
[   17.672177] bcache: register_bdev() registered backing device dm-1
[   17.752456] bcache: bch_journal_replay() journal replay done, 4936 keys in 6 
entries, seq 

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-31 Thread Mauricio Faria de Oliveira
Verification with bionic-proposed of the I/O Error path.

All good, working as expected (see comments #11 to #16).

# uname -rv
4.15.0-56-generic #62-Ubuntu SMP Wed Jul 24 20:18:55 UTC 2019


test 1
--

# ./setup.sh >/dev/null 2>&1
[  369.375820] bcache: register_bdev() registered backing device dm-0
[  369.395195] bcache: run_cache_set() invalidating existing data
[  369.410278] bcache: register_cache() registered cache device dm-1
[  371.393391] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
c1126837-e029-4d08-bad3-38ff8bc08054

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 


On another shell:

# fio --name=write --rw=randwrite --filename=/dev/bcache0 --bs=4k
--iodepth=8 --ioengine=libaio --runtime=300s --continue_on_error=all

# [  425.656209] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
[  425.684837] bcache: error on c1126837-e029-4d08-bad3-38ff8bc08054: 
[  425.684840] journal io error
[  425.686537] , disabling caching
[  425.688849] Buffer I/O error on dev bcache0, logical block 2807, lost async 
page write
[  425.691541] Buffer I/O error on dev bcache0, logical block 2808, lost async 
page write
[  425.694131] bcache: conditional_stop_bcache_device() 
stop_when_cache_set_failed of bcache0 is "auto" and cache is clean, keep it 
alive.
[  425.698343] Buffer I/O error on dev bcache0, logical block 2810, lost async 
page write
[  425.702522] Buffer I/O error on dev bcache0, logical block 2812, lost async 
page write
[  425.705326] Buffer I/O error on dev bcache0, logical block 2813, lost async 
page write
[  425.707896] Buffer I/O error on dev bcache0, logical block 2814, lost async 
page write
[  425.710692] Buffer I/O error on dev bcache0, logical block 2816, lost async 
page write
[  425.713524] Buffer I/O error on dev bcache0, logical block 2817, lost async 
page write
[  425.716512] Buffer I/O error on dev bcache0, logical block 2818, lost async 
page write
[  425.719156] Buffer I/O error on dev bcache0, logical block 2819, lost async 
page write
[  425.742817] bcache: cached_dev_detach_finish() Caching disabled for dm-0
[  425.746933] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
[  425.750502] bcache: cache_set_free() Cache set 
c1126837-e029-4d08-bad3-38ff8bc08054 unregistered

fio finished:

Run status group 0 (all jobs):
  WRITE: bw=212MiB/s (222MB/s), 212MiB/s-212MiB/s (222MB/s-222MB/s), io=1024MiB 
(1074MB), run=4830-4830msec

bcache not on top of caching device:

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
fake-loop1   253:101G  0 dm  


test 2
--

# ./setup.sh >/dev/null 2>&1
[   23.946411] bcache: register_bdev() registered backing device dm-0   

[   23.952262] bcache: run_cache_set() invalidating existing data   

[   23.966564] bcache: register_cache() registered cache device dm-1

[   25.949934] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
d7a3c644-e21e-49bb-bcee-e14709a65745

# lsblk -e 252  

NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT

loop0  7:001G  0 loop   

└─fake-loop0 253:00 1024M  0 dm 

  └─bcache0  251:00 1024M  0 disk   

loop1  7:101G  0 loop   

└─fake-loop1 253:10 1024M  0 dm 
   
  └─bcache0  251:00 1024M  0 disk   


# echo writeback > /sys/block/bcache0/bcache/cache_mode

# dd if=/dev/zero of=/dev/bcache0 bs=4k 
dd: error writing '/dev/bcache0': No space left on device   

262142+0 records in 

262141+0 records out

1073729536 bytes (1.1 GB, 1.0 GiB) copied, 4.2152 s, 255 MB/s   


# ./dm_fake_dev.s

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-25 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
bionic' to 'verification-done-bionic'. If the problem still exists,
change the tag 'verification-needed-bionic' to 'verification-failed-
bionic'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-25 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
disco' to 'verification-done-disco'. If the problem still exists, change
the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-disco

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-22 Thread Khaled El Mously
** Changed in: linux (Ubuntu Disco)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-18 Thread Stefan Bader
** Changed in: linux (Ubuntu Disco)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  In Progress
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-17 Thread Seth Forshee
** Changed in: linux (Ubuntu Eoan)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  In Progress
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-17 Thread Kleber Sacilotto de Souza
** Changed in: linux (Ubuntu Bionic)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  In Progress
Status in linux source package in Eoan:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-17 Thread Kleber Sacilotto de Souza
Cosmic is reaching EOL and will not receive more kernel updates. Marking
the its task as 'Won't Fix'.

** Changed in: linux (Ubuntu Cosmic)
   Status: In Progress => Won't Fix

** Changed in: linux (Ubuntu Disco)
 Assignee: (unassigned) => Mauricio Faria de Oliveira (mfo)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  In Progress
Status in linux source package in Eoan:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-08 Thread Mauricio Faria de Oliveira
** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
 Assignee: Mauricio Faria de Oliveira (mfo)
   Status: Invalid

** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Disco)
   Status: New => In Progress

** Changed in: linux (Ubuntu Eoan)
   Status: Invalid => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress
Status in linux source package in Disco:
  In Progress
Status in linux source package in Eoan:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 6 (for the Disco kernel)


commit: 'Revert "bcache: set CACHE_SET_IO_DISABLE in
bch_cached_dev_error()"'

Problem: if one backing device hits I/O errors the cache device
is disabled, but if that cache device is shared by other bcache
devices they stop too (even with non-failing backing devices).

Original kernel: all bcache devices that share cache device with
failing backing device are stopped.

Modified kernel: only the bcache device with the failing backing
device is stopped.


Original kernel:
---

root@bionic-bcache:~# uname -rv
5.0.0-21-generic #22-Ubuntu SMP Tue Jul 2 13:27:33 UTC 2019

root@bionic-bcache:~# ./setup-two-bcache-one-cache.sh >/dev/null 2>&1   

[   23.323929] bcache: register_bdev() registered backing device dm-0   

[   23.330821] bcache: register_bdev() registered backing device dm-1   

[   23.335493] bcache: run_cache_set() invalidating existing data   

[   23.347255] bcache: register_cache() registered cache device dm-2

[   24.335738] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
f816e09d-f744-4fc9-b3bd-239f3d5093c6
[   24.342388] bcache: bch_cached_dev_attach() Caching dm-1 as bcache1 on set 
f816e09d-f744-4fc9-b3bd-239f3d5093c6


root@bionic-bcache:~# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache1  251:128  0 1024M  0 disk 
loop2  7:201G  0 loop 
└─fake-loop2 253:20 1024M  0 dm   
  ├─bcache0  251:00 1024M  0 disk 
  └─bcache1  251:128  0 1024M  0 disk 

# echo writeback | tee /sys/block/bcache*/bcache/cache_mode
writeback

# echo always | tee /sys/block/bcache*/bcache/stop_when_cache_set_failed
always


root@bionic-bcache:~# ./dm_fake_dev.sh /dev/loop0 bad
[   58.915344] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   58.921948] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   58.928886] bcache: register_bcache() error /dev/dm-0: device already 
registered (emitting change event)
[   58.931006] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.936386] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.939346] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
root@bionic-bcache:~# [   58.944685] bcache: bch_count_backing_io_errors() 
dm-0: IO error on backing device, unrecoverable
[   58.948468] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
[   58.951078] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.954231] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   58.957216] Buffer I/O error on dev bcache0, logical block 1, async page read


# ./dm_fake_dev.sh /dev/loop0 bad
[  167.341298] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[  167.347802] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[  167.354959] bcache: register_bcache() error /dev/dm-0: device already 
registered (emitting change event)
[  167.356585] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  167.364784] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  167.369083] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
root@bionic-bcache:~# [  167.376976] bcache: bch_count_backing_io_errors() 
dm-0: IO error on backing device, unrecoverable
[  167.381644] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
[  167.384195] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  167.387144] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  167.390040] Buffer I/O error on dev bcache0, logical block 1, async page read


root@guest-bcache:~# dd if=/dev/zero of=/dev/bcache1 bs=4k & dd if=/dev/zero 
of=/dev/bcache0 bs=4k &
[1] 1464

[2] 1465

root@bionic-bcache:~# [  178.103060] bcache: bch_count_backing_io_errors() 
dm-0: IO error on backing device, unrecoverable
[  178.107790] Buffer I/O error on dev bcache0, logical block 0, lost async 
page write  
[  178.111814] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  178.116428] Buffer I/O error on dev bcache0, logical block 1, lost async 
page write  
[  178.119286] bcache: bch_

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
[D/E][PATCH 0/1] LP#1829563 bcache: pending patch for Disco/Eoan
https://lists.ubuntu.com/archives/kernel-team/2019-July/102006.html

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
[C][PATCH 0/2] LP#1829563 bcache: pending patches for Cosmic
https://lists.ubuntu.com/archives/kernel-team/2019-July/102001.html

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 6 (for the Cosmic kernel)


commit: 'Revert "bcache: set CACHE_SET_IO_DISABLE in
bch_cached_dev_error()"'

Problem: if one backing device hits I/O errors the cache device
is disabled, but if that cache device is shared by other bcache
devices they stop too (even with non-failing backing devices).

Original kernel: all bcache devices that share cache device with
failing backing device are stopped.

Modified kernel: only the bcache device with the failing backing
device is stopped.


Original kernel
---

root@guest-bcache:~#  uname -rv
4.18.0-23-generic #24-Ubuntu SMP Wed Jun 12 18:17:39 UTC 2019

root@guest-bcache:~# lsblk -e 252
root@guest-bcache:~# 

root@guest-bcache:~# ./setup-two-bcache-one-cache.sh >/dev/null 2>&1
[   35.686002] bcache: register_bdev() registered backing device dm-0
[   35.695980] bcache: register_bdev() registered backing device dm-1
[   35.704662] bcache: run_cache_set() invalidating existing data
[   35.719046] bcache: register_cache() registered cache device dm-2
[   36.705686] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
fce8d558-4657-47dc-ab37-226ada14daf5
[   36.711827] bcache: bch_cached_dev_attach() Caching dm-1 as bcache1 on set 
fce8d558-4657-47dc-ab37-226ada14daf5

root@guest-bcache:~# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache1  251:128  0 1024M  0 disk 
loop2  7:201G  0 loop 
└─fake-loop2 253:20 1024M  0 dm   
  ├─bcache0  251:00 1024M  0 disk 
  └─bcache1  251:128  0 1024M  0 disk 

root@guest-bcache:~# echo writeback | tee /sys/block/dm-*/bcache/cache_mode
writeback

root@guest-bcache:~# cat /sys/block/dm-*/bcache/cache_mode
writethrough [writeback] writearound none
writethrough [writeback] writearound none


root@guest-bcache:~# ./dm_fake_dev.sh /dev/loop0 bad
[   76.875749] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   76.882159] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   76.889453] bcache: register_bcache() error /dev/dm-0: device already 
registered (emitting change event)
[   76.892183] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   76.904907] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   76.907711] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
[   76.912607] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   76.916905] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
[   76.920345] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   76.924767] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[   76.928404] Buffer I/O error on dev bcache0, logical block 1, async page read



root@guest-bcache:~# dd if=/dev/zero of=/dev/bcache1 bs=4k & dd if=/dev/zero 
of=/dev/bcache0 bs=4k &
[  175.024811] Buffer I/O error on dev bcache0, logical block 0, lost async 
page write  
[  175.029844] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  175.034652] Buffer I/O error on dev bcache0, logical block 1, lost async 
page write  
[  175.037465] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  175.040373] Buffer I/O error on dev bcache0, logical block 2, lost async 
page write  
...
[  175.092196] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  175.096635] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  175.101272] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  175.105829] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
...
[  175.235700] bcache: bch_count_backing_io_errors() dm-0: IO error on backing 
device, unrecoverable
[  175.239457] bcache: bch_cached_dev_error() stop bcache0: too many IO errors 
on backing device dm-0
[  175.239457]
[  175.324069] bcache: bch_cache_set_error() CACHE_SET_IO_DISABLE already set
[  175.328998] bcache: error on fce8d558-4657-47dc-ab37-226ada14daf5:
[  175.328999] journal io error
[  175.331022] , disabling caching
[  175.334264] bcache: conditional_stop_bcache_device() 
stop_when_cache_set_failed of bcache0 is "auto" and cache is dirty, stop it to 
avoid
 potential data corruption.
[  175.338865] bcache: conditional_stop_bcache_device() 
stop_when_cache_set_failed of bcache1 is "auto" and cache is dirty, stop it to 
avoid
 potential data corruption.
[  175.344097] bcache: cached_dev_detach_finish() Caching disabled for dm-1
[  176.080139] bcache: bcache_devi

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 7


commit: "bcache: fix ioctl in flash device"

This test needs QEMU+GDB since there's no way to set the variable
from the kernel, because it references incorrect memory/pointer.

In the original kernel, the ioctl() to a flash-only volume
goes from ioctl_dev() down to flash_dev_ioctl()
(which always return -ENOTTY always) only if its backing device
has io_disable == 0, but gets -EIO if io_disable == 1
(which are both wrong as flash device has no backing device).
 
In the modified kernel, it always go down to flash_dev_ioctl(),
thus -ENOTTY in both cases.

Original kernel:

$ uname -rv
4.15.0-51-generic #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019

io_disable = 0:

# ./ioctl
ioctl: Inappropriate ioctl for device

io_disable = 1:

# ./ioctl
ioctl: Input/output error


Modified kernel:

# uname -rv
4.18.0-23-generic #24+test20190619b1 SMP Wed Jun 19 22:33:22 UTC 2019

io_disable = 0:

# ./ioctl
ioctl: Inappropriate ioctl for device

io_disable = 1:

# ./ioctl
ioctl: Inappropriate ioctl for device


Steps
-

$ sudo -i
# ./setup.sh >/dev/null 2>&1
[   57.607391] bcache: register_bdev() registered backing device dm-0
[   57.611885] bcache: run_cache_set() invalidating existing data
[   57.625712] bcache: register_cache() registered cache device dm-1
[   58.605153] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
08f1756b-081f-4545-a947-927d429f7ef1

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop
└─fake-loop0 253:00 1024M  0 dm
  └─bcache0  251:00 1024M  0 disk
loop1  7:101G  0 loop
└─fake-loop1 253:10 1024M  0 dm
  └─bcache0  251:00 1024M  0 disk

# echo $((128*1024**2)) > /sys/block/dm-1/bcache/set/flash_vol_create

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  ├─bcache0  251:00 1024M  0 disk 
  └─bcache1  251:128  0  128M  0 disk 


# ls -1d /sys/block/bcache1/bcache/cache/volume*
/sys/block/bcache1/bcache/cache/volume1

# cat /sys/module/bcache/sections/.text
0xc05f4000

# cat ioctl.c 
#include 
#include 
#include 
#include 

int main() {
int fd, rc;
fd = open("/dev/bcache1", O_RDWR);
if (fd == -1) {
perror("open");
return 0;
}
rc = ioctl(fd, 0xcc00ffee); // our cookie.
if (rc == -1)
perror("ioctl");
close(fd);
return 0;
}


# gcc -o ioctl ioctl.c
# ./ioctl


(gdb) add-symbol-file 
~/ddeb-4.18.0-23-generic/usr/lib/debug/lib/modules/4.18.0-23-generic/kernel/drivers/md/bcache/bcache.ko
 0xc05f4000 

(gdb) b *ioctl_dev +0x13
(gdb) c

Thread 4 hit Breakpoint 1, ioctl_dev (b=, mode=134610975, 
cmd=3422617582, arg=2) at 
/build/linux-JHFGaE/linux-4.18.0/drivers/md/bcache/super.c:641
...

(gdb) p/x 3422617582
$53 = 0xcc00ffee  // our cookie.

(gdb) print d.name
$54 = "volume1\000\000\000\000" // matches the volume above.


See it's an incorrect pointer.
No way to set it via any io_disable in sysfs.

(gdb) print ((struct cached_dev *)((void *)d - 0x10)).backing_dev_name
$75 = 
"\017\000\000\000\260\373\212s\277\230\377\377\260\373\212s\277\230\377\377 
nԇ\377\377\377\377\060qh\211"

(gdb) print ((struct cached_dev *)((void *)d - 0x10)).io_disable 
$76 = 0


With the default value (io_disable = 0):

# ./ioctl
ioctl: Inappropriate ioctl for device


(gdb) c

... repeat.

With the forced value (io_disable = 1):

(gdb) set ((struct cached_dev *)((void *)d - 0x10)).io_disable = 1
(gdb) c

# ./ioctl
ioctl: Input/output error

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regre

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
[B][PATCH 00/11] LP#1829563 bcache: risk of data loss on I/O errors in backing 
or caching devices
https://lists.ubuntu.com/archives/kernel-team/2019-July/101989.html

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of many patches
     with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
     read-write).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).

  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
** Description changed:

  [Impact]
  
-  * The bcache code in Bionic lacks several fixes to handle
-I/O errors in both backing devices and caching devices.
+  * The bcache code in Bionic lacks several fixes to handle
+    I/O errors in both backing devices and caching devices.
  
-  * Partial or permanent errors in backing or caching devices,
-specially in writeback mode, can lead to data loss and/or
-the application is not notified about failed I/O requests.
+  * Partial or permanent errors in backing or caching devices,
+    specially in writeback mode, can lead to data loss and/or
+    the application is not notified about failed I/O requests.
  
-  * The bcache device might remain available for I/O requests
-even if backing device is offline, so writes are undefined.
+  * The bcache device might remain available for I/O requests
+    even if backing device is offline, so writes are undefined.
  
  [Test Case]
  
-  * Detailed test cases/steps for the behavior of almost every
-patch with code logic changes are provided in bug comments.
+  * Detailed test cases/steps for the behavior of almost every
+    patch with code logic changes are provided in bug comments.
  
-  * The patchset has been tested for regressions on each cache
-mode (writethrough, writeback, writearound, none) with the
-xfstests test suite (on ext4), fio (random read-write) and
-iozone (several read/write tests).
+  * The patchset has been tested for regressions on each cache
+    mode (writethrough, writeback, writearound, none) with the
+    xfstests test suite (on ext4) and fio (sequential + random
+read-write).
  
  [Regression Potential]
  
-  * The patchset is relatively large and touches several areas
-in bcache code, however, synthetic testing of the patches
-has been performed, and extensive regression/stress tests
-were run (as mentioned in Test Case section).
+  * The patchset is relatively large and touches several areas
+    in bcache code, however, synthetic testing of the patches
+    has been performed, and extensive regression/stress tests
+    were run (as mentioned in Test Case section).
  
-  * Many patches in the patchset are 'Fixes' patches to other
-patches, and no further 'Fixes' currently exist upstream.
+  * Many patches in the patchset are 'Fixes' patches to other
+    patches, and no further 'Fixes' currently exist upstream.
  
  [Other Info]
  
-  * Canonical Field Eng. deploys bcache+writeback extensively
-(e.g., BootStack, UA cloud, except rare all-flash cases).
- 
+  * Canonical Field Eng. deploys bcache+writeback extensively
+    (e.g., BootStack, UA cloud, except rare all-flash cases).
  
  [Original Bug Description]
  
  This is a request for a backport of the following upstream patch from
  4.18:
  
  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee
  
  Field engineering uses bcache quite extensively and it would be good to
  have this in the GA/bionic kernel.

** Description changed:

  [Impact]
  
   * The bcache code in Bionic lacks several fixes to handle
     I/O errors in both backing devices and caching devices.
  
   * Partial or permanent errors in backing or caching devices,
     specially in writeback mode, can lead to data loss and/or
     the application is not notified about failed I/O requests.
  
   * The bcache device might remain available for I/O requests
     even if backing device is offline, so writes are undefined.
  
  [Test Case]
  
-  * Detailed test cases/steps for the behavior of almost every
-    patch with code logic changes are provided in bug comments.
+  * Detailed test cases/steps for the behavior of many patches
+    with code logic changes are provided in bug comments.
  
   * The patchset has been tested for regressions on each cache
     mode (writethrough, writeback, writearound, none) with the
     xfstests test suite (on ext4) and fio (sequential + random
-read-write).
+    read-write).
  
  [Regression Potential]
  
   * The patchset is relatively large and touches several areas
     in bcache code, however, synthetic testing of the patches
     has been performed, and extensive regression/stress tests
     were run (as mentioned in Test Case section).
  
   * Many patches in the patchset are 'Fixes' patches to other
     patches, and no further 'Fixes' currently exist upstream.
  
  [Other Info]
  
   * Canonical Field Eng. deploys bcache+writeback extensively
     (e.g., BootStack, UA cloud, except rare all-flash cases).
  
  [Original Bug Description]
  
  This is a request for a backport of the following upstream patch from
  4.18:
  
  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee
  
  Field engineering uses bcache quite extensively and it would be good to
  have this in the GA/bionic kernel.

-- 
You 

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 1


commit "bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags"

Problem: the cacheset is not retired immediately on I/O errors in
cache device if I/O requests keep coming.

Original kernel: bcache device remains on top of caching device,
and 'fio' never finishes.

Modified kernel: bcache device is removed from caching device,
and fio finishes.

Original


# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019

# ./setup.sh >/dev/null 2>&1
[  285.677682] bcache: register_bdev() registered backing device dm-0
[  285.697006] bcache: run_cache_set() invalidating existing data
[  285.710938] bcache: register_cache() registered cache device dm-1
[  287.686924] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
c589879b-b1c3-49b3-9603-9795ddc750f5

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 

# ./dm_fake_dev.sh /dev/loop1 bad
[  766.102586] Buffer I/O error on dev dm-1, logical block 262128, async page 
read
[  766.107602] Buffer I/O error on dev dm-1, logical block 262128, async page 
read
[  766.113889] bcache: register_bcache() error /dev/dm-1: device already 
registered

On another shell:

# fio --name=write --rw=randwrite --filename=/dev/bcache0 --bs=4k --iodepth=8 
--ioengine=libaio --runtime=300s --continue_on_error=all
write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=8
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [f(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]

< fio never finishes, keeps showing the last line above >
<  console prints as below when fio starts running  >

[  777.451177] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
[  777.490882] bcache: error on c589879b-b1c3-49b3-9603-9795ddc750f5:
[  777.490885] journal io error
[  777.494087] , disabling caching
[  807.900700] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
(error msgs looping)

bcache0 still present on top of cache device (fake-loop1)

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop
└─fake-loop0 253:00 1024M  0 dm
  └─bcache0  251:00 1024M  0 disk
loop1  7:101G  0 loop
fake-loop1   253:101G  0 dm
└─bcache0251:00 1024M  0 disk


Modified


# uname -rv
4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37 
UTC


# ./setup.sh >/dev/null 2>&1
[   60.542088] bcache: register_bdev() registered backing device dm-0
[   60.550509] bcache: run_cache_set() invalidating existing data
[   60.560109] bcache: register_cache() registered cache device dm-1
[   62.548849] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
f6833a2c-53e6-468e-bf1d-a9f48b73d783

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 

# ./dm_fake_dev.sh /dev/loop1 bad
[   72.639185] Buffer I/O error on dev dm-1, logical block 262128, async page 
read
[   72.644876] Buffer I/O error on dev dm-1, logical block 262128, async page 
read
[   72.650707] bcache: register_bcache() error /dev/dm-1: device already 
registered

On another shell:

# fio --name=write --rw=randwrite --filename=/dev/bcache0 --bs=4k
--iodepth=8 --ioengine=libaio --runtime=300s --continue_on_error=all

[   97.858468] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
[   97.868519] bcache: error on f6833a2c-53e6-468e-bf1d-a9f48b73d783:
[   97.868520] journal io error
[   97.869998] , disabling caching
[   97.871441] bcache: conditional_stop_bcache_device() 
stop_when_cache_set_failed of bcache0 is "auto" and cache is clean, keep it 
alive.
[   97.874423] Buffer I/O error on dev bcache0, logical block 2814, lost async 
page write
[   97.878697] Buffer I/O error on dev bcache0, logical block 2816, lost async 
page write
[   97.881702] Buffer I/O error on dev bcache0, logical block 2817, lost async 
page write
[   97.884790] Buffer I/O error on dev bcache0, logical block 2818, lost async 
page write
[   97.887709] Buffer I/O error on dev bcache0, logical block 2819, lost async 
page write
[   97.890558] Buffer I/O error on dev bcache0, logical block 2820, lost async 
page write
[   97.892419] Buffer I/O error on dev bcache0, logical block 2821, lost async 
page write
[   97.894228] Buffer I/O error on dev bcache0, logical block 2822, lost async 
page write
[   97.896107] Buffer I/O error on dev bcache0, logical block 2823, lost async 
page write
[   97.8979

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 4


commit "bcache: add io_disable to struct cached_dev"

Problem: in case of the backing device hits I/o errors or is
disconected, the I/O can still be accepted to the bcache device.

Original kernel: dd writes in writeback mode to failed backing device complete.
Modified kernel: the bcache0 device is removed after some I/O errors in backing 
device.


Original


# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019


# ./setup.sh >/dev/null 2>&1

[   24.820401] bcache: register_bdev() registered backing device dm-0   

[   24.833268] bcache: run_cache_set() invalidating existing data
[   24.848314] bcache: register_cache() registered cache device dm-1
[   26.824645] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
dd465fa7-4e85-484b-89dd-353c24c6b041

# echo writeback > /sys/block/bcache0/bcache/cache_mode

# cat /sys/block/bcache0/bcache/cache_mode
writethrough [writeback] writearound none

# ./dm_fake_dev.sh /dev/loop0 bad
[   41.439684] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   41.445284] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   41.451846] bcache: register_bcache() error /dev/dm-0: device already 
registered (emitting change event)
[   41.454704] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
[   41.457685] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
[   41.457743] Buffer I/O error on dev bcache0, logical block 1, async page read

# dd if=/dev/zero of=/dev/bcache0 bs=4k
[   49.048036] Buffer I/O error on dev bcache0, logical block 0, lost async 
page write
[   49.051702] Buffer I/O error on dev bcache0, logical block 1, lost async 
page write
[   49.054062] Buffer I/O error on dev bcache0, logical block 2, lost async 
page write
[   49.056466] Buffer I/O error on dev bcache0, logical block 3, lost async 
page write
[   49.058867] Buffer I/O error on dev bcache0, logical block 4, lost async 
page write
[   49.072020] Buffer I/O error on dev bcache0, logical block 5, lost async 
page write
[   49.074440] Buffer I/O error on dev bcache0, logical block 6, lost async 
page write
[   49.078658] Buffer I/O error on dev bcache0, logical block 7, lost async 
page write
[   49.079008] Buffer I/O error on dev bcache0, logical block 6834, lost async 
page write
[   49.079022] Buffer I/O error on dev bcache0, logical block 6835, lost async 
page write
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.58342 s, 416 MB/s

# dd if=/dev/zero of=/dev/bcache0 bs=4k
[   62.696034] buffer_io_error: 260992 callbacks suppressed
[   62.696037] Buffer I/O error on dev bcache0, logical block 0, lost async 
page write
[   62.701996] Buffer I/O error on dev bcache0, logical block 1, lost async 
page write
[   62.704394] Buffer I/O error on dev bcache0, logical block 2, lost async 
page write
[   62.706763] Buffer I/O error on dev bcache0, logical block 3, lost async 
page write
[   62.716025] Buffer I/O error on dev bcache0, logical block 4, lost async 
page write
[   62.718421] Buffer I/O error on dev bcache0, logical block 5, lost async 
page write
[   62.720821] Buffer I/O error on dev bcache0, logical block 6, lost async 
page write
[   62.723193] Buffer I/O error on dev bcache0, logical block 7, lost async 
page write
[   62.725584] Buffer I/O error on dev bcache0, logical block 8, lost async 
page write
[   62.725763] Buffer I/O error on dev bcache0, logical block 5405, lost async 
page write
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.88915 s, 372 MB/s

# dd if=/dev/zero of=/dev/bcache0 bs=4k
[   67.700114] buffer_io_error: 290043 callbacks suppressed
[   67.700117] Buffer I/O error on dev bcache0, logical block 40750, lost async 
page write
[   67.706230] Buffer I/O error on dev bcache0, logical block 40751, lost async 
page write
[   67.709846] Buffer I/O error on dev bcache0, logical block 40752, lost async 
page write
[   67.713503] Buffer I/O error on dev bcache0, logical block 40753, lost async 
page write
[   67.717241] Buffer I/O error on dev bcache0, logical block 40754, lost async 
page write
[   67.720938] Buffer I/O error on dev bcache0, logical block 40755, lost async 
page write
[   67.741395] Buffer I/O error on dev bcache0, logical block 40756, lost async 
page write
dd: error writing '/dev/bcache0': No space left on device
[   67.748145] Buffer I/O error on dev bcache0, logical block 40757, lost async 
page write
[   67.752352] Buffer I/O error on dev bcache0, logical block 41038, lost async 
page write
[   67.756642] Buffer I/O error on dev bcache0, logical block 41313, lo

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 2


commit "bcache: add stop_when_cache_set_failed option to backing device"

Problem: upon too many io errors on cache device, bcache code retires cache set,
but it doesn't stop the bcache device.

(in writeback mode, if it continue to write, writes go to backing device 
directly;
next time bcache re-registers, it writes dirty cached data again, overwriting 
data)

Original kernel: bcache device remains available for writes.
Modified kernel: bcache device is not available anymore.


Original


# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019


# ./setup.sh >/dev/null 2>&1
[   27.427543] bcache: register_bdev() registered backing device dm-0
[   27.434844] bcache: run_cache_set() invalidating existing data
[   27.445652] bcache: register_cache() registered cache device dm-1
[   29.435245] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
68e969a1-0f60-4978-a109-b48559970472

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 

# echo writeback > /sys/block/bcache0/bcache/cache_mode

# dd if=/dev/zero of=/dev/bcache0 bs=4k
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 5.14109 s, 209 MB/s
  
# ./dm_fake_dev.sh /dev/loop1 bad   

[  135.316335] Buffer I/O error on dev dm-1, logical block 262128, async page 
read  
[  135.330181] Buffer I/O error on dev dm-1, logical block 262128, async page 
read  
[  135.337019] bcache: register_bcache() error /dev/dm-1: device already 
registered 

# dd if=/dev/zero of=/dev/bcache0 bs=4k 

[  142.754766] bcache: error on 68e969a1-0f60-4978-a109-b48559970472:   

[  142.754769] journal io error 

[  142.757583] , disabling caching  

dd: error writing '/dev/bcache0': No space left on device   

[  143.299557] bcache: bch_count_io_errors() dm-1: IO error on reading dirty 
data from cache, recovering
[  143.305560] bcache: bch_count_io_errors() dm-1: IO error on reading dirty 
data from cache, recovering
[  143.308463] bcache: bch_count_io_errors() dm-1: IO error on reading dirty 
data from cache, recovering


# ./dm_fake_dev.sh /dev/loop1 good
[  171.220172] bcache: bch_count_io_errors() dm-1: IO error on reading dirty 
data from cache, recovering
[  171.426360] bcache: register_bcache() error /dev/dm-1: device already 
registered

262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 32.066 s, 33.5 MB/s
dd: closing input file '/dev/zero': Bad file descriptor
# 
# [  174.589309] bcache: cached_dev_detach_finish() Caching disabled for dm-0
[  174.599062] bcache: cache_set_free() Cache set 
68e969a1-0f60-4978-a109-b48559970472 unregistered


# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm



Modified


# uname -rv
4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37 
UTC


# ./setup.sh >/dev/null 2>&1
[   39.981825] bcache: register_bdev() registered backing device dm-0
[   39.987359] bcache: run_cache_set() invalidating existing data
[   40.000743] bcache: register_cache() registered cache device dm-1
[   41.985275] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
9b509864-84be-428e-ad8a-293d546ec2e9

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 

# echo writeback > /sys/block/bcache0/bcache/cache_mode

# dd if=/dev/zero of=/dev/bcache0 bs=4k
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 5.96102 s, 180 MB/s

# ./dm_fake_dev.sh /dev/loop1 bad   

[   81.798129] Buffer I/O error on dev dm-1, logical block 262128, async page 
read  
[   81.804533] Buffer I/O error on dev dm-1, logical block 262128, async page 
read  
[   81.810630] b

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 5


commit "bcache: stop bcache device when backing device is offline"

Problem: bcache is unaware that a backing device goes offline,
and accepts writes.

Original kernel: bcache doesn't realize backing device is offline.
Modified kernel: bcache device is removed after backing device is offline.


Original


# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019

# apt install -y linux-modules-extra-$(uname -r)=4.15.0-55.60

# modprobe scsi_debug dev_size_mb=1024
# dmesg | tail
[   17.339336] scsi host2: scsi_debug: version 1.86 [20160430]
 dev_size_mb=1024, opts=0x0, submit_queues=1, statistics=0
[   17.339667] scsi 2:0:0:0: Direct-Access Linuxscsi_debug   0186 
PQ: 0 ANSI: 7
[   17.339989] sd 2:0:0:0: Power-on or device reset occurred
[   17.340184] sd 2:0:0:0: Attached scsi generic sg0 type 0
[   17.348436] sd 2:0:0:0: [sda] 2097152 512-byte logical blocks: (1.07 GB/1.00 
GiB)
[   17.352496] sd 2:0:0:0: [sda] Write Protect is off
[   17.352501] sd 2:0:0:0: [sda] Mode Sense: 73 00 10 08
[   17.360686] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
supports DPO and FUA
[   17.451956] sd 2:0:0:0: [sda] Attached SCSI disk

# ./setup-sda.sh >/dev/null 2>&1
[   37.515474] bcache: run_cache_set() invalidating existing data
[   37.527275] bcache: register_cache() registered cache device dm-0
[   37.625160] bcache: register_bdev() registered backing device sda
[   38.466997] bcache: bch_cached_dev_attach() Caching sda as bcache0 on set 
0e23e266-6032-4700-a4c9-464964ed1349

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
sda8:001G  0 disk 
└─bcache0251:00 1024M  0 disk 


# echo writeback > /sys/block/sda/bcache/cache_mode

# echo 1 > /sys/block/sda/device/delete
[   77.878624] sd 2:0:0:0: [sda] Synchronizing SCSI cache

# sleep 10
#

# dd if=/dev/zero of=/dev/bcache0 bs=4k
[  120.268072] Buffer I/O error on dev bcache0, logical block 0, lost async 
page write
[  120.271978] Buffer I/O error on dev bcache0, logical block 1, lost async 
page write
[  120.274554] Buffer I/O error on dev bcache0, logical block 2, lost async 
page write
[  120.276671] Buffer I/O error on dev bcache0, logical block 3, lost async 
page write
[  120.278760] Buffer I/O error on dev bcache0, logical block 4, lost async 
page write
[  120.288042] Buffer I/O error on dev bcache0, logical block 5, lost async 
page write
[  120.290178] Buffer I/O error on dev bcache0, logical block 6, lost async 
page write
[  120.296043] Buffer I/O error on dev bcache0, logical block 7, lost async 
page write
[  120.298167] Buffer I/O error on dev bcache0, logical block 8, lost async 
page write
[  120.300220] Buffer I/O error on dev bcache0, logical block 3260, lost async 
page write
dd: error writing '/dev/bcache0': No space left on device
262143+0 records in
262142+0 records out
1073733632 bytes (1.1 GB, 1.0 GiB) copied, 3.49705 s, 307 MB/s

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 


Modified


# uname -rv
4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37 
UTC

# modprobe scsi_debug dev_size_mb=1024
# dmesg | tail
[   27.855163] scsi host2: scsi_debug: version 1.86 [20160430]
 dev_size_mb=1024, opts=0x0, submit_queues=1, statistics=0
[   27.855497] scsi 2:0:0:0: Direct-Access Linuxscsi_debug   0186 
PQ: 0 ANSI: 7
[   27.855829] sd 2:0:0:0: Power-on or device reset occurred
[   27.855883] sd 2:0:0:0: Attached scsi generic sg0 type 0
[   27.863979] sd 2:0:0:0: [sda] 2097152 512-byte logical blocks: (1.07 GB/1.00 
GiB)
[   27.868028] sd 2:0:0:0: [sda] Write Protect is off
[   27.868031] sd 2:0:0:0: [sda] Mode Sense: 73 00 10 08
[   27.876114] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
supports DPO and FUA
[   27.966877] sd 2:0:0:0: [sda] Attached SCSI disk
# 

# ./setup-sda.sh >/dev/null 2>&1
[   46.294586] bcache: run_cache_set() invalidating existing data
[   46.306769] bcache: register_cache() registered cache device dm-0
[   46.400510] bcache: register_bdev() registered backing device sda
[   47.247662] bcache: bch_cached_dev_attach() Caching sda as bcache0 on set 
d3d7816b-6bd9-4dea-bf0c-4bde76b45f91

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
sda8:001G  0 disk 
└─bcache0251:00 1024M  0 disk 

# echo writeback > /sys/block/sda/bcache/cache_mode

# echo 1 > /sys/block/sda/device/delete
[   85.713826] sd 2:0:0:0: [sda] Synchronizing SCSI cache

# sleep 10
[   90.428125] bcache: cached_dev_status_update() sda: device

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 6


Make sure that if two bcache devices share a cache device,
only one bcache device is offlined in case of I/O errors
in only one backing device.

Original


# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019


# ./setup-two-bcache-one-cache.sh >/dev/null 2>&1
[   20.464988] bcache: register_bdev() registered backing device dm-0
[   20.474082] bcache: register_bdev() registered backing device dm-1
[   20.485464] bcache: run_cache_set() invalidating existing data
[   20.496253] bcache: register_cache() registered cache device dm-2
[   21.478688] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
e247271f-eea4-46ea-8ffe-4b04299a7c24
[   21.484183] bcache: bch_cached_dev_attach() Caching dm-1 as bcache1 on set 
e247271f-eea4-46ea-8ffe-4b04299a7c24

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache1  251:128  0 1024M  0 disk 
loop2  7:201G  0 loop 
└─fake-loop2 253:20 1024M  0 dm   
  ├─bcache0  251:00 1024M  0 disk 
  └─bcache1  251:128  0 1024M  0 disk 

# echo writeback | tee /sys/block/dm-*/bcache/cache_mode
writeback
# cat /sys/block/dm-*/bcache/cache_mode
writethrough [writeback] writearound none
writethrough [writeback] writearound none


# ./dm_fake_dev.sh /dev/loop0 bad
[   56.142318] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   56.151754] Buffer I/O error on dev dm-0, logical block 262128, async page 
read
[   56.158706] bcache: register_bcache() error /dev/dm-0: device already 
registered (emitting change event)
[   56.161631] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
[   56.169723] Buffer I/O error on dev bcache0, logical block 262112, async 
page read
[   56.172303] Buffer I/O error on dev bcache0, logical block 1, async page read


root@guest-bcache:~# dd if=/dev/zero of=/dev/bcache1 bs=4k & dd if=/dev/zero 
of=/dev/bcache0 bs=4k &
[1] 1400
[2] 1401
# [   85.400054] Buffer I/O error on dev bcache0, logical block 0, lost async 
page write
[   85.403138] Buffer I/O error on dev bcache0, logical block 1, lost async 
page write
[   85.417864] Buffer I/O error on dev bcache0, logical block 2, lost async 
page write
[   85.419699] Buffer I/O error on dev bcache0, logical block 3, lost async 
page write
[   85.421510] Buffer I/O error on dev bcache0, logical block 4, lost async 
page write
[   85.423301] Buffer I/O error on dev bcache0, logical block 5, lost async 
page write
[   85.450407] Buffer I/O error on dev bcache0, logical block 6, lost async 
page write
[   85.452248] Buffer I/O error on dev bcache0, logical block 7, lost async 
page write
[   85.453980] Buffer I/O error on dev bcache0, logical block 8, lost async 
page write
[   85.455722] Buffer I/O error on dev bcache0, logical block 9, lost async 
page write
dd: error writing '/dev/bcache0': No space left on device
dd: error writing '/dev/bcache1': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 4.23916 s, 253 MB/s

[2]+  Exit 1  dd if=/dev/zero of=/dev/bcache0 bs=4k
# 262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 6.5355 s, 164 MB/s

[1]+  Exit 1  dd if=/dev/zero of=/dev/bcache1 bs=4k


# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache1  251:128  0 1024M  0 disk 
loop2  7:201G  0 loop 
└─fake-loop2 253:20 1024M  0 dm   
  ├─bcache0  251:00 1024M  0 disk 
  └─bcache1  251:128  0 1024M  0 disk 
fake-loop0   253:001G  0 dm   
└─bcache0251:00 1024M  0 disk 

Nothing is removed, since no I/O errors are detected.


Modified


# uname -rv
4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37 
UTC


# ./setup-two-bcache-one-cache.sh >/dev/null 2>&1
[   29.341483] bcache: register_bdev() registered backing device dm-0
[   29.351963] bcache: run_cache_set() invalidating existing data
[   29.365256] bcache: register_cache() registered cache device dm-2
[   29.365566] bcache: register_bdev() registered backing device dm-1
[   30.357267] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
ee57b847-da70-4b65-888b-e5795ecf6c46
[   30.363117] bcache: bch_cached_dev_attach() Caching dm-1 as bcache1 on set 
ee57b847-da70-4b65-888b-e5795ecf6c46

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop 
└─fake-loop0 253:00 1024M  0 dm   
  └─bcache0  251:00 1024M  0 disk 
loop1  7:101G  0 loop 
└─fake-loop1 253:10 1024M  0 dm   
  └─bcache1  251:128  0 1024M  0 disk 
l

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
I/O Error Test 3


commit "bcache: fix inaccurate io state for detached bcache devices"

Probçem: on detached bcache devices, iostat reports non-zero 'avgqu-sz'
and '%util' even after IO stopped.

Original kernel: iostat fields are non-zero on  I/O idle.
Modified kernel: iostat fields are all zero on  I/O idle.


Original


# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019

# ./setup.sh >/dev/null 2>&1
[  118.371172] bcache: register_bdev() registered backing device dm-0
[  118.377221] bcache: run_cache_set() invalidating existing data 
[  118.386614] bcache: register_cache() registered cache device dm-1
[  120.373682] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
4f2b6518-1a58-42e0-8720-42b535cb826f

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
   
loop0  7:001G  0 loop
└─fake-loop0 253:00 1024M  0 dm
  └─bcache0  251:00 1024M  0 disk  
loop1  7:101G  0 loop
└─fake-loop1 253:10 1024M  0 dm 

  └─bcache0  251:00 1024M  0 disk   
   


# echo 1 > /sys/block/bcache0/bcache/detach
[  131.847360] bcache: cached_dev_detach_finish() Caching disabled for dm-0
   
# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT

loop0  7:001G  0 loop   
   
└─fake-loop0 253:00 1024M  0 dm
  └─bcache0  251:00 1024M  0 disk  
loop1  7:101G  0 loop  
└─fake-loop1 253:10 1024M  0 dm

# dd if=/dev/zero of=/dev/bcache0 bs=4k 

dd: error writing '/dev/bcache0': No space left on device   
   
262142+0 records in
262141+0 records out   
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 3.86316 s, 278 MB/s


# sync

# iostat -xy 1 10 /dev/bcache0 | grep bcache0
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00
bcache0  0.000.00  0.00  0.00 0.00 0.00   0.00   
0.000.000.00 262224.00 0.00 0.00   0.00 100.00


Modified


# uname -rv
4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37 
UTC


# ./setup.sh >/dev/null 2>&1
[   26.923296] bcache: register_bdev() registered backing device dm-0
[   26.928998] bcache: run_cache_set() invalidating existing data
[   26.940144] bcache: register_cache() registered cache device dm-1
[   28.920772] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
bf256397-22a4-40ac-8ff3-0026fd28354c

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop
└─fake-loop0 253:00 1024M  0 dm
  └─bcache0  251:00 1024M  0 disk
loop1  7:101G  0 loop
└─fake-loop1 253:10 1024M  0 dm
  └─bcache0  251:00 1024M  0 disk

# echo 1 > /sys/block/bcache0/bcache/detach
[   42.292538] bcache: cached_dev_detach_finish() Caching disabled for dm-0

# lsblk -e 252
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0  7:001G  0 loop
└─fake-loop0 253:00 1024M  0 dm
  └─bcache0  251:00 1024M  0 disk
loop1  7:101G  0 loop
└─fake-loop1 253:1

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
** Attachment added: "setup-two-bcache-one-cache.sh"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+attachment/5275535/+files/setup-two-bcache-one-cache.sh

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
 I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
 specially in writeback mode, can lead to data loss and/or
 the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
 even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of almost every
 patch with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
 mode (writethrough, writeback, writearound, none) with the
 xfstests test suite (on ext4), fio (random read-write) and
 iozone (several read/write tests).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
 in bcache code, however, synthetic testing of the patches
 has been performed, and extensive regression/stress tests
 were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
 patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
 (e.g., BootStack, UA cloud, except rare all-flash cases).

  
  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
** Attachment added: "setup.sh"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+attachment/5275533/+files/setup.sh

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
 I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
 specially in writeback mode, can lead to data loss and/or
 the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
 even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of almost every
 patch with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
 mode (writethrough, writeback, writearound, none) with the
 xfstests test suite (on ext4), fio (random read-write) and
 iozone (several read/write tests).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
 in bcache code, however, synthetic testing of the patches
 has been performed, and extensive regression/stress tests
 were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
 patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
 (e.g., BootStack, UA cloud, except rare all-flash cases).

  
  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
** Attachment added: "setup-sda.sh"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+attachment/5275534/+files/setup-sda.sh

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
 I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
 specially in writeback mode, can lead to data loss and/or
 the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
 even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of almost every
 patch with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
 mode (writethrough, writeback, writearound, none) with the
 xfstests test suite (on ext4), fio (random read-write) and
 iozone (several read/write tests).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
 in bcache code, however, synthetic testing of the patches
 has been performed, and extensive regression/stress tests
 were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
 patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
 (e.g., BootStack, UA cloud, except rare all-flash cases).

  
  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
Tests with I/O errors
=

The I/O error tests in the caching and backing device have been
simulated with the device mapper dm-linear and dm-error targets.

The tests are based on the patch messages, which describe a
problem and/or expected behavior, so to exercise the change.

Comparisons/comments between the original & modified/test kernels
are provided in the next comments.

The test script initially sets up fake DM devices in 'good' state
with the linear target, and later can switch them to 'bad' state
with the error target, and back again.
(attached script: 'dm_fake_dev.sh')

The bcache setup uses the DM devices above (for caching and backing
devices) on top of loop devices in a virtual machine for flexibility.
(attached script: 'setup.sh')

One test uses a scsi-debug SCSI disk as backing device to offline it.
(attached script: 'setup-sda.sh')

One test uses 2 bcache devices with 1 shared cache device to ensure
only one bcache device is stopped if 1 backing device has failures.
(attached script: 'setup-two-bcache-one-cache.sh')

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
 I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
 specially in writeback mode, can lead to data loss and/or
 the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
 even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of almost every
 patch with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
 mode (writethrough, writeback, writearound, none) with the
 xfstests test suite (on ext4), fio (random read-write) and
 iozone (several read/write tests).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
 in bcache code, however, synthetic testing of the patches
 has been performed, and extensive regression/stress tests
 were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
 patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
 (e.g., BootStack, UA cloud, except rare all-flash cases).

  
  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
** Attachment added: "test-bcache-cache-modes.fio.sh"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+attachment/5275532/+files/test-bcache-cache-modes.fio.sh

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
 I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
 specially in writeback mode, can lead to data loss and/or
 the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
 even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of almost every
 patch with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
 mode (writethrough, writeback, writearound, none) with the
 xfstests test suite (on ext4), fio (random read-write) and
 iozone (several read/write tests).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
 in bcache code, however, synthetic testing of the patches
 has been performed, and extensive regression/stress tests
 were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
 patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
 (e.g., BootStack, UA cloud, except rare all-flash cases).

  
  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
Tests with fio
==

The fio tool has been run for performance tests on the original 
and modified kernels in sequential (rw) and random read/write (randrw)
on all cache modes available (writethrough, writeback, writearound, none).

Summary: no regressions observed, same IOPS and BW metrics,
run three times (test1, test2, test3) on 'orig' and 'test'.

# fio \
  --name=test \
  --filename=/dev/bcache0 \
  --runtime=180s \
  --ioengine=libaio \
  --rw=$RW_MODE \
  --bs=4k \
  --iodepth=16 \
  --numjobs=8 \
  --group_reporting

(attached script: 'test-bcache-cache-modes.fio.sh')

Details:
---

The bcache device is made of
- caching device: 100 GB partition in NVMe drive
- backing device:   1 TB partition in SCSI disk

$ dpkg -s fio | grep Version:
Version: 3.1-1

# grep -e read: -e write: fio.test*/fio.*/fio.test.*.log | sed 's,/fio, fio,g' 
| sort -s -k4,4 -k3,3 -k2,2
fio.test1 fio.orig fio.test.none.randrw.log:   read: IOPS=79, BW=318KiB/s 
(325kB/s)(55.9MiB/180092msec)
fio.test2 fio.orig fio.test.none.randrw.log:   read: IOPS=81, BW=328KiB/s 
(336kB/s)(57.7MiB/180137msec)
fio.test3 fio.orig fio.test.none.randrw.log:   read: IOPS=80, BW=324KiB/s 
(331kB/s)(56.9MiB/180142msec)
fio.test1 fio.test fio.test.none.randrw.log:   read: IOPS=81, BW=326KiB/s 
(334kB/s)(57.4MiB/180222msec)
fio.test2 fio.test fio.test.none.randrw.log:   read: IOPS=80, BW=324KiB/s 
(331kB/s)(56.0MiB/180177msec)
fio.test3 fio.test fio.test.none.randrw.log:   read: IOPS=77, BW=312KiB/s 
(319kB/s)(54.8MiB/180103msec)
fio.test1 fio.orig fio.test.none.rw.log:   read: IOPS=81.8k, BW=319MiB/s 
(335MB/s)(56.2GiB/180023msec)
fio.test2 fio.orig fio.test.none.rw.log:   read: IOPS=82.7k, BW=323MiB/s 
(339MB/s)(56.8GiB/180070msec)
fio.test3 fio.orig fio.test.none.rw.log:   read: IOPS=82.8k, BW=323MiB/s 
(339MB/s)(56.9GiB/180006msec)
fio.test1 fio.test fio.test.none.rw.log:   read: IOPS=83.2k, BW=325MiB/s 
(341MB/s)(57.1GiB/180035msec)
fio.test2 fio.test fio.test.none.rw.log:   read: IOPS=85.2k, BW=333MiB/s 
(349MB/s)(58.5GiB/180030msec)
fio.test3 fio.test fio.test.none.rw.log:   read: IOPS=82.9k, BW=324MiB/s 
(339MB/s)(56.9GiB/180003msec)
fio.test1 fio.orig fio.test.writearound.randrw.log:   read: IOPS=78, 
BW=315KiB/s (322kB/s)(55.3MiB/180123msec)
fio.test2 fio.orig fio.test.writearound.randrw.log:   read: IOPS=78, 
BW=313KiB/s (321kB/s)(55.1MiB/180114msec)
fio.test3 fio.orig fio.test.writearound.randrw.log:   read: IOPS=81, 
BW=325KiB/s (333kB/s)(57.2MiB/180108msec)
fio.test1 fio.test fio.test.writearound.randrw.log:   read: IOPS=81, 
BW=324KiB/s (332kB/s)(57.0MiB/180107msec)
fio.test2 fio.test fio.test.writearound.randrw.log:   read: IOPS=78, 
BW=315KiB/s (323kB/s)(55.4MiB/180069msec)
fio.test3 fio.test fio.test.writearound.randrw.log:   read: IOPS=78, 
BW=312KiB/s (320kB/s)(54.9MiB/180109msec)
fio.test1 fio.orig fio.test.writearound.rw.log:   read: IOPS=81.8k, BW=319MiB/s 
(335MB/s)(56.2GiB/180046msec)
fio.test2 fio.orig fio.test.writearound.rw.log:   read: IOPS=81.6k, BW=319MiB/s 
(334MB/s)(56.0GiB/180004msec)
fio.test3 fio.orig fio.test.writearound.rw.log:   read: IOPS=82.7k, BW=323MiB/s 
(339MB/s)(56.8GiB/180002msec)
fio.test1 fio.test fio.test.writearound.rw.log:   read: IOPS=82.8k, BW=323MiB/s 
(339MB/s)(56.9GiB/180097msec)
fio.test2 fio.test fio.test.writearound.rw.log:   read: IOPS=79.6k, BW=311MiB/s 
(326MB/s)(54.6GiB/180003msec)
fio.test3 fio.test fio.test.writearound.rw.log:   read: IOPS=81.7k, BW=319MiB/s 
(335MB/s)(56.1GiB/180056msec)
fio.test1 fio.orig fio.test.writeback.randrw.log:   read: IOPS=81, BW=327KiB/s 
(335kB/s)(57.5MiB/180146msec)
fio.test2 fio.orig fio.test.writeback.randrw.log:   read: IOPS=79, BW=318KiB/s 
(326kB/s)(55.9MiB/180075msec)
fio.test3 fio.orig fio.test.writeback.randrw.log:   read: IOPS=80, BW=321KiB/s 
(329kB/s)(56.5MiB/180103msec)
fio.test1 fio.test fio.test.writeback.randrw.log:   read: IOPS=80, BW=324KiB/s 
(331kB/s)(56.9MiB/180082msec)
fio.test2 fio.test fio.test.writeback.randrw.log:   read: IOPS=79, BW=317KiB/s 
(325kB/s)(55.8MiB/180106msec)
fio.test3 fio.test fio.test.writeback.randrw.log:   read: IOPS=79, BW=316KiB/s 
(324kB/s)(55.6MiB/180144msec)
fio.test1 fio.orig fio.test.writeback.rw.log:   read: IOPS=72.4k, BW=283MiB/s 
(297MB/s)(49.7GiB/180097msec)
fio.test2 fio.orig fio.test.writeback.rw.log:   read: IOPS=97.6k, BW=381MiB/s 
(400MB/s)(67.1GiB/180238msec)
fio.test3 fio.orig fio.test.writeback.rw.log:   read: IOPS=82.9k, BW=324MiB/s 
(339MB/s)(56.9GiB/180002msec)
fio.test1 fio.test fio.test.writeback.rw.log:   read: IOPS=81.5k, BW=318MiB/s 
(334MB/s)(55.0GiB/180002msec)
fio.test2 fio.test fio.test.writeback.rw.log:   read: IOPS=72.1k, BW=282MiB/s 
(295MB/s)(49.5GiB/180165msec)
fio.test3 fio.test fio.test.writeback.rw.log:   read: IOPS=81.8k, BW=319MiB/s 
(335MB/s)(56.2GiB/180037msec)
fio.test1 fio.orig fio.test.writethrough.randrw.log:   read: IOPS=80, 
BW=321KiB/s (329kB/s)(56.5MiB/180137msec)
fio.test2 fio.orig fio.test.writethrough.randrw

[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
** Attachment added: "test-bcache-cache-modes.xfstests.sh"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+attachment/5275531/+files/test-bcache-cache-modes.xfstests.sh

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
 I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
 specially in writeback mode, can lead to data loss and/or
 the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
 even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of almost every
 patch with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
 mode (writethrough, writeback, writearound, none) with the
 xfstests test suite (on ext4), fio (random read-write) and
 iozone (several read/write tests).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
 in bcache code, however, synthetic testing of the patches
 has been performed, and extensive regression/stress tests
 were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
 patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
 (e.g., BootStack, UA cloud, except rare all-flash cases).

  
  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829563] Re: bcache: risk of data loss on I/O errors in backing or caching devices

2019-07-07 Thread Mauricio Faria de Oliveira
Tests with xfstests
===

The xfstests test suite has been run on the original
and modified kernels -- on all cache modes available
(writethrough, writeback, writearound, none).

Summary: no regressions observed; apparently one FIX
achieved in the writearound cache mode (generic/371).

(attached script: 'test-bcache-cache-modes.sh')

Details:
---

The bcache device is made of
- caching device: 1 GB partition in NVMe drive  
- backing device: 5 GB partition in SCSI disk

ubuntu@buneary:~/xfstests-dev$ git log --oneline -1
f769a923 (HEAD -> master, origin/master, origin/HEAD) xfs: project quota 
ineritance flag test

The comparison of the output of xfstests/check script
for the 'Failures' with original (orig) and modified
(test) test kernels is provided below.

# grep ^Failures: xfstests.*/xfstests.*.*.log | sed 's,/, ,' | sort -k 2,2 -k 
1,1
xfstests.orig xfstests.test.none.log:Failures: ext4/032 generic/484 generic/491 
generic/504 generic/537
xfstests.test xfstests.test.none.log:Failures: ext4/032 generic/484 generic/491 
generic/504 generic/537
xfstests.orig xfstests.test.writearound.log:Failures: ext4/032 generic/371 
generic/451 generic/484 generic/491 generic/504 generic/537
xfstests.test xfstests.test.writearound.log:Failures: ext4/032 <.> 
generic/451 generic/484 generic/491 generic/504 generic/537
xfstests.orig xfstests.test.writeback.log:Failures: generic/484 generic/491 
generic/504 generic/537
xfstests.test xfstests.test.writeback.log:Failures: generic/484 generic/491 
generic/504 generic/537
xfstests.orig xfstests.test.writethrough.log:Failures: ext4/032 generic/484 
generic/491 generic/504 generic/537
xfstests.test xfstests.test.writethrough.log:Failures: ext4/032 generic/484 
generic/491 generic/504 generic/537

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [Impact]

   * The bcache code in Bionic lacks several fixes to handle
 I/O errors in both backing devices and caching devices.

   * Partial or permanent errors in backing or caching devices,
 specially in writeback mode, can lead to data loss and/or
 the application is not notified about failed I/O requests.

   * The bcache device might remain available for I/O requests
 even if backing device is offline, so writes are undefined.

  [Test Case]

   * Detailed test cases/steps for the behavior of almost every
 patch with code logic changes are provided in bug comments.

   * The patchset has been tested for regressions on each cache
 mode (writethrough, writeback, writearound, none) with the
 xfstests test suite (on ext4), fio (random read-write) and
 iozone (several read/write tests).

  [Regression Potential]

   * The patchset is relatively large and touches several areas
 in bcache code, however, synthetic testing of the patches
 has been performed, and extensive regression/stress tests
 were run (as mentioned in Test Case section).

   * Many patches in the patchset are 'Fixes' patches to other
 patches, and no further 'Fixes' currently exist upstream.

  [Other Info]

   * Canonical Field Eng. deploys bcache+writeback extensively
 (e.g., BootStack, UA cloud, except rare all-flash cases).

  
  [Original Bug Description]

  This is a request for a backport of the following upstream patch from
  4.18:

  "bcache: stop bcache device when backing device is offline"
  
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee

  Field engineering uses bcache quite extensively and it would be good
  to have this in the GA/bionic kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp