I/O Error Test 1
================

commit "bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags"

Problem: the cacheset is not retired immediately on I/O errors in
cache device if I/O requests keep coming.

Original kernel: bcache device remains on top of caching device,
and 'fio' never finishes.

Modified kernel: bcache device is removed from caching device,
and fio finishes.

Original
--------

# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019

# ./setup.sh >/dev/null 2>&1
[  285.677682] bcache: register_bdev() registered backing device dm-0
[  285.697006] bcache: run_cache_set() invalidating existing data
[  285.710938] bcache: register_cache() registered cache device dm-1
[  287.686924] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
c589879b-b1c3-49b3-9603-9795ddc750f5

# lsblk -e 252
NAME         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0          7:0    0    1G  0 loop 
└─fake-loop0 253:0    0 1024M  0 dm   
  └─bcache0  251:0    0 1024M  0 disk 
loop1          7:1    0    1G  0 loop 
└─fake-loop1 253:1    0 1024M  0 dm   
  └─bcache0  251:0    0 1024M  0 disk 

# ./dm_fake_dev.sh /dev/loop1 bad
[  766.102586] Buffer I/O error on dev dm-1, logical block 262128, async page 
read
[  766.107602] Buffer I/O error on dev dm-1, logical block 262128, async page 
read
[  766.113889] bcache: register_bcache() error /dev/dm-1: device already 
registered

On another shell:

# fio --name=write --rw=randwrite --filename=/dev/bcache0 --bs=4k --iodepth=8 
--ioengine=libaio --runtime=300s --continue_on_error=all
write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=8
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [f(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]

< fio never finishes, keeps showing the last line above >
<  console prints as below when fio starts running .... >

[  777.451177] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
[  777.490882] bcache: error on c589879b-b1c3-49b3-9603-9795ddc750f5:
[  777.490885] journal io error
[  777.494087] , disabling caching
[  807.900700] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
(error msgs looping)

bcache0 still present on top of cache device (fake-loop1)

# lsblk -e 252
NAME         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0          7:0    0    1G  0 loop
└─fake-loop0 253:0    0 1024M  0 dm
  └─bcache0  251:0    0 1024M  0 disk
loop1          7:1    0    1G  0 loop
fake-loop1   253:1    0    1G  0 dm
└─bcache0    251:0    0 1024M  0 disk


Modified
--------

# uname -rv
4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37 
UTC


# ./setup.sh >/dev/null 2>&1
[   60.542088] bcache: register_bdev() registered backing device dm-0
[   60.550509] bcache: run_cache_set() invalidating existing data
[   60.560109] bcache: register_cache() registered cache device dm-1
[   62.548849] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 
f6833a2c-53e6-468e-bf1d-a9f48b73d783

# lsblk -e 252
NAME         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0          7:0    0    1G  0 loop 
└─fake-loop0 253:0    0 1024M  0 dm   
  └─bcache0  251:0    0 1024M  0 disk 
loop1          7:1    0    1G  0 loop 
└─fake-loop1 253:1    0 1024M  0 dm   
  └─bcache0  251:0    0 1024M  0 disk 

# ./dm_fake_dev.sh /dev/loop1 bad
[   72.639185] Buffer I/O error on dev dm-1, logical block 262128, async page 
read
[   72.644876] Buffer I/O error on dev dm-1, logical block 262128, async page 
read
[   72.650707] bcache: register_bcache() error /dev/dm-1: device already 
registered

On another shell:

# fio --name=write --rw=randwrite --filename=/dev/bcache0 --bs=4k
--iodepth=8 --ioengine=libaio --runtime=300s --continue_on_error=all

[   97.858468] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
[   97.868519] bcache: error on f6833a2c-53e6-468e-bf1d-a9f48b73d783:
[   97.868520] journal io error
[   97.869998] , disabling caching
[   97.871441] bcache: conditional_stop_bcache_device() 
stop_when_cache_set_failed of bcache0 is "auto" and cache is clean, keep it 
alive.
[   97.874423] Buffer I/O error on dev bcache0, logical block 2814, lost async 
page write
[   97.878697] Buffer I/O error on dev bcache0, logical block 2816, lost async 
page write
[   97.881702] Buffer I/O error on dev bcache0, logical block 2817, lost async 
page write
[   97.884790] Buffer I/O error on dev bcache0, logical block 2818, lost async 
page write
[   97.887709] Buffer I/O error on dev bcache0, logical block 2819, lost async 
page write
[   97.890558] Buffer I/O error on dev bcache0, logical block 2820, lost async 
page write
[   97.892419] Buffer I/O error on dev bcache0, logical block 2821, lost async 
page write
[   97.894228] Buffer I/O error on dev bcache0, logical block 2822, lost async 
page write
[   97.896107] Buffer I/O error on dev bcache0, logical block 2823, lost async 
page write
[   97.897900] Buffer I/O error on dev bcache0, logical block 2824, lost async 
page write
[   97.916818] bcache: cached_dev_detach_finish() Caching disabled for dm-0
[   97.918511] bcache: bch_count_io_errors() dm-1: IO error on writing btree, 
recovering
[   97.920581] bcache: cache_set_free() Cache set 
f6833a2c-53e6-468e-bf1d-a9f48b73d783 unregistered

fio finished:

write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=8
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [f(1)][100.0%][r=0KiB/s,w=135MiB/s][r=0,w=34.7k IOPS][eta 
00m:00s]
...
Run status group 0 (all jobs):
  WRITE: bw=219MiB/s (229MB/s), 219MiB/s-219MiB/s (229MB/s-229MB/s), io=1024MiB 
(1074MB), run=4685-4685msec

bcache not on top of caching device:

# lsblk -e 252
NAME         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0          7:0    0    1G  0 loop 
└─fake-loop0 253:0    0 1024M  0 dm   
  └─bcache0  251:0    0 1024M  0 disk 
loop1          7:1    0    1G  0 loop 
fake-loop1   253:1    0    1G  0 dm

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1829563

Title:
  bcache: risk of data loss on I/O errors in backing or caching devices

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to