[PATCH 1/2] scsi: convert unrecovered read error to -EILSEQ
It is quite easily to get URE after power failure and get scary message. URE is happens due to internal drive crc mismatch due to partial sector update. Most people interpret such message as "My drive is dying", which isreasonable assumption if your dmesg is full of complain from disks and read(2) return EIO. In fact this error is not fatal. One can fix it easily by rewriting affected sector. So we have to handle URE like follows: - Return EILSEQ to signall caller that this is bad data related problem - Do not retry command, because this is useless. ### Test case #Test uses two HDD: disks sdb sdc #Write_phase # let fio work ~100sec and then cut the power fio --ioengine=libaio --direct=1 --rw=write --bs=1M --iodepth=16 \ --time_based=1 --runtime=600 --filesize=1G --size=1T \ --name /dev/sdb --name /dev/sdc # Check_phase after system goes back fio --ioengine=libaio --direct=1 --group_reporting --rw=read --bs=1M \ --iodepth=16 --size=1G --filesize=1G --name=/dev/sdb --name /dev/sdc More info about URE probability here: https://plus.google.com/101761226576930717211/posts/Pctq7kk1dLL Signed-off-by: Dmitry Monakhov --- drivers/scsi/scsi_lib.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 19125d7..59d64ad 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -961,6 +961,19 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) /* See SSC3rXX or current. */ action = ACTION_FAIL; break; + case MEDIUM_ERROR: + if (sshdr.asc == 0x11) { + /* Handle unrecovered read error */ + switch (sshdr.ascq) { + case 0x00: /* URE */ + case 0x04: /* URE auto reallocate failed */ + case 0x0B: /* URE recommend reassignment*/ + case 0x0C: /* URE recommend rewrite the data */ + action = ACTION_FAIL; + error = -EILSEQ; + break; + } + } default: action = ACTION_FAIL; break; -- 2.9.3
[PATCH 2/2] block: Improve error handling verbosity
EILSEQ is returned due to internal csum error on disk/fabric, let's add special message to distinguish it from others. Also dump original numerical error code. Signed-off-by: Dmitry Monakhov --- block/blk-core.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 071a998..8eab846 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2576,13 +2576,16 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes) case -ENODATA: error_type = "critical medium"; break; + case -EILSEQ: + error_type = "bad data"; + break; case -EIO: default: error_type = "I/O"; break; } - printk_ratelimited(KERN_ERR "%s: %s error, dev %s, sector %llu\n", - __func__, error_type, req->rq_disk ? + printk_ratelimited(KERN_ERR "%s: %s error (%d), dev %s, sector %llu\n", + __func__, error_type, error, req->rq_disk ? req->rq_disk->disk_name : "?", (unsigned long long)blk_rq_pos(req)); -- 2.9.3
Re: [PATCH 0/3] Improve block device testing coverage
Eryu Guan writes: > On Fri, Mar 31, 2017 at 10:43:19AM +0300, Dmitry Monakhov wrote: >> Christoph Hellwig writes: >> >> > On Thu, Mar 30, 2017 at 05:19:01PM +0400, Dmitry Monakhov wrote: >> >> During LSFMM we have discussed how to test lower-backend of linux >> >> IO-stack. >> >> Common opinion was that xfstests is the most obvious solution which cover >> >> most of use cases filesystem care about. >> >> >> >> I'm working on integration T10-DIF/DIF data integrity features to ext4, >> >> for that reason we need to be shure that linux integrity framework is >> >> in working state, which is currently broken in several places. >> >> >> >> In fact, it is relatively simple to add basic coverage tests for basic >> >> IO operations over virtual device with integrity support. All we need >> >> is to add lio target support. >> > >> > First: Thanks for adding block layer testing! >> > >> > Second: even more so than Darrick's blockdev fallocate test this is >> > the wrong place. If I run xfstests I want to test my file system, >> > not random block device features. Please start a proper block device >> > testsuite instead, possibly by copy and pasting code from xfstests. >> Fair enough. I also not happy to place blkdev feature to tests/generic >> namespace. But altearnative to fork xfstests infrastructure to dedicated >> test-framework only for blkdevice seems not very good. Because fork is >> always pain. I already maintain one internal fork of xfstests which >> tests our Vituozzo's speciffic features. >> >> May be it would be reasonable idea to add didicated namespace >> 'tests/blockdev' in xfstests, and move all blkdev related tests here? >> IMHO this is good idea. Because filesystem relay on some basic >> features from blkdev which should be tested explicitly, because >> implicit testing is too hard to debug/investigation. > > I'm not sure if xfstests is the right place for blockdev tests, > especially for the pure blockdev level features (at least Darrick's > blockdev tests are testing fallocate(2) interface). Another good example may be a bug with dirty page cache after blkdiscard https://lkml.org/lkml/2017/3/22/789 . This simple bug result in crappy fsimage if mkfs relay on discard_zeroes_data behaviour. So IMHO basic blkdev test coverage is important filesystem testing. i.e. important for xfstests. > > But yeah, a new tests/blockdev dir would be good if we eventually decide > adding blockdev tests to xfstests, so they're not run by default when > people want to test filesystems. > > Thanks, > Eryu
Re: [PATCH 0/3] Improve block device testing coverage
Christoph Hellwig writes: > On Thu, Mar 30, 2017 at 05:19:01PM +0400, Dmitry Monakhov wrote: >> During LSFMM we have discussed how to test lower-backend of linux IO-stack. >> Common opinion was that xfstests is the most obvious solution which cover >> most of use cases filesystem care about. >> >> I'm working on integration T10-DIF/DIF data integrity features to ext4, >> for that reason we need to be shure that linux integrity framework is >> in working state, which is currently broken in several places. >> >> In fact, it is relatively simple to add basic coverage tests for basic >> IO operations over virtual device with integrity support. All we need >> is to add lio target support. > > First: Thanks for adding block layer testing! > > Second: even more so than Darrick's blockdev fallocate test this is > the wrong place. If I run xfstests I want to test my file system, > not random block device features. Please start a proper block device > testsuite instead, possibly by copy and pasting code from xfstests. Fair enough. I also not happy to place blkdev feature to tests/generic namespace. But altearnative to fork xfstests infrastructure to dedicated test-framework only for blkdevice seems not very good. Because fork is always pain. I already maintain one internal fork of xfstests which tests our Vituozzo's speciffic features. May be it would be reasonable idea to add didicated namespace 'tests/blockdev' in xfstests, and move all blkdev related tests here? IMHO this is good idea. Because filesystem relay on some basic features from blkdev which should be tested explicitly, because implicit testing is too hard to debug/investigation. > > That's how I started the test suite for qemu's block layer for example. Do you mean qemu/tests/qemu-iotests ?
[PATCH 2/3] add test: generic/420 check information lead for lio-fileio
Signed-off-by: Dmitry Monakhov --- tests/generic/420 | 77 +++ tests/generic/420.out | 2 ++ tests/generic/group | 1 + 3 files changed, 80 insertions(+) create mode 100755 tests/generic/420 create mode 100644 tests/generic/420.out diff --git a/tests/generic/420 b/tests/generic/420 new file mode 100755 index 000..592703d --- /dev/null +++ b/tests/generic/420 @@ -0,0 +1,77 @@ +#! /bin/bash +# FS QA Test 420 +# +# Regression test for information leak for lio-fileio target +# BUG: If image file is less than virtual blockdev then read() may return +# unitilized data. +# +#--- +# Copyright (c) 2017 Dmitry Monakhov +# All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + + +_cleanup() +{ + cd / + rm -f $tmp.* + _liotgt_cleanup + rm -rf $TEST_DIR/$$ +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/liotarget + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here + +# Modify as appropriate. +_supported_fs generic +_supported_os Linux +_require_test +_require_liotarget + +mkdir -p $TEST_DIR/$$ || _fail "Can not make test dir" + +cfg_path=$(_liotgt_create_fileio 420-test.img $TEST_DIR/$$/420-test.img 128M) +dev=$(_liotgt_attach_target /backstores/fileio/420-test.img) + +$XFS_IO_PROG -f -c "truncate 1M" $TEST_DIR/$$/420-test.img >> $seqres.full + +$XFS_IO_PROG -c "pwrite -S 0xa0 -b 512 512 1k" -d $dev >>$seqres.full 2>&1 || \ +_fail "pwrite failed" + +$XFS_IO_PROG -c "pwrite -S 0xab -b 512 2M 1k" -d $dev >>$seqres.full 2>&1 || \ +_fail "pwrite failed" + +md5sum $dev | sed -e "s,$dev,loopdev,g" +# success, all done +status=0 +exit diff --git a/tests/generic/420.out b/tests/generic/420.out new file mode 100644 index 000..deed5da --- /dev/null +++ b/tests/generic/420.out @@ -0,0 +1,2 @@ +QA output created by 420 +f5cfb0d8d7bfadbc2127f4f6ca4a0eba loopdev diff --git a/tests/generic/group b/tests/generic/group index 0781f35..1261547 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -422,3 +422,4 @@ 417 auto quick shutdown log 418 auto rw 419 auto quick encrypt +420 auto blockdev liotarget -- 2.9.3
[PATCH 1/3] add lio-target helpers
Linux-IO Target is very good framework for testing block backend. It is more flexible than scsi_debug. http://linux-iscsi.org/wiki/LIO Signed-off-by: Dmitry Monakhov --- common/config| 2 + common/liotarget | 111 +++ 2 files changed, 113 insertions(+) create mode 100644 common/liotarget diff --git a/common/config b/common/config index 59041a3..cfe7913 100644 --- a/common/config +++ b/common/config @@ -212,6 +212,8 @@ export XZ_PROG="`set_prog_path xz`" export FLOCK_PROG="`set_prog_path flock`" export LDD_PROG="`set_prog_path ldd`" export TIMEOUT_PROG="`set_prog_path timeout`" +export TARGETCLI_PROG="`set_prog_path targetcli`" +export TARGETCTL_PROG="`set_prog_path targetctl`" # use 'udevadm settle' or 'udevsettle' to wait for lv to be settled. # newer systems have udevadm command but older systems like RHEL5 don't. diff --git a/common/liotarget b/common/liotarget new file mode 100644 index 000..f821692 --- /dev/null +++ b/common/liotarget @@ -0,0 +1,111 @@ +#!/bin/bash +# +# Copyright (c) 2017 Virtuozzo Inc +# All Rights Reserved. +# +# Written by Dmitry Monakhov +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# +# Functions for Linux-IO Target manipulation +# + +_require_liotarget() +{ + which $TARGETCLI_PROG >>$seqres.full 2>&1 || \ + _notrun "this test requires 'targetcli' tool" + which $TARGETCLI_PROG >>$seqres.full 2>&1 || \ + _notrun "this test requires 'targetcli' tool" + + $TARGETCLI_PROG ls /backstores/ramdisk >>$seqres.full 2>&1 ||\ + _notrun "kernel compiled w/o CONFIG_TARGET_CORE" + $TARGETCLI_PROG ls /backstores/fileio >>$seqres.full 2>&1 ||\ + _notrun "kernel compiled w/o CONFIG_TCM_FILEIO" + $TARGETCLI_PROG ls /loopback >>$seqres.full 2>&1 ||\ + _notrun "kernel compiled w/o CONFIG_LOOPBACK_TARGET" +} + +_liotgt_create_fileio() +{ + local name=$1 + local img=$2 + local sz=$3 + + $TARGETCLI_PROG /backstores/fileio create \ + name=$name file_or_dev=$img size=$sz >>$seqres.full ||\ + _fail "Can not create /backstores/fileio/$name" + + local cfg_path=`ls -d /sys/kernel/config/target/core/fileio_*/$name` + [ -d "$cfg_path" ] || _fail "Bad config path" + echo $cfg_path +} + +_liotgt_set_attribute() +{ + local path=$1 + local attr_name=$2 + local attr_val=$3 + + [ -f $path/attrib/$attr_name ] || _fail "Bad attribute $attr_name" + echo "echo $attr_val > $path/attrib/$attr_name " >>$seqres.full + echo $attr_val > $path/attrib/$attr_name +} + +_liotgt_create_loopback() +{ + local out=`$TARGETCLI_PROG /loopback/ create 2>&1` + [ $? -eq 0 ] || _fail "Can not create loopback target" + echo $out >>$seqres.full + + local naa=`echo $out | gawk '{ print $3 }'` + echo ${naa:0:20} >>$seqres.full + + echo ${naa:0:20} +} + +_liotgt_find_dev() +{ + local found="" + local name=$1 + local drives=`find /sys/devices -type f -name model` + for d in $drives;do + local dir=`dirname $d` + local vendor=`cat $dir/vendor` + local model=`cat $dir/model` + if [ "${vendor//[[:space:]]/}" == "LIO-ORG" ] && \ + [ "${model//[[:space:]]/}" == "$name" ]; then + found=/dev/$(ls $dir/block) + break + fi + done + [ -z "$found" ] && _fail "Can not find device with backend $name" + echo "$found" +} + +_liotgt_attach_target() +{ + local path=$1 + local name=`basename $path` + local naa=$(_liotgt_create_loopback) + + $TARGETCLI_PROG /loopback/$naa/luns create $path >>$seqres.full 2>&1 ||\ + _fail "Can not attach $path to tcm_loop" + _liotgt_find_dev $name +} + +_liotgt_cleanup() +{ + $TARGETCTL_PROG clear +} -- 2.9.3
[PATCH 3/3] add test: generic/421 basic blockdev T10-DIF-TYPE1 integrity checks
Test create virtual block device via lio-targed infastructure and perform basic IO operations with data corruption detection. Temprorally mark is as dangerous, because currently it trigger BUG_ON inside blkdev_issue_flush BTW: I use 'dd' to test read from corrupted image instead of xfs_io because even if pread failed, xfs_io still exit with success, BUG? Signed-off-by: Dmitry Monakhov --- tests/generic/421 | 136 ++ tests/generic/421.out | 16 ++ tests/generic/group | 1 + 3 files changed, 153 insertions(+) create mode 100644 tests/generic/421 create mode 100644 tests/generic/421.out diff --git a/tests/generic/421 b/tests/generic/421 new file mode 100644 index 000..de1ba73 --- /dev/null +++ b/tests/generic/421 @@ -0,0 +1,136 @@ +#! /bin/bash +# FS QA Test 421 +# +# Check basic T10-DIF integrity features for a block device +# +# DIF/DIX TYPE: T10-DIF-TYPE1-CRC +# Kernel docs: Documentation/block/data-integrity.txt +#--- +# Copyright (c) 2017 Dmitry Monakhov +# All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + + +_cleanup() +{ + cd / + rm -f $tmp.* + _liotgt_cleanup + rm -rf $TEST_DIR/$$ +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/liotarget + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here + +# Modify as appropriate. +_supported_fs generic +_supported_os Linux +_require_test +_require_liotarget + +mkdir -p $TEST_DIR/$$ || _fail "Can not make test dir" + +# Create virtual block device +cfg_path=$(_liotgt_create_fileio t10-dif-t1 $TEST_DIR/$$/t10-dif-t1 32M) + +_liotgt_set_attribute $cfg_path pi_prot_type 1 +_liotgt_set_attribute $cfg_path pi_prot_format 1 + +dev=$(_liotgt_attach_target /backstores/fileio/t10-dif-t1) + +echo "T0: Test basic IO" +$XFS_IO_PROG -c "pwrite -S 0xa0 -b 4M 0 16M" -d $dev >>$seqres.full 2>&1 || \ +_fail "pwrite failed" + +$XFS_IO_PROG -c "pwrite -S 0xa1 -b 1k -V8 20M 32k" -d $dev >>$seqres.full 2>&1 || \ +_fail "pwrite failed" + +$XFS_IO_PROG -c "pwrite -S 0xa2 -b 1k -V8 2M 32k" -f $dev >>$seqres.full 2>&1 || \ +_fail "pwrite failed" + +$XFS_IO_PROG -c "pwrite -S 0xa3 -b 4k 1536000 8k" -f $dev >>$seqres.full 2>&1 || \ +_fail "pwrite failed" + +$XFS_IO_PROG -c "fsync" -d $dev >>$seqres.full 2>&1 || _fail "fsync failed" + +echo "Check that buffered and direct read works" +dd if=$dev bs=4k 2>$seqres.full | md5sum +dd if=$dev bs=4M iflag=direct 2>$seqres.full | md5sum + +echo "Check csum corruption detection" +# LIO-fileio store t10 DIF data in separate file ${IMG}.protection +# struct t10_pi_tuple { +# __be16 guard_tag; /* Checksum */ +# __be16 app_tag; /* Opaque storage */ +# __be32 ref_tag; /* Target LBA or indirect LBA */ +#} +# Play with 3000'th sector -> t10_pi_tuple offset == 3000 * 8 == 24000 +# +echo "T1: Corrupt guard_tag, next read should fail" +$XFS_IO_PROG -c "pwrite -S 0xde -b2 24000 2 -w" \ +-f $TEST_DIR/$$/t10-dif-t1.protection >>$seqres.full 2>&1 +dd if=$dev bs=1M count=2 iflag=direct >>$seqres.full 2>&1 && +_fail "read should fail on 3000'th sector" + +echo "T2: Check that unaffected blocks are still readable" +dd if=$dev bs=1M count=1 iflag=direct >>$seqres.full 2>&1 || _fail "read failed" + +echo "T3: Rewrite corrupted sector and check that read works" +$XFS_IO_PROG -c "pwrite -S 0xa3 -b 4k 1536000 4k" -d $dev >>$seqres.full 2>&1 || \ +_fail "pwrite failed" +dd if=$dev bs=2M count=1 iflag=direct >>$seqres.full 2>&1 || _fail "read failed&
[PATCH 0/3] Improve block device testing coverage
During LSFMM we have discussed how to test lower-backend of linux IO-stack. Common opinion was that xfstests is the most obvious solution which cover most of use cases filesystem care about. I'm working on integration T10-DIF/DIF data integrity features to ext4, for that reason we need to be shure that linux integrity framework is in working state, which is currently broken in several places. In fact, it is relatively simple to add basic coverage tests for basic IO operations over virtual device with integrity support. All we need is to add lio target support. TOC: add lio target helpers add test: generic/420 check information lead for lio-fileio add test: generic/421 basic blockdev T10-DIF-TYPE1 IO
HDD Unrecovered readerror issue
Drive:WDC WD1003FZEX-00MK2A0 I have got this in logs: ata1.00: failed command: READ FPDMA QUEUED ata1.00: cmd 60/a0:a0:f0:c0:c5/00:00:04:00:00/40 tag 20 ncq 81920 in res 41/40:00:88:c1:c5/00:00:04:00:00/00 Emask 0x409 (media error) ata1.00: status: { DRDY ERR } ata1.00: error: { UNC } ata1.00: configured for UDMA/133 sd 0:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE sd 0:0:0:0: [sda] Sense Key : Medium Error[current] [descriptor] sd 0:0:0:0: [sda] Add. Sense: Unrecovered readerror - auto reallocate failed sd 0:0:0:0: [sda] CDB: Read(10) 28 00 04 c5 c0 f0 00 00 a0 00 blk_update_request: I/O error, dev sda, sector 80069000 ata1: EH complete I can reproduce this easily #xfs_io -c "pread $((80069000/2))k 4k" -d /dev/sda pread64: Input/output error ##Got EIO ##Smartctl also detect this #smartctl -t short /dev/sda #smartctl -l selftest /dev/sda Short offline Completed: read failure 90% 4682 80069000 But once I rewrite this block, problem goes away. #xfs_io -c "pwrite -S 0x0 $((80069000/2))k 4k" -d /dev/sda Now I can read it w/o any errors and smartctl is happy #smartctl -t short /dev/sda #smartctl -l selftest /dev/sda Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 4683 - So my disk is not dead right? Why the hell HDD fail read from very beginning Is this because HDD firmware detect internal crcXX sum corruption? How this can happen? Is this because of power failure? AFAIK standard guarantees that sector will be updated atomically. But it happens! Please guide me how to fix such problems in general. signature.asc Description: PGP signature