On Tue, Mar 29, 2016 at 10:22:29PM +0800, Anand Jain wrote: > Write and Flush errors are considered as critical errors, > upon which the device will be brought offline and marked as > failed. Write and Flush errors are identified using device > error statistics. > > Signed-off-by: Anand Jain <anand.j...@oracle.com> > > btrfs: check for failed device and hot replace > > This patch creates casualty_kthread to check for the failed > devices, and triggers device replace. > > Signed-off-by: Anand Jain <anand.j...@oracle.com> > --- > fs/btrfs/ctree.h | 2 + > fs/btrfs/disk-io.c | 161 > ++++++++++++++++++++++++++++++++++++++++++++++++++++- > fs/btrfs/disk-io.h | 2 + > fs/btrfs/volumes.c | 1 + > fs/btrfs/volumes.h | 4 ++ > 5 files changed, 169 insertions(+), 1 deletion(-)
btrfs_check_and_handle_casualty() tries to perfom auto-replacement only once after each failure. If no hotspare was added in system before failure, only one remaining way to replace drive is to perform replace manually. This sounds reasonable, so just clarification: are you sure that we shouldn't start autoreplacement if hotspare will be added after drive failure? V1 of the patchset tried to perform autoreplace endlessly until replace drive is added. -- Yauhen Kharuzhy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html