Re: [PATCH 3.13.y-ckt 10/60] md/raid10: always set reshape_safe when initializing reshape_position.
On Tue, 2015-09-01 at 17:57 -0700, Kamal Mostafa wrote: > 3.13.11-ckt26 -stable review patch. If anyone has any objections, please let > me know. I'm deferring this commit until the next 3.13-stable release (along with "md: flush ->event_work before stopping array.") as per the guidance on their cc: stable lines. -Kamal > -- > > From: NeilBrown > > commit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 upstream. > > 'reshape_position' tracks where in the reshape we have reached. > 'reshape_safe' tracks where in the reshape we have safely recorded > in the metadata. > > These are compared to determine when to update the metadata. > So it is important that reshape_safe is initialised properly. > Currently it isn't. When starting a reshape from the beginning > it usually has the correct value by luck. But when reducing the > number of devices in a RAID10, it has the wrong value and this leads > to the metadata not being updated correctly. > This can lead to corruption if the reshape is not allowed to complete. > > This patch is suitable for any -stable kernel which supports RAID10 > reshape, which is 3.5 and later. > > Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support") > Signed-off-by: NeilBrown > Signed-off-by: Kamal Mostafa > --- > drivers/md/raid10.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 1b707ad..b8215a3 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -3597,6 +3597,7 @@ static struct r10conf *setup_conf(struct mddev *mddev) > /* far_copies must be 1 */ > conf->prev.stride = conf->dev_sectors; > } > + conf->reshape_safe = conf->reshape_progress; > spin_lock_init(>device_lock); > INIT_LIST_HEAD(>retry_list); > > @@ -3804,7 +3805,6 @@ static int run(struct mddev *mddev) > } > conf->offset_diff = min_offset_diff; > > - conf->reshape_safe = conf->reshape_progress; > clear_bit(MD_RECOVERY_SYNC, >recovery); > clear_bit(MD_RECOVERY_CHECK, >recovery); > set_bit(MD_RECOVERY_RESHAPE, >recovery); > @@ -4149,6 +4149,7 @@ static int raid10_start_reshape(struct mddev *mddev) > conf->reshape_progress = size; > } else > conf->reshape_progress = 0; > + conf->reshape_safe = conf->reshape_progress; > spin_unlock_irq(>device_lock); > > if (mddev->delta_disks && mddev->bitmap) { > @@ -4215,6 +4216,7 @@ abort: > rdev->new_data_offset = rdev->data_offset; > smp_wmb(); > conf->reshape_progress = MaxSector; > + conf->reshape_safe = MaxSector; > mddev->reshape_position = MaxSector; > spin_unlock_irq(>device_lock); > return ret; > @@ -4566,6 +4568,7 @@ static void end_reshape(struct r10conf *conf) > md_finish_reshape(conf->mddev); > smp_wmb(); > conf->reshape_progress = MaxSector; > + conf->reshape_safe = MaxSector; > spin_unlock_irq(>device_lock); > > /* read-ahead size must cover two whole stripes, which is -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3.13.y-ckt 10/60] md/raid10: always set reshape_safe when initializing reshape_position.
On Tue, 2015-09-01 at 17:57 -0700, Kamal Mostafa wrote: > 3.13.11-ckt26 -stable review patch. If anyone has any objections, please let > me know. I'm deferring this commit until the next 3.13-stable release (along with "md: flush ->event_work before stopping array.") as per the guidance on their cc: stable lines. -Kamal > -- > > From: NeilBrown> > commit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 upstream. > > 'reshape_position' tracks where in the reshape we have reached. > 'reshape_safe' tracks where in the reshape we have safely recorded > in the metadata. > > These are compared to determine when to update the metadata. > So it is important that reshape_safe is initialised properly. > Currently it isn't. When starting a reshape from the beginning > it usually has the correct value by luck. But when reducing the > number of devices in a RAID10, it has the wrong value and this leads > to the metadata not being updated correctly. > This can lead to corruption if the reshape is not allowed to complete. > > This patch is suitable for any -stable kernel which supports RAID10 > reshape, which is 3.5 and later. > > Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support") > Signed-off-by: NeilBrown > Signed-off-by: Kamal Mostafa > --- > drivers/md/raid10.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 1b707ad..b8215a3 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -3597,6 +3597,7 @@ static struct r10conf *setup_conf(struct mddev *mddev) > /* far_copies must be 1 */ > conf->prev.stride = conf->dev_sectors; > } > + conf->reshape_safe = conf->reshape_progress; > spin_lock_init(>device_lock); > INIT_LIST_HEAD(>retry_list); > > @@ -3804,7 +3805,6 @@ static int run(struct mddev *mddev) > } > conf->offset_diff = min_offset_diff; > > - conf->reshape_safe = conf->reshape_progress; > clear_bit(MD_RECOVERY_SYNC, >recovery); > clear_bit(MD_RECOVERY_CHECK, >recovery); > set_bit(MD_RECOVERY_RESHAPE, >recovery); > @@ -4149,6 +4149,7 @@ static int raid10_start_reshape(struct mddev *mddev) > conf->reshape_progress = size; > } else > conf->reshape_progress = 0; > + conf->reshape_safe = conf->reshape_progress; > spin_unlock_irq(>device_lock); > > if (mddev->delta_disks && mddev->bitmap) { > @@ -4215,6 +4216,7 @@ abort: > rdev->new_data_offset = rdev->data_offset; > smp_wmb(); > conf->reshape_progress = MaxSector; > + conf->reshape_safe = MaxSector; > mddev->reshape_position = MaxSector; > spin_unlock_irq(>device_lock); > return ret; > @@ -4566,6 +4568,7 @@ static void end_reshape(struct r10conf *conf) > md_finish_reshape(conf->mddev); > smp_wmb(); > conf->reshape_progress = MaxSector; > + conf->reshape_safe = MaxSector; > spin_unlock_irq(>device_lock); > > /* read-ahead size must cover two whole stripes, which is -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.13.y-ckt 10/60] md/raid10: always set reshape_safe when initializing reshape_position.
3.13.11-ckt26 -stable review patch. If anyone has any objections, please let me know. -- From: NeilBrown commit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 upstream. 'reshape_position' tracks where in the reshape we have reached. 'reshape_safe' tracks where in the reshape we have safely recorded in the metadata. These are compared to determine when to update the metadata. So it is important that reshape_safe is initialised properly. Currently it isn't. When starting a reshape from the beginning it usually has the correct value by luck. But when reducing the number of devices in a RAID10, it has the wrong value and this leads to the metadata not being updated correctly. This can lead to corruption if the reshape is not allowed to complete. This patch is suitable for any -stable kernel which supports RAID10 reshape, which is 3.5 and later. Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support") Signed-off-by: NeilBrown Signed-off-by: Kamal Mostafa --- drivers/md/raid10.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 1b707ad..b8215a3 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -3597,6 +3597,7 @@ static struct r10conf *setup_conf(struct mddev *mddev) /* far_copies must be 1 */ conf->prev.stride = conf->dev_sectors; } + conf->reshape_safe = conf->reshape_progress; spin_lock_init(>device_lock); INIT_LIST_HEAD(>retry_list); @@ -3804,7 +3805,6 @@ static int run(struct mddev *mddev) } conf->offset_diff = min_offset_diff; - conf->reshape_safe = conf->reshape_progress; clear_bit(MD_RECOVERY_SYNC, >recovery); clear_bit(MD_RECOVERY_CHECK, >recovery); set_bit(MD_RECOVERY_RESHAPE, >recovery); @@ -4149,6 +4149,7 @@ static int raid10_start_reshape(struct mddev *mddev) conf->reshape_progress = size; } else conf->reshape_progress = 0; + conf->reshape_safe = conf->reshape_progress; spin_unlock_irq(>device_lock); if (mddev->delta_disks && mddev->bitmap) { @@ -4215,6 +4216,7 @@ abort: rdev->new_data_offset = rdev->data_offset; smp_wmb(); conf->reshape_progress = MaxSector; + conf->reshape_safe = MaxSector; mddev->reshape_position = MaxSector; spin_unlock_irq(>device_lock); return ret; @@ -4566,6 +4568,7 @@ static void end_reshape(struct r10conf *conf) md_finish_reshape(conf->mddev); smp_wmb(); conf->reshape_progress = MaxSector; + conf->reshape_safe = MaxSector; spin_unlock_irq(>device_lock); /* read-ahead size must cover two whole stripes, which is -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.13.y-ckt 10/60] md/raid10: always set reshape_safe when initializing reshape_position.
3.13.11-ckt26 -stable review patch. If anyone has any objections, please let me know. -- From: NeilBrowncommit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 upstream. 'reshape_position' tracks where in the reshape we have reached. 'reshape_safe' tracks where in the reshape we have safely recorded in the metadata. These are compared to determine when to update the metadata. So it is important that reshape_safe is initialised properly. Currently it isn't. When starting a reshape from the beginning it usually has the correct value by luck. But when reducing the number of devices in a RAID10, it has the wrong value and this leads to the metadata not being updated correctly. This can lead to corruption if the reshape is not allowed to complete. This patch is suitable for any -stable kernel which supports RAID10 reshape, which is 3.5 and later. Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support") Signed-off-by: NeilBrown Signed-off-by: Kamal Mostafa --- drivers/md/raid10.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 1b707ad..b8215a3 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -3597,6 +3597,7 @@ static struct r10conf *setup_conf(struct mddev *mddev) /* far_copies must be 1 */ conf->prev.stride = conf->dev_sectors; } + conf->reshape_safe = conf->reshape_progress; spin_lock_init(>device_lock); INIT_LIST_HEAD(>retry_list); @@ -3804,7 +3805,6 @@ static int run(struct mddev *mddev) } conf->offset_diff = min_offset_diff; - conf->reshape_safe = conf->reshape_progress; clear_bit(MD_RECOVERY_SYNC, >recovery); clear_bit(MD_RECOVERY_CHECK, >recovery); set_bit(MD_RECOVERY_RESHAPE, >recovery); @@ -4149,6 +4149,7 @@ static int raid10_start_reshape(struct mddev *mddev) conf->reshape_progress = size; } else conf->reshape_progress = 0; + conf->reshape_safe = conf->reshape_progress; spin_unlock_irq(>device_lock); if (mddev->delta_disks && mddev->bitmap) { @@ -4215,6 +4216,7 @@ abort: rdev->new_data_offset = rdev->data_offset; smp_wmb(); conf->reshape_progress = MaxSector; + conf->reshape_safe = MaxSector; mddev->reshape_position = MaxSector; spin_unlock_irq(>device_lock); return ret; @@ -4566,6 +4568,7 @@ static void end_reshape(struct r10conf *conf) md_finish_reshape(conf->mddev); smp_wmb(); conf->reshape_progress = MaxSector; + conf->reshape_safe = MaxSector; spin_unlock_irq(>device_lock); /* read-ahead size must cover two whole stripes, which is -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/