Re: [Regression/Behavior change]dm-flakey corrupt read bio, even the feature is drop_writes

2016-08-23 Thread Qu Wenruo



At 08/22/2016 11:07 PM, Mike Snitzer wrote:

On Mon, Aug 22 2016 at  4:05am -0400,
Lukas Herbolt  wrote:


Hello,

There is patch from Mike. It's part of current pull request to 4.8-rc1
For more details check:
 - https://www.redhat.com/archives/dm-devel/2016-July/msg00561.html
 - https://www.redhat.com/archives/dm-devel/2016-August/msg00109.html

Lukas

On Mon, Aug 22, 2016 at 9:31 AM, Qu Wenruo  wrote:

Hi, Mike and btrfs and dm guys

When doing regression test on v4.8-rc1, we found that fstests/btrfs/056
always fails. With the following dmesg:
---
Buffer I/O error on dev dm-0, logical block 1310704, async page read
Buffer I/O error on dev dm-0, logical block 16, async page read
Buffer I/O error on dev dm-0, logical block 16, async page read
---

And bisect leads to the following commits:
---
commit 99f3c90d0d85708e7401a81ce3314e50bf7f2819
Author: Mike Snitzer 
Date:   Fri Jul 29 13:19:55 2016 -0400

dm flakey: error READ bios during the down_interval
---

While according to the document of dm-flakey, it says that when using
drop_writes feature, read bios are not affected:
---
  drop_writes:
All write I/O is silently ignored.
Read I/O is handled correctly.


I went back to the dm-flakey.c code at the time that the 'drop_writes'
feature was added via commit b26f5e3d.  It does confirm your
understanding of how reads should be handled if drop_writes is enabled.

Not sure why I thought differently.  Please try the following patch.


In fact, I quite understand the idea to corrupt READ bio, if drop_writes 
and corrupt_bio_byte is given at the same time.




diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index 97e446d..6a2e8dd 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -289,15 +289,13 @@ static int flakey_map(struct dm_target *ti, struct bio 
*bio)
pb->bio_submitted = true;

/*
-* Map reads as normal only if corrupt_bio_byte set.
+* Error reads if neither corrupt_bio_byte or drop_writes are 
set.
+* Otherwise, flakey_end_io() will decide if the reads should 
be modified.
 */
if (bio_data_dir(bio) == READ) {
-   /* If flags were specified, only corrupt those that 
match. */
-   if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) 
&&
-   all_corrupt_bio_flags_match(bio, fc))
-   goto map_bio;
-   else
+   if (!fc->corrupt_bio_byte && !test_bit(DROP_WRITES, 
>flags))
return -EIO;
+   goto map_bio;
}

/*
@@ -334,14 +332,21 @@ static int flakey_end_io(struct dm_target *ti, struct bio 
*bio, int error)
struct flakey_c *fc = ti->private;
struct per_bio_data *pb = dm_per_bio_data(bio, sizeof(struct 
per_bio_data));

-   /*
-* Corrupt successful READs while in down state.
-*/
if (!error && pb->bio_submitted && (bio_data_dir(bio) == READ)) {
-   if (fc->corrupt_bio_byte)
+   if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
+   all_corrupt_bio_flags_match(bio, fc)) {
+   /*
+* Corrupt successful matching READs while in down 
state.
+*/
corrupt_bio_data(bio, fc);
-   else
+
+   } else if (!test_bit(DROP_WRITES, >flags)) {
+   /*
+* Error read during the down_interval if drop_writes
+* wasn't configured.
+*/
return -EIO;
+   }
}

return error;



It looks good to me, as it can also handle corrupt_bio_byte and drop_writes.

Tested-by: Qu Wenruo 

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [dm-devel] [Regression/Behavior change]dm-flakey corrupt read bio, even the feature is drop_writes

2016-08-23 Thread Qu Wenruo

Hi Lukas,

Thanks for your patch, while I am a little concerned of it, even I'm a 
newbie to flakey code.


At 08/22/2016 10:53 PM, Lukas Herbolt wrote:

Hi Qu,

Sorry for the confusion. Reading the email again and the code it seems
that the READS are really returned as -EIO if you set the drop_writes.
I just tested it and you are right.

If I was reading the fstest correctly the flakey is created as:
---
flakey: 0 409600 flakey 8:64 0 0 180 1 drop_writes
---

I believe the READs are dropped because it does not have any flags set.

---
if (bio_data_dir(bio) == READ) {
/* If flags were specified, only corrupt those that match. */
if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
all_corrupt_bio_flags_match(bio, fc))
goto map_bio;
else
return -EIO;
}
---

with conclusion of setting:
---
/*
 * Flag this bio as submitted while down.
 */
pb->bio_submitted = true;
---

I have quick test patch ready, but it probably broke more thing than
fixes so I will continue on it.
Just in case you want to test it. Diff is done again 4.8-rc1

--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -292,6 +292,11 @@ static int flakey_map(struct dm_target *ti,
struct bio *bio)
 * Map reads as normal only if corrupt_bio_byte set.
 */
if (bio_data_dir(bio) == READ) {
+/* We should retunr all READS as ok in case
of DROP WRITES flag is set. */
+   if (test_bit(DROP_WRITES, >flags)) {
+   pb->bio_submitted = false;
+   goto map_bio;
+   }


According to my personal understanding, drop_writes should:
1) Drop any write bio silently
   Just as its name
2) For read
2.1) Read out data if the range doesn't include corrupt_bio_byte

2.2) Read out corrupted data if the range contains corrupt_bio_byte.

So it seems that 2.2) is not fulfilled.

While it solves the problem I reported, I'm still concerned if it 
matches the correct/designed behavior of flakey.


Thanks,
Qu

/* If flags were specified, only corrupt those
that match. */
if (fc->corrupt_bio_byte &&
(fc->corrupt_bio_rw == READ) &&
all_corrupt_bio_flags_match(bio, fc))



On Mon, Aug 22, 2016 at 10:05 AM, Lukas Herbolt  wrote:

Hello,

There is patch from Mike. It's part of current pull request to 4.8-rc1
For more details check:
 - https://www.redhat.com/archives/dm-devel/2016-July/msg00561.html
 - https://www.redhat.com/archives/dm-devel/2016-August/msg00109.html

Lukas

On Mon, Aug 22, 2016 at 9:31 AM, Qu Wenruo  wrote:

Hi, Mike and btrfs and dm guys

When doing regression test on v4.8-rc1, we found that fstests/btrfs/056
always fails. With the following dmesg:
---
Buffer I/O error on dev dm-0, logical block 1310704, async page read
Buffer I/O error on dev dm-0, logical block 16, async page read
Buffer I/O error on dev dm-0, logical block 16, async page read
---

And bisect leads to the following commits:
---
commit 99f3c90d0d85708e7401a81ce3314e50bf7f2819
Author: Mike Snitzer 
Date:   Fri Jul 29 13:19:55 2016 -0400

dm flakey: error READ bios during the down_interval
---

While according to the document of dm-flakey, it says that when using
drop_writes feature, read bios are not affected:
---
  drop_writes:
All write I/O is silently ignored.
Read I/O is handled correctly.
---

If I understand the word "correctly" correctly, it should means READ I/0 is
handled without problem.

However with this commit, it also corrupt the read bio, leading to the test
failure.


At least there are two fixes available here;
1) Fix fstest scripts
   The related macro is "_flakey_drop_and_remount yes", which will
   check the fs during the "drop_writes" time.

   Currently, only btrfs/056 calls "_flakey_drop_and_remount" with
   "yes". So other test cases are not affected.

   However, even we move the fsck outside of the "drop_writes" range,
   although test case can pass without problem, but we will still
   get a dmesg error:
   "Buffer I/O error on dev dm-0, logical block 1310704, async page read"

2) Revert to flakey behavior to allow READ bio
   Then everything is back to the old good days.

Not sure which one is correct for current use case, as I'm not familiar with
dm codes.

Any idea to fix dm-flaky and keep the READ bio behavior?

Thanks,
Qu





--
dm-devel mailing list
dm-de...@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel




--
Lukas Herbolt
RHCE, RH436, BSc, SSc
Senior Technical Support Engineer
Global Support Services (GSS)
Email:lherb...@redhat.com







--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to 

Re: [Regression/Behavior change]dm-flakey corrupt read bio, even the feature is drop_writes

2016-08-22 Thread Mike Snitzer
On Mon, Aug 22 2016 at  4:05am -0400,
Lukas Herbolt  wrote:

> Hello,
> 
> There is patch from Mike. It's part of current pull request to 4.8-rc1
> For more details check:
>  - https://www.redhat.com/archives/dm-devel/2016-July/msg00561.html
>  - https://www.redhat.com/archives/dm-devel/2016-August/msg00109.html
> 
> Lukas
> 
> On Mon, Aug 22, 2016 at 9:31 AM, Qu Wenruo  wrote:
> > Hi, Mike and btrfs and dm guys
> >
> > When doing regression test on v4.8-rc1, we found that fstests/btrfs/056
> > always fails. With the following dmesg:
> > ---
> > Buffer I/O error on dev dm-0, logical block 1310704, async page read
> > Buffer I/O error on dev dm-0, logical block 16, async page read
> > Buffer I/O error on dev dm-0, logical block 16, async page read
> > ---
> >
> > And bisect leads to the following commits:
> > ---
> > commit 99f3c90d0d85708e7401a81ce3314e50bf7f2819
> > Author: Mike Snitzer 
> > Date:   Fri Jul 29 13:19:55 2016 -0400
> >
> > dm flakey: error READ bios during the down_interval
> > ---
> >
> > While according to the document of dm-flakey, it says that when using
> > drop_writes feature, read bios are not affected:
> > ---
> >   drop_writes:
> > All write I/O is silently ignored.
> > Read I/O is handled correctly.

I went back to the dm-flakey.c code at the time that the 'drop_writes'
feature was added via commit b26f5e3d.  It does confirm your
understanding of how reads should be handled if drop_writes is enabled.

Not sure why I thought differently.  Please try the following patch.

diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index 97e446d..6a2e8dd 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -289,15 +289,13 @@ static int flakey_map(struct dm_target *ti, struct bio 
*bio)
pb->bio_submitted = true;
 
/*
-* Map reads as normal only if corrupt_bio_byte set.
+* Error reads if neither corrupt_bio_byte or drop_writes are 
set.
+* Otherwise, flakey_end_io() will decide if the reads should 
be modified.
 */
if (bio_data_dir(bio) == READ) {
-   /* If flags were specified, only corrupt those that 
match. */
-   if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == 
READ) &&
-   all_corrupt_bio_flags_match(bio, fc))
-   goto map_bio;
-   else
+   if (!fc->corrupt_bio_byte && !test_bit(DROP_WRITES, 
>flags))
return -EIO;
+   goto map_bio;
}
 
/*
@@ -334,14 +332,21 @@ static int flakey_end_io(struct dm_target *ti, struct bio 
*bio, int error)
struct flakey_c *fc = ti->private;
struct per_bio_data *pb = dm_per_bio_data(bio, sizeof(struct 
per_bio_data));
 
-   /*
-* Corrupt successful READs while in down state.
-*/
if (!error && pb->bio_submitted && (bio_data_dir(bio) == READ)) {
-   if (fc->corrupt_bio_byte)
+   if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
+   all_corrupt_bio_flags_match(bio, fc)) {
+   /*
+* Corrupt successful matching READs while in down 
state.
+*/
corrupt_bio_data(bio, fc);
-   else
+
+   } else if (!test_bit(DROP_WRITES, >flags)) {
+   /*
+* Error read during the down_interval if drop_writes
+* wasn't configured.
+*/
return -EIO;
+   }
}
 
return error;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [dm-devel] [Regression/Behavior change]dm-flakey corrupt read bio, even the feature is drop_writes

2016-08-22 Thread Lukas Herbolt
Hi Qu,

Sorry for the confusion. Reading the email again and the code it seems
that the READS are really returned as -EIO if you set the drop_writes.
I just tested it and you are right.

If I was reading the fstest correctly the flakey is created as:
---
flakey: 0 409600 flakey 8:64 0 0 180 1 drop_writes
---

I believe the READs are dropped because it does not have any flags set.

---
if (bio_data_dir(bio) == READ) {
/* If flags were specified, only corrupt those that match. */
if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
all_corrupt_bio_flags_match(bio, fc))
goto map_bio;
else
return -EIO;
}
---

with conclusion of setting:
---
/*
 * Flag this bio as submitted while down.
 */
pb->bio_submitted = true;
---

I have quick test patch ready, but it probably broke more thing than
fixes so I will continue on it.
Just in case you want to test it. Diff is done again 4.8-rc1

--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -292,6 +292,11 @@ static int flakey_map(struct dm_target *ti,
struct bio *bio)
 * Map reads as normal only if corrupt_bio_byte set.
 */
if (bio_data_dir(bio) == READ) {
+/* We should retunr all READS as ok in case
of DROP WRITES flag is set. */
+   if (test_bit(DROP_WRITES, >flags)) {
+   pb->bio_submitted = false;
+   goto map_bio;
+   }
/* If flags were specified, only corrupt those
that match. */
if (fc->corrupt_bio_byte &&
(fc->corrupt_bio_rw == READ) &&
all_corrupt_bio_flags_match(bio, fc))



On Mon, Aug 22, 2016 at 10:05 AM, Lukas Herbolt  wrote:
> Hello,
>
> There is patch from Mike. It's part of current pull request to 4.8-rc1
> For more details check:
>  - https://www.redhat.com/archives/dm-devel/2016-July/msg00561.html
>  - https://www.redhat.com/archives/dm-devel/2016-August/msg00109.html
>
> Lukas
>
> On Mon, Aug 22, 2016 at 9:31 AM, Qu Wenruo  wrote:
>> Hi, Mike and btrfs and dm guys
>>
>> When doing regression test on v4.8-rc1, we found that fstests/btrfs/056
>> always fails. With the following dmesg:
>> ---
>> Buffer I/O error on dev dm-0, logical block 1310704, async page read
>> Buffer I/O error on dev dm-0, logical block 16, async page read
>> Buffer I/O error on dev dm-0, logical block 16, async page read
>> ---
>>
>> And bisect leads to the following commits:
>> ---
>> commit 99f3c90d0d85708e7401a81ce3314e50bf7f2819
>> Author: Mike Snitzer 
>> Date:   Fri Jul 29 13:19:55 2016 -0400
>>
>> dm flakey: error READ bios during the down_interval
>> ---
>>
>> While according to the document of dm-flakey, it says that when using
>> drop_writes feature, read bios are not affected:
>> ---
>>   drop_writes:
>> All write I/O is silently ignored.
>> Read I/O is handled correctly.
>> ---
>>
>> If I understand the word "correctly" correctly, it should means READ I/0 is
>> handled without problem.
>>
>> However with this commit, it also corrupt the read bio, leading to the test
>> failure.
>>
>>
>> At least there are two fixes available here;
>> 1) Fix fstest scripts
>>The related macro is "_flakey_drop_and_remount yes", which will
>>check the fs during the "drop_writes" time.
>>
>>Currently, only btrfs/056 calls "_flakey_drop_and_remount" with
>>"yes". So other test cases are not affected.
>>
>>However, even we move the fsck outside of the "drop_writes" range,
>>although test case can pass without problem, but we will still
>>get a dmesg error:
>>"Buffer I/O error on dev dm-0, logical block 1310704, async page read"
>>
>> 2) Revert to flakey behavior to allow READ bio
>>Then everything is back to the old good days.
>>
>> Not sure which one is correct for current use case, as I'm not familiar with
>> dm codes.
>>
>> Any idea to fix dm-flaky and keep the READ bio behavior?
>>
>> Thanks,
>> Qu
>>
>>
>>
>>
>>
>> --
>> dm-devel mailing list
>> dm-de...@redhat.com
>> https://www.redhat.com/mailman/listinfo/dm-devel
>
>
>
> --
> Lukas Herbolt
> RHCE, RH436, BSc, SSc
> Senior Technical Support Engineer
> Global Support Services (GSS)
> Email:lherb...@redhat.com



-- 
Lukas Herbolt
RHCE, RH436, BSc, SSc
Senior Technical Support Engineer
Global Support Services (GSS)
Email:lherb...@redhat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [dm-devel] [Regression/Behavior change]dm-flakey corrupt read bio, even the feature is drop_writes

2016-08-22 Thread Lukas Herbolt
Hello,

There is patch from Mike. It's part of current pull request to 4.8-rc1
For more details check:
 - https://www.redhat.com/archives/dm-devel/2016-July/msg00561.html
 - https://www.redhat.com/archives/dm-devel/2016-August/msg00109.html

Lukas

On Mon, Aug 22, 2016 at 9:31 AM, Qu Wenruo  wrote:
> Hi, Mike and btrfs and dm guys
>
> When doing regression test on v4.8-rc1, we found that fstests/btrfs/056
> always fails. With the following dmesg:
> ---
> Buffer I/O error on dev dm-0, logical block 1310704, async page read
> Buffer I/O error on dev dm-0, logical block 16, async page read
> Buffer I/O error on dev dm-0, logical block 16, async page read
> ---
>
> And bisect leads to the following commits:
> ---
> commit 99f3c90d0d85708e7401a81ce3314e50bf7f2819
> Author: Mike Snitzer 
> Date:   Fri Jul 29 13:19:55 2016 -0400
>
> dm flakey: error READ bios during the down_interval
> ---
>
> While according to the document of dm-flakey, it says that when using
> drop_writes feature, read bios are not affected:
> ---
>   drop_writes:
> All write I/O is silently ignored.
> Read I/O is handled correctly.
> ---
>
> If I understand the word "correctly" correctly, it should means READ I/0 is
> handled without problem.
>
> However with this commit, it also corrupt the read bio, leading to the test
> failure.
>
>
> At least there are two fixes available here;
> 1) Fix fstest scripts
>The related macro is "_flakey_drop_and_remount yes", which will
>check the fs during the "drop_writes" time.
>
>Currently, only btrfs/056 calls "_flakey_drop_and_remount" with
>"yes". So other test cases are not affected.
>
>However, even we move the fsck outside of the "drop_writes" range,
>although test case can pass without problem, but we will still
>get a dmesg error:
>"Buffer I/O error on dev dm-0, logical block 1310704, async page read"
>
> 2) Revert to flakey behavior to allow READ bio
>Then everything is back to the old good days.
>
> Not sure which one is correct for current use case, as I'm not familiar with
> dm codes.
>
> Any idea to fix dm-flaky and keep the READ bio behavior?
>
> Thanks,
> Qu
>
>
>
>
>
> --
> dm-devel mailing list
> dm-de...@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel



-- 
Lukas Herbolt
RHCE, RH436, BSc, SSc
Senior Technical Support Engineer
Global Support Services (GSS)
Email:lherb...@redhat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Regression/Behavior change]dm-flakey corrupt read bio, even the feature is drop_writes

2016-08-22 Thread Qu Wenruo

Hi, Mike and btrfs and dm guys

When doing regression test on v4.8-rc1, we found that fstests/btrfs/056 
always fails. With the following dmesg:

---
Buffer I/O error on dev dm-0, logical block 1310704, async page read
Buffer I/O error on dev dm-0, logical block 16, async page read
Buffer I/O error on dev dm-0, logical block 16, async page read
---

And bisect leads to the following commits:
---
commit 99f3c90d0d85708e7401a81ce3314e50bf7f2819
Author: Mike Snitzer 
Date:   Fri Jul 29 13:19:55 2016 -0400

dm flakey: error READ bios during the down_interval
---

While according to the document of dm-flakey, it says that when using 
drop_writes feature, read bios are not affected:

---
  drop_writes:
All write I/O is silently ignored.
Read I/O is handled correctly.
---

If I understand the word "correctly" correctly, it should means READ I/0 
is handled without problem.


However with this commit, it also corrupt the read bio, leading to the 
test failure.



At least there are two fixes available here;
1) Fix fstest scripts
   The related macro is "_flakey_drop_and_remount yes", which will
   check the fs during the "drop_writes" time.

   Currently, only btrfs/056 calls "_flakey_drop_and_remount" with
   "yes". So other test cases are not affected.

   However, even we move the fsck outside of the "drop_writes" range,
   although test case can pass without problem, but we will still
   get a dmesg error:
   "Buffer I/O error on dev dm-0, logical block 1310704, async page read"

2) Revert to flakey behavior to allow READ bio
   Then everything is back to the old good days.

Not sure which one is correct for current use case, as I'm not familiar 
with dm codes.


Any idea to fix dm-flaky and keep the READ bio behavior?

Thanks,
Qu





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html