Re: softraid performance problem when rebuilding

Matthew Roberts Thu, 15 Apr 2010 12:10:34 -0700

Marco Peereboom wrote:

First let me defend softraid.  The rebuild code is designed to offer
maximum data protection.  With this is mind certain assumptions were
made.

Sorry... I haven't stated that I think that data protection is king.Any performance increase that could compromise the disk consistency isbad in my view. So... If my patch is not good enough for you to include,it (probably) isn't good enough for me to trust with my data.

My instinct, when editing the code, was that you had created theSR_WUF_REBUILD tag to enable this kind of optimisation later... oh well,it seems that I _cannot_ read between the lines.


---

With luck (assuming that I, not my mail client, messed up the tabs last
time); this is my original 'nasty' patch:

Index: softraid_raid1.c
===================================================================
RCS file: /cvs/src/sys/dev/softraid_raid1.c,v
retrieving revision 1.23
diff -u softraid_raid1.c
--- softraid_raid1.c    26 Mar 2010 11:20:34 -0000      1.23
+++ softraid_raid1.c    15 Apr 2010 17:34:48 -0000
@@ -447,13 +447,25 @@
                        }
                } else {
                        /* writes go on all working disks */
+                       /* XXX - copy and paste code warning */
                        x = i;
                        scp = sd->sd_vol.sv_chunks[x];
                        switch (scp->src_meta.scm_status) {
-                       case BIOC_SDONLINE:
                        case BIOC_SDSCRUB:
                        case BIOC_SDREBUILD:
                                b->b_flags |= B_WRITE;
+                               break;
+
+                       case BIOC_SDONLINE:
+                               /* when rebuilding, take care to only send the
+                                * rebuild writes to disks that need them
+                                */
+                               if (wu->swu_flags & SR_WUF_REBUILD) {
+                                       wu->swu_io_count--;
+                                       sr_ccb_put(ccb);
+                                       continue;
+                               } else
+                                       b->b_flags |= B_WRITE;
                                break;

                        case BIOC_SDHOTSPARE: /* should never happen */

And at the end of the email is a slightly better one (i.e. less copy andpasting). The patch still feels a bit nasty, because it assumes thatb->b_flags will not have B_WRITE set (which is true) but it doesn't statethat. But those are 'just' details... your concerns were not (yet) aboutthe exact shape of the code after all.


---

Testing:

When I said that I had tested the original patch I did slightly more than
you may have guessed.  When I said 'no data corruption', what I meant was
that I diff'd the chunks after a rebuild to check that I hadn't introduced

any. Obviously the chunks _did_ differ, but only in the metadata (seecaveat below).

When rebuilding, I did write data to the array: so I am reasonable surethat while rebuilding the writes to the array get passed to all themembers of the raid1.

What I didn't test, and may struggle to do so, was what happens withhardware failures during the rebuild (or normally). The only thing thattested was disk missing at boot. It _may_ be that I cannot hotplug myvirtual harddisks, I'll find out later.

I also tested that I could fail any of the chunks without ill effect.This probably also checked that I hadn't messed up the roaming detection.


There is one way that the chunks will differ... for blocks that haven't
been written to the chunks may not be consistent.  The original rebuild

code would, as a side effect, cause one of good source chunks version ofthat block to be written to all disks. My patched code doesn't. So aftera rebuild the original code would always create a perfectly consistentset of chunks, whereas the patched code only does that if the goodchunks were already in that state.


---

Thought experiment time:

I think that the patch doesn't cause any problems. I have included mythoughts below, but I am a novice at this kernel hackery, so please tellme where I have made an error (if I have).

I haven't edited the reading half of the rebuild process. So I cannothave introduced any bugs there (except, perhaps, exposing raceconditions).


Assuming that the chunks marked BIOC_SDONLINE are consistent (i.e. all
contain the same data) then the rebuild write (tagged SR_WUF_REBUILD) will
contain the same data as is already on the disks.  If the source chunks are

not consistent then, by definition, they haven't been written to ever.Therefore we _can_ safely ignore the request to write the data.

For normal writes (not tagged SR_WUF_REBUILD) the data needs to be writtento all the disks as it used to be. (the data will probably differ!)


For disks marked BIOC_SDREBUILD or BIOC_SDSCRUB the data needs to be

written to all disks as it will (probably) differ - as the original codedid.

For disks marked BIOC_SDOFFLINE (And BIOC_SDHOTSPARE) the write shouldn'tbe done, so it should be ignored - as the original code did.

The code that I use for either enacting or ignoring writes to a given diskis the same code that was used originally - so I don't think that I couldhave caused problems with colliding IO (or race conditions).


---

Slightly improved patch...

Index: softraid_raid1.c
===================================================================
RCS file: /cvs/src/sys/dev/softraid_raid1.c,v
retrieving revision 1.23
diff -u softraid_raid1.c
--- softraid_raid1.c    26 Mar 2010 11:20:34 -0000      1.23
+++ softraid_raid1.c    15 Apr 2010 17:35:02 -0000
@@ -446,11 +446,18 @@
                                goto bad;
                        }
                } else {
-                       /* writes go on all working disks */
+                       /* 'normal' writes go on all working disks,
+                        * 'rebuild' writes go on all disks that need the data
+                        */
                        x = i;
                        scp = sd->sd_vol.sv_chunks[x];
+
                        switch (scp->src_meta.scm_status) {
                        case BIOC_SDONLINE:
+                               if (!(wu->swu_flags & SR_WUF_REBUILD))
+                                       b->b_flags |= B_WRITE;
+                               break;
+
                        case BIOC_SDSCRUB:
                        case BIOC_SDREBUILD:
                                b->b_flags |= B_WRITE;
@@ -458,14 +465,17 @@

                        case BIOC_SDHOTSPARE: /* should never happen */
                        case BIOC_SDOFFLINE:
-                               wu->swu_io_count--;
-                               sr_ccb_put(ccb);
-                               continue;
+                               break;

                        default:
                                goto bad;
                        }

+                       if (!(b->b_flags & B_WRITE)) {
+                               wu->swu_io_count--;
+                               sr_ccb_put(ccb);
+                               continue;
+                       }
                }
                ccb->ccb_target = x;
                b->b_dev = sd->sd_vol.sv_chunks[x]->src_dev_mm;

Re: softraid performance problem when rebuilding

Reply via email to