Hello,
I tested the h/w accelerated RAID-5 using the kernel with PAGE_SIZE set to
64KB and found the bonnie++ application hangs-up during the "Re-writing"
test. I made some investigations and discovered that the hang-up occurs
because one of the mpage_end_io_read() calls is missing (these are the
callbacks initiated from the ops_complete_biofill() function).
The fact is that my low-level ADMA driver (the ppc440spe one) successfully
initiated the ops_complete_biofill() callback but the ops_complete_biofill()
function itself skipped calling the bi_end_io() handler of the completed bio
(current dev->read) because during processing of this (current dev->read) bio
some other request had come to the sh (current dev_q->toread). Thus
ops_complete_biofill() scheduled another biofill operation which, as a
result, overwrote the unacknowledged bio (dev->read in ops_run_biofill()),
and so we lost the previous dev->read bio completely.
Here is a patch that solves this problem. Perhaps this might be implemented
in some more elegant and effective way. What are your thoughts regarding
this?
Regards, Yuri
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 08b4893..7abc96b 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -838,11 +838,24 @@ static void ops_complete_biofill(void *stripe_head_ref)
/* acknowledge completion of a biofill operation */
/* and check if we need to reply to a read request
*/
- if (test_bit(R5_Wantfill, &dev_q->flags) && !dev_q->toread) {
+ if (test_bit(R5_Wantfill, &dev_q->flags)) {
struct bio *rbi, *rbi2;
struct r5dev *dev = &sh->dev[i];
- clear_bit(R5_Wantfill, &dev_q->flags);
+ /* There is a chance that another fill operation
+ * had been scheduled for this dev while we
+ * processed sh. In this case do one of the following
+ * alternatives:
+ * - if there is no active completed biofill for the dev
+ * then go to the next dev leaving Wantfill set;
+ * - if there is active completed biofill for the dev
+ * then ack it but leave Wantfill set.
+ */
+ if (dev_q->toread && !dev->read)
+ continue;
+
+ if (!dev_q->toread)
+ clear_bit(R5_Wantfill, &dev_q->flags);
/* The access to dev->read is outside of the
* spin_lock_irq(&conf->device_lock), but is protected
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html