On Wed, 18 Aug 1999, Gadi Oxman wrote:

> I'd recommend verifying if the following changes affect the s/w
> raid-5 performance:
> 
> 1.    A kernel compiled with HZ=1024 instead of HZ=100 -- this
>       will decrease the latency between "i/o submitted to the raid
>       layer" and "i/o submitted to the low level drivers" by allowing
>       the raid-5 kernel thread to run more often.
> 
> 2.    Increased NR_STRIPES constant in drivers/block/raid5.c from 128
>       to 256 of 512; this will potentially queue a larger amount of data
>       to the low level drivers simultaneously.

Another thing which might hurt performance is the hash table scanning
order in the raid-5 kernel thread.

In the default setup, the hash table can contain up to 1024 entries,
and the hash function is:

#define stripe_hash(conf, sect, size)   ((conf)->stripe_hashtbl[((sect) /
(size >> 9)) & HASH_MASK])

So that sectors 0 - 1023, 1024 - 2047, etc, will fill the slots 0 - 1023
in that order (for 512 bytes block size).

Only NR_STRIPES active stripes might be in the hash table at a time,
and in addition to using the hash table to find a stripe quickly, we
are also queueing the stripes to the low level drivers by scanning the
table in increasing order, starting from slot 0.

This means that if, for example, we currently have a 128 pending
write stripes which wrap around the table, for example for sectors
950 - 1077, we will actually queue sectors 1024 - 1077 first, and
only then queue sectors 950 - 1023, which might be one of the
causes for sub-optimal performance.

The following patch tries to find the current minimum sector,
and start running on the table from there in a circular manner,
so that in the above example, we will queue the sectors in
increasing order.

Gadi

--- drivers/block/raid5.c~      Fri Jun 18 10:18:07 1999
+++ drivers/block/raid5.c       Wed Aug 18 22:39:06 1999
@@ -1322,7 +1322,8 @@
        struct stripe_head *sh;
        raid5_conf_t *conf = data;
        mddev_t *mddev = conf->mddev;
-       int i, handled = 0, unplug = 0;
+       int i, handled = 0, unplug = 0, min_index = 0;
+       unsigned long min_sector = 0;
        unsigned long flags;
 
        PRINTK(("+++ raid5d active\n"));
@@ -1332,8 +1333,22 @@
                md_update_sb(mddev);
        }
        for (i = 0; i < NR_HASH; i++) {
-repeat:
                sh = conf->stripe_hashtbl[i];
+               if (!sh || sh->phase == PHASE_COMPLETE || sh->nr_pending)
+                       continue;
+               if (!min_sector) {
+                       min_sector = sh->sector;
+                       min_index = i;
+                       continue;
+               }
+               if (sh->sector < min_sector) {
+                       min_sector = sh->sector;
+                       min_index = i;
+               }
+       }
+       for (i = 0; i < NR_HASH; i++) {
+repeat:
+               sh = conf->stripe_hashtbl[(i + min_index) & HASH_MASK];
                for (; sh; sh = sh->hash_next) {
                        if (sh->raid_conf != conf)
                                continue;

Reply via email to