Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-26 Thread Jens Axboe
On Thu, Apr 26 2007, Alan D. Brunelle wrote:
> Jens Axboe wrote:
> >On Wed, Apr 25 2007, Jens Axboe wrote:
> >>On Wed, Apr 25 2007, Jens Axboe wrote:
> >>>On Wed, Apr 25 2007, Alan D. Brunelle wrote:
> Hi Jens -
> 
> The attached patch speeds it up even more - I'm finding a >9% reduction 
> in %system with no loss in IO performance. This just sets the cached 
> element when the first is looked for.
> >>>Interesting, good thinking. It should not change the IO pattern, as the
> >>>end result should be the same. Thanks Alan, will commit!
> >>>
> >>>I'll give elevator.c the same treatment, should be even more beneficial.
> >>>Stay tuned for a test patch.
> >>Something like this, totally untested (it compiles). I initially wanted
> >>to fold the cfq addon into the elevator.h provided implementation, but
> >>that requires more extensive changes. Given how little code it is, I
> >>think I'll keep them seperate.
> >
> >Booted, seems to work fine for me. In a null ended IO test, I get about
> >a 1-2% speedup for a single queue of depth 64 using libaio. So it's
> >definitely worth it, will commit.
> >
> After longer runs last night, I think the patched elevator code /does/ 
> help (albeit ever so slightly - about 0.6% performance improvement at a 
> 1.1% %system overhead).
> 
>  rkB_s%system  Kernel
> -  ---  
> 1022942.2 3.69  Original patch + fix to cfq_rb_first
> 1029087.0 3.73  This patch stream (including fixes to elevator code)

Ah good, thanks for testing! It's all in the cfq branch now.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-26 Thread Alan D. Brunelle

Jens Axboe wrote:

On Wed, Apr 25 2007, Jens Axboe wrote:

On Wed, Apr 25 2007, Jens Axboe wrote:

On Wed, Apr 25 2007, Alan D. Brunelle wrote:

Hi Jens -

The attached patch speeds it up even more - I'm finding a >9% reduction 
in %system with no loss in IO performance. This just sets the cached 
element when the first is looked for.

Interesting, good thinking. It should not change the IO pattern, as the
end result should be the same. Thanks Alan, will commit!

I'll give elevator.c the same treatment, should be even more beneficial.
Stay tuned for a test patch.

Something like this, totally untested (it compiles). I initially wanted
to fold the cfq addon into the elevator.h provided implementation, but
that requires more extensive changes. Given how little code it is, I
think I'll keep them seperate.


Booted, seems to work fine for me. In a null ended IO test, I get about
a 1-2% speedup for a single queue of depth 64 using libaio. So it's
definitely worth it, will commit.

After longer runs last night, I think the patched elevator code /does/ 
help (albeit ever so slightly - about 0.6% performance improvement at a 
1.1% %system overhead).


 rkB_s%system  Kernel
-  ---  
1022942.2 3.69  Original patch + fix to cfq_rb_first
1029087.0 3.73  This patch stream (including fixes to elevator code)

Alan



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-26 Thread Alan D. Brunelle

Jens Axboe wrote:

On Wed, Apr 25 2007, Jens Axboe wrote:

On Wed, Apr 25 2007, Jens Axboe wrote:

On Wed, Apr 25 2007, Alan D. Brunelle wrote:

Hi Jens -

The attached patch speeds it up even more - I'm finding a 9% reduction 
in %system with no loss in IO performance. This just sets the cached 
element when the first is looked for.

Interesting, good thinking. It should not change the IO pattern, as the
end result should be the same. Thanks Alan, will commit!

I'll give elevator.c the same treatment, should be even more beneficial.
Stay tuned for a test patch.

Something like this, totally untested (it compiles). I initially wanted
to fold the cfq addon into the elevator.h provided implementation, but
that requires more extensive changes. Given how little code it is, I
think I'll keep them seperate.


Booted, seems to work fine for me. In a null ended IO test, I get about
a 1-2% speedup for a single queue of depth 64 using libaio. So it's
definitely worth it, will commit.

After longer runs last night, I think the patched elevator code /does/ 
help (albeit ever so slightly - about 0.6% performance improvement at a 
1.1% %system overhead).


 rkB_s%system  Kernel
-  ---  
1022942.2 3.69  Original patch + fix to cfq_rb_first
1029087.0 3.73  This patch stream (including fixes to elevator code)

Alan



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-26 Thread Jens Axboe
On Thu, Apr 26 2007, Alan D. Brunelle wrote:
 Jens Axboe wrote:
 On Wed, Apr 25 2007, Jens Axboe wrote:
 On Wed, Apr 25 2007, Jens Axboe wrote:
 On Wed, Apr 25 2007, Alan D. Brunelle wrote:
 Hi Jens -
 
 The attached patch speeds it up even more - I'm finding a 9% reduction 
 in %system with no loss in IO performance. This just sets the cached 
 element when the first is looked for.
 Interesting, good thinking. It should not change the IO pattern, as the
 end result should be the same. Thanks Alan, will commit!
 
 I'll give elevator.c the same treatment, should be even more beneficial.
 Stay tuned for a test patch.
 Something like this, totally untested (it compiles). I initially wanted
 to fold the cfq addon into the elevator.h provided implementation, but
 that requires more extensive changes. Given how little code it is, I
 think I'll keep them seperate.
 
 Booted, seems to work fine for me. In a null ended IO test, I get about
 a 1-2% speedup for a single queue of depth 64 using libaio. So it's
 definitely worth it, will commit.
 
 After longer runs last night, I think the patched elevator code /does/ 
 help (albeit ever so slightly - about 0.6% performance improvement at a 
 1.1% %system overhead).
 
  rkB_s%system  Kernel
 -  ---  
 1022942.2 3.69  Original patch + fix to cfq_rb_first
 1029087.0 3.73  This patch stream (including fixes to elevator code)

Ah good, thanks for testing! It's all in the cfq branch now.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-25 Thread Jens Axboe
On Wed, Apr 25 2007, Jens Axboe wrote:
> On Wed, Apr 25 2007, Jens Axboe wrote:
> > On Wed, Apr 25 2007, Alan D. Brunelle wrote:
> > > Hi Jens -
> > > 
> > > The attached patch speeds it up even more - I'm finding a >9% reduction 
> > > in %system with no loss in IO performance. This just sets the cached 
> > > element when the first is looked for.
> > 
> > Interesting, good thinking. It should not change the IO pattern, as the
> > end result should be the same. Thanks Alan, will commit!
> > 
> > I'll give elevator.c the same treatment, should be even more beneficial.
> > Stay tuned for a test patch.
> 
> Something like this, totally untested (it compiles). I initially wanted
> to fold the cfq addon into the elevator.h provided implementation, but
> that requires more extensive changes. Given how little code it is, I
> think I'll keep them seperate.

Booted, seems to work fine for me. In a null ended IO test, I get about
a 1-2% speedup for a single queue of depth 64 using libaio. So it's
definitely worth it, will commit.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-25 Thread Jens Axboe
On Wed, Apr 25 2007, Jens Axboe wrote:
> On Wed, Apr 25 2007, Alan D. Brunelle wrote:
> > Hi Jens -
> > 
> > The attached patch speeds it up even more - I'm finding a >9% reduction 
> > in %system with no loss in IO performance. This just sets the cached 
> > element when the first is looked for.
> 
> Interesting, good thinking. It should not change the IO pattern, as the
> end result should be the same. Thanks Alan, will commit!
> 
> I'll give elevator.c the same treatment, should be even more beneficial.
> Stay tuned for a test patch.

Something like this, totally untested (it compiles). I initially wanted
to fold the cfq addon into the elevator.h provided implementation, but
that requires more extensive changes. Given how little code it is, I
think I'll keep them seperate.

diff --git a/block/as-iosched.c b/block/as-iosched.c
index ef12627..37c6fb3 100644
--- a/block/as-iosched.c
+++ b/block/as-iosched.c
@@ -89,7 +89,7 @@ struct as_data {
/*
 * requests (as_rq s) are present on both sort_list and fifo_list
 */
-   struct rb_root sort_list[2];
+   struct blk_rb_root sort_list[2];
struct list_head fifo_list[2];
 
struct request *next_rq[2]; /* next in sort order */
@@ -369,9 +369,9 @@ as_find_next_rq(struct as_data *ad, struct request *last)
else {
const int data_dir = rq_is_sync(last);
 
-   rbnext = rb_first(>sort_list[data_dir]);
-   if (rbnext && rbnext != >rb_node)
-   next = rb_entry_rq(rbnext);
+   next = elv_rb_first(>sort_list[data_dir]);
+   if (next == last)
+   next = NULL;
}
 
return as_choose_req(ad, next, prev);
@@ -1057,7 +1057,7 @@ static int as_dispatch_request(request_queue_t *q, int 
force)
 */
 
if (reads) {
-   BUG_ON(RB_EMPTY_ROOT(>sort_list[REQ_SYNC]));
+   BUG_ON(BLK_RB_EMPTY_ROOT(>sort_list[REQ_SYNC]));
 
if (writes && ad->batch_data_dir == REQ_SYNC)
/*
@@ -1081,7 +1081,7 @@ static int as_dispatch_request(request_queue_t *q, int 
force)
 
if (writes) {
 dispatch_writes:
-   BUG_ON(RB_EMPTY_ROOT(>sort_list[REQ_ASYNC]));
+   BUG_ON(BLK_RB_EMPTY_ROOT(>sort_list[REQ_ASYNC]));
 
if (ad->batch_data_dir == REQ_SYNC) {
ad->changed_batch = 1;
@@ -1337,8 +1337,8 @@ static void *as_init_queue(request_queue_t *q)
 
INIT_LIST_HEAD(>fifo_list[REQ_SYNC]);
INIT_LIST_HEAD(>fifo_list[REQ_ASYNC]);
-   ad->sort_list[REQ_SYNC] = RB_ROOT;
-   ad->sort_list[REQ_ASYNC] = RB_ROOT;
+   ad->sort_list[REQ_SYNC] = BLK_RB_ROOT;
+   ad->sort_list[REQ_ASYNC] = BLK_RB_ROOT;
ad->fifo_expire[REQ_SYNC] = default_read_expire;
ad->fifo_expire[REQ_ASYNC] = default_write_expire;
ad->antic_expire = default_antic_expire;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 7a18f31..f3020aa 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -127,7 +127,7 @@ struct cfq_queue {
/* service_tree key */
unsigned long rb_key;
/* sorted list of pending requests */
-   struct rb_root sort_list;
+   struct blk_rb_root sort_list;
/* if fifo isn't expired, next request to serve */
struct request *next_rq;
/* requests queued in sort_list */
@@ -419,9 +419,9 @@ cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue 
*cfqq,
if (rbnext)
next = rb_entry_rq(rbnext);
else {
-   rbnext = rb_first(>sort_list);
-   if (rbnext && rbnext != >rb_node)
-   next = rb_entry_rq(rbnext);
+   next = elv_rb_first(>sort_list);
+   if (next == last)
+   next = NULL;
}
 
return cfq_choose_req(cfqd, next, prev);
@@ -564,7 +564,7 @@ static inline void cfq_del_rq_rb(struct request *rq)
 
elv_rb_del(>sort_list, rq);
 
-   if (cfq_cfqq_on_rr(cfqq) && RB_EMPTY_ROOT(>sort_list))
+   if (cfq_cfqq_on_rr(cfqq) && BLK_RB_EMPTY_ROOT(>sort_list))
cfq_del_cfqq_rr(cfqd, cfqq);
 }
 
@@ -871,7 +871,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
struct cfq_io_context *cic;
unsigned long sl;
 
-   WARN_ON(!RB_EMPTY_ROOT(>sort_list));
+   WARN_ON(!BLK_RB_EMPTY_ROOT(>sort_list));
WARN_ON(cfq_cfqq_slice_new(cfqq));
 
/*
@@ -983,7 +983,7 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data 
*cfqd)
 * The active queue has requests and isn't expired, allow it to
 * dispatch.
 */
-   if (!RB_EMPTY_ROOT(>sort_list))
+   if (!BLK_RB_EMPTY_ROOT(>sort_list))
goto keep_queue;
 
/*
@@ -1015,7 +1015,7 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct 
cfq_queue *cfqq,
 {
int dispatched = 0;
 
-   BUG_ON(RB_EMPTY_ROOT(>sort_list));
+ 

Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-25 Thread Jens Axboe
On Wed, Apr 25 2007, Alan D. Brunelle wrote:
> Hi Jens -
> 
> The attached patch speeds it up even more - I'm finding a >9% reduction 
> in %system with no loss in IO performance. This just sets the cached 
> element when the first is looked for.

Interesting, good thinking. It should not change the IO pattern, as the
end result should be the same. Thanks Alan, will commit!

I'll give elevator.c the same treatment, should be even more beneficial.
Stay tuned for a test patch.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-25 Thread Alan D. Brunelle

Hi Jens -

The attached patch speeds it up even more - I'm finding a >9% reduction 
in %system with no loss in IO performance. This just sets the cached 
element when the first is looked for.


Alan
From: Alan D. Brunelle <[EMAIL PROTECTED]>

Update cached leftmost every time it is found.

Signed-off-by: Alan D. Brunelle <[EMAIL PROTECTED]>
---

 block/cfq-iosched.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 8093733..a86a7c3 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -388,10 +388,10 @@ cfq_choose_req(struct cfq_data *cfqd, struct request 
*rq1, struct request *rq2)
  */
 static struct rb_node *cfq_rb_first(struct cfq_rb_root *root)
 {
-   if (root->left)
-   return root->left;
+   if (!root->left)
+   root->left = rb_first(>rb);
 
-   return rb_first(>rb);
+   return root->left;
 }
 
 static void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root)


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-25 Thread Alan D. Brunelle

Hi Jens -

The attached patch speeds it up even more - I'm finding a 9% reduction 
in %system with no loss in IO performance. This just sets the cached 
element when the first is looked for.


Alan
From: Alan D. Brunelle [EMAIL PROTECTED]

Update cached leftmost every time it is found.

Signed-off-by: Alan D. Brunelle [EMAIL PROTECTED]
---

 block/cfq-iosched.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 8093733..a86a7c3 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -388,10 +388,10 @@ cfq_choose_req(struct cfq_data *cfqd, struct request 
*rq1, struct request *rq2)
  */
 static struct rb_node *cfq_rb_first(struct cfq_rb_root *root)
 {
-   if (root-left)
-   return root-left;
+   if (!root-left)
+   root-left = rb_first(root-rb);
 
-   return rb_first(root-rb);
+   return root-left;
 }
 
 static void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root)


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-25 Thread Jens Axboe
On Wed, Apr 25 2007, Alan D. Brunelle wrote:
 Hi Jens -
 
 The attached patch speeds it up even more - I'm finding a 9% reduction 
 in %system with no loss in IO performance. This just sets the cached 
 element when the first is looked for.

Interesting, good thinking. It should not change the IO pattern, as the
end result should be the same. Thanks Alan, will commit!

I'll give elevator.c the same treatment, should be even more beneficial.
Stay tuned for a test patch.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-25 Thread Jens Axboe
On Wed, Apr 25 2007, Jens Axboe wrote:
 On Wed, Apr 25 2007, Alan D. Brunelle wrote:
  Hi Jens -
  
  The attached patch speeds it up even more - I'm finding a 9% reduction 
  in %system with no loss in IO performance. This just sets the cached 
  element when the first is looked for.
 
 Interesting, good thinking. It should not change the IO pattern, as the
 end result should be the same. Thanks Alan, will commit!
 
 I'll give elevator.c the same treatment, should be even more beneficial.
 Stay tuned for a test patch.

Something like this, totally untested (it compiles). I initially wanted
to fold the cfq addon into the elevator.h provided implementation, but
that requires more extensive changes. Given how little code it is, I
think I'll keep them seperate.

diff --git a/block/as-iosched.c b/block/as-iosched.c
index ef12627..37c6fb3 100644
--- a/block/as-iosched.c
+++ b/block/as-iosched.c
@@ -89,7 +89,7 @@ struct as_data {
/*
 * requests (as_rq s) are present on both sort_list and fifo_list
 */
-   struct rb_root sort_list[2];
+   struct blk_rb_root sort_list[2];
struct list_head fifo_list[2];
 
struct request *next_rq[2]; /* next in sort order */
@@ -369,9 +369,9 @@ as_find_next_rq(struct as_data *ad, struct request *last)
else {
const int data_dir = rq_is_sync(last);
 
-   rbnext = rb_first(ad-sort_list[data_dir]);
-   if (rbnext  rbnext != last-rb_node)
-   next = rb_entry_rq(rbnext);
+   next = elv_rb_first(ad-sort_list[data_dir]);
+   if (next == last)
+   next = NULL;
}
 
return as_choose_req(ad, next, prev);
@@ -1057,7 +1057,7 @@ static int as_dispatch_request(request_queue_t *q, int 
force)
 */
 
if (reads) {
-   BUG_ON(RB_EMPTY_ROOT(ad-sort_list[REQ_SYNC]));
+   BUG_ON(BLK_RB_EMPTY_ROOT(ad-sort_list[REQ_SYNC]));
 
if (writes  ad-batch_data_dir == REQ_SYNC)
/*
@@ -1081,7 +1081,7 @@ static int as_dispatch_request(request_queue_t *q, int 
force)
 
if (writes) {
 dispatch_writes:
-   BUG_ON(RB_EMPTY_ROOT(ad-sort_list[REQ_ASYNC]));
+   BUG_ON(BLK_RB_EMPTY_ROOT(ad-sort_list[REQ_ASYNC]));
 
if (ad-batch_data_dir == REQ_SYNC) {
ad-changed_batch = 1;
@@ -1337,8 +1337,8 @@ static void *as_init_queue(request_queue_t *q)
 
INIT_LIST_HEAD(ad-fifo_list[REQ_SYNC]);
INIT_LIST_HEAD(ad-fifo_list[REQ_ASYNC]);
-   ad-sort_list[REQ_SYNC] = RB_ROOT;
-   ad-sort_list[REQ_ASYNC] = RB_ROOT;
+   ad-sort_list[REQ_SYNC] = BLK_RB_ROOT;
+   ad-sort_list[REQ_ASYNC] = BLK_RB_ROOT;
ad-fifo_expire[REQ_SYNC] = default_read_expire;
ad-fifo_expire[REQ_ASYNC] = default_write_expire;
ad-antic_expire = default_antic_expire;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 7a18f31..f3020aa 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -127,7 +127,7 @@ struct cfq_queue {
/* service_tree key */
unsigned long rb_key;
/* sorted list of pending requests */
-   struct rb_root sort_list;
+   struct blk_rb_root sort_list;
/* if fifo isn't expired, next request to serve */
struct request *next_rq;
/* requests queued in sort_list */
@@ -419,9 +419,9 @@ cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue 
*cfqq,
if (rbnext)
next = rb_entry_rq(rbnext);
else {
-   rbnext = rb_first(cfqq-sort_list);
-   if (rbnext  rbnext != last-rb_node)
-   next = rb_entry_rq(rbnext);
+   next = elv_rb_first(cfqq-sort_list);
+   if (next == last)
+   next = NULL;
}
 
return cfq_choose_req(cfqd, next, prev);
@@ -564,7 +564,7 @@ static inline void cfq_del_rq_rb(struct request *rq)
 
elv_rb_del(cfqq-sort_list, rq);
 
-   if (cfq_cfqq_on_rr(cfqq)  RB_EMPTY_ROOT(cfqq-sort_list))
+   if (cfq_cfqq_on_rr(cfqq)  BLK_RB_EMPTY_ROOT(cfqq-sort_list))
cfq_del_cfqq_rr(cfqd, cfqq);
 }
 
@@ -871,7 +871,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
struct cfq_io_context *cic;
unsigned long sl;
 
-   WARN_ON(!RB_EMPTY_ROOT(cfqq-sort_list));
+   WARN_ON(!BLK_RB_EMPTY_ROOT(cfqq-sort_list));
WARN_ON(cfq_cfqq_slice_new(cfqq));
 
/*
@@ -983,7 +983,7 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data 
*cfqd)
 * The active queue has requests and isn't expired, allow it to
 * dispatch.
 */
-   if (!RB_EMPTY_ROOT(cfqq-sort_list))
+   if (!BLK_RB_EMPTY_ROOT(cfqq-sort_list))
goto keep_queue;
 
/*
@@ -1015,7 +1015,7 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct 
cfq_queue *cfqq,
 {
int dispatched = 0;
 
-   

Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-25 Thread Jens Axboe
On Wed, Apr 25 2007, Jens Axboe wrote:
 On Wed, Apr 25 2007, Jens Axboe wrote:
  On Wed, Apr 25 2007, Alan D. Brunelle wrote:
   Hi Jens -
   
   The attached patch speeds it up even more - I'm finding a 9% reduction 
   in %system with no loss in IO performance. This just sets the cached 
   element when the first is looked for.
  
  Interesting, good thinking. It should not change the IO pattern, as the
  end result should be the same. Thanks Alan, will commit!
  
  I'll give elevator.c the same treatment, should be even more beneficial.
  Stay tuned for a test patch.
 
 Something like this, totally untested (it compiles). I initially wanted
 to fold the cfq addon into the elevator.h provided implementation, but
 that requires more extensive changes. Given how little code it is, I
 think I'll keep them seperate.

Booted, seems to work fine for me. In a null ended IO test, I get about
a 1-2% speedup for a single queue of depth 64 using libaio. So it's
definitely worth it, will commit.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-24 Thread Jens Axboe
For cases where the rbtree is mainly used for sorting and min retrieval,
a nice speedup of the rbtree code is to maintain a cache of the leftmost
node in the tree.

Also spotted in the CFS CPU scheduler code.

Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>
---
 block/cfq-iosched.c |   62 +++---
 1 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index ad29a99..7f964ee 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -70,6 +70,18 @@ static struct completion *ioc_gone;
 #define sample_valid(samples)  ((samples) > 80)
 
 /*
+ * Most of our rbtree usage is for sorting with min extraction, so
+ * if we cache the leftmost node we don't have to walk down the tree
+ * to find it. Idea borrowed from Ingo Molnars CFS scheduler. We should
+ * move this into the elevator for the rq sorting as well.
+ */
+struct cfq_rb_root {
+   struct rb_root rb;
+   struct rb_node *left;
+};
+#define CFQ_RB_ROOT(struct cfq_rb_root) { RB_ROOT, NULL, }
+
+/*
  * Per block device queue structure
  */
 struct cfq_data {
@@ -78,7 +90,7 @@ struct cfq_data {
/*
 * rr list of queues with requests and the count of them
 */
-   struct rb_root service_tree;
+   struct cfq_rb_root service_tree;
struct list_head cur_rr;
struct list_head idle_rr;
unsigned int busy_queues;
@@ -378,6 +390,23 @@ cfq_choose_req(struct cfq_data *cfqd, struct request *rq1, 
struct request *rq2)
}
 }
 
+static struct rb_node *cfq_rb_first(struct cfq_rb_root *root)
+{
+   if (root->left)
+   return root->left;
+
+   return rb_first(>rb);
+}
+
+static void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root)
+{
+   if (root->left == n)
+   root->left = NULL;
+
+   rb_erase(n, >rb);
+   RB_CLEAR_NODE(n);
+}
+
 /*
  * would be nice to take fifo expire time into account as well
  */
@@ -417,10 +446,10 @@ static unsigned long cfq_slice_offset(struct cfq_data 
*cfqd,
 static void cfq_service_tree_add(struct cfq_data *cfqd,
struct cfq_queue *cfqq)
 {
-   struct rb_node **p = >service_tree.rb_node;
+   struct rb_node **p = >service_tree.rb.rb_node;
struct rb_node *parent = NULL;
-   struct cfq_queue *__cfqq;
unsigned long rb_key;
+   int left = 1;
 
rb_key = cfq_slice_offset(cfqd, cfqq) + jiffies;
rb_key += cfqq->slice_resid;
@@ -433,22 +462,29 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
if (rb_key == cfqq->rb_key)
return;
 
-   rb_erase(>rb_node, >service_tree);
+   cfq_rb_erase(>rb_node, >service_tree);
}
 
while (*p) {
+   struct cfq_queue *__cfqq;
+
parent = *p;
__cfqq = rb_entry(parent, struct cfq_queue, rb_node);
 
if (rb_key < __cfqq->rb_key)
p = &(*p)->rb_left;
-   else
+   else {
p = &(*p)->rb_right;
+   left = 0;
+   }
}
 
+   if (left)
+   cfqd->service_tree.left = >rb_node;
+
cfqq->rb_key = rb_key;
rb_link_node(>rb_node, parent, p);
-   rb_insert_color(>rb_node, >service_tree);
+   rb_insert_color(>rb_node, >service_tree.rb);
 }
 
 static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
@@ -509,10 +545,8 @@ cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue 
*cfqq)
cfq_clear_cfqq_on_rr(cfqq);
list_del_init(>cfq_list);
 
-   if (!RB_EMPTY_NODE(>rb_node)) {
-   rb_erase(>rb_node, >service_tree);
-   RB_CLEAR_NODE(>rb_node);
-   }
+   if (!RB_EMPTY_NODE(>rb_node))
+   cfq_rb_erase(>rb_node, >service_tree);
 
BUG_ON(!cfqd->busy_queues);
cfqd->busy_queues--;
@@ -758,8 +792,8 @@ static struct cfq_queue *cfq_get_next_queue(struct cfq_data 
*cfqd)
 * if current list is non-empty, grab first entry.
 */
cfqq = list_entry_cfqq(cfqd->cur_rr.next);
-   } else if (!RB_EMPTY_ROOT(>service_tree)) {
-   struct rb_node *n = rb_first(>service_tree);
+   } else if (!RB_EMPTY_ROOT(>service_tree.rb)) {
+   struct rb_node *n = cfq_rb_first(>service_tree);
 
cfqq = rb_entry(n, struct cfq_queue, rb_node);
} else if (!list_empty(>idle_rr)) {
@@ -1030,7 +1064,7 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd)
int dispatched = 0;
struct rb_node *n;
 
-   while ((n = rb_first(>service_tree)) != NULL) {
+   while ((n = cfq_rb_first(>service_tree)) != NULL) {
struct cfq_queue *cfqq = rb_entry(n, struct cfq_queue, rb_node);
 
dispatched += __cfq_forced_dispatch_cfqq(cfqq);
@@ -2014,7 +2048,7 @@ static void 

[PATCH 5/15] cfq-iosched: speed up rbtree handling

2007-04-24 Thread Jens Axboe
For cases where the rbtree is mainly used for sorting and min retrieval,
a nice speedup of the rbtree code is to maintain a cache of the leftmost
node in the tree.

Also spotted in the CFS CPU scheduler code.

Signed-off-by: Jens Axboe [EMAIL PROTECTED]
---
 block/cfq-iosched.c |   62 +++---
 1 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index ad29a99..7f964ee 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -70,6 +70,18 @@ static struct completion *ioc_gone;
 #define sample_valid(samples)  ((samples)  80)
 
 /*
+ * Most of our rbtree usage is for sorting with min extraction, so
+ * if we cache the leftmost node we don't have to walk down the tree
+ * to find it. Idea borrowed from Ingo Molnars CFS scheduler. We should
+ * move this into the elevator for the rq sorting as well.
+ */
+struct cfq_rb_root {
+   struct rb_root rb;
+   struct rb_node *left;
+};
+#define CFQ_RB_ROOT(struct cfq_rb_root) { RB_ROOT, NULL, }
+
+/*
  * Per block device queue structure
  */
 struct cfq_data {
@@ -78,7 +90,7 @@ struct cfq_data {
/*
 * rr list of queues with requests and the count of them
 */
-   struct rb_root service_tree;
+   struct cfq_rb_root service_tree;
struct list_head cur_rr;
struct list_head idle_rr;
unsigned int busy_queues;
@@ -378,6 +390,23 @@ cfq_choose_req(struct cfq_data *cfqd, struct request *rq1, 
struct request *rq2)
}
 }
 
+static struct rb_node *cfq_rb_first(struct cfq_rb_root *root)
+{
+   if (root-left)
+   return root-left;
+
+   return rb_first(root-rb);
+}
+
+static void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root)
+{
+   if (root-left == n)
+   root-left = NULL;
+
+   rb_erase(n, root-rb);
+   RB_CLEAR_NODE(n);
+}
+
 /*
  * would be nice to take fifo expire time into account as well
  */
@@ -417,10 +446,10 @@ static unsigned long cfq_slice_offset(struct cfq_data 
*cfqd,
 static void cfq_service_tree_add(struct cfq_data *cfqd,
struct cfq_queue *cfqq)
 {
-   struct rb_node **p = cfqd-service_tree.rb_node;
+   struct rb_node **p = cfqd-service_tree.rb.rb_node;
struct rb_node *parent = NULL;
-   struct cfq_queue *__cfqq;
unsigned long rb_key;
+   int left = 1;
 
rb_key = cfq_slice_offset(cfqd, cfqq) + jiffies;
rb_key += cfqq-slice_resid;
@@ -433,22 +462,29 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
if (rb_key == cfqq-rb_key)
return;
 
-   rb_erase(cfqq-rb_node, cfqd-service_tree);
+   cfq_rb_erase(cfqq-rb_node, cfqd-service_tree);
}
 
while (*p) {
+   struct cfq_queue *__cfqq;
+
parent = *p;
__cfqq = rb_entry(parent, struct cfq_queue, rb_node);
 
if (rb_key  __cfqq-rb_key)
p = (*p)-rb_left;
-   else
+   else {
p = (*p)-rb_right;
+   left = 0;
+   }
}
 
+   if (left)
+   cfqd-service_tree.left = cfqq-rb_node;
+
cfqq-rb_key = rb_key;
rb_link_node(cfqq-rb_node, parent, p);
-   rb_insert_color(cfqq-rb_node, cfqd-service_tree);
+   rb_insert_color(cfqq-rb_node, cfqd-service_tree.rb);
 }
 
 static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
@@ -509,10 +545,8 @@ cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue 
*cfqq)
cfq_clear_cfqq_on_rr(cfqq);
list_del_init(cfqq-cfq_list);
 
-   if (!RB_EMPTY_NODE(cfqq-rb_node)) {
-   rb_erase(cfqq-rb_node, cfqd-service_tree);
-   RB_CLEAR_NODE(cfqq-rb_node);
-   }
+   if (!RB_EMPTY_NODE(cfqq-rb_node))
+   cfq_rb_erase(cfqq-rb_node, cfqd-service_tree);
 
BUG_ON(!cfqd-busy_queues);
cfqd-busy_queues--;
@@ -758,8 +792,8 @@ static struct cfq_queue *cfq_get_next_queue(struct cfq_data 
*cfqd)
 * if current list is non-empty, grab first entry.
 */
cfqq = list_entry_cfqq(cfqd-cur_rr.next);
-   } else if (!RB_EMPTY_ROOT(cfqd-service_tree)) {
-   struct rb_node *n = rb_first(cfqd-service_tree);
+   } else if (!RB_EMPTY_ROOT(cfqd-service_tree.rb)) {
+   struct rb_node *n = cfq_rb_first(cfqd-service_tree);
 
cfqq = rb_entry(n, struct cfq_queue, rb_node);
} else if (!list_empty(cfqd-idle_rr)) {
@@ -1030,7 +1064,7 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd)
int dispatched = 0;
struct rb_node *n;
 
-   while ((n = rb_first(cfqd-service_tree)) != NULL) {
+   while ((n = cfq_rb_first(cfqd-service_tree)) != NULL) {
struct cfq_queue *cfqq = rb_entry(n, struct cfq_queue, rb_node);