Re: [PATCH][RFC] Use bio markers for request callback

2007-09-11 Thread Kiyoshi Ueda
Hi Hannes,

On Tue, 11 Sep 2007 12:44:46 +0200, Hannes Reinecke <[EMAIL PROTECTED]> wrote:
> this is a proposal for a different implementation of request  
> callbacks. The existing ->endio callback of a request is actually a  
> destructor function, to be called to terminate a request and free all  
> structures.
> 
> However, on certain occasions (like request-based multipathing) it is  
> desirable to have a callback function for a request which is called  
> right after the request is finished, ie in end_that_request_first()  
> before any bio->bi_endio callback is called.
> 
> So a simple solution for this is to clone the request and add a new  
> 'marker' bio in front of the bio list of the request. This callback  
> will be attached a structure in bi_private which keeps a pointer to  
> the cloned and the original request, thus serving as a callback for  
> the request itself.

Thank you for another idea for request-based multipath.

However, I disagree with the idea.
I think the design (bio->bi_end_io() completes a request) is complex
a little bit and is breaking layer stacking.
Also, I think that the design of completion handling by 2 hooks,
bio->bi_end_io() and rq->end_io(), makes error handing in dm-multipath
complex.
e.g. Even if we detect an error on the first bio->bi_end_io() hook,
 the request can't be retried until the second rq->end_io() hook
 is called, because the ownership of the request is still on
 low level drivers until the low level driver calls
 end_that_request_last().


Although this is just implementation issue, I think that the patch
can't handle leftover.
SCSI, scsi_end_request(), may have some leftovers in the request.
Then, the marker bio is freed in bio->bi_end_io() and the request
is resubmitted in SCSI level.  So the resubmitted request
doesn't have the marker bio any more.

Thanks,
Kiyoshi Ueda
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Use bio markers for request callback

2007-09-11 Thread Hannes Reinecke
Am Tue 11 Sep 2007 01:11:34 PM CEST schrieb Jens Axboe  
<[EMAIL PROTECTED]>:



On Tue, Sep 11 2007, Hannes Reinecke wrote:

Hi all,

this is a proposal for a different implementation of request callbacks. The
existing ->endio callback of a request is actually a destructor function,
to be called to terminate a request and free all structures.

However, on certain occasions (like request-based multipathing) it is
desirable to have a callback function for a request which is called right
after the request is finished, ie in end_that_request_first() before any
bio->bi_endio callback is called.

So a simple solution for this is to clone the request and add a new
'marker' bio in front of the bio list of the request. This callback will be
attached a structure in bi_private which keeps a pointer to the cloned and
the original request, thus serving as a callback for the request itself.

Proposed patch attached. As usual comments are welcome.


Honestly, I think this approach is a much worse design than the NEC
callback option. Exporting __make_request() (which is an INTERNAL
function) aside, this looks like one big hack with pseudo bio attached
to front of list, the enable/disable elevator stuff (does that even work
on stacked devices?).


Hmm; agreed. Main reason for the patch was to show an alternative.
Having callbacks within callbacks is not the most straightforward way.
But if you're okay with it: you're the maintainer, you get to decide.

Exporting __make_request() is just for enabling elevator support based
on the target type. Of course this can be done differently.

Main point here was the marker bio idea. If that's rejected everything else
is moot to discuss.

Cheers,

Hannes
---
No .sig today, my .message went away ...
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Use bio markers for request callback

2007-09-11 Thread Jens Axboe
On Tue, Sep 11 2007, Hannes Reinecke wrote:
> Hi all,
>
> this is a proposal for a different implementation of request callbacks. The 
> existing ->endio callback of a request is actually a destructor function, 
> to be called to terminate a request and free all structures.
>
> However, on certain occasions (like request-based multipathing) it is 
> desirable to have a callback function for a request which is called right 
> after the request is finished, ie in end_that_request_first() before any 
> bio->bi_endio callback is called.
>
> So a simple solution for this is to clone the request and add a new 
> 'marker' bio in front of the bio list of the request. This callback will be 
> attached a structure in bi_private which keeps a pointer to the cloned and 
> the original request, thus serving as a callback for the request itself.
>
> Proposed patch attached. As usual comments are welcome.

Honestly, I think this approach is a much worse design than the NEC
callback option. Exporting __make_request() (which is an INTERNAL
function) aside, this looks like one big hack with pseudo bio attached
to front of list, the enable/disable elevator stuff (does that even work
on stacked devices?).

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][RFC] Use bio markers for request callback

2007-09-11 Thread Hannes Reinecke

Hi all,

this is a proposal for a different implementation of request  
callbacks. The existing ->endio callback of a request is actually a  
destructor function, to be called to terminate a request and free all  
structures.


However, on certain occasions (like request-based multipathing) it is  
desirable to have a callback function for a request which is called  
right after the request is finished, ie in end_that_request_first()  
before any bio->bi_endio callback is called.


So a simple solution for this is to clone the request and add a new  
'marker' bio in front of the bio list of the request. This callback  
will be attached a structure in bi_private which keeps a pointer to  
the cloned and the original request, thus serving as a callback for  
the request itself.


Proposed patch attached. As usual comments are welcome.

Cheers,

Hannes
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index a15845c..c286f05 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -40,7 +40,6 @@ static void blk_unplug_work(struct work_struct *work);
 static void blk_unplug_timeout(unsigned long data);
 static void drive_stat_acct(struct request *rq, int nr_sectors, int new_io);
 static void init_request_from_bio(struct request *req, struct bio *bio);
-static int __make_request(struct request_queue *q, struct bio *bio);
 static struct io_context *current_io_context(gfp_t gfp_flags, int node);
 
 /*
@@ -2912,7 +2911,7 @@ static void init_request_from_bio(struct request *req, struct bio *bio)
 	req->start_time = jiffies;
 }
 
-static int __make_request(struct request_queue *q, struct bio *bio)
+int __make_request(struct request_queue *q, struct bio *bio)
 {
 	struct request *req;
 	int el_ret, nr_sectors, barrier, err;
@@ -3030,6 +3029,7 @@ end_io:
 	bio_endio(bio, nr_sectors << 9, err);
 	return 0;
 }
+EXPORT_SYMBOL(__make_request);
 
 /*
  * If bio->bi_dev is a partition, remap the location
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index d6ca9d0..76acda2 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1,6 +1,7 @@
 /*
  * Copyright (C) 2003 Sistina Software Limited.
  * Copyright (C) 2004-2005 Red Hat, Inc. All rights reserved.
+ * Copyright (C) 2007 NEC Corporation.
  *
  * This file is released under the GPL.
  */
@@ -8,8 +9,7 @@
 #include "dm.h"
 #include "dm-path-selector.h"
 #include "dm-hw-handler.h"
-#include "dm-bio-list.h"
-#include "dm-bio-record.h"
+#include "dm-rq-record.h"
 
 #include 
 #include 
@@ -77,7 +77,7 @@ struct multipath {
 	unsigned saved_queue_if_no_path;/* Saved state during suspension */
 
 	struct work_struct process_queued_ios;
-	struct bio_list queued_ios;
+	struct list_head queued_ios;
 	unsigned queue_size;
 
 	struct work_struct trigger_event;
@@ -94,7 +94,7 @@ struct multipath {
  */
 struct dm_mpath_io {
 	struct pgpath *pgpath;
-	struct dm_bio_details details;
+	struct dm_rq_details details;
 };
 
 typedef int (*action_fn) (struct pgpath *pgpath);
@@ -171,6 +171,7 @@ static struct multipath *alloc_multipath(struct dm_target *ti)
 	m = kzalloc(sizeof(*m), GFP_KERNEL);
 	if (m) {
 		INIT_LIST_HEAD(&m->priority_groups);
+		INIT_LIST_HEAD(&m->queued_ios);
 		spin_lock_init(&m->lock);
 		m->queue_io = 1;
 		INIT_WORK(&m->process_queued_ios, process_queued_ios);
@@ -180,6 +181,9 @@ static struct multipath *alloc_multipath(struct dm_target *ti)
 			kfree(m);
 			return NULL;
 		}
+
+		dm_table_enable_elevator(ti->table);
+
 		m->ti = ti;
 		ti->private = m;
 	}
@@ -202,6 +206,8 @@ static void free_multipath(struct multipath *m)
 		dm_put_hw_handler(hwh->type);
 	}
 
+	dm_table_disable_elevator(m->ti->table);
+
 	mempool_destroy(m->mpio_pool);
 	kfree(m);
 }
@@ -299,7 +305,7 @@ static int __must_push_back(struct multipath *m)
 		dm_noflush_suspending(m->ti));
 }
 
-static int map_io(struct multipath *m, struct bio *bio,
+static int map_io(struct multipath *m, struct request *clone,
 		  struct dm_mpath_io *mpio, unsigned was_queued)
 {
 	int r = DM_MAPIO_REMAPPED;
@@ -321,22 +327,30 @@ static int map_io(struct multipath *m, struct bio *bio,
 	if ((pgpath && m->queue_io) ||
 	(!pgpath && m->queue_if_no_path)) {
 		/* Queue for the daemon to resubmit */
-		bio_list_add(&m->queued_ios, bio);
+		list_add_tail(&clone->queuelist, &m->queued_ios);
 		m->queue_size++;
 		if ((m->pg_init_required && !m->pg_init_in_progress) ||
 		!m->queue_io)
 			queue_work(kmultipathd, &m->process_queued_ios);
 		pgpath = NULL;
+		clone->q = NULL;
+		clone->rq_disk = NULL;
 		r = DM_MAPIO_SUBMITTED;
-	} else if (pgpath)
-		bio->bi_bdev = pgpath->path.dev->bdev;
-	else if (__must_push_back(m))
-		r = DM_MAPIO_REQUEUE;
-	else
+	} else if (pgpath) {
+		clone->q = bdev_get_queue(pgpath->path.dev->bdev);
+		clone->rq_disk = pgpath->path.dev->bdev->bd_disk;
+	} else {
+		printk(KERN_INFO "request %p: failed to map\n", clone);
+		clone->q = NULL;
+		clone->rq_disk = NULL;
 		r = -EIO;	/* Failed */
+	}
 
 	mpio->pgpath = pgpath;
 
+	if (r == DM_MAPIO_REMAPPED && m->curr