Re: [PATCH][RFC] Use bio markers for request callback
Hi Hannes, On Tue, 11 Sep 2007 12:44:46 +0200, Hannes Reinecke <[EMAIL PROTECTED]> wrote: > this is a proposal for a different implementation of request > callbacks. The existing ->endio callback of a request is actually a > destructor function, to be called to terminate a request and free all > structures. > > However, on certain occasions (like request-based multipathing) it is > desirable to have a callback function for a request which is called > right after the request is finished, ie in end_that_request_first() > before any bio->bi_endio callback is called. > > So a simple solution for this is to clone the request and add a new > 'marker' bio in front of the bio list of the request. This callback > will be attached a structure in bi_private which keeps a pointer to > the cloned and the original request, thus serving as a callback for > the request itself. Thank you for another idea for request-based multipath. However, I disagree with the idea. I think the design (bio->bi_end_io() completes a request) is complex a little bit and is breaking layer stacking. Also, I think that the design of completion handling by 2 hooks, bio->bi_end_io() and rq->end_io(), makes error handing in dm-multipath complex. e.g. Even if we detect an error on the first bio->bi_end_io() hook, the request can't be retried until the second rq->end_io() hook is called, because the ownership of the request is still on low level drivers until the low level driver calls end_that_request_last(). Although this is just implementation issue, I think that the patch can't handle leftover. SCSI, scsi_end_request(), may have some leftovers in the request. Then, the marker bio is freed in bio->bi_end_io() and the request is resubmitted in SCSI level. So the resubmitted request doesn't have the marker bio any more. Thanks, Kiyoshi Ueda - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] Use bio markers for request callback
Am Tue 11 Sep 2007 01:11:34 PM CEST schrieb Jens Axboe <[EMAIL PROTECTED]>: On Tue, Sep 11 2007, Hannes Reinecke wrote: Hi all, this is a proposal for a different implementation of request callbacks. The existing ->endio callback of a request is actually a destructor function, to be called to terminate a request and free all structures. However, on certain occasions (like request-based multipathing) it is desirable to have a callback function for a request which is called right after the request is finished, ie in end_that_request_first() before any bio->bi_endio callback is called. So a simple solution for this is to clone the request and add a new 'marker' bio in front of the bio list of the request. This callback will be attached a structure in bi_private which keeps a pointer to the cloned and the original request, thus serving as a callback for the request itself. Proposed patch attached. As usual comments are welcome. Honestly, I think this approach is a much worse design than the NEC callback option. Exporting __make_request() (which is an INTERNAL function) aside, this looks like one big hack with pseudo bio attached to front of list, the enable/disable elevator stuff (does that even work on stacked devices?). Hmm; agreed. Main reason for the patch was to show an alternative. Having callbacks within callbacks is not the most straightforward way. But if you're okay with it: you're the maintainer, you get to decide. Exporting __make_request() is just for enabling elevator support based on the target type. Of course this can be done differently. Main point here was the marker bio idea. If that's rejected everything else is moot to discuss. Cheers, Hannes --- No .sig today, my .message went away ... - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] Use bio markers for request callback
On Tue, Sep 11 2007, Hannes Reinecke wrote: > Hi all, > > this is a proposal for a different implementation of request callbacks. The > existing ->endio callback of a request is actually a destructor function, > to be called to terminate a request and free all structures. > > However, on certain occasions (like request-based multipathing) it is > desirable to have a callback function for a request which is called right > after the request is finished, ie in end_that_request_first() before any > bio->bi_endio callback is called. > > So a simple solution for this is to clone the request and add a new > 'marker' bio in front of the bio list of the request. This callback will be > attached a structure in bi_private which keeps a pointer to the cloned and > the original request, thus serving as a callback for the request itself. > > Proposed patch attached. As usual comments are welcome. Honestly, I think this approach is a much worse design than the NEC callback option. Exporting __make_request() (which is an INTERNAL function) aside, this looks like one big hack with pseudo bio attached to front of list, the enable/disable elevator stuff (does that even work on stacked devices?). -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][RFC] Use bio markers for request callback
Hi all, this is a proposal for a different implementation of request callbacks. The existing ->endio callback of a request is actually a destructor function, to be called to terminate a request and free all structures. However, on certain occasions (like request-based multipathing) it is desirable to have a callback function for a request which is called right after the request is finished, ie in end_that_request_first() before any bio->bi_endio callback is called. So a simple solution for this is to clone the request and add a new 'marker' bio in front of the bio list of the request. This callback will be attached a structure in bi_private which keeps a pointer to the cloned and the original request, thus serving as a callback for the request itself. Proposed patch attached. As usual comments are welcome. Cheers, Hannes diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index a15845c..c286f05 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -40,7 +40,6 @@ static void blk_unplug_work(struct work_struct *work); static void blk_unplug_timeout(unsigned long data); static void drive_stat_acct(struct request *rq, int nr_sectors, int new_io); static void init_request_from_bio(struct request *req, struct bio *bio); -static int __make_request(struct request_queue *q, struct bio *bio); static struct io_context *current_io_context(gfp_t gfp_flags, int node); /* @@ -2912,7 +2911,7 @@ static void init_request_from_bio(struct request *req, struct bio *bio) req->start_time = jiffies; } -static int __make_request(struct request_queue *q, struct bio *bio) +int __make_request(struct request_queue *q, struct bio *bio) { struct request *req; int el_ret, nr_sectors, barrier, err; @@ -3030,6 +3029,7 @@ end_io: bio_endio(bio, nr_sectors << 9, err); return 0; } +EXPORT_SYMBOL(__make_request); /* * If bio->bi_dev is a partition, remap the location diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index d6ca9d0..76acda2 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -1,6 +1,7 @@ /* * Copyright (C) 2003 Sistina Software Limited. * Copyright (C) 2004-2005 Red Hat, Inc. All rights reserved. + * Copyright (C) 2007 NEC Corporation. * * This file is released under the GPL. */ @@ -8,8 +9,7 @@ #include "dm.h" #include "dm-path-selector.h" #include "dm-hw-handler.h" -#include "dm-bio-list.h" -#include "dm-bio-record.h" +#include "dm-rq-record.h" #include #include @@ -77,7 +77,7 @@ struct multipath { unsigned saved_queue_if_no_path;/* Saved state during suspension */ struct work_struct process_queued_ios; - struct bio_list queued_ios; + struct list_head queued_ios; unsigned queue_size; struct work_struct trigger_event; @@ -94,7 +94,7 @@ struct multipath { */ struct dm_mpath_io { struct pgpath *pgpath; - struct dm_bio_details details; + struct dm_rq_details details; }; typedef int (*action_fn) (struct pgpath *pgpath); @@ -171,6 +171,7 @@ static struct multipath *alloc_multipath(struct dm_target *ti) m = kzalloc(sizeof(*m), GFP_KERNEL); if (m) { INIT_LIST_HEAD(&m->priority_groups); + INIT_LIST_HEAD(&m->queued_ios); spin_lock_init(&m->lock); m->queue_io = 1; INIT_WORK(&m->process_queued_ios, process_queued_ios); @@ -180,6 +181,9 @@ static struct multipath *alloc_multipath(struct dm_target *ti) kfree(m); return NULL; } + + dm_table_enable_elevator(ti->table); + m->ti = ti; ti->private = m; } @@ -202,6 +206,8 @@ static void free_multipath(struct multipath *m) dm_put_hw_handler(hwh->type); } + dm_table_disable_elevator(m->ti->table); + mempool_destroy(m->mpio_pool); kfree(m); } @@ -299,7 +305,7 @@ static int __must_push_back(struct multipath *m) dm_noflush_suspending(m->ti)); } -static int map_io(struct multipath *m, struct bio *bio, +static int map_io(struct multipath *m, struct request *clone, struct dm_mpath_io *mpio, unsigned was_queued) { int r = DM_MAPIO_REMAPPED; @@ -321,22 +327,30 @@ static int map_io(struct multipath *m, struct bio *bio, if ((pgpath && m->queue_io) || (!pgpath && m->queue_if_no_path)) { /* Queue for the daemon to resubmit */ - bio_list_add(&m->queued_ios, bio); + list_add_tail(&clone->queuelist, &m->queued_ios); m->queue_size++; if ((m->pg_init_required && !m->pg_init_in_progress) || !m->queue_io) queue_work(kmultipathd, &m->process_queued_ios); pgpath = NULL; + clone->q = NULL; + clone->rq_disk = NULL; r = DM_MAPIO_SUBMITTED; - } else if (pgpath) - bio->bi_bdev = pgpath->path.dev->bdev; - else if (__must_push_back(m)) - r = DM_MAPIO_REQUEUE; - else + } else if (pgpath) { + clone->q = bdev_get_queue(pgpath->path.dev->bdev); + clone->rq_disk = pgpath->path.dev->bdev->bd_disk; + } else { + printk(KERN_INFO "request %p: failed to map\n", clone); + clone->q = NULL; + clone->rq_disk = NULL; r = -EIO; /* Failed */ + } mpio->pgpath = pgpath; + if (r == DM_MAPIO_REMAPPED && m->curr