Re: [PATCH 026 of 35] Split any large bios that arrive at __make_request.
On Thursday August 2, [EMAIL PROTECTED] wrote: > Neil Brown wrote: > > > > If you confirm that 027 isn't applying, I'll track down what happened. > > You're right. I don't have patch 27. Looking Ummm... It's not in > my LKML folder either. Can you resend it? > > Thanks. > > -- > tejun It definitely got out: http://lkml.org/lkml/2007/7/30/504 but here it is. Thanks, NeilBrown Subject: Remove bi_XXX_segments and related code. __make_request now handles bios with too many segments, and it tracks segment counts in 'struct request' so we no longer need to track the counts in each bio, or to check the counts when adding a page to a bio. So bi_phys_segments, bi_hw_segments, blk_recount_segments(), BIO_SEG_VALID, bio_phys_segments and bio_hw_segments can all go. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> ### Diffstat output ./Documentation/block/biodoc.txt |2 - ./block/ll_rw_blk.c | 18 -- ./drivers/md/dm.c|1 ./drivers/md/raid1.c |5 ./drivers/md/raid10.c|5 ./drivers/md/raid5.c |5 ./drivers/scsi/scsi_lib.c|1 ./fs/bio.c | 47 --- ./include/linux/bio.h| 14 --- ./include/linux/blkdev.h |1 10 files changed, 1 insertion(+), 98 deletions(-) diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c --- .prev/block/ll_rw_blk.c 2007-07-31 11:21:20.0 +1000 +++ ./block/ll_rw_blk.c 2007-07-31 11:21:22.0 +1000 @@ -1193,19 +1193,6 @@ void blk_dump_rq_flags(struct request *r EXPORT_SYMBOL(blk_dump_rq_flags); -void blk_recount_segments(struct request_queue *q, struct bio *bio) -{ - struct request rq; - rq.q = q; - rq.bio = rq.biotail = bio; - rq.first_offset = 0; - blk_recalc_rq_segments(); - bio->bi_phys_segments = rq.nr_phys_segments; - bio->bi_hw_segments = rq.nr_hw_segments; - bio->bi_flags |= (1 << BIO_SEG_VALID); -} -EXPORT_SYMBOL(blk_recount_segments); - static void blk_recalc_rq_segments(struct request *rq) { int nr_phys_segs; @@ -1326,11 +1313,6 @@ static int blk_phys_contig_segment(struc static int blk_hw_contig_segment(struct request_queue *q, struct request *req, struct request *nxt) { - if (unlikely(!bio_flagged(req->biotail, BIO_SEG_VALID))) - blk_recount_segments(q, req->biotail); - if (unlikely(!bio_flagged(nxt->bio, BIO_SEG_VALID))) - blk_recount_segments(q, nxt->bio); - if (!rq_virt_mergeable(req, nxt) || BIOVEC_VIRT_OVERSIZE(req->hw_back_size + nxt->hw_front_size)) diff .prev/Documentation/block/biodoc.txt ./Documentation/block/biodoc.txt --- .prev/Documentation/block/biodoc.txt2007-07-31 11:21:06.0 +1000 +++ ./Documentation/block/biodoc.txt2007-07-31 11:21:22.0 +1000 @@ -456,8 +456,6 @@ struct bio { unsigned intbi_idx; /* current index into bio_vec array */ unsigned intbi_size; /* total size in bytes */ - unsigned short bi_phys_segments; /* segments after physaddr coalesce*/ - unsigned short bi_hw_segments; /* segments after DMA remapping */ unsigned intbi_max; /* max bio_vecs we can hold used as index into pool */ struct bio_vec *bi_io_vec; /* the actual vec list */ diff .prev/drivers/md/dm.c ./drivers/md/dm.c --- .prev/drivers/md/dm.c 2007-07-31 11:21:03.0 +1000 +++ ./drivers/md/dm.c 2007-07-31 11:21:22.0 +1000 @@ -660,7 +660,6 @@ static struct bio *clone_bio(struct bio clone->bi_io_vec += idx; clone->bi_vcnt = bv_count; clone->bi_size = to_bytes(len); - clone->bi_flags &= ~(1 << BIO_SEG_VALID); return clone; } diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c --- .prev/drivers/md/raid10.c 2007-07-31 11:21:07.0 +1000 +++ ./drivers/md/raid10.c 2007-07-31 11:21:22.0 +1000 @@ -1277,8 +1277,6 @@ static void sync_request_write(mddev_t * */ tbio->bi_vcnt = vcnt; tbio->bi_size = r10_bio->sectors << 9; - tbio->bi_phys_segments = 0; - tbio->bi_hw_segments = 0; tbio->bi_flags &= ~(BIO_POOL_MASK - 1); tbio->bi_flags |= 1 << BIO_UPTODATE; tbio->bi_next = NULL; @@ -1883,8 +1881,6 @@ static sector_t sync_request(mddev_t *md if (bio->bi_end_io) bio->bi_flags |= 1 << BIO_UPTODATE; bio->bi_vcnt = 0; - bio->bi_phys_segments = 0; - bio->bi_hw_segments = 0; bio->bi_size = 0; } @@ -1909,7 +1905,6 @@ static sector_t sync_request(mddev_t *md /* remove last page from
Re: [PATCH 026 of 35] Split any large bios that arrive at __make_request.
Neil Brown wrote: > On Thursday August 2, [EMAIL PROTECTED] wrote: >> Hmmm... Patches don't apply beyond this one. I'm applying against >> clean 2.6.23-rc1-mm1 grabbed using ketchup. >> > > So do you mean 027 doesn't apply, or that 028 doesn't apply next? > > It is possible that you missed 027. It originally has 3 consecutive > Xs in the subject line, so vger.kernel.org bounced it. > I re-sent it, but it would have had a different References header and > the might not appear in the same thread. > > If you confirm that 027 isn't applying, I'll track down what happened. You're right. I don't have patch 27. Looking Ummm... It's not in my LKML folder either. Can you resend it? Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 026 of 35] Split any large bios that arrive at __make_request.
On Thursday August 2, [EMAIL PROTECTED] wrote: > Hmmm... Patches don't apply beyond this one. I'm applying against > clean 2.6.23-rc1-mm1 grabbed using ketchup. > So do you mean 027 doesn't apply, or that 028 doesn't apply next? It is possible that you missed 027. It originally has 3 consecutive Xs in the subject line, so vger.kernel.org bounced it. I re-sent it, but it would have had a different References header and the might not appear in the same thread. If you confirm that 027 isn't applying, I'll track down what happened. Thanks, NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 026 of 35] Split any large bios that arrive at __make_request.
Hmmm... Patches don't apply beyond this one. I'm applying against clean 2.6.23-rc1-mm1 grabbed using ketchup. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 026 of 35] Split any large bios that arrive at __make_request.
Hmmm... Patches don't apply beyond this one. I'm applying against clean 2.6.23-rc1-mm1 grabbed using ketchup. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 026 of 35] Split any large bios that arrive at __make_request.
On Thursday August 2, [EMAIL PROTECTED] wrote: Hmmm... Patches don't apply beyond this one. I'm applying against clean 2.6.23-rc1-mm1 grabbed using ketchup. So do you mean 027 doesn't apply, or that 028 doesn't apply next? It is possible that you missed 027. It originally has 3 consecutive Xs in the subject line, so vger.kernel.org bounced it. I re-sent it, but it would have had a different References header and the might not appear in the same thread. If you confirm that 027 isn't applying, I'll track down what happened. Thanks, NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 026 of 35] Split any large bios that arrive at __make_request.
On Thursday August 2, [EMAIL PROTECTED] wrote: Neil Brown wrote: If you confirm that 027 isn't applying, I'll track down what happened. You're right. I don't have patch 27. Looking Ummm... It's not in my LKML folder either. Can you resend it? Thanks. -- tejun It definitely got out: http://lkml.org/lkml/2007/7/30/504 but here it is. Thanks, NeilBrown Subject: Remove bi_XXX_segments and related code. __make_request now handles bios with too many segments, and it tracks segment counts in 'struct request' so we no longer need to track the counts in each bio, or to check the counts when adding a page to a bio. So bi_phys_segments, bi_hw_segments, blk_recount_segments(), BIO_SEG_VALID, bio_phys_segments and bio_hw_segments can all go. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./Documentation/block/biodoc.txt |2 - ./block/ll_rw_blk.c | 18 -- ./drivers/md/dm.c|1 ./drivers/md/raid1.c |5 ./drivers/md/raid10.c|5 ./drivers/md/raid5.c |5 ./drivers/scsi/scsi_lib.c|1 ./fs/bio.c | 47 --- ./include/linux/bio.h| 14 --- ./include/linux/blkdev.h |1 10 files changed, 1 insertion(+), 98 deletions(-) diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c --- .prev/block/ll_rw_blk.c 2007-07-31 11:21:20.0 +1000 +++ ./block/ll_rw_blk.c 2007-07-31 11:21:22.0 +1000 @@ -1193,19 +1193,6 @@ void blk_dump_rq_flags(struct request *r EXPORT_SYMBOL(blk_dump_rq_flags); -void blk_recount_segments(struct request_queue *q, struct bio *bio) -{ - struct request rq; - rq.q = q; - rq.bio = rq.biotail = bio; - rq.first_offset = 0; - blk_recalc_rq_segments(rq); - bio-bi_phys_segments = rq.nr_phys_segments; - bio-bi_hw_segments = rq.nr_hw_segments; - bio-bi_flags |= (1 BIO_SEG_VALID); -} -EXPORT_SYMBOL(blk_recount_segments); - static void blk_recalc_rq_segments(struct request *rq) { int nr_phys_segs; @@ -1326,11 +1313,6 @@ static int blk_phys_contig_segment(struc static int blk_hw_contig_segment(struct request_queue *q, struct request *req, struct request *nxt) { - if (unlikely(!bio_flagged(req-biotail, BIO_SEG_VALID))) - blk_recount_segments(q, req-biotail); - if (unlikely(!bio_flagged(nxt-bio, BIO_SEG_VALID))) - blk_recount_segments(q, nxt-bio); - if (!rq_virt_mergeable(req, nxt) || BIOVEC_VIRT_OVERSIZE(req-hw_back_size + nxt-hw_front_size)) diff .prev/Documentation/block/biodoc.txt ./Documentation/block/biodoc.txt --- .prev/Documentation/block/biodoc.txt2007-07-31 11:21:06.0 +1000 +++ ./Documentation/block/biodoc.txt2007-07-31 11:21:22.0 +1000 @@ -456,8 +456,6 @@ struct bio { unsigned intbi_idx; /* current index into bio_vec array */ unsigned intbi_size; /* total size in bytes */ - unsigned short bi_phys_segments; /* segments after physaddr coalesce*/ - unsigned short bi_hw_segments; /* segments after DMA remapping */ unsigned intbi_max; /* max bio_vecs we can hold used as index into pool */ struct bio_vec *bi_io_vec; /* the actual vec list */ diff .prev/drivers/md/dm.c ./drivers/md/dm.c --- .prev/drivers/md/dm.c 2007-07-31 11:21:03.0 +1000 +++ ./drivers/md/dm.c 2007-07-31 11:21:22.0 +1000 @@ -660,7 +660,6 @@ static struct bio *clone_bio(struct bio clone-bi_io_vec += idx; clone-bi_vcnt = bv_count; clone-bi_size = to_bytes(len); - clone-bi_flags = ~(1 BIO_SEG_VALID); return clone; } diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c --- .prev/drivers/md/raid10.c 2007-07-31 11:21:07.0 +1000 +++ ./drivers/md/raid10.c 2007-07-31 11:21:22.0 +1000 @@ -1277,8 +1277,6 @@ static void sync_request_write(mddev_t * */ tbio-bi_vcnt = vcnt; tbio-bi_size = r10_bio-sectors 9; - tbio-bi_phys_segments = 0; - tbio-bi_hw_segments = 0; tbio-bi_flags = ~(BIO_POOL_MASK - 1); tbio-bi_flags |= 1 BIO_UPTODATE; tbio-bi_next = NULL; @@ -1883,8 +1881,6 @@ static sector_t sync_request(mddev_t *md if (bio-bi_end_io) bio-bi_flags |= 1 BIO_UPTODATE; bio-bi_vcnt = 0; - bio-bi_phys_segments = 0; - bio-bi_hw_segments = 0; bio-bi_size = 0; } @@ -1909,7 +1905,6 @@ static sector_t sync_request(mddev_t *md /* remove last page from this bio */
Re: [PATCH 026 of 35] Split any large bios that arrive at __make_request.
Neil Brown wrote: On Thursday August 2, [EMAIL PROTECTED] wrote: Hmmm... Patches don't apply beyond this one. I'm applying against clean 2.6.23-rc1-mm1 grabbed using ketchup. So do you mean 027 doesn't apply, or that 028 doesn't apply next? It is possible that you missed 027. It originally has 3 consecutive Xs in the subject line, so vger.kernel.org bounced it. I re-sent it, but it would have had a different References header and the might not appear in the same thread. If you confirm that 027 isn't applying, I'll track down what happened. You're right. I don't have patch 27. Looking Ummm... It's not in my LKML folder either. Can you resend it? Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 026 of 35] Split any large bios that arrive at __make_request.
Now that bi_io_vec and bio can be shared, we can handle arbitrarily large bios in __make_request by splitting them over multiple requests. If we do split a request, we mark both halves as "REQ_NOMERGE". It is only really necessary to mark the first part as NO_BACK_MERGE and the second part as NO_FRONT_MERGE but that distinction isn't currently supported. Note that we do not try to merge part of a large bio to a neighbouring request. That is a possible future enhancement. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> ### Diffstat output ./block/ll_rw_blk.c | 122 +++ ./include/linux/blkdev.h |5 + 2 files changed, 107 insertions(+), 20 deletions(-) diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c --- .prev/block/ll_rw_blk.c 2007-07-31 11:21:15.0 +1000 +++ ./block/ll_rw_blk.c 2007-07-31 11:21:20.0 +1000 @@ -1221,13 +1221,21 @@ static void blk_recalc_rq_segments(struc struct req_iterator i; int high, highprv = 1; struct request_queue *q = rq->q; + int curr_size = 0; + unsigned short max_sectors; if (!rq->bio) return; + if (unlikely(blk_pc_request(rq))) + max_sectors = q->max_hw_sectors; + else + max_sectors = q->max_sectors; + cluster = q->queue_flags & (1 << QUEUE_FLAG_CLUSTER); hw_seg_size = seg_size = 0; phys_size = hw_size = nr_phys_segs = nr_hw_segs = 0; + rq->max_allowed_size = 0; rq_for_each_segment(rq, i, bv) { /* * the trick here is making sure that a high page is never @@ -1249,9 +1257,7 @@ static void blk_recalc_rq_segments(struc seg_size += bv.bv_len; hw_seg_size += bv.bv_len; - bvprv = bv; - prvidx = i.i.i; - continue; + goto same_seg; } new_segment: if (BIOVEC_VIRT_MERGEABLE(, ) && @@ -1267,11 +1273,19 @@ new_hw_segment: } nr_phys_segs++; + seg_size = bv.bv_len; +same_seg: + curr_size += bv.bv_len; bvprv = bv; prvidx = i.i.i; - seg_size = bv.bv_len; highprv = high; + + if (curr_size <= (max_sectors << 9) && + nr_phys_segs <= q->max_phys_segments && + nr_hw_segs <= q->max_hw_segments) + rq->max_allowed_size = curr_size; } + rq->last_len = bvprv.bv_offset + bvprv.bv_len; rq->last_idx = prvidx; @@ -2924,6 +2938,70 @@ static void init_request_from_bio(struct blk_rq_bio_prep(req->q, req, bio); } +static void rq_split(struct request *orig, struct request *new) +{ + + /* 'orig' contains exactly one bio, and may refer to +* some section in the middle of that bio. +* Make 'new' refer to the beginning of that section, up +* to orig->max_allowed_size. +* Remove from 'orig' everything that went into 'new'. +* If 'orig' becomes empty, release it's reference to the bio. +*/ + + new->cmd_type = orig->cmd_type; + new->cmd_flags |= orig->cmd_flags; + new->errors = 0; + new->hard_sector = new->sector = orig->hard_sector; + new->ioprio = orig->ioprio; + new->start_time = jiffies; + new->data_len = orig->data_len; + new->bio = orig->bio; + atomic_inc(>bio->bi_iocnt); + new->biotail = orig->biotail; + new->current_nr_sectors = orig->current_nr_sectors; + + new->buffer = orig->buffer; + new->rq_disk = orig->rq_disk; + + if (orig->max_allowed_size == orig->hard_nr_sectors << 9) { + /* all of orig goes into new */ + new->nr_sectors = new->hard_nr_sectors + = orig->hard_nr_sectors; + new->nr_phys_segments = orig->nr_phys_segments; + new->nr_hw_segments = orig->nr_hw_segments; + new->hw_front_size = orig->hw_front_size; + new->hw_back_size = orig->hw_back_size; + new->last_len = orig->last_len; + new->last_idx = orig->last_idx; + + orig->nr_sectors = orig->hard_nr_sectors = 0; + atomic_dec(>bio->bi_iocnt); + orig->bio = NULL; + } else { + /* start of orig goes into new, rest stays in orig */ + int offset; + new->nr_sectors = new->hard_nr_sectors + = (orig->max_allowed_size >> 9); + new->data_len = new->nr_sectors << 9; + new->biotail = NULL; + new->cmd_flags |= REQ_NOMERGE; + + orig->nr_sectors = orig->hard_nr_sectors + -= orig->max_allowed_size >> 9; + orig->data_len = orig->nr_sectors << 9; +
[PATCH 026 of 35] Split any large bios that arrive at __make_request.
Now that bi_io_vec and bio can be shared, we can handle arbitrarily large bios in __make_request by splitting them over multiple requests. If we do split a request, we mark both halves as REQ_NOMERGE. It is only really necessary to mark the first part as NO_BACK_MERGE and the second part as NO_FRONT_MERGE but that distinction isn't currently supported. Note that we do not try to merge part of a large bio to a neighbouring request. That is a possible future enhancement. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./block/ll_rw_blk.c | 122 +++ ./include/linux/blkdev.h |5 + 2 files changed, 107 insertions(+), 20 deletions(-) diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c --- .prev/block/ll_rw_blk.c 2007-07-31 11:21:15.0 +1000 +++ ./block/ll_rw_blk.c 2007-07-31 11:21:20.0 +1000 @@ -1221,13 +1221,21 @@ static void blk_recalc_rq_segments(struc struct req_iterator i; int high, highprv = 1; struct request_queue *q = rq-q; + int curr_size = 0; + unsigned short max_sectors; if (!rq-bio) return; + if (unlikely(blk_pc_request(rq))) + max_sectors = q-max_hw_sectors; + else + max_sectors = q-max_sectors; + cluster = q-queue_flags (1 QUEUE_FLAG_CLUSTER); hw_seg_size = seg_size = 0; phys_size = hw_size = nr_phys_segs = nr_hw_segs = 0; + rq-max_allowed_size = 0; rq_for_each_segment(rq, i, bv) { /* * the trick here is making sure that a high page is never @@ -1249,9 +1257,7 @@ static void blk_recalc_rq_segments(struc seg_size += bv.bv_len; hw_seg_size += bv.bv_len; - bvprv = bv; - prvidx = i.i.i; - continue; + goto same_seg; } new_segment: if (BIOVEC_VIRT_MERGEABLE(bvprv, bv) @@ -1267,11 +1273,19 @@ new_hw_segment: } nr_phys_segs++; + seg_size = bv.bv_len; +same_seg: + curr_size += bv.bv_len; bvprv = bv; prvidx = i.i.i; - seg_size = bv.bv_len; highprv = high; + + if (curr_size = (max_sectors 9) + nr_phys_segs = q-max_phys_segments + nr_hw_segs = q-max_hw_segments) + rq-max_allowed_size = curr_size; } + rq-last_len = bvprv.bv_offset + bvprv.bv_len; rq-last_idx = prvidx; @@ -2924,6 +2938,70 @@ static void init_request_from_bio(struct blk_rq_bio_prep(req-q, req, bio); } +static void rq_split(struct request *orig, struct request *new) +{ + + /* 'orig' contains exactly one bio, and may refer to +* some section in the middle of that bio. +* Make 'new' refer to the beginning of that section, up +* to orig-max_allowed_size. +* Remove from 'orig' everything that went into 'new'. +* If 'orig' becomes empty, release it's reference to the bio. +*/ + + new-cmd_type = orig-cmd_type; + new-cmd_flags |= orig-cmd_flags; + new-errors = 0; + new-hard_sector = new-sector = orig-hard_sector; + new-ioprio = orig-ioprio; + new-start_time = jiffies; + new-data_len = orig-data_len; + new-bio = orig-bio; + atomic_inc(orig-bio-bi_iocnt); + new-biotail = orig-biotail; + new-current_nr_sectors = orig-current_nr_sectors; + + new-buffer = orig-buffer; + new-rq_disk = orig-rq_disk; + + if (orig-max_allowed_size == orig-hard_nr_sectors 9) { + /* all of orig goes into new */ + new-nr_sectors = new-hard_nr_sectors + = orig-hard_nr_sectors; + new-nr_phys_segments = orig-nr_phys_segments; + new-nr_hw_segments = orig-nr_hw_segments; + new-hw_front_size = orig-hw_front_size; + new-hw_back_size = orig-hw_back_size; + new-last_len = orig-last_len; + new-last_idx = orig-last_idx; + + orig-nr_sectors = orig-hard_nr_sectors = 0; + atomic_dec(orig-bio-bi_iocnt); + orig-bio = NULL; + } else { + /* start of orig goes into new, rest stays in orig */ + int offset; + new-nr_sectors = new-hard_nr_sectors + = (orig-max_allowed_size 9); + new-data_len = new-nr_sectors 9; + new-biotail = NULL; + new-cmd_flags |= REQ_NOMERGE; + + orig-nr_sectors = orig-hard_nr_sectors + -= orig-max_allowed_size 9; + orig-data_len = orig-nr_sectors 9; + orig-sector = orig-hard_sector += orig-max_allowed_size 9; +