Re: [PATCH 025 of 35] Treat rq->hard_nr_sectors as setting an overriding limit in the size of the request

2007-08-01 Thread Tejun Heo
Neil Brown wrote:
> On Thursday August 2, [EMAIL PROTECTED] wrote:
>> This is pretty confusing.  In all other places, bi_size -> #sector
>> conversion is done by rounding down but only in blk_rq_bio_prep() it's
>> being rounded up.
>>
>> Is my following reasoning correct?
>>
>> It was okay till now because unaligned requests don't get merged and
>> also haven't done partial completions (end_that_request_first with
>> partial count)?  So till now, hard_nr_sectors and nr_sectors didn't
>> really matter for unaligned requests but now it matters because it's
>> considered while iterating over bvecs in rq.
> 
> Yes, that reasoning matches mine.
> 
>> If so, I think the correct thing to do would be changing bio_sectors()
>> to round up first or let block layer measure transfer in bytes not in
>> sectors.  I don't think everyone would agree with the latter tho.  I
>> (tentatively) think it would be better to represent length in bytes
>> tho.  A lot of requests which aren't aligned to 512 bytes pass through
>> the block layer and the mismatch can result in subtle bugs.
> 
> I suspect that having a byte count in 'struct request' would make
> sense too.  However I would rather avoid making that change myself - I
> think it would require reading and understanding a lot more code
> 
> I cannot see anything that would go wrong with rounding up bio_sectors
> unconditionally, so I think I will take that approach for this patch
> series.

Yes, converting to nbytes will probably take a lot of work and probably
deserves a separate series if it's ever gonna be done.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 025 of 35] Treat rq->hard_nr_sectors as setting an overriding limit in the size of the request

2007-08-01 Thread Neil Brown
On Thursday August 2, [EMAIL PROTECTED] wrote:
> 
> This is pretty confusing.  In all other places, bi_size -> #sector
> conversion is done by rounding down but only in blk_rq_bio_prep() it's
> being rounded up.
> 
> Is my following reasoning correct?
> 
> It was okay till now because unaligned requests don't get merged and
> also haven't done partial completions (end_that_request_first with
> partial count)?  So till now, hard_nr_sectors and nr_sectors didn't
> really matter for unaligned requests but now it matters because it's
> considered while iterating over bvecs in rq.

Yes, that reasoning matches mine.

> 
> If so, I think the correct thing to do would be changing bio_sectors()
> to round up first or let block layer measure transfer in bytes not in
> sectors.  I don't think everyone would agree with the latter tho.  I
> (tentatively) think it would be better to represent length in bytes
> tho.  A lot of requests which aren't aligned to 512 bytes pass through
> the block layer and the mismatch can result in subtle bugs.

I suspect that having a byte count in 'struct request' would make
sense too.  However I would rather avoid making that change myself - I
think it would require reading and understanding a lot more code

I cannot see anything that would go wrong with rounding up bio_sectors
unconditionally, so I think I will take that approach for this patch
series.

Thanks.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 025 of 35] Treat rq->hard_nr_sectors as setting an overriding limit in the size of the request

2007-08-01 Thread Tejun Heo
On Tue, Jul 31, 2007 at 12:17:59PM +1000, NeilBrown wrote:
> 
> For a request to be able to refer to part of a bio, we need to be able
> to impose a size limit at the request level.  So allow hard_nr_sectors
> to be less than the size of the bios (and bio_vecs) and interpret it
> such that anything in the last bio beyond that limit is ignored.
> 
> As some bios can be less than one sector - as happens when a SCSI
> sense command is being submitted - we need to set hard_nr_sectors to
> bi_size rounded up in blk_rq_bio_prep, and we need to abort the
> rq_for_each_segment loop if _iter.bio becomes NULL even if _iter.size
> is still non-zero

This is pretty confusing.  In all other places, bi_size -> #sector
conversion is done by rounding down but only in blk_rq_bio_prep() it's
being rounded up.

Is my following reasoning correct?

It was okay till now because unaligned requests don't get merged and
also haven't done partial completions (end_that_request_first with
partial count)?  So till now, hard_nr_sectors and nr_sectors didn't
really matter for unaligned requests but now it matters because it's
considered while iterating over bvecs in rq.

If so, I think the correct thing to do would be changing bio_sectors()
to round up first or let block layer measure transfer in bytes not in
sectors.  I don't think everyone would agree with the latter tho.  I
(tentatively) think it would be better to represent length in bytes
tho.  A lot of requests which aren't aligned to 512 bytes pass through
the block layer and the mismatch can result in subtle bugs.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 025 of 35] Treat rq-hard_nr_sectors as setting an overriding limit in the size of the request

2007-08-01 Thread Tejun Heo
On Tue, Jul 31, 2007 at 12:17:59PM +1000, NeilBrown wrote:
 
 For a request to be able to refer to part of a bio, we need to be able
 to impose a size limit at the request level.  So allow hard_nr_sectors
 to be less than the size of the bios (and bio_vecs) and interpret it
 such that anything in the last bio beyond that limit is ignored.
 
 As some bios can be less than one sector - as happens when a SCSI
 sense command is being submitted - we need to set hard_nr_sectors to
 bi_size rounded up in blk_rq_bio_prep, and we need to abort the
 rq_for_each_segment loop if _iter.bio becomes NULL even if _iter.size
 is still non-zero

This is pretty confusing.  In all other places, bi_size - #sector
conversion is done by rounding down but only in blk_rq_bio_prep() it's
being rounded up.

Is my following reasoning correct?

It was okay till now because unaligned requests don't get merged and
also haven't done partial completions (end_that_request_first with
partial count)?  So till now, hard_nr_sectors and nr_sectors didn't
really matter for unaligned requests but now it matters because it's
considered while iterating over bvecs in rq.

If so, I think the correct thing to do would be changing bio_sectors()
to round up first or let block layer measure transfer in bytes not in
sectors.  I don't think everyone would agree with the latter tho.  I
(tentatively) think it would be better to represent length in bytes
tho.  A lot of requests which aren't aligned to 512 bytes pass through
the block layer and the mismatch can result in subtle bugs.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 025 of 35] Treat rq-hard_nr_sectors as setting an overriding limit in the size of the request

2007-08-01 Thread Neil Brown
On Thursday August 2, [EMAIL PROTECTED] wrote:
 
 This is pretty confusing.  In all other places, bi_size - #sector
 conversion is done by rounding down but only in blk_rq_bio_prep() it's
 being rounded up.
 
 Is my following reasoning correct?
 
 It was okay till now because unaligned requests don't get merged and
 also haven't done partial completions (end_that_request_first with
 partial count)?  So till now, hard_nr_sectors and nr_sectors didn't
 really matter for unaligned requests but now it matters because it's
 considered while iterating over bvecs in rq.

Yes, that reasoning matches mine.

 
 If so, I think the correct thing to do would be changing bio_sectors()
 to round up first or let block layer measure transfer in bytes not in
 sectors.  I don't think everyone would agree with the latter tho.  I
 (tentatively) think it would be better to represent length in bytes
 tho.  A lot of requests which aren't aligned to 512 bytes pass through
 the block layer and the mismatch can result in subtle bugs.

I suspect that having a byte count in 'struct request' would make
sense too.  However I would rather avoid making that change myself - I
think it would require reading and understanding a lot more code

I cannot see anything that would go wrong with rounding up bio_sectors
unconditionally, so I think I will take that approach for this patch
series.

Thanks.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 025 of 35] Treat rq-hard_nr_sectors as setting an overriding limit in the size of the request

2007-08-01 Thread Tejun Heo
Neil Brown wrote:
 On Thursday August 2, [EMAIL PROTECTED] wrote:
 This is pretty confusing.  In all other places, bi_size - #sector
 conversion is done by rounding down but only in blk_rq_bio_prep() it's
 being rounded up.

 Is my following reasoning correct?

 It was okay till now because unaligned requests don't get merged and
 also haven't done partial completions (end_that_request_first with
 partial count)?  So till now, hard_nr_sectors and nr_sectors didn't
 really matter for unaligned requests but now it matters because it's
 considered while iterating over bvecs in rq.
 
 Yes, that reasoning matches mine.
 
 If so, I think the correct thing to do would be changing bio_sectors()
 to round up first or let block layer measure transfer in bytes not in
 sectors.  I don't think everyone would agree with the latter tho.  I
 (tentatively) think it would be better to represent length in bytes
 tho.  A lot of requests which aren't aligned to 512 bytes pass through
 the block layer and the mismatch can result in subtle bugs.
 
 I suspect that having a byte count in 'struct request' would make
 sense too.  However I would rather avoid making that change myself - I
 think it would require reading and understanding a lot more code
 
 I cannot see anything that would go wrong with rounding up bio_sectors
 unconditionally, so I think I will take that approach for this patch
 series.

Yes, converting to nbytes will probably take a lot of work and probably
deserves a separate series if it's ever gonna be done.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 025 of 35] Treat rq->hard_nr_sectors as setting an overriding limit in the size of the request

2007-07-30 Thread NeilBrown

For a request to be able to refer to part of a bio, we need to be able
to impose a size limit at the request level.  So allow hard_nr_sectors
to be less than the size of the bios (and bio_vecs) and interpret it
such that anything in the last bio beyond that limit is ignored.

As some bios can be less than one sector - as happens when a SCSI
sense command is being submitted - we need to set hard_nr_sectors to
bi_size rounded up in blk_rq_bio_prep, and we need to abort the
rq_for_each_segment loop if _iter.bio becomes NULL even if _iter.size
is still non-zero

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./block/ll_rw_blk.c  |   11 +--
 ./include/linux/bio.h|9 +
 ./include/linux/blkdev.h |   11 +++
 3 files changed, 17 insertions(+), 14 deletions(-)

diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c
--- .prev/block/ll_rw_blk.c 2007-07-31 11:21:14.0 +1000
+++ ./block/ll_rw_blk.c 2007-07-31 11:21:15.0 +1000
@@ -3364,12 +3364,10 @@ static void blk_recalc_rq_sectors(struct
 
/*
 * if total number of sectors is less than the first segment
-* size, something has gone terribly wrong
+* size, then we have hit an early end-of-request.
 */
-   if (rq->nr_sectors < rq->current_nr_sectors) {
-   printk("blk: request botched\n");
-   rq->nr_sectors = rq->current_nr_sectors;
-   }
+   if (rq->nr_sectors < rq->current_nr_sectors)
+   rq->current_nr_sectors = rq->nr_sectors;
}
 }
 
@@ -3661,7 +3659,8 @@ static void blk_rq_bio_prep(struct reque
/* first two bits are identical in rq->cmd_flags and bio->bi_rw */
rq->cmd_flags |= (bio->bi_rw & 3);
 
-   rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio);
+   rq->hard_nr_sectors = rq->nr_sectors
+   = DIV_ROUND_UP(bio->bi_size, 512);
rq->data_len = bio->bi_size;
 
rq->first_offset = 0;

diff .prev/include/linux/bio.h ./include/linux/bio.h
--- .prev/include/linux/bio.h   2007-07-31 11:21:07.0 +1000
+++ ./include/linux/bio.h   2007-07-31 11:21:15.0 +1000
@@ -222,7 +222,8 @@ struct bio_iterator {
 
 /* This macro probably need some explanation...
  * Its purpose is to find all the effective segments in a bio
- * missing the first 'offs' bytes.  We need to be sure to honour
+ * missing the first 'offs' bytes and only covering the next _size
+ * bytes.   We need to be sure to honour
  * bi_offset which can cause us to skip part of the firs segment,
  * and bi_size which may cause us to stop before the end of bi_io_vec.
  * The 'for' loop iterates through the segments in bi_io_vec until
@@ -233,9 +234,9 @@ struct bio_iterator {
  * follows this macro) if the size would be zero.
  * It also keeps 'offset' and 'size' (in the iterator) up to date.
  */
-#define bio_for_each_segment_offset(bv, bio, _i, offs) \
+#define bio_for_each_segment_offset(bv, bio, _i, offs, _size)  \
for (_i.i = 0, _i.offset = (bio)->bi_offset + offs, \
-_i.size = (bio)->bi_size - offs;   \
+_i.size = min_t(int, _size, (bio)->bi_size - offs);\
 _i.i < (bio)->bi_vcnt && _i.size > 0;  \
 _i.i++)\
if (bv = *bio_iovec_idx((bio), _i.i),   \
@@ -251,7 +252,7 @@ struct bio_iterator {
  bv.bv_len > 0)))
 
 #define bio_for_each_segment(bv, bio, __i) \
-   bio_for_each_segment_offset(bv, bio, __i, 0)
+   bio_for_each_segment_offset(bv, bio, __i, 0, (bio)->bi_size)
 /*
  * get a reference to a bio, so it won't disappear. the intended use is
  * something like:

diff .prev/include/linux/blkdev.h ./include/linux/blkdev.h
--- .prev/include/linux/blkdev.h2007-07-31 11:21:14.0 +1000
+++ ./include/linux/blkdev.h2007-07-31 11:21:15.0 +1000
@@ -657,13 +657,16 @@ struct req_iterator {
struct bio_iterator i;
struct bio *bio;
int offset;
+   int size;
 };
 #define rq_for_each_segment(rq, _iter, bvec)   \
-   for (_iter.bio = (rq)->bio, _iter.offset = (rq)->first_offset; \
-_iter.bio && _iter.bio != rq->biotail->bi_next;   \
-_iter.bio = _iter.bio->bi_next, _iter.offset = 0) \
+   for (_iter.bio = (rq)->bio, _iter.offset = (rq)->first_offset, \
+   _iter.size = (rq)->hard_nr_sectors << 9;   \
+_iter.size && _iter.bio;  \
+_iter.size -= (_iter.bio->bi_size - _iter.offset),\
+  _iter.bio = _iter.bio->bi_next, _iter.offset = 0)   \
 

[PATCH 025 of 35] Treat rq-hard_nr_sectors as setting an overriding limit in the size of the request

2007-07-30 Thread NeilBrown

For a request to be able to refer to part of a bio, we need to be able
to impose a size limit at the request level.  So allow hard_nr_sectors
to be less than the size of the bios (and bio_vecs) and interpret it
such that anything in the last bio beyond that limit is ignored.

As some bios can be less than one sector - as happens when a SCSI
sense command is being submitted - we need to set hard_nr_sectors to
bi_size rounded up in blk_rq_bio_prep, and we need to abort the
rq_for_each_segment loop if _iter.bio becomes NULL even if _iter.size
is still non-zero

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./block/ll_rw_blk.c  |   11 +--
 ./include/linux/bio.h|9 +
 ./include/linux/blkdev.h |   11 +++
 3 files changed, 17 insertions(+), 14 deletions(-)

diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c
--- .prev/block/ll_rw_blk.c 2007-07-31 11:21:14.0 +1000
+++ ./block/ll_rw_blk.c 2007-07-31 11:21:15.0 +1000
@@ -3364,12 +3364,10 @@ static void blk_recalc_rq_sectors(struct
 
/*
 * if total number of sectors is less than the first segment
-* size, something has gone terribly wrong
+* size, then we have hit an early end-of-request.
 */
-   if (rq-nr_sectors  rq-current_nr_sectors) {
-   printk(blk: request botched\n);
-   rq-nr_sectors = rq-current_nr_sectors;
-   }
+   if (rq-nr_sectors  rq-current_nr_sectors)
+   rq-current_nr_sectors = rq-nr_sectors;
}
 }
 
@@ -3661,7 +3659,8 @@ static void blk_rq_bio_prep(struct reque
/* first two bits are identical in rq-cmd_flags and bio-bi_rw */
rq-cmd_flags |= (bio-bi_rw  3);
 
-   rq-hard_nr_sectors = rq-nr_sectors = bio_sectors(bio);
+   rq-hard_nr_sectors = rq-nr_sectors
+   = DIV_ROUND_UP(bio-bi_size, 512);
rq-data_len = bio-bi_size;
 
rq-first_offset = 0;

diff .prev/include/linux/bio.h ./include/linux/bio.h
--- .prev/include/linux/bio.h   2007-07-31 11:21:07.0 +1000
+++ ./include/linux/bio.h   2007-07-31 11:21:15.0 +1000
@@ -222,7 +222,8 @@ struct bio_iterator {
 
 /* This macro probably need some explanation...
  * Its purpose is to find all the effective segments in a bio
- * missing the first 'offs' bytes.  We need to be sure to honour
+ * missing the first 'offs' bytes and only covering the next _size
+ * bytes.   We need to be sure to honour
  * bi_offset which can cause us to skip part of the firs segment,
  * and bi_size which may cause us to stop before the end of bi_io_vec.
  * The 'for' loop iterates through the segments in bi_io_vec until
@@ -233,9 +234,9 @@ struct bio_iterator {
  * follows this macro) if the size would be zero.
  * It also keeps 'offset' and 'size' (in the iterator) up to date.
  */
-#define bio_for_each_segment_offset(bv, bio, _i, offs) \
+#define bio_for_each_segment_offset(bv, bio, _i, offs, _size)  \
for (_i.i = 0, _i.offset = (bio)-bi_offset + offs, \
-_i.size = (bio)-bi_size - offs;   \
+_i.size = min_t(int, _size, (bio)-bi_size - offs);\
 _i.i  (bio)-bi_vcnt  _i.size  0;  \
 _i.i++)\
if (bv = *bio_iovec_idx((bio), _i.i),   \
@@ -251,7 +252,7 @@ struct bio_iterator {
  bv.bv_len  0)))
 
 #define bio_for_each_segment(bv, bio, __i) \
-   bio_for_each_segment_offset(bv, bio, __i, 0)
+   bio_for_each_segment_offset(bv, bio, __i, 0, (bio)-bi_size)
 /*
  * get a reference to a bio, so it won't disappear. the intended use is
  * something like:

diff .prev/include/linux/blkdev.h ./include/linux/blkdev.h
--- .prev/include/linux/blkdev.h2007-07-31 11:21:14.0 +1000
+++ ./include/linux/blkdev.h2007-07-31 11:21:15.0 +1000
@@ -657,13 +657,16 @@ struct req_iterator {
struct bio_iterator i;
struct bio *bio;
int offset;
+   int size;
 };
 #define rq_for_each_segment(rq, _iter, bvec)   \
-   for (_iter.bio = (rq)-bio, _iter.offset = (rq)-first_offset; \
-_iter.bio  _iter.bio != rq-biotail-bi_next;   \
-_iter.bio = _iter.bio-bi_next, _iter.offset = 0) \
+   for (_iter.bio = (rq)-bio, _iter.offset = (rq)-first_offset, \
+   _iter.size = (rq)-hard_nr_sectors  9;   \
+_iter.size  _iter.bio;  \
+_iter.size -= (_iter.bio-bi_size - _iter.offset),\
+  _iter.bio = _iter.bio-bi_next, _iter.offset = 0)   \
bio_for_each_segment_offset(bvec, _iter.bio,