Hi Guys, On Thu, Feb 18, 2016 at 2:16 PM, Ming Lei <ming....@canonical.com> wrote: > Hi Kent, > > Thanks for your review. > > On Thu, Feb 18, 2016 at 12:24 PM, Kent Overstreet > <kent.overstr...@gmail.com> wrote: >> On Mon, Feb 15, 2016 at 05:42:12PM +0800, Ming Lei wrote: >>> Cc Kent and Keith. >>> >>> Follows another version which should be more efficient. >>> Kent and Keith, I appreciate much if you may give a review on it. >>> >>> diff --git a/include/linux/bio.h b/include/linux/bio.h >>> index 56d2db8..ef45fec 100644 >>> --- a/include/linux/bio.h >>> +++ b/include/linux/bio.h >>> @@ -278,11 +278,21 @@ static inline void bio_get_first_bvec(struct bio >>> *bio, struct bio_vec *bv) >>> */ >>> static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv) >>> { >>> - struct bvec_iter iter; >>> + struct bvec_iter iter = bio->bi_iter; >>> + int idx; >>> + >>> + bio_advance_iter(bio, &iter, iter.bi_size); >>> + >>> + WARN_ON(!iter.bi_idx && !iter.bi_bvec_done); >>> + >>> + if (!iter.bi_bvec_done) >>> + idx = iter.bi_idx - 1; >>> + else /* in the middle of bvec */ >>> + idx = iter.bi_idx; >>> >>> - bio_for_each_segment(*bv, bio, iter) >>> - if (bv->bv_len == iter.bi_size) >>> - break; >>> + *bv = bio->bi_io_vec[idx]; >>> + if (iter.bi_bvec_done) >>> + bv->bv_len = iter.bi_bvec_done; >>> } >> >> It can't be done correctly without a loop. > > As we discussed in gtalk, the only case this patch can't cope with > is that one single bvec doesn't use up the remained io vector, > but it can be handled by putting the following code at the > function entry: > > if (!bio_multiple_segments(bio)) { > *bv = bio_iovec(bio); > return; > } > >> >> The reason is that if the bio was split in the middle of a segment, >> bv->bv_len >> on the last biovec will be larger than what's actually used by the bio (it's >> being shared between the two splits!). > > The last two lines in this helper should handle the situation. > >> >> You have to iterate over all the biovecs so that you can see where >> bi_iter->bi_size ends. > > I understand your concern is that this patch may not be much more > efficient than bio_for_each_segment(). > > IMO, one win of the patch is that 16bytes bvec copy is saved for all > vectors, and another 'win' is to just run bvec_iter_advance() once( > like move the outside for loop inside). > > I will run some benchmark to see if there is any performance > difference between the two patches.
When I call bench_last_bvec()(see below) from null_queue_rq(): drivers/block/null_blk.c, IOPS with the 2nd patch in fio test(libaio, randread, null_blk with default mod parameter) is better than the 1st one by > ~2%: ------------------------- |BS | IOPS | ------------------------ |64K | +2% | ----------------------- |512K | +3% | ------------------------ the 1st patch: use bio_for_each_segment() the 2nd patch: use single bio_advance_iter() static void bench_last_bvec(struct request *rq) { static unsigned long total = 0; struct bio *bio; struct bio_vec bv = {0}; __rq_for_each_bio(bio, rq) { bio_get_last_bvec(bio, &bv); total += bv.bv_len; } } Thanks Ming