Re: [dm-devel] [PATCH] mpathpersistent: segment faulty occured in mpath_persistent_reserve_in()

2016-10-31 Thread tang . junhui
Hello Hannes,

Since this issue is introduced by RCU, can you have a review for this 
patch?

Thanks,
Tang





发件人: tang.we...@zte.com.cn
收件人: christophe varoqui , 
抄送:   zhang.ka...@zte.com.cn, dm-devel@redhat.com, 
tang.jun...@zte.com.cn, tang.we...@zte.com.cn
日期:   2016/10/27 17:08
主题:   [dm-devel] [PATCH] mpathpersistent: segment faulty occured in 
mpath_persistent_reserve_in()
发件人: dm-devel-boun...@redhat.com



From: 10111224 

Segment faulty occured when executing "mpathpersist -i -k
/dev/mapper/mpath1" command.The reason is that an uninitialized global 
variable conf is used in mpath_persistent_reserve_in(). The same problem 
also exists in
mpath_persistent_reserve_out().

Signed-off-by: tang.wenji 
---
 libmpathpersist/mpath_persist.c | 21 +++--
 libmpathpersist/mpathpr.h   |  4 
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/libmpathpersist/mpath_persist.c 
b/libmpathpersist/mpath_persist.c
index 7501651..582d4ef 100644
--- a/libmpathpersist/mpath_persist.c
+++ b/libmpathpersist/mpath_persist.c
@@ -78,6 +78,7 @@ updatepaths (struct multipath * mpp)
 int i, j;
 struct pathgroup * pgp;
 struct path * pp;
+struct config *conf;
 
 if (!mpp->pg)
 return 0;
@@ -97,16 +98,24 @@ updatepaths (struct multipath * mpp)
  continue;
 }
 pp->mpp = 
mpp;
+conf = 
get_multipath_config();
 pathinfo(pp, conf, DI_ALL);
+ put_multipath_config(conf);
 continue;
 }
 pp->mpp = mpp;
 if (pp->state == 
PATH_UNCHECKED ||
-  pp->state == PATH_WILD)
+  pp->state == PATH_WILD){
+conf = 
get_multipath_config();
 pathinfo(pp, conf, DI_CHECKER);
+ put_multipath_config(conf);
+}
 
-if (pp->priority == 
PRIO_UNDEF)
+if (pp->priority == 
PRIO_UNDEF){
+conf = 
get_multipath_config();
 pathinfo(pp, conf, DI_PRIO);
+ put_multipath_config(conf);
+}
 }
 }
 return 0;
@@ -159,8 +168,11 @@ int mpath_persistent_reserve_in (int fd, int 
rq_servact,
 int map_present;
 int major, minor;
 int ret;
+struct config *conf;
 
+conf = get_multipath_config();
 conf->verbosity = verbose;
+put_multipath_config( conf);
 
 if (fstat( fd, ) != 0){
 condlog(0, "stat error %d", fd);
@@ -252,8 +264,11 @@ int mpath_persistent_reserve_out ( int fd, int 
rq_servact, int rq_scope,
 int j;
 unsigned char *keyp;
 uint64_t prkey;
+struct config *conf;
 
+conf = get_multipath_config();
 conf->verbosity = verbose;
+put_multipath_config(conf);
 
 if (fstat( fd, ) != 0){
 condlog(0, "stat error fd=%d", fd);
@@ -320,7 +335,9 @@ int mpath_persistent_reserve_out ( int fd, int 
rq_servact, int rq_scope,
 goto out1;
 }
 
+conf = get_multipath_config();
 select_reservation_key(conf, mpp);
+put_multipath_config(conf);
 
 switch(rq_servact)
 {
diff --git a/libmpathpersist/mpathpr.h b/libmpathpersist/mpathpr.h
index cd58201..e6c2ded 100644
--- a/libmpathpersist/mpathpr.h
+++ b/libmpathpersist/mpathpr.h
@@ -25,10 +25,6 @@ struct threadinfo {
 struct prout_param param;
 };
 
-
-struct config * conf;
-
-
 int prin_do_scsi_ioctl(char * dev, int rq_servact, struct prin_resp * 
resp, int noisy);
 int prout_do_scsi_ioctl( char * dev, int rq_servact, int rq_scope,
 unsigned int rq_type, struct 
prout_param_descriptor *paramp, int noisy);
-- 
2.8.1.windows.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH] add_feature: coredump

2016-10-31 Thread huang . wei56
Hi Bart,

Thanks for your answer.

I commit the code again with the title "segment faulty occured in 
add_feature()", please review.

Thanks,

Wei huang.



Bart Van Assche  
2016-10-28 23:14

收件人
, Christophe Varoqui 
, 
抄送
, , 
主题
Re: [dm-devel] [PATCH] add_feature: coredump






On 10/13/2016 07:08 PM, huang.we...@zte.com.cn wrote:
> From: "wei.huang" 
>
> Problem:
> when we configure device like vendor is COMPELNT in multipath.conf, 
multipathd will be coredump.
>
> Reasons:
> some vonders are not configured features in default_hw. In add_feature, 
strstr's first parameter *f maybe null.
>
> Signed-off-by: wei.huang 
> ---
>  libmultipath/structs.c | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/libmultipath/structs.c b/libmultipath/structs.c
> index fee58e5..41e142f 100644
> --- a/libmultipath/structs.c
> +++ b/libmultipath/structs.c
> @@ -520,6 +520,20 @@ add_feature (char **f, char *n)
>if (!n || *n == '0')
>return 0;
>
> +  /* default feature is null */
> +  if(!*f)
> +  {
> +  l = strlen("1 ") + strlen(n) + 1;
> +  t = MALLOC(l);
> +  if (!t)
> +  return 1;
> +
> +  snprintf(t, l, "1 %s", n);
> +  *f = t;
> + 
> +  return 0;
> +  }
> +
>/* Check if feature is already present */
>if (strstr(*f, n))
>return 0;
>

Hello Wei Huang,

Please use asprintf() instead of open coding it and please also make the 
title of your patch comprehensible. Your patch avoids that multipathd 
triggers a core dump for a certain vendor name but that's not clear from 
the "add_feature: coredump".

Thanks,

Bart.


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] REQ_OP for zeroing, was Re: [PATCH 1/4] brd: handle misaligned discard

2016-10-31 Thread Mikulas Patocka


On Fri, 28 Oct 2016, Christoph Hellwig wrote:

> [adding Chaitanya to Cc]
> 
> On Fri, Oct 28, 2016 at 07:43:41AM -0400, Mikulas Patocka wrote:
> > We could detect if the REQ_OP_WRITE_SAME command contains all zeroes and 
> > if it does, turn it into "Write Zeroes" or TRIM command (if the device 
> > guarantees zeroing on trim). If it doesn't contain all zeroes and the 
> > device doesn't support non-zero WRITE SAME, then reject it.
> 
> I don't like this because it's very inefficient - we have to allocate
> a payload first and then compare the whole payload for very operation.
> 
> > Or maybe we could add a new command REQ_OP_WRITE_ZEROES - I'm not sure 
> > which of these two possibilities is better.
> 
> Chaitanya actually did an initial prototype implementation of this for
> NVMe that he shared with me.  I liked it a lot and I think he'll be
> ready to post it in a few days.  Now that we have the REQ_OP* values
> instead of mapping different command types to flags it's actually
> surprisingly easy to add new block layer operations.

OK - when it is in the kernel, let me know, so that I can write device 
mapper support for that.

We should remove the flag "discard_zeroes_data" afterwards, because it is 
unreliable and impossible to support correctly in the device mapper.

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 1/4] brd: handle misaligned discard

2016-10-31 Thread Mikulas Patocka


On Fri, 28 Oct 2016, Bart Van Assche wrote:

> On 10/28/2016 04:39 AM, Mikulas Patocka wrote:
> > On Wed, 26 Oct 2016, Bart Van Assche wrote:
> > > On 10/26/2016 02:46 PM, Mikulas Patocka wrote:
> > > > I think the proper thing would be to move "discard_zeroes_data" flag
> > > > into
> > > > the bio itself - there would be REQ_OP_DISCARD and REQ_OP_DISCARD_ZERO -
> > > > and if the device doesn't support REQ_OP_DISCARD_ZERO, it rejects the
> > > > bio
> > > > and the caller is supposed to do zeroing manually.
> > > 
> > > Sorry but I don't like this proposal. I think that a much better solution
> > > would be to pause I/O before making any changes to the discard_zeroes_data
> > > flag.
> > 
> > The device mapper pauses all bios when it switches the table - but the bio
> > was submitted with assumption that it goes to a device withe
> > "discard_zeroes_data" set, then the bio is paused, device mapper table is
> > switched, the bio is unpaused, and now it can go to a device without
> > "discard_zeroes_data".
> 
> Hello Mikulas,
> 
> Sorry if I wasn't clear enough. What I meant is to wait until all outstanding
> requests have finished

It is possible that the process sends never ending stream of bios (for 
example when reading linear data and using readahead), so waiting until 
there are no oustanding bios never finishes.

> before modifying the discard_zeroes_data flag - the
> kind of operation that is performed by e.g. blk_mq_freeze_queue().

blk_mq_freeze_queue() works on request-based drivers, device mapper works 
with bios, so that function has no effect on device mapper device. Anyway 
- blk_mq_freeze_queue() won't stop the process that issues the I/O 
requests - it will just hold the requests in the queue and not forward 
them to the device.

There is no way to stop the process that issues the bios. We can't stop 
the process that is looping in __blkdev_issue_discard, issuing discard 
requests. All that we can do is to hold the bios that the process issued.

Device mapper can freeze the filesystem with "freeze_bdev", but...
- some filesystems don't support freeze
- if the filesystem is not directly mounted on the frozen device, but 
there is a stack of intermediate block devices between the filesystem and 
the frozen device, then the filesystem will not be frozen
- the user can open the block device directly and he won't be affected by 
the freeze

> Modifying the discard_zeroes_data flag after a bio has been submitted 
> and before it has completed could lead to several classes of subtle 
> bugs. Functions like __blkdev_issue_discard() assume that the value of 
> the discard_zeroes_data flag does not change after this function has 
> been called and before the submitted requests completed.
>
> Bart.

I agree. That's the topic of the discussion - that the discard_zeroes_data 
flag is unreliable and the flag should be moved to the bio.

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments

2016-10-31 Thread Christoph Hellwig
On Sat, Oct 29, 2016 at 04:08:08PM +0800, Ming Lei wrote:
> Avoid to access .bi_vcnt directly, because it may be not what
> the driver expected any more after supporting multipage bvec.
> 
> Signed-off-by: Ming Lei 

It would be really nice to have a comment in the code why it's
even checking for multiple segments.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 03/10] do not allow in-use path to change wwid

2016-10-31 Thread Benjamin Marzinski
On Sun, Oct 30, 2016 at 02:45:01PM +0100, Hannes Reinecke wrote:
> On 10/29/2016 04:55 AM, Benjamin Marzinski wrote:
> >When a path is part of a multipath device, it must not change it's wwid.
> >If it can, when multipathd is reconfigured, you can end up with two
> >multipath devices owning the same path, eventually leading to a crash.
> >
> >Signed-off-by: Benjamin Marzinski 
> >---
> > libmultipath/dmparser.c | 8 
> > 1 file changed, 8 insertions(+)
> >
> Hmm. While I do see that this is an issue, just continuing is probably as
> bad; the wwid change might be genuine, in which case this device has no
> business being part of that particular multipath device.
> Can't we just evict that offending path eg by orphaning it and let the admin
> figure things out?

Possibly, but sometimes devices change wwids temporarily when they get
temporarily unmapped, which can happen when you resize them. When I
tried orphaning them, I could get multipath devices getting created for
that temporary wwid, which was pretty confusing.  My later patch also
disables access to these paths, so multipath can't keep writing to them,
but you don't get these annoying fake mutipath devices.

-Ben

> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke zSeries & Storage
> h...@suse.de+49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 02/60] block drivers: convert to bio_init_with_vec_table()

2016-10-31 Thread Ming Lei
Signed-off-by: Ming Lei 
---
 drivers/block/floppy.c|  3 +--
 drivers/md/bcache/io.c|  4 +---
 drivers/md/bcache/journal.c   |  4 +---
 drivers/md/bcache/movinggc.c  |  7 +++
 drivers/md/bcache/super.c | 13 -
 drivers/md/bcache/writeback.c |  6 +++---
 drivers/md/dm-bufio.c |  4 +---
 drivers/md/raid5.c|  9 ++---
 drivers/nvme/target/io-cmd.c  |  4 +---
 fs/logfs/dev_bdev.c   |  4 +---
 10 files changed, 18 insertions(+), 40 deletions(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index e3d8e4ced4a2..cdc916a95137 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3806,8 +3806,7 @@ static int __floppy_read_block_0(struct block_device 
*bdev, int drive)
 
cbdata.drive = drive;
 
-   bio_init();
-   bio.bi_io_vec = _vec;
+   bio_init_with_vec_table(, _vec, 1);
bio_vec.bv_page = page;
bio_vec.bv_len = size;
bio_vec.bv_offset = 0;
diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index e97b0acf7b8d..af9489087cd3 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -24,9 +24,7 @@ struct bio *bch_bbio_alloc(struct cache_set *c)
struct bbio *b = mempool_alloc(c->bio_meta, GFP_NOIO);
struct bio *bio = >bio;
 
-   bio_init(bio);
-   bio->bi_max_vecs = bucket_pages(c);
-   bio->bi_io_vec   = bio->bi_inline_vecs;
+   bio_init_with_vec_table(bio, bio->bi_inline_vecs, bucket_pages(c));
 
return bio;
 }
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 6925023e12d4..b966f28d1b98 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -448,13 +448,11 @@ static void do_journal_discard(struct cache *ca)
 
atomic_set(>discard_in_flight, DISCARD_IN_FLIGHT);
 
-   bio_init(bio);
+   bio_init_with_vec_table(bio, bio->bi_inline_vecs, 1);
bio_set_op_attrs(bio, REQ_OP_DISCARD, 0);
bio->bi_iter.bi_sector  = bucket_to_sector(ca->set,
ca->sb.d[ja->discard_idx]);
bio->bi_bdev= ca->bdev;
-   bio->bi_max_vecs= 1;
-   bio->bi_io_vec  = bio->bi_inline_vecs;
bio->bi_iter.bi_size= bucket_bytes(ca);
bio->bi_end_io  = journal_discard_endio;
 
diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c
index 5c4bddecfaf0..9d7991f69030 100644
--- a/drivers/md/bcache/movinggc.c
+++ b/drivers/md/bcache/movinggc.c
@@ -77,15 +77,14 @@ static void moving_init(struct moving_io *io)
 {
struct bio *bio = >bio.bio;
 
-   bio_init(bio);
+   bio_init_with_vec_table(bio, bio->bi_inline_vecs,
+   DIV_ROUND_UP(KEY_SIZE(>w->key),
+PAGE_SECTORS));
bio_get(bio);
bio_set_prio(bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0));
 
bio->bi_iter.bi_size= KEY_SIZE(>w->key) << 9;
-   bio->bi_max_vecs= DIV_ROUND_UP(KEY_SIZE(>w->key),
-  PAGE_SECTORS);
bio->bi_private = >cl;
-   bio->bi_io_vec  = bio->bi_inline_vecs;
bch_bio_map(bio, NULL);
 }
 
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 849ad441cd76..d8a6d807b498 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1152,9 +1152,7 @@ static void register_bdev(struct cache_sb *sb, struct 
page *sb_page,
dc->bdev = bdev;
dc->bdev->bd_holder = dc;
 
-   bio_init(>sb_bio);
-   dc->sb_bio.bi_max_vecs  = 1;
-   dc->sb_bio.bi_io_vec= dc->sb_bio.bi_inline_vecs;
+   bio_init_with_vec_table(>sb_bio, dc->sb_bio.bi_inline_vecs, 1);
dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
get_page(sb_page);
 
@@ -1814,9 +1812,8 @@ static int cache_alloc(struct cache *ca)
__module_get(THIS_MODULE);
kobject_init(>kobj, _cache_ktype);
 
-   bio_init(>journal.bio);
-   ca->journal.bio.bi_max_vecs = 8;
-   ca->journal.bio.bi_io_vec = ca->journal.bio.bi_inline_vecs;
+   bio_init_with_vec_table(>journal.bio,
+   ca->journal.bio.bi_inline_vecs, 8);
 
free = roundup_pow_of_two(ca->sb.nbuckets) >> 10;
 
@@ -1852,9 +1849,7 @@ static int register_cache(struct cache_sb *sb, struct 
page *sb_page,
ca->bdev = bdev;
ca->bdev->bd_holder = ca;
 
-   bio_init(>sb_bio);
-   ca->sb_bio.bi_max_vecs  = 1;
-   ca->sb_bio.bi_io_vec= ca->sb_bio.bi_inline_vecs;
+   bio_init_with_vec_table(>sb_bio, ca->sb_bio.bi_inline_vecs, 1);
ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
get_page(sb_page);
 
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index e51644e503a5..b2568cef8c86 100644
--- 

[dm-devel] [PATCH 00/60] block: support multipage bvec

2016-10-31 Thread Ming Lei
Hi,

This patchset brings multipage bvec into block layer. Basic
xfstests(-a auto) over virtio-blk/virtio-scsi have been run
and no regression is found, so it should be good enough
to show the approach now, and any comments are welcome!

1) what is multipage bvec?

Multipage bvecs means that one 'struct bio_bvec' can hold
multiple pages which are physically contiguous instead
of one single page used in linux kernel for long time.

2) why is multipage bvec introduced?

Kent proposed the idea[1] first. 

As system's RAM becomes much bigger than before, and 
at the same time huge page, transparent huge page and
memory compaction are widely used, it is a bit easy now
to see physically contiguous pages inside fs/block stack.
On the other hand, from block layer's view, it isn't
necessary to store intermediate pages into bvec, and
it is enough to just store the physicallly contiguous
'segment'.

Also huge pages are being brought to filesystem[2], we
can do IO a hugepage a time[3], requires that one bio can
transfer at least one huge page one time. Turns out it isn't
flexiable to change BIO_MAX_PAGES simply[3]. Multipage bvec
can fit in this case very well.

With multipage bvec:

- bio size can be increased and it should improve some
high-bandwidth IO case in theory[4].

- Inside block layer, both bio splitting and sg map can
become more efficient than before by just traversing the
physically contiguous 'segment' instead of each page.

- there is possibility in future to improve memory footprint
of bvecs usage. 

3) how is multipage bvec implemented in this patchset?

The 1st 22 patches cleanup on direct access to bvec table,
and comments on some special cases. With this approach,
most of cases are found as safe for multipage bvec,
only fs/buffer, pktcdvd, dm-io, MD and btrfs need to deal
with.

Given a little more work is involved to cleanup pktcdvd,
MD and btrfs, this patchset introduces QUEUE_FLAG_NO_MP for
them, and these components can still see/use singlepage bvec.
In the future, once the cleanup is done, the flag can be killed.

The 2nd part(23 ~ 60) implements multipage bvec in block:

- put all tricks into bvec/bio/rq iterators, and as far as
drivers and fs use these standard iterators, they are happy
with multipage bvec

- bio_for_each_segment_all() changes
this helper pass pointer of each bvec directly to user, and
it has to be changed. Two new helpers(bio_for_each_segment_all_rd()
and bio_for_each_segment_all_wt()) are introduced. 

- bio_clone() changes
At default bio_clone still clones one new bio in multipage bvec
way. Also single page version of bio_clone() is introduced
for some special cases, such as only single page bvec is used
for the new cloned bio(bio bounce, ...)

These patches can be found in the following git tree:

https://github.com/ming1/linux/tree/mp-bvec-0.3-v4.9

Thanks Christoph for looking at the early version and providing
very good suggestions, such as: introduce bio_init_with_vec_table(),
remove another unnecessary helpers for cleanup and so on.

TODO:
- cleanup direct access to bvec table for MD & btrfs


[1], http://marc.info/?l=linux-kernel=141680246629547=2
[2], http://lwn.net/Articles/700781/
[3], http://marc.info/?t=14773544711=1=2
[4], http://marc.info/?l=linux-mm=147745525801433=2


Ming Lei (60):
  block: bio: introduce bio_init_with_vec_table()
  block drivers: convert to bio_init_with_vec_table()
  block: drbd: remove impossible failure handling
  block: floppy: use bio_add_page()
  target: avoid to access .bi_vcnt directly
  bcache: debug: avoid to access .bi_io_vec directly
  dm: crypt: use bio_add_page()
  dm: use bvec iterator helpers to implement .get_page and .next_page
  dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
  fs: logfs: convert to bio_add_page() in sync_request()
  fs: logfs: use bio_add_page() in __bdev_writeseg()
  fs: logfs: use bio_add_page() in do_erase()
  fs: logfs: remove unnecesary check
  block: drbd: comment on direct access bvec table
  block: loop: comment on direct access to bvec table
  block: pktcdvd: comment on direct access to bvec table
  kernel/power/swap.c: comment on direct access to bvec table
  mm: page_io.c: comment on direct access to bvec table
  fs/buffer: comment on direct access to bvec table
  f2fs: f2fs_read_end_io: comment on direct access to bvec table
  bcache: comment on direct access to bvec table
  block: comment on bio_alloc_pages()
  block: introduce flag QUEUE_FLAG_NO_MP
  md: set NO_MP for request queue of md
  block: pktcdvd: set NO_MP for pktcdvd request queue
  btrfs: set NO_MP for request queues behind BTRFS
  block: introduce BIO_SP_MAX_SECTORS
  block: introduce QUEUE_FLAG_SPLIT_MP
  dm: limit the max bio size as BIO_SP_MAX_SECTORS << SECTOR_SHIFT
  bcache: set flag of QUEUE_FLAG_SPLIT_MP
  block: introduce multipage/single page bvec helpers
  block: implement sp version of bvec iterator helpers
  block: introduce bio_for_each_segment_mp()
  block: introduce 

[dm-devel] [PATCH 29/60] dm: limit the max bio size as BIO_SP_MAX_SECTORS << SECTOR_SHIFT

2016-10-31 Thread Ming Lei
For BIO based DM, some targets aren't ready for dealing with
bigger incoming bio than 1Mbyte, such as crypt and log write
targets.

Signed-off-by: Ming Lei 
---
 drivers/md/dm.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index ef7bf1dd6900..ce454c6c1a4e 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -899,7 +899,16 @@ int dm_set_target_max_io_len(struct dm_target *ti, 
sector_t len)
return -EINVAL;
}
 
-   ti->max_io_len = (uint32_t) len;
+   /*
+* BIO based queue uses its own splitting. When multipage bvecs
+* is switched on, size of the incoming bio may be too big to
+* be handled in some targets, such as crypt and log write.
+*
+* When these targets are ready for the big bio, we can remove
+* the limit.
+*/
+   ti->max_io_len = min_t(uint32_t, len,
+  BIO_SP_MAX_SECTORS << SECTOR_SHIFT);
 
return 0;
 }
-- 
2.7.4

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 08/60] dm: use bvec iterator helpers to implement .get_page and .next_page

2016-10-31 Thread Ming Lei
Firstly we have mature bvec/bio iterator helper for iterate each
page in one bio, not necessary to reinvent a wheel to do that.

Secondly the coming multipage bvecs requires this patch.

Also add comments about the direct access to bvec table.

Signed-off-by: Ming Lei 
---
 drivers/md/dm-io.c | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 0bf1a12e35fe..2ef573c220fc 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -162,7 +162,10 @@ struct dpages {
 struct page **p, unsigned long *len, unsigned *offset);
void (*next_page)(struct dpages *dp);
 
-   unsigned context_u;
+   union {
+   unsigned context_u;
+   struct bvec_iter context_bi;
+   };
void *context_ptr;
 
void *vma_invalidate_address;
@@ -204,25 +207,36 @@ static void list_dp_init(struct dpages *dp, struct 
page_list *pl, unsigned offse
 static void bio_get_page(struct dpages *dp, struct page **p,
 unsigned long *len, unsigned *offset)
 {
-   struct bio_vec *bvec = dp->context_ptr;
-   *p = bvec->bv_page;
-   *len = bvec->bv_len - dp->context_u;
-   *offset = bvec->bv_offset + dp->context_u;
+   struct bio_vec bv = bvec_iter_bvec((struct bio_vec *)dp->context_ptr,
+   dp->context_bi);
+
+   *p = bv.bv_page;
+   *len = bv.bv_len;
+   *offset = bv.bv_offset;
+
+   /* avoid to figure out it in bio_next_page() again */
+   dp->context_bi.bi_sector = (sector_t)bv.bv_len;
 }
 
 static void bio_next_page(struct dpages *dp)
 {
-   struct bio_vec *bvec = dp->context_ptr;
-   dp->context_ptr = bvec + 1;
-   dp->context_u = 0;
+   unsigned int len = (unsigned int)dp->context_bi.bi_sector;
+
+   bvec_iter_advance((struct bio_vec *)dp->context_ptr,
+   >context_bi, len);
 }
 
 static void bio_dp_init(struct dpages *dp, struct bio *bio)
 {
dp->get_page = bio_get_page;
dp->next_page = bio_next_page;
-   dp->context_ptr = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
-   dp->context_u = bio->bi_iter.bi_bvec_done;
+
+   /*
+* We just use bvec iterator to retrieve pages, so it is ok to
+* access the bvec table directly here
+*/
+   dp->context_ptr = bio->bi_io_vec;
+   dp->context_bi = bio->bi_iter;
 }
 
 /*
-- 
2.7.4

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 58/60] dm-crypt: convert to bio_for_each_segment_all_rd()

2016-10-31 Thread Ming Lei
Signed-off-by: Ming Lei 
---
 drivers/md/dm-crypt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 4999c7497f95..ed0f54e51638 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1034,8 +1034,9 @@ static void crypt_free_buffer_pages(struct crypt_config 
*cc, struct bio *clone)
 {
unsigned int i;
struct bio_vec *bv;
+   struct bvec_iter_all bia;
 
-   bio_for_each_segment_all(bv, clone, i) {
+   bio_for_each_segment_all_rd(bv, clone, i, bia) {
BUG_ON(!bv->bv_page);
mempool_free(bv->bv_page, cc->page_pool);
bv->bv_page = NULL;
-- 
2.7.4

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel