On 10/01/2018 4:51 PM, tang.jun...@zte.com.cn wrote:
> From: Tang Junhui
>
> After long time run of random small IO writing,
> reboot the machine, and after the machine power on,
> bcache got stuck, the stack is:
> [root@ceph153 ~]# cat /proc/2510/task/*/stack
> [] closure_sync+0x25/0x90 [bcache]
On Tue, Jan 16, 2018 at 9:40 AM, Ming Lei wrote:
> On Mon, Jan 15, 2018 at 10:29:46AM -0700, Jens Axboe wrote:
>> On 1/15/18 9:58 AM, Ming Lei wrote:
>> > No functional change, just to clean up code a bit, so that the following
>> > change of using direct issue for blk_mq_request_bypass_insert() w
On Tue, Jan 16, 2018 at 7:13 AM, Mike Snitzer wrote:
> On Mon, Jan 15 2018 at 6:10P -0500,
> Mike Snitzer wrote:
>
>> On Mon, Jan 15 2018 at 5:51pm -0500,
>> Bart Van Assche wrote:
>>
>> > On Mon, 2018-01-15 at 17:15 -0500, Mike Snitzer wrote:
>> > > sysfs write op calls kernfs_fop_write which
On Mon, Jan 15, 2018 at 09:40:36AM -0800, Christoph Hellwig wrote:
> On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> > Hi,
> >
> > These two patches fixes IO hang issue reported by Laurence.
> >
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one
On Mon, Jan 15, 2018 at 12:43:44PM -0500, Mike Snitzer wrote:
> On Mon, Jan 15 2018 at 11:58am -0500,
> Ming Lei wrote:
>
> > Hi Guys,
> >
> > The 3 paches changes the blk-mq part of blk_insert_cloned_request(),
> > in which we switch to blk_mq_try_issue_directly(), so that both dm-rq
> > and bl
On Mon, Jan 15 2018 at 8:43pm -0500,
Ming Lei wrote:
> On Mon, Jan 15, 2018 at 02:41:12PM -0500, Mike Snitzer wrote:
> > On Mon, Jan 15 2018 at 12:29pm -0500,
> > Jens Axboe wrote:
> >
> > > On 1/15/18 9:58 AM, Ming Lei wrote:
> > > > No functional change, just to clean up code a bit, so that
On Mon, Jan 15, 2018 at 02:41:12PM -0500, Mike Snitzer wrote:
> On Mon, Jan 15 2018 at 12:29pm -0500,
> Jens Axboe wrote:
>
> > On 1/15/18 9:58 AM, Ming Lei wrote:
> > > No functional change, just to clean up code a bit, so that the following
> > > change of using direct issue for blk_mq_request_
On Mon, Jan 15, 2018 at 10:29:46AM -0700, Jens Axboe wrote:
> On 1/15/18 9:58 AM, Ming Lei wrote:
> > No functional change, just to clean up code a bit, so that the following
> > change of using direct issue for blk_mq_request_bypass_insert() which is
> > needed by DM can be easier to do.
> >
> >
On Mon, Jan 15, 2018 at 12:15:47PM -0500, Mike Snitzer wrote:
> On Mon, Jan 15 2018 at 11:58am -0500,
> Ming Lei wrote:
>
> > No functional change, just to clean up code a bit, so that the following
> > change of using direct issue for blk_mq_request_bypass_insert() which is
> > needed by DM can
On Mon, Jan 15, 2018 at 06:43:47PM +0100, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Ming Lei wrote:
> > These two patches fixes IO hang issue reported by Laurence.
> >
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one irq vector assigned to all offline C
On Mon, Jan 15 2018 at 11:58am -0500,
Ming Lei wrote:
> blk_insert_cloned_request() is called in fast path of dm-rq driver, and
> in this function we append request to hctx->dispatch_list of the underlying
> queue directly.
>
> 1) This way isn't efficient enough because hctx lock is always requi
On Mon, Jan 15, 2018 at 09:40:36AM -0800, Christoph Hellwig wrote:
> On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> > Hi,
> >
> > These two patches fixes IO hang issue reported by Laurence.
> >
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one
For the storage track, I would like to propose a topic for differentiated
blk-mq hardware contexts. Today, blk-mq considers all hardware contexts
equal, and are selected based on the software's CPU context. There are
use cases that benefit from having hardware context selection criteria
beyond whic
On Mon, Jan 15 2018 at 6:10P -0500,
Mike Snitzer wrote:
> On Mon, Jan 15 2018 at 5:51pm -0500,
> Bart Van Assche wrote:
>
> > On Mon, 2018-01-15 at 17:15 -0500, Mike Snitzer wrote:
> > > sysfs write op calls kernfs_fop_write which takes:
> > > of->mutex then kn->count#213 (no idea what that i
On Mon, Jan 15 2018 at 5:51pm -0500,
Bart Van Assche wrote:
> On Mon, 2018-01-15 at 17:15 -0500, Mike Snitzer wrote:
> > sysfs write op calls kernfs_fop_write which takes:
> > of->mutex then kn->count#213 (no idea what that is)
> > then q->sysfs_lock (via queue_attr_store)
> >
> > vs
> >
> >
On Mon, 2018-01-15 at 17:15 -0500, Mike Snitzer wrote:
> sysfs write op calls kernfs_fop_write which takes:
> of->mutex then kn->count#213 (no idea what that is)
> then q->sysfs_lock (via queue_attr_store)
>
> vs
>
> blk_unregister_queue takes:
> q->sysfs_lock then
> kernfs_mutex (via kernfs_rem
On Mon, Jan 15 2018 at 12:16pm -0500,
Bart Van Assche wrote:
> On Fri, 2018-01-12 at 10:06 -0500, Mike Snitzer wrote:
> > I'm submitting this v5 with more feeling now ;)
>
> Hello Mike,
>
> Have these patches been tested with lockdep enabled? The following appeared in
> the kernel log when afte
On Mon, Jan 15 2018 at 12:29pm -0500,
Jens Axboe wrote:
> On 1/15/18 9:58 AM, Ming Lei wrote:
> > No functional change, just to clean up code a bit, so that the following
> > change of using direct issue for blk_mq_request_bypass_insert() which is
> > needed by DM can be easier to do.
> >
> > Si
On Mon, Jan 15 2018 at 12:51pm -0500,
Bart Van Assche wrote:
> On Mon, 2018-01-15 at 12:48 -0500, Mike Snitzer wrote:
> > Do I need to do something more to enable lockdep aside from set
> > CONFIG_LOCKDEP_SUPPORT=y ?
>
> Hello Mike,
>
> I think you also need to set CONFIG_PROVE_LOCKING=y.
Ah o
On Mon, 2018-01-15 at 18:43 +0100, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Ming Lei wrote:
> > These two patches fixes IO hang issue reported by Laurence.
> >
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one irq vector assigned to all offline CPUs, th
On Mon, 2018-01-15 at 12:48 -0500, Mike Snitzer wrote:
> Do I need to do something more to enable lockdep aside from set
> CONFIG_LOCKDEP_SUPPORT=y ?
Hello Mike,
I think you also need to set CONFIG_PROVE_LOCKING=y.
Bart.
On Mon, Jan 15 2018 at 12:36pm -0500,
Bart Van Assche wrote:
> On Mon, 2018-01-15 at 12:29 -0500, Mike Snitzer wrote:
> > So you replied to v5, I emailed a v6 out for the relevant patch. Just
> > want to make sure you're testing with either Jens' latest tree or are
> > using my v6 that fixed blk
On Mon, Jan 15 2018 at 11:58am -0500,
Ming Lei wrote:
> Hi Guys,
>
> The 3 paches changes the blk-mq part of blk_insert_cloned_request(),
> in which we switch to blk_mq_try_issue_directly(), so that both dm-rq
> and blk-mq can get the dispatch result of underlying queue, and with
> this informat
On Tue, 16 Jan 2018, Ming Lei wrote:
> These two patches fixes IO hang issue reported by Laurence.
>
> 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> may cause one irq vector assigned to all offline CPUs, then this vector
> can't handle irq any more.
>
> The 1st patch moves
On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> Hi,
>
> These two patches fixes IO hang issue reported by Laurence.
>
> 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> may cause one irq vector assigned to all offline CPUs, then this vector
> can't handle irq any
On Mon, 2018-01-15 at 12:29 -0500, Mike Snitzer wrote:
> So you replied to v5, I emailed a v6 out for the relevant patch. Just
> want to make sure you're testing with either Jens' latest tree or are
> using my v6 that fixed blk_mq_unregister_dev() to require caller holds
> q->sysfs_lock ?
Hello M
On 1/15/18 9:58 AM, Ming Lei wrote:
> No functional change, just to clean up code a bit, so that the following
> change of using direct issue for blk_mq_request_bypass_insert() which is
> needed by DM can be easier to do.
>
> Signed-off-by: Ming Lei
> ---
> block/blk-mq.c | 39 ++
On Mon, Jan 15 2018 at 12:16pm -0500,
Bart Van Assche wrote:
> On Fri, 2018-01-12 at 10:06 -0500, Mike Snitzer wrote:
> > I'm submitting this v5 with more feeling now ;)
>
> Hello Mike,
>
> Have these patches been tested with lockdep enabled? The following appeared in
> the kernel log when afte
On Fri, 2018-01-12 at 10:06 -0500, Mike Snitzer wrote:
> I'm submitting this v5 with more feeling now ;)
Hello Mike,
Have these patches been tested with lockdep enabled? The following appeared in
the kernel log when after I started testing Jens' for-next tree of this morning:
===
On Mon, Jan 15 2018 at 11:58am -0500,
Ming Lei wrote:
> No functional change, just to clean up code a bit, so that the following
> change of using direct issue for blk_mq_request_bypass_insert() which is
> needed by DM can be easier to do.
>
> Signed-off-by: Ming Lei
> ---
> block/blk-mq.c | 3
On Mon, Jan 15, 2018 at 08:33:41AM -0700, Jens Axboe wrote:
> On 1/14/18 7:59 PM, Mike Snitzer wrote:
> > Hi Jens,
> >
> > I prepared this pull request in the hope that it may help you review and
> > stage these changes for 4.16.
> >
> > I went over Ming's changes again to refine the headers and
blk_insert_cloned_request() is called in fast path of dm-rq driver, and
in this function we append request to hctx->dispatch_list of the underlying
queue directly.
1) This way isn't efficient enough because hctx lock is always required
2) With blk_insert_cloned_request(), we bypass underlying que
No functional change, just to clean up code a bit, so that the following
change of using direct issue for blk_mq_request_bypass_insert() which is
needed by DM can be easier to do.
Signed-off-by: Ming Lei
---
block/blk-mq.c | 39 +++
1 file changed, 27 insertio
In the following patch, we will use blk_mq_try_issue_directly() for DM
to return the dispatch result, and DM need this informatin to improve
IO merge.
Signed-off-by: Ming Lei
---
block/blk-mq.c | 23 ++-
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/block/blk
Hi Guys,
The 3 paches changes the blk-mq part of blk_insert_cloned_request(),
in which we switch to blk_mq_try_issue_directly(), so that both dm-rq
and blk-mq can get the dispatch result of underlying queue, and with
this information, blk-mq can handle IO merge much better, then
sequential I/O per
The annual Linux Storage, Filesystem and Memory Management (LSF/MM)
Summit for 2018 will be held from April 23-25 at the Deer Valley
Lodges in Park City, Utah. LSF/MM is an invitation-only technical
workshop to map out improvements to the Linux storage, filesystem and
memory management subsystems t
add a check before allocate resource for blk_trace, if it's in use.
Signed-off-by: weiping zhang
---
kernel/trace/blktrace.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 987d9a9a..16c 100644
--- a/kernel/trace/blktrace.c
+++ b
84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
causes irq vector assigned to all offline CPUs, and IO hang is reported
on HPSA by Laurence.
This patch fixes this issue by trying best to make sure online CPU can be
assigned to irq vector. And take two steps to spread irq vector
This patch is preparing for doing two steps spread:
- spread vectors across non-online CPUs
- spread vectors across online CPUs
This way is applied for trying best to avoid allocating all offline CPUs
to one single vector.
No functional change, and code gets cleaned up too.
Cc:
Hi,
These two patches fixes IO hang issue reported by Laurence.
84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
may cause one irq vector assigned to all offline CPUs, then this vector
can't handle irq any more.
The 1st patch moves irq vectors spread into one function, and pre
On 1/15/18 8:52 AM, Mike Snitzer wrote:
> On Mon, Jan 15 2018 at 10:33am -0500,
> Jens Axboe wrote:
>
>> On 1/14/18 7:59 PM, Mike Snitzer wrote:
> ...
>>> Ming Lei (3):
>>> blk-mq: move actual issue into __blk_mq_issue_req helper
>>
>> I don't like this patch at all - it's a 10 line functio
On Mon, Jan 15 2018 at 10:33am -0500,
Jens Axboe wrote:
> On 1/14/18 7:59 PM, Mike Snitzer wrote:
...
> > Ming Lei (3):
> > blk-mq: move actual issue into __blk_mq_issue_req helper
>
> I don't like this patch at all - it's a 10 line function (if that)
> that ends up with three outputs, two
On Mon, Jan 15, 2018 at 10:25:01AM -0500, Mike Snitzer wrote:
> On Mon, Jan 15 2018 at 8:27am -0500,
> Stephen Rothwell wrote:
>
> > Hi all,
> >
> > Commit
> >
> > 34e1467da673 ("Revert "genirq/affinity: assign vectors to all possible
> > CPUs"")
> >
> > is missing a Signed-off-by from its
On 1/14/18 7:59 PM, Mike Snitzer wrote:
> Hi Jens,
>
> I prepared this pull request in the hope that it may help you review and
> stage these changes for 4.16.
>
> I went over Ming's changes again to refine the headers and code comments
> for clarity to help ease review and inclussion.
>
> I've
On Mon, Jan 15 2018 at 8:27am -0500,
Stephen Rothwell wrote:
> Hi all,
>
> Commit
>
> 34e1467da673 ("Revert "genirq/affinity: assign vectors to all possible
> CPUs"")
>
> is missing a Signed-off-by from its author and committer.
>
> Reverts are commits as well.
Right, I'm aware. I stage
On Mon, Jan 15 2018 at 7:46am -0500,
Lars Ellenberg wrote:
> As I understood it,
> blkdev_issue_zeroout() was supposed to "always try to unmap",
> deprovision, the relevant region, and zero-out any unaligned
> head or tail, just like my work around above was doing.
>
> And that device mapper t
On Sat, Jan 13, 2018 at 12:46:40AM +, Eric Wheeler wrote:
> Hello All,
>
> We just noticed that discards to DRBD devices backed by dm-thin devices
> are fully allocating the thin blocks.
>
> This behavior does not exist before
> ee472d83 block: add a flags argument to (__)blkdev_issue_zeroo
47 matches
Mail list logo