+ Bartlomiej
[...]
> So my conclusion is, let's start a as you suggested, by not completing
> the request in ->done() as to maintain existing behavior. Then we can
> address optimizations on top, which very likely will involve doing
> changes to host drivers as well.
On Mon, Oct 23, 2017 at 3:08 PM, Bart Van Assche wrote:
> On Mon, 2017-10-23 at 09:41 -0600, dann frazier wrote:
>> (gdb) list *(sg_io+0x120)
>> 0x084e71a8 is in sg_io (./include/linux/uaccess.h:113).
>> 108 static inline unsigned long
>> 109
On Mon, 2017-10-23 at 09:41 -0600, dann frazier wrote:
> (gdb) list *(sg_io+0x120)
> 0x084e71a8 is in sg_io (./include/linux/uaccess.h:113).
> 108 static inline unsigned long
> 109 _copy_from_user(void *to, const void __user *from, unsigned long n)
> 110 {
> 111 unsigned
static inline bool nvme_req_needs_retry(struct request *req)
{
if (blk_noretry_request(req))
@@ -143,6 +204,11 @@ static inline bool nvme_req_needs_retry(struct request
*req)
void nvme_complete_rq(struct request *req)
{
if (unlikely(nvme_req(req)->status &&
On Mon, Oct 23, 2017 at 06:32:47PM +0300, Sagi Grimberg wrote:
>> struct nvme_queue *nvmeq = hctx->driver_data;
>> + printk_ratelimited("%s: called\n", __func__);
>> +
>
> This must be a left-over...
Indeed, it is a left-over debug statement..
Hi Ming,
On Fri, Oct 20, 2017 at 3:39 PM, Ming Lei wrote:
> On Wed, Oct 18, 2017 at 12:22:06PM +0200, Roman Pen wrote:
>> Hi all,
>>
>> the patch below fixes queue stalling when shared hctx marked for restart
>> (BLK_MQ_S_SCHED_RESTART bit) but q->shared_hctx_restart stays
On Fri, Oct 20, 2017 at 11:30:55PM +, Bart Van Assche wrote:
> On Fri, 2017-10-20 at 16:54 -0600, dann frazier wrote:
> > hey,
> > I'm seeing a regression when executing 'dmraid -r -c' in an arm64
> > QEMU guest, which I've bisected to the following commit:
> >
> > ca18d6f7 "block: Make
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 7735571ffc9a..bbece5edabff 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1050,6 +1050,8 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx, unsigned
int tag)
{
struct nvme_queue *nvmeq =
On Fri, Oct 20, 2017 at 10:05 PM, Bart Van Assche
wrote:
> On Fri, 2017-10-20 at 11:39 +0200, Roman Penyaev wrote:
>> But what bothers me is these looong loops inside blk_mq_sched_restart(),
>> and since you are the author of the original 6d8c6c0f97ad ("blk-mq: Restart
>>
This flag should be before the operation-specific REQ_NOUNMAP bit.
Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Reviewed-by: Hannes Reinecke
Reviewed-by: Johannes Thumshirn
---
include/linux/blk_types.h | 4 ++--
Use the core chrdev code to set up the link between the character device
and the nvme controller. This allows us to get rid of the global list
of all controllers, and also ensures that we have both a reference to
the controller and the transport module before the open method of the
character
This is a much more sensible check than just the admin queue.
Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Reviewed-by: Johannes Thumshirn
---
drivers/nvme/host/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff
That we we can also poll non blk-mq queues. Mostly needed for
the NVMe multipath code, but could also be useful elsewhere.
Signed-off-by: Christoph Hellwig
---
block/blk-core.c | 11 +++
block/blk-mq.c | 14 +-
Now that we are protected against lookup vs free races for the namespace
by using kref_get_unless_zero we don't need the hack of NULLing out the
disk private data during removal.
Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Reviewed-by: Johannes
Hi all,
this series adds support for multipathing, that is accessing nvme
namespaces through multiple controllers to the nvme core driver.
It is a very thin and efficient implementation that relies on
close cooperation with other bits of the nvme driver, and few small
and simple block helpers.
Introduce a new struct nvme_ns_head that holds information about an actual
namespace, unlike struct nvme_ns, which only holds the per-controller
namespace information. For private namespaces there is a 1:1 relation of
the two, but for shared namespaces this lets us discover all the paths to
it.
The hidden gendisks introduced in the next patch need to keep the dev
field in their struct device empty so that udev won't try to create
block device nodes for them. To support that rewrite disk_devt to
look at the major and first_minor fields in the gendisk itself instead
of looking into the
We do this by adding a helper that returns the ns_head for a device that
can belong to either the per-controller or per-subsystem block device
nodes, and otherwise reuse all the existing code.
Signed-off-by: Christoph Hellwig
Reviewed-by: Keith Busch
For kref_get_unless_zero to protect against lookup vs free races we need
to use it in all places where we aren't guaranteed to already hold a
reference. There is no such guarantee in nvme_find_get_ns, so switch to
kref_get_unless_zero in this function.
Signed-off-by: Christoph Hellwig
Instead of allocating a separate struct device for the character device
handle embedd it into struct nvme_ctrl and use it for the main controller
refcounting. This removes double refcounting and gets us an automatic
reference for the character device operations. We keep ctrl->device as a
pointer
This patch adds native multipath support to the nvme driver. For each
namespace we create only single block device node, which can be used
to access that namespace through any of the controllers that refer to it.
The gendisk for each controllers path to the name space still exists
inside the
This allows us to manage the various uniqueue namespace identifiers
together instead needing various variables and arguments.
Signed-off-by: Christoph Hellwig
Reviewed-by: Keith Busch
Reviewed-by: Sagi Grimberg
---
drivers/nvme/host/core.c
With this flag a driver can create a gendisk that can be used for I/O
submission inside the kernel, but which is not registered as user
facing block device. This will be useful for the NVMe multipath
implementation.
Signed-off-by: Christoph Hellwig
---
block/genhd.c | 57
This helper allows reinserting a bio into a new queue without much
overhead, but requires all queue limits to be the same for the upper
and lower queues, and it does not provide any recursion preventions.
Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Set aside a bit in the request/bio flags for driver use.
Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Reviewed-by: Hannes Reinecke
Reviewed-by: Johannes Thumshirn
---
include/linux/blk_types.h | 5 +
1 file
This helpers allows to bounce steal the uncompleted bios from a request so
that they can be reissued on another path.
Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
---
block/blk-core.c | 20
include/linux/blkdev.h | 2 ++
On 20/10/17 15:30, Adrian Hunter wrote:
> On 19/10/17 14:44, Adrian Hunter wrote:
>> On 18/10/17 09:16, Adrian Hunter wrote:
>>> On 11/10/17 16:58, Ulf Hansson wrote:
On 11 October 2017 at 14:58, Adrian Hunter wrote:
> On 11/10/17 15:13, Ulf Hansson wrote:
>>
On Fri, Oct 20, 2017 at 04:45:23PM +0200, Christoph Hellwig wrote:
> We need to look for an active PM request until the next softbarrier
> instead of looking for the first non-PM request. Otherwise any cause
> of request reordering might starve the PM request(s).
Hi Christoph,
Could you share
On Sun, Oct 22, 2017 at 01:47:54PM +, Israel Rukshin wrote:
> Currently, blk_mq_tagset_iter() iterate over initial hctx tags only.
> In case scheduler is used, it doesn't iterate the hctx scheduler tags
> and the static request aren't been updated.
> For example, while using NVMe over Fabrics
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index c81b40e..c290de0 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -322,6 +322,22 @@ int blk_mq_tagset_iter(struct blk_mq_tag_set
*set, void *data,
}
}
+ for (i = 0; i < set->nr_hw_queues; i++) {
+
@@ -441,6 +442,8 @@ static int blk_mq_sched_alloc_tags(struct request_queue *q,
if (!hctx->sched_tags)
return -ENOMEM;
+ set->sched_tags[hctx_idx] = hctx->sched_tags;
+
ret = blk_mq_alloc_rqs(set, hctx->sched_tags, hctx_idx, q->nr_requests);
if (ret)
On Mon, Oct 23, 2017 at 10:27:29AM +0300, Sagi Grimberg wrote:
> Note that it does introduce a new spinlock to our hot-path, but given
> the current over-allocation scheme with schedulers, its probably better
> off.
We could look into llists if it matters.
Currently, blk_mq_tagset_iter() iterate over initial hctx tags only.
In case scheduler is used, it doesn't iterate the hctx scheduler tags
and the static request aren't been updated.
For example, while using NVMe over Fabrics RDMA host, this cause us not to
reinit the scheduler requests and
On Sun, Oct 22, 2017 at 11:36:30PM -0700, Christoph Hellwig wrote:
> On Mon, Oct 23, 2017 at 08:53:35AM +0900, Byungchul Park wrote:
> > On Fri, Oct 20, 2017 at 07:44:51AM -0700, Christoph Hellwig wrote:
> > > The Subject prefix for this should be "block:".
> > >
> > > > @@ -945,7 +945,7 @@ int
On Sun, 2017-10-22 at 13:47 +, Israel Rukshin wrote:
> @@ -441,6 +442,8 @@ static int blk_mq_sched_alloc_tags(struct request_queue
> *q,
> if (!hctx->sched_tags)
> return -ENOMEM;
>
> + set->sched_tags[hctx_idx] = hctx->sched_tags;
> +
> ret =
On Mon, Oct 23, 2017 at 08:53:35AM +0900, Byungchul Park wrote:
> On Fri, Oct 20, 2017 at 07:44:51AM -0700, Christoph Hellwig wrote:
> > The Subject prefix for this should be "block:".
> >
> > > @@ -945,7 +945,7 @@ int submit_bio_wait(struct bio *bio)
> > > {
> > > struct submit_bio_ret ret;
>
Guan,
If per-controller block device nodes are hidden, how can the user-space tools
such as multipath-tools and nvme-cli (if it supports) know status of each path
of
the multipath device?
if at all, the path state is reflected on the controller class device
node, not on the namespace block
On Mon, Oct 23, 2017 at 02:16:03AM -0400, Martin K. Petersen wrote:
>
> Benjamin,
>
> >> Not sure it's worth it especially now that Martin has merged the patch.
> >
> > He did? I only saw a mail that he picked patches 2-5. So all the bsg
> > changes are still open I think.
>
> Yes, I expected
On Sun, Oct 22, 2017 at 09:32:00PM +0300, Sagi Grimberg wrote:
>
>> Currently, blk_mq_tagset_iter() iterate over initial hctx tags only.
>> In case scheduler is used, it doesn't iterate the hctx scheduler tags
>> and the static request aren't been updated.
>> For example, while using NVMe over
Benjamin,
>> Not sure it's worth it especially now that Martin has merged the patch.
>
> He did? I only saw a mail that he picked patches 2-5. So all the bsg
> changes are still open I think.
Yes, I expected the bsg bits to go through Jens' tree.
--
Martin K. Petersen Oracle Linux
40 matches
Mail list logo