max_discard_bytes >> SECTOR_SHIFT;
> + q->limits.max_discard_sectors =
> + min_not_zero(q->limits.max_hw_discard_sectors,
> + q->limits.max_user_discard_sectors);
s/min_not_zero/min
Otherwise the whole series looks pretty good! And with that:
Reviewed-by: Keith Busch
On Wed, Jan 24, 2024 at 07:22:21PM +0800, yi sun wrote:
> In my case, I want all hw queues owned by this device to be clean.
> Because in the virtio device, each hw queue corresponds to a virtqueue,
> and all virtqueues will be deleted when vdev suspends.
>
> The blk_mq_tagset_wait_request_complet
On Mon, Jan 22, 2024 at 07:38:57PM +0100, Christoph Hellwig wrote:
> On Mon, Jan 22, 2024 at 11:27:15AM -0700, Keith Busch wrote:
> > > + q->limits.max_user_discard_sectors = max_discard_bytes >> SECTOR_SHIFT;
> > > + q->limits.max_discard_sectors
On Wed, Jan 24, 2024 at 07:59:54PM +0800, Ming Lei wrote:
> Requests are added to plug list in reverse order, and both virtio-blk
> and nvme retrieves request from plug list in order, so finally requests
> are submitted to hardware in reverse order via nvme_queue_rqs() or
> virtio_queue_rqs, see:
>
On Mon, Jan 22, 2024 at 07:07:21PM +0800, Yi Sun wrote:
> In some cases, it is necessary to wait for all requests to become complete
> status before performing other operations. Otherwise, these requests will
> never
> be processed successfully.
>
> For example, when the virtio device is in hiber
On Mon, Jan 22, 2024 at 06:36:35PM +0100, Christoph Hellwig wrote:
> @@ -174,23 +174,23 @@ static ssize_t queue_discard_max_show(struct
> request_queue *q, char *page)
> static ssize_t queue_discard_max_store(struct request_queue *q,
> const char *page, size_t
On Fri, Dec 08, 2023 at 10:00:36AM +0800, Ming Lei wrote:
> On Thu, Dec 07, 2023 at 12:31:05PM +0800, Li Feng wrote:
> > virtio-blk is generally used in cloud computing scenarios, where the
> > performance of virtual disks is very important. The mq-deadline scheduler
> > has a big performance drop
On Wed, Jan 25, 2023 at 02:34:36PM +0100, Christoph Hellwig wrote:
> @@ -363,8 +384,10 @@ void __swap_writepage(struct page *page, struct
> writeback_control *wbc)
>*/
> if (data_race(sis->flags & SWP_FS_OPS))
> swap_writepage_fs(page, wbc);
> + else if (sis->flags
On Wed, Jan 25, 2023 at 02:34:31PM +0100, Christoph Hellwig wrote:
> -static inline int swap_readpage(struct page *page, bool do_poll,
> - struct swap_iocb **plug)
> +static inline void swap_readpage(struct page *page, bool do_poll,
> + struct swap_iocb **plu
mespace head.
Looks good, thank you.
Reviewed-by: Keith Busch
On Mon, Sep 27, 2021 at 03:00:32PM -0700, Luis Chamberlain wrote:
> + /*
> + * test_and_set_bit() is used because it is protecting against two nvme
> + * paths simultaneously calling device_add_disk() on the same namespace
> + * head.
> + */
> if (!test_and_set_bit(NVM
On Fri, Aug 27, 2021 at 12:18:02PM -0700, Luis Chamberlain wrote:
> @@ -479,13 +479,17 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,
> struct nvme_ns_head *head)
> static void nvme_mpath_set_live(struct nvme_ns *ns)
> {
> struct nvme_ns_head *head = ns->head;
> + int rc;
>
>
On Mon, Oct 21, 2019 at 12:40:02PM +0900, Damien Le Moal wrote:
> This is series is a couple of cleanup patches. The first one cleans up
> the helper function nvme_block_nr() using SECTOR_SHIFT instead of the
> hard coded value 9 and clarifies the helper role by renaming it to
> nvme_sect_to_lba().
the request SGL.
>
> [chaitanya.kulka...@wdc.com: this was factored out of a patch
> originally authored by Chaitanya]
> Signed-off-by: Chaitanya Kulkarni
> Signed-off-by: Logan Gunthorpe
> Reviewed-by: Sagi Grimberg
> ---
Looks fine
Reviewed-by: Keith Busch
berg
> ---
Looks fine, but let's remove the version comments out of commit log if
we're applying this one.
Reviewed-by: Keith Busch
> drivers/nvme/target/core.c | 6 --
> drivers/nvme/target/nvmet.h | 2 +-
> 2 files changed, 5 insertions(+), 3 deletions(-)
>
> dif
target
> passthru feature.
>
> Signed-off-by: Chaitanya Kulkarni
> Signed-off-by: Logan Gunthorpe
> Reviewed-by: Sagi Grimberg
Looks fine.
Reviewed-by: Keith Busch
to obtain a pointer to the struct nvme_ctrl. If the fops of the
> file do not match, -EINVAL is returned.
>
> The purpose of this function is to support NVMe-OF target passthru.
>
> Signed-off-by: Logan Gunthorpe
> Reviewed-by: Max Gurtovoy
> Reviewed-by: Sagi Grimberg
Looks fine.
Reviewed-by: Keith Busch
On Sun, Sep 08, 2019 at 10:22:50PM -0400, Martin K. Petersen wrote:
>
> Max,
>
> > Only type 1 and type 2 have a reference tag by definition.
>
> DIX Type 0 needs remapping so this assertion is not correct.
At least for nvme, type 0 means you have meta data but not for protection
information, s
On Wed, Sep 04, 2019 at 07:27:32PM +0300, Max Gurtovoy wrote:
> + if (blk_integrity_rq(req) && req_op(req) == REQ_OP_READ &&
> + error == BLK_STS_OK)
> + t10_pi_complete(req,
> + nr_bytes >> blk_integrity_interval_shift(req->q));
This is not created by y
On Wed, Aug 14, 2019 at 11:29:17AM -0700, Guilherme G. Piccoli wrote:
> It is a suggestion from my colleague Dan (CCed here), something like:
> for non-multipath nvme, keep nvmeC and nvmeCnN (C=controller ida,
> N=namespace); for multipath nvme, use nvmeScCnN (S=subsystem ida).
This will inevitabl
On Wed, Aug 14, 2019 at 09:18:22AM -0700, Guilherme G. Piccoli wrote:
> On 14/08/2019 13:06, Keith Busch wrote:
> > On Wed, Aug 14, 2019 at 07:28:36AM -0700, Guilherme G. Piccoli wrote:
> >>[...]
> >
> > The subsystem lifetime is not tied to a single controlle
On Wed, Aug 14, 2019 at 07:28:36AM -0700, Guilherme G. Piccoli wrote:
> Since after the introduction of NVMe multipath, we have a struct to
> track subsystems, and more important, we have now the nvme block device
> name bound to the subsystem id instead of ctrl->instance as before.
> This is not a
case.
>
> So still create affinity mask for single vector, since
> irq_create_affinity_masks()
> is capable of handling that.
Hi Ming,
Looks good to me.
Reviewed-by: Keith Busch
> ---
> kernel/irq/affinity.c | 6 ++
> 1 file changed, 2 insertions(+), 4 deletions(
On Mon, Aug 05, 2019 at 10:52:07AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:e21a712a Linux 5.3-rc3
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10cf349a60
> kernel config: https://syzkaller.appspot.co
s is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
>
> Cc: Dan Carpenter
> Cc: Greg Kroah-Hartman
> Cc: Keith Busch
> Cc: Kirill A. Shutemov
> Cc: Michael S. Tsirkin
> Cc: YueHaibing
&
On Thu, Jul 25, 2019 at 02:28:28PM -0600, Logan Gunthorpe wrote:
>
>
> On 2019-07-25 1:58 p.m., Keith Busch wrote:
> > On Thu, Jul 25, 2019 at 11:54:18AM -0600, Logan Gunthorpe wrote:
> >>
> >>
> >> On 2019-07-25 11:50 a.m., Matthew Wilcox wrote:
>
On Thu, Jul 25, 2019 at 11:54:18AM -0600, Logan Gunthorpe wrote:
>
>
> On 2019-07-25 11:50 a.m., Matthew Wilcox wrote:
> > On Thu, Jul 25, 2019 at 11:23:23AM -0600, Logan Gunthorpe wrote:
> >> nvme_get_by_path() is analagous to blkdev_get_by_path() except it
> >> gets a struct nvme_ctrl from the
On Tue, Jul 23, 2019 at 12:38:04PM +0800, Ming Lei wrote:
> From the IO trace, discard command(nvme_cmd_dsm) is failed:
>
> kworker/15:1H-462 [015] 91814.342452: nvme_setup_cmd: nvme0:
> disk=nvme0n1, qid=7, cmdid=552, nsid=1, flags=0x0, meta=0x0,
> cmd=(nvme_cmd_dsm nr=0, attributes=4)
PI.
> >
> > So don't abort one request if it is marked as completed, otherwise
> > we may abort one normal completed request.
> >
> > Cc: Max Gurtovoy
> > Cc: Sagi Grimberg
> > Cc: Keith Busch
> > Cc: Christoph Hellwig
> > Signed-off-
On Thu, May 23, 2019 at 04:10:54PM +0200, Christoph Hellwig wrote:
> On Thu, May 23, 2019 at 07:23:04AM -0600, Keith Busch wrote:
> > > Figure 49: Asynchronous Event Information - Notice
> > >
> > > 1hFirmware Activation Starting: The controller
On Thu, May 23, 2019 at 03:19:54AM -0700, Ming Lei wrote:
> On Wed, May 22, 2019 at 11:48:12AM -0600, Keith Busch wrote:
> > @@ -3605,6 +3606,11 @@ static void nvme_fw_act_work(struct work_struct
> > *work)
> > msecs_to_jiffies
On Thu, May 23, 2019 at 03:13:11AM -0700, Christoph Hellwig wrote:
> On Wed, May 22, 2019 at 09:48:10PM -0600, Keith Busch wrote:
> > Yeah, that's a good question. A FW update may have been initiated out
> > of band or from another host entirely. The driver can't c
On Wed, May 22, 2019, 9:29 PM Ming Lei wrote:
>
> On Wed, May 22, 2019 at 11:48:10AM -0600, Keith Busch wrote:
> > Hardware may temporarily stop processing commands that have
> > been dispatched to it while activating new firmware. Some target
> > implementation's
On Wed, May 22, 2019 at 10:20:45PM +0200, Bart Van Assche wrote:
> On 5/22/19 7:48 PM, Keith Busch wrote:
> > Hardware may temporarily stop processing commands that have
> > been dispatched to it while activating new firmware. Some target
> > implementation's paused stat
when the hardware
exists that state. This action applies to IO and admin queues.
Signed-off-by: Keith Busch
---
drivers/nvme/host/core.c | 20
1 file changed, 20 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1b7c2afd84cb..37a9a66ada22 1
jiffies so that time accrued during the paused state doesn't
count against that request.
Signed-off-by: Keith Busch
---
block/blk-mq.c | 30 ++
include/linux/blk-mq.h | 1 +
2 files changed, 31 insertions(+)
diff --git a/block/blk-mq.c b/block/blk-mq.c
will time out, and handling this may interrupt
the firmware activation.
This two-part series provides a way for drivers to reset dispatched
requests' timeout deadline, then uses this new mechanism from the nvme
driver's fw activation work.
Keith Busch (2):
blk-mq: provide way to rese
uring
configure. If the option is set, fio can test THP when used with private
anonymous memory (i.e. mmap /dev/zero).
Signed-off-by: Keith Busch
---
v1 -> v2:
Added a 'configure' check for MADV_HUGEPAGE support rather than just
consider only OS Linux
Fixed cases when MADV_HUGEPAG
On Wed, Apr 17, 2019 at 04:01:28PM -0600, Jens Axboe wrote:
> On 4/17/19 3:28 PM, Keith Busch wrote:
> > The transparent hugepage Linux-specific memory advisory has potentially
> > significant implications for how the memory management behaves. Add a
> > new mmap specifi
: Keith Busch
---
engines/mmap.c | 39 ---
optgroup.h | 2 ++
2 files changed, 38 insertions(+), 3 deletions(-)
diff --git a/engines/mmap.c b/engines/mmap.c
index 308b4665..7bca6c20 100644
--- a/engines/mmap.c
+++ b/engines/mmap.c
@@ -11,6 +11,7
to Jan Kara for providing the solution and more clear comments
> for the code.
>
> Fixes: 2da78092dda1 ("block: Fix dev_t minor allocation lifetime")
> Cc: Al Viro
> Cc: Bart Van Assche
> Cc: Keith Busch
> Suggested-by: Jan Kara
> Signed-off-by: Yufen Yu
Looks good to me.
Reviewed-by: Keith Busch
+++
> > drivers/nvme/host/core.c | 2 +-
> > include/linux/blk-mq.h | 1 +
> > 3 files changed, 9 insertions(+), 1 deletion(-)
> >
> > Cc: Keith Busch
> > Cc: Sagi Grimberg
> > Cc: Bart Van Assche
> > Cc: James Smart
> > Cc: Christoph
q_ops->complete(rq) remotelly and asynchronously, and
> ->complete(rq) may be run after #4.
>
> This patch introduces blk_mq_complete_request_sync() for fixing the
> above race.
>
> Cc: Keith Busch
> Cc: Sagi Grimberg
> Cc: Bart Van Assche
> Cc: James Smart
&
get that by registering a single callback for the request_queue and loop
only the affected hctx's.
But this patch looks good to me too.
Reviewed-by: Keith Busch
> Signed-off-by: Dongli Zhang
> ---
> block/blk-mq.c | 4
> 1 file changed, 4 insertions(+)
>
>
On Sun, Apr 07, 2019 at 12:51:23AM -0700, Christoph Hellwig wrote:
> On Fri, Apr 05, 2019 at 05:36:32PM -0600, Keith Busch wrote:
> > On Fri, Apr 5, 2019 at 5:04 PM Jens Axboe wrote:
> > > Looking at current peak testing, I've got around 1.2% in queue enter
> > &g
On Sat, Apr 06, 2019 at 02:27:10PM -0700, Ming Lei wrote:
> On Fri, Apr 05, 2019 at 05:36:32PM -0600, Keith Busch wrote:
> > On Fri, Apr 5, 2019 at 5:04 PM Jens Axboe wrote:
> > > Looking at current peak testing, I've got around 1.2% in queue enter
> > > and exit.
On Fri, Apr 5, 2019 at 5:04 PM Jens Axboe wrote:
> Looking at current peak testing, I've got around 1.2% in queue enter
> and exit. It's definitely not free, hence my question. Probably safe
> to assume that we'll double that cycle counter, per IO.
Okay, that's not negligible at all. I don't know
On Fri, Apr 05, 2019 at 04:23:27PM -0600, Jens Axboe wrote:
> On 4/5/19 3:59 PM, Keith Busch wrote:
> > Managed interrupts can not migrate affinity when their CPUs are offline.
> > If the CPU is allowed to shutdown before they're returned, commands
> > dispatched to manag
e CPU dead
notification for all allocated requests to complete if an hctx's last
CPU is being taken offline.
Cc: Ming Lei
Cc: Thomas Gleixner
Signed-off-by: Keith Busch
---
block/blk-mq-sched.c | 2 ++
block/blk-mq-sysfs.c | 1 +
block/blk-mq-tag.c | 1 +
block/blk-mq
On Thu, Mar 28, 2019 at 09:42:51AM +0800, jianchao.wang wrote:
> On 3/27/19 9:21 PM, Keith Busch wrote:
> > +void blk_mq_terminate_queued_requests(struct request_queue *q, int
> > hctx_idx)
> > +{
> > + if (WARN_ON_ONCE(!atomic_read(&q->mq_freeze_depth)))
On Wed, Mar 27, 2019 at 04:51:13PM +0800, Ming Lei wrote:
> @@ -594,8 +594,11 @@ static void __blk_mq_complete_request(struct request *rq)
> /*
>* For a polled request, always complete locallly, it's pointless
>* to redirect the completion.
> + *
> + * If driver requ
t it through the proper tests (I no longer
have a hotplug machine), but this is what I'd written if you can give it
a quick look:
>From 5afd8e3765eabf859100fda84e646a96683d7751 Mon Sep 17 00:00:00 2001
From: Keith Busch
Date: Tue, 12 Mar 2019 13:58:12 -0600
Subject: [PATCH] blk-mq: Provide re
On Tue, Mar 26, 2019 at 01:07:12PM +0100, Hannes Reinecke wrote:
> When a queue is dying or dead there is no point in calling
> blk_mq_run_hw_queues() in blk_mq_unquiesce_queue(); in fact, doing
> so might crash the machine as the queue structures are in the
> process of being deleted.
>
> Signed-
On Fri, Mar 22, 2019 at 03:05:46PM +0100, Hannes Reinecke wrote:
> On 3/22/19 3:02 PM, Christoph Hellwig wrote:
> > But how do you manage to get the tiny on-stack bios split? What kind
> > of setup is this?
> >
> It's not tiny if you send a 2M file via direct-io, _and_ have a non-zero
> MDTS sett
ic
> testing because I've been a bit busy, but I thought it might be
> worthwhile to get it out for feedback.
Tests well here with a measurable IOPs improvement at lower queue depths.
Series looks good to me, especially patch 5! :p
Reviewed-by: Keith Busch
On Mon, Mar 18, 2019 at 06:14:01PM -0400, Nikhil Sambhus wrote:
> Hi,
>
> On a Linux Kernel 5.0.0+ machine (Ubuntu 16.04) I am using the
> following command as a root user to enable polling for a NVMe SSD
> device.
>
> # echo 1 > /sys/block/nvme2n1/queue/io_poll
>
> I get the following error:
>
On Sun, Mar 17, 2019 at 09:09:09PM -0700, Bart Van Assche wrote:
> On 3/17/19 8:29 PM, Ming Lei wrote:
> > In NVMe's error handler, follows the typical steps for tearing down
> > hardware:
> >
> > 1) stop blk_mq hw queues
> > 2) stop the real hw queues
> > 3) cancel in-flight requests via
> >
On Mon, Mar 11, 2019 at 07:40:31PM +0100, Christoph Hellwig wrote:
> From a quick look the code seems reasonably sensible here,
> but any chance we could have this in common code?
>
> > +static bool nvme_fail_queue_request(struct request *req, void *data, bool
> > reserved)
> > +{
> > + struct
On Fri, Mar 08, 2019 at 09:31:02PM -0700, Jens Axboe wrote:
> On 3/8/19 2:59 PM, Keith Busch wrote:
> > Make depth options command line parameters so a recompile isn't
> > required to see how it affects performance.
>
> Thanks, everything really should be command
On Sun, Mar 10, 2019 at 08:58:21PM -0700, jianchao.wang wrote:
> Hi Keith
>
> How about introducing a per hctx queue_rq callback, then install a
> separate .queue_rq callback for the dead hctx. Then we just need to
> start and complete the request there.
That sounds like it could work, though I t
On Mon, Mar 11, 2019 at 10:24:42AM +0800, Ming Lei wrote:
> Hi,
>
> It is observed that ext4 is corrupted easily by running some workloads
> on QEMU NVMe, such as:
>
> 1) mkfs.ext4 /dev/nvme0n1
>
> 2) mount /dev/nvme0n1 /mnt
>
> 3) cd /mnt; git clone
> git://git.kernel.org/pub/scm/linux/kernel
On Fri, Mar 08, 2019 at 01:54:06PM -0800, Bart Van Assche wrote:
> On Fri, 2019-03-08 at 11:19 -0700, Keith Busch wrote:
> > On Fri, Mar 08, 2019 at 10:15:27AM -0800, Bart Van Assche wrote:
> > > On Fri, 2019-03-08 at 10:40 -0700, Keith Busch wrote:
> > > > End the
Make depth options command line parameters so a recompile isn't
required to see how it affects performance.
Signed-off-by: Keith Busch
---
t/io_uring.c | 70
1 file changed, 52 insertions(+), 18 deletions(-)
diff --git a/t/io_ur
On Fri, Mar 08, 2019 at 01:25:16PM -0800, Bart Van Assche wrote:
> On Fri, 2019-03-08 at 14:14 -0700, Keith Busch wrote:
> > On Fri, Mar 08, 2019 at 12:47:10PM -0800, Bart Van Assche wrote:
> > > If no such mechanism has been defined in the NVMe spec: have you
> > > con
On Fri, Mar 08, 2019 at 12:47:10PM -0800, Bart Van Assche wrote:
> Thanks for the clarification. Are you aware of any mechanism in the NVMe spec
> that causes all outstanding requests to fail? With RDMA this is easy - all
> one has to do is to change the queue pair state into IB_QPS_ERR. See also
>
On Fri, Mar 08, 2019 at 12:21:16PM -0800, Sagi Grimberg wrote:
> For some reason I didn't get patches 2/5 and 3/5...
Unreliable 'git send-email'?! :)
They're copied to patchwork too:
https://patchwork.kernel.org/patch/10845225/
https://patchwork.kernel.org/patch/10845229/
On Fri, Mar 08, 2019 at 10:42:17AM -0800, Bart Van Assche wrote:
> On Fri, 2019-03-08 at 11:15 -0700, Keith Busch wrote:
> > On Fri, Mar 08, 2019 at 10:07:23AM -0800, Bart Van Assche wrote:
> > > On Fri, 2019-03-08 at 10:40 -0700, Keith Busch wrote:
> > > > Drivers
On Fri, Mar 08, 2019 at 10:15:27AM -0800, Bart Van Assche wrote:
> On Fri, 2019-03-08 at 10:40 -0700, Keith Busch wrote:
> > End the entered requests on a quieced queue directly rather than flush
> > them through the low level driver's queue_rq().
> >
>
On Fri, Mar 08, 2019 at 10:07:23AM -0800, Bart Van Assche wrote:
> On Fri, 2019-03-08 at 10:40 -0700, Keith Busch wrote:
> > Drivers may need to know the state of their requets.
>
> Hi Keith,
>
> What makes you think that drivers should be able to check the state of the
On Fri, Mar 08, 2019 at 10:08:47AM -0800, Bart Van Assche wrote:
> On Fri, 2019-03-08 at 10:40 -0700, Keith Busch wrote:
> > A driver may need to iterate a particular queue's tagged request rather
> > than the whole tagset.
>
> Since iterating over requests triggers ra
A driver may need to iterate a particular queue's tagged request rather
than the whole tagset.
Signed-off-by: Keith Busch
---
block/blk-mq-tag.c | 1 +
block/blk-mq-tag.h | 2 --
include/linux/blk-mq.h | 2 ++
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/block/b
Drivers may need to know the state of their requets.
Signed-off-by: Keith Busch
---
block/blk-mq.h | 9 -
include/linux/blkdev.h | 9 +
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/block/blk-mq.h b/block/blk-mq.h
index c11353a3749d..99ab7e472e62 100644
End the entered requests on a quieced queue directly rather than flush
them through the low level driver's queue_rq().
Signed-off-by: Keith Busch
---
drivers/nvme/host/core.c | 10 --
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers
o see COMPLETED requests
that were being returned before, so this also fixes that for all existing
callback handlers.
Signed-off-by: Keith Busch
---
block/blk-mq-tag.c| 12 ++--
drivers/block/mtip32xx/mtip32xx.c | 6 ++
drivers/block/nbd.c | 2 ++
dr
when the queue
isn't going to be restarted so the IO path doesn't have to deal with
these conditions.
Signed-off-by: Keith Busch
---
drivers/nvme/host/pci.c | 45 +
1 file changed, 29 insertions(+), 16 deletions(-)
diff --git a/drivers/nvme/
On Wed, Mar 06, 2019 at 06:48:28PM +, alex_gagn...@dellteam.com wrote:
> Hi,
>
> I'm seeing a list error when we take away, then add back a bunch of nvme
> drives. It's not very easy to repro, and the one surviving log is pasted
> below.
This looks like a double completion coming from the b
On Thu, Feb 21, 2019 at 09:51:12PM -0500, Martin K. Petersen wrote:
>
> Keith,
>
> > With respect to fs block sizes, one thing making discards suck is that
> > many high capacity SSDs' physical page sizes are larger than the fs
> > block size, and a sub-page discard is worse than doing nothing.
>
On Sun, Feb 17, 2019 at 06:42:59PM -0500, Ric Wheeler wrote:
> I think the variability makes life really miserable for layers above it.
>
> Might be worth constructing some tooling that we can use to validate or
> shame vendors over - testing things like a full device discard, discard of
> fs bloc
On Wed, Feb 20, 2019 at 06:43:46AM -0800, Matthew Wilcox wrote:
> What NVMe doesn't have is a way for the host to tell the controller
> "Here's a 2MB sized I/O; bytes 40960 to 45056 are most important to
> me; please give me a completion event once those bytes are valid and
> then another completio
On Mon, Feb 18, 2019 at 04:42:27PM -0800, 陈华才 wrote:
> I've tested, this patch can fix the nvme problem, but it can't be applied
> to 4.19 because of different context. And, I still think my original solution
> (genirq/affinity: Assign default affinity to pre/post vectors) is correct.
> There may b
On Fri, Feb 15, 2019 at 09:19:02PM +, Felipe Franciosi wrote:
> Over the last year or two, I have done extensive experimentation comparing
> applications using libaio to those using SDPK.
Try the io_uring interface instead. Its queued up in the linux-block
for-next tree.
> For hypervisors,
On Thu, Feb 14, 2019 at 10:02:02PM -0500, Theodore Y. Ts'o wrote:
> > My (undocumented) rule of thumb has been that blktests shouldn't assume
> > anything newer than whatever ships on Debian oldstable. I can document
> > that requirement.
>
> That's definitely not true for the nvme tests; the nvme
On Wed, Feb 13, 2019 at 10:41:55PM +0100, Thomas Gleixner wrote:
> Btw, while I have your attention. There popped up an issue recently related
> to that affinity logic.
>
> The current implementation fails when:
>
> /*
> * If there aren't any vectors left after applying the pre/p
On Wed, Feb 13, 2019 at 09:56:36PM +0100, Thomas Gleixner wrote:
> On Wed, 13 Feb 2019, Bjorn Helgaas wrote:
> > On Wed, Feb 13, 2019 at 06:50:37PM +0800, Ming Lei wrote:
> > > We have to ask driver to re-caculate set vectors after the whole IRQ
> > > vectors are allocated later, and the result nee
ew for the whole series
if you spin a v3 for the other minor comments.
Reviewed-by: Keith Busch
> +static void nvme_calc_irq_sets(struct irq_affinity *affd, int nvecs)
> +{
> + struct nvme_dev *dev = affd->priv;
> +
> + nvme_calc_io_queues(dev, nvecs);
> +
> +
On Thu, Feb 07, 2019 at 12:55:39PM -0700, Jens Axboe wrote:
> IO submissions use the io_uring_sqe data structure, and completions
> are generated in the form of io_uring_sqe data structures.
^^^
Completions use _cqe, right?
On Tue, Feb 05, 2019 at 04:10:47PM +0100, Hannes Reinecke wrote:
> On 2/5/19 3:52 PM, Keith Busch wrote:
> > Whichever layer dispatched the IO to a CPU specific context should
> > be the one to wait for its completion. That should be blk-mq for most
> > block drivers.
> >
On Tue, Feb 05, 2019 at 03:09:28PM +, John Garry wrote:
> On 05/02/2019 14:52, Keith Busch wrote:
> > On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote:
> > > On 04/02/2019 07:12, Hannes Reinecke wrote:
> > >
> > > Hi Hannes,
> > >
>
On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote:
> On 04/02/2019 07:12, Hannes Reinecke wrote:
>
> Hi Hannes,
>
> >
> > So, as the user then has to wait for the system to declars 'ready for
> > CPU remove', why can't we just disable the SQ and wait for all I/O to
> > complete?
> > We c
On Mon, Jan 28, 2019 at 04:47:09AM -0800, Jan Kara wrote:
> On Fri 25-01-19 09:23:53, Keith Busch wrote:
> > On Wed, Jan 09, 2019 at 09:00:57PM +0530, Kanchan Joshi wrote:
> > > Towards supporing write-hints/streams for
On Wed, Jan 09, 2019 at 09:00:57PM +0530, Kanchan Joshi wrote:
> Towards supporing write-hints/streams for filesystem journal.
>
>
>
> Here is the v1 patch for background -
On Tue, Dec 18, 2018 at 06:47:50PM +0100, h...@lst.de wrote:
> On Tue, Dec 18, 2018 at 10:26:46AM -0700, Keith Busch wrote:
> > No need for a space after the %s. __print_disk_name already appends a
> > space if there's a disk name, and we don't want the extra space if
On Mon, Dec 17, 2018 at 08:51:38PM -0800, yupeng wrote:
> +TRACE_EVENT(nvme_sq,
> + TP_PROTO(void *rq_disk, int qid, int sq_head, int sq_tail),
> + TP_ARGS(rq_disk, qid, sq_head, sq_tail),
> + TP_STRUCT__entry(
> + __array(char, disk, DISK_NAME_LEN)
> + __field(i
On Wed, Dec 12, 2018 at 09:36:36AM -0700, Jens Axboe wrote:
> On 12/12/18 9:28 AM, Keith Busch wrote:
> > On Wed, Dec 12, 2018 at 09:18:11AM -0700, Jens Axboe wrote:
> >> When boxes are run near (or to) OOM, we have a problem with the discard
> >> page allocation in nvme
On Wed, Dec 12, 2018 at 09:18:11AM -0700, Jens Axboe wrote:
> When boxes are run near (or to) OOM, we have a problem with the discard
> page allocation in nvme. If we fail allocating the special page, we
> return busy, and it'll get retried. But since ordering is honored for
> dispatch requests, we
On Tue, Dec 11, 2018 at 02:49:36AM -0800, Sagi Grimberg wrote:
> if (cfg.host_traddr) {
> len = sprintf(p, ",host_traddr=%s", cfg.host_traddr);
> if (len < 0)
> @@ -1009,6 +1019,7 @@ int connect(const char *desc, int argc, char **argv)
> {"hostnqn",
On Mon, Dec 10, 2018 at 03:06:52PM -0500, Laurence Oberman wrote:
> Tested and works fine.
> Thanks All
>
> Tested-by: Laurence Oberman
Cool, thank you for confirming.
We don't need to zero fill the bio if not using kernel allocated pages.
Fixes: f3587d76da05 ("block: Clear kernel memory before copying to user") #
v4.20-rc2
Reported-by: Todd Aiken
Cc: Laurence Oberman
Cc: sta...@vger.kernel.org
Cc: Bart Van Assche
Signed-off-by: Keith Bu
On Sun, Dec 09, 2018 at 07:08:14PM -0800, Bart Van Assche wrote:
> According to what I found in
> https://bugzilla.kernel.org/show_bug.cgi?id=201935 patch "block: Clear
> kernel memory before copying to user" broke tape access. Hence revert
> that patch.
Instead of reverting back to the leaking ar
On Tue, Dec 04, 2018 at 02:21:17PM -0700, Keith Busch wrote:
> On Tue, Dec 04, 2018 at 11:33:33AM -0800, James Smart wrote:
> > On 12/4/2018 9:48 AM, Keith Busch wrote:
> > > Once quiesced, the proposed iterator can handle the final termination
> > > of the request, perf
1 - 100 of 477 matches
Mail list logo