Re: [PATCH 12/17] bcache: fix code comments style

2018-08-09 Thread Coly Li
On 2018/8/10 11:10 AM, shenghui wrote: > > > On 08/09/2018 02:43 PM, Coly Li wrote: >> This patch fixes 3 style issues warned by checkpatch.pl, >> - Comment lines are not aligned >> - Comments use "/*" on subsequent lines >> - Comment lines use a trailing "*/" >> >> Signed-off-by: Coly Li >> ---

Re: [RFC PATCH 1/5] block: move call of scheduler's ->completed_request() hook

2018-08-09 Thread jianchao.wang
Hi Omar On 08/10/2018 04:26 AM, Omar Sandoval wrote: > @@ -524,6 +524,9 @@ inline void __blk_mq_end_request(struct request *rq, > blk_status_t error) > blk_stat_add(rq, now); > } > > + if (rq->internal_tag != -1) > + blk_mq_sched_completed_request(rq, now); >

Re: [PATCH v6 08/12] block, scsi: Introduce blk_pm_runtime_exit()

2018-08-09 Thread jianchao.wang
Hi Bart On 08/10/2018 03:41 AM, Bart Van Assche wrote: > +void blk_pm_runtime_exit(struct request_queue *q) > +{ > + if (!q->dev) > + return; > + > + pm_runtime_get_sync(q->dev); > + q->dev = NULL; > +} > +EXPORT_SYMBOL(blk_pm_runtime_exit); > + > /** > * blk_pre_runtime

Fix for 84676c1f (b5b6e8c8) missing in 4.14.y

2018-08-09 Thread Felipe Franciosi
Hi Ming (and all), Your series "scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity" which forces virtio-scsi to use blk-mq fixes an issue introduced by 84676c1f. We noticed that this bug also exists in 4.14.y (as ef86f3a72adb), but your series was not backported to that stab

Re: Fix for 84676c1f (b5b6e8c8) missing in 4.14.y

2018-08-09 Thread Ming Lei
On Fri, Aug 10, 2018 at 02:09:01AM +, Felipe Franciosi wrote: > Hi Ming (and all), > > Your series "scsi: virtio_scsi: fix IO hang caused by irq vector automatic > affinity" which forces virtio-scsi to use blk-mq fixes an issue introduced by > 84676c1f. We noticed that this bug also exists i

Re: [PATCH v6 10/12] block: Change the runtime power management approach (1/2)

2018-08-09 Thread jianchao.wang
Hi Bart On 08/10/2018 03:41 AM, Bart Van Assche wrote: > Instead of scheduling runtime resume of a request queue after a > request has been queued, schedule asynchronous resume during request > allocation. The new pm_request_resume() calls occur after > blk_queue_enter() has increased the q_usage_

Re: [PATCH v6 11/12] block: Change the runtime power management approach (2/2)

2018-08-09 Thread jianchao.wang
Hi Bart On 08/10/2018 03:41 AM, Bart Van Assche wrote: > + > + blk_set_pm_only(q); > + /* > + * This function only gets called if the most recent > + * pm_request_resume() call occurred at least autosuspend_delay_ms > + * ago. Since blk_queue_enter() is called by the request

Re: [PATCH v6 05/12] block, scsi: Rename QUEUE_FLAG_PREEMPT_ONLY into DV_ONLY and introduce PM_ONLY

2018-08-09 Thread jianchao.wang
Hi Bart On 08/10/2018 03:41 AM, Bart Van Assche wrote: > +/* > + * Whether or not blk_queue_enter() should proceed. RQF_PM requests are > always > + * allowed. RQF_DV requests are allowed if the PM_ONLY queue flag has not > been > + * set. Other requests are only allowed if neither PM_ONLY nor D

Re: [PATCH v6 02/12] scsi: Alter handling of RQF_DV requests

2018-08-09 Thread Ming Lei
On Thu, Aug 09, 2018 at 12:41:39PM -0700, Bart Van Assche wrote: > Process all requests in state SDEV_CREATED instead of only RQF_DV > requests. This does not change the behavior of the SCSI core because > the SCSI device state is modified into another state before SCSI > devices become visible in

[RFC PATCH 4/5] kyber: implement improved heuristics

2018-08-09 Thread Omar Sandoval
From: Omar Sandoval Kyber's current heuristics have a few flaws: - It's based on the mean latency, but p99 latency tends to be more meaningful to anyone who cares about latency. The mean can also be skewed by rare outliers that the scheduler can't do anything about. - The statistics calculat

[RFC PATCH 5/5] kyber: add tracepoints

2018-08-09 Thread Omar Sandoval
From: Omar Sandoval When debugging Kyber, it's really useful to know what latencies we've been having and how the domain depths have been adjusted. Add two tracepoints, kyber_latency and kyber_adjust, to record that. Signed-off-by: Omar Sandoval --- block/kyber-iosched.c| 46 ++

[RFC PATCH 3/5] kyber: don't make domain token sbitmap larger than necessary

2018-08-09 Thread Omar Sandoval
From: Omar Sandoval The domain token sbitmaps are currently initialized to the device queue depth or 256, whichever is larger, and immediately resized to the maximum depth for that domain (256, 128, or 64 for read, write, and other, respectively). The sbitmap is never resized larger than that, so

[RFC PATCH 1/5] block: move call of scheduler's ->completed_request() hook

2018-08-09 Thread Omar Sandoval
From: Omar Sandoval Commit 4bc6339a583c ("block: move blk_stat_add() to __blk_mq_end_request()") consolidated some calls using ktime_get() so we'd only need to call it once. Kyber's ->completed_request() hook also calls ktime_get(), so let's move it to the same place, too. Signed-off-by: Omar Sa

[RFC PATCH 0/5] kyber: better heuristics

2018-08-09 Thread Omar Sandoval
From: Omar Sandoval Hello, I've spent the past few weeks experimenting with different heuristics for Kyber in order to deal with some edge cases we've hit here. This series is my progress so far, implementing less handwavy heuristics while keeping the same basic mechanisms. Patches 1 and 2 are p

[RFC PATCH 2/5] block: export blk_stat_enable_accounting()

2018-08-09 Thread Omar Sandoval
From: Omar Sandoval Kyber will need this in a future change if it is built as a module. Signed-off-by: Omar Sandoval --- block/blk-stat.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/blk-stat.c b/block/blk-stat.c index 175c143ac5b9..d98f3ad6794e 100644 --- a/block/blk-stat.c +++ b

Re: [PATCH v5 1/3] blkcg: Introduce blkg_root_lookup()

2018-08-09 Thread Jens Axboe
On 8/9/18 2:17 PM, Bart Van Assche wrote: > On Thu, 2018-08-09 at 12:56 -0700, Tejun Heo wrote: >> Hello, >> >> On Thu, Aug 09, 2018 at 07:53:36AM -0700, Bart Van Assche wrote: >>> +/** >>> + * blkg_lookup - look up blkg for the specified request queue >>> + * @q: request_queue of interest >>> + *

Re: [PATCH v5 1/3] blkcg: Introduce blkg_root_lookup()

2018-08-09 Thread Bart Van Assche
On Thu, 2018-08-09 at 12:56 -0700, Tejun Heo wrote: > Hello, > > On Thu, Aug 09, 2018 at 07:53:36AM -0700, Bart Van Assche wrote: > > +/** > > + * blkg_lookup - look up blkg for the specified request queue > > + * @q: request_queue of interest > > + * > > + * Lookup blkg for @q at the root level.

Re: [PATCH v5 1/3] blkcg: Introduce blkg_root_lookup()

2018-08-09 Thread Tejun Heo
Hello, On Thu, Aug 09, 2018 at 07:53:36AM -0700, Bart Van Assche wrote: > +/** > + * blkg_lookup - look up blkg for the specified request queue > + * @q: request_queue of interest > + * > + * Lookup blkg for @q at the root level. See also blkg_lookup(). > + */ > +static inline struct blkcg_gq *blk

[PATCH v6 01/12] block, scsi: Introduce request flag RQF_DV

2018-08-09 Thread Bart Van Assche
Instead of marking all power management, SCSI domain validation and IDE preempt requests with RQF_PREEMPT, only mark IDE preempt requests with RQF_PREEMPT. Use RQF_DV to mark requests submitted by scsi_execute() and RQF_PM to mark power management requests. Most but not all power management request

[PATCH v6 06/12] scsi: Reallow SPI domain validation during system suspend

2018-08-09 Thread Bart Van Assche
Now that SCSI power management and SPI domain validation use different mechanisms for blocking SCSI command execution, remove the mechanism again that prevents system suspend during SPI domain validation. This patch reverts 203f8c250e21 ("block, scsi: Fix race between SPI domain validation and sys

[PATCH v6 03/12] scsi: Only set RQF_DV for requests used for domain validation

2018-08-09 Thread Bart Van Assche
Instead of setting RQF_DV for all requests submitted by scsi_execute(), only set that flag for requests that are used for domain validation. Move the SCSI Parallel Interface (SPI) domain validation status from the transport data to struct scsi_target such that this status information can be accesse

[PATCH v6 04/12] scsi: Introduce the SDEV_SUSPENDED device status

2018-08-09 Thread Bart Van Assche
Instead of using the SDEV_QUIESCE state for both SCSI domain validation and runtime suspend, use the SDEV_QUIESCE state only for SCSI domain validation. Keep using scsi_device_quiesce() and scsi_device_unquiesce() for SCSI domain validation. Add new functions scsi_device_suspend() and scsi_device_u

[PATCH v6 08/12] block, scsi: Introduce blk_pm_runtime_exit()

2018-08-09 Thread Bart Van Assche
Since it is possible to unbind a SCSI ULD and since unbinding removes the association between a request queue and struct device, the q->dev pointer has to be reset during unbind. Hence introduce a function in the block layer that clears request_queue.dev. Signed-off-by: Bart Van Assche Cc: Martin

[PATCH v6 05/12] block, scsi: Rename QUEUE_FLAG_PREEMPT_ONLY into DV_ONLY and introduce PM_ONLY

2018-08-09 Thread Bart Van Assche
Instead of having one queue state (PREEMPT_ONLY) in which both power management and SCSI domain validation requests are processed, rename the PREEMPT_ONLY state into DV_ONLY and introduce a new queue flag QUEUE_FLAG_PM_ONLY. Provide the new functions blk_set_pm_only() and blk_clear_pm_only() for po

[PATCH v6 07/12] block: Move power management code into a new source file

2018-08-09 Thread Bart Van Assche
Move the code for runtime power management from blk-core.c into the new source file blk-pm.c. Move the corresponding declarations from into . For CONFIG_PM=n, leave out the declarations of the functions that are not used in that mode. This patch not only reduces the number of #ifdefs in the block

[PATCH v6 02/12] scsi: Alter handling of RQF_DV requests

2018-08-09 Thread Bart Van Assche
Process all requests in state SDEV_CREATED instead of only RQF_DV requests. This does not change the behavior of the SCSI core because the SCSI device state is modified into another state before SCSI devices become visible in sysfs and before any device nodes are created in /dev. Do not process RQF

[PATCH v6 12/12] blk-mq: Enable support for runtime power management

2018-08-09 Thread Bart Van Assche
Now that the blk-mq core processes power management requests (marked with RQF_PREEMPT) in other states than RPM_ACTIVE, enable runtime power management for blk-mq. Signed-off-by: Bart Van Assche Cc: Christoph Hellwig Cc: Ming Lei Cc: Jianchao Wang Cc: Hannes Reinecke Cc: Johannes Thumshirn C

[PATCH v6 11/12] block: Change the runtime power management approach (2/2)

2018-08-09 Thread Bart Van Assche
Instead of allowing requests that are not power management requests to enter the queue in runtime suspended status (RPM_SUSPENDED), make the blk_get_request() caller block. This change fixes a starvation issue: it is now guaranteed that power management requests will be executed no matter how many

[PATCH v6 09/12] block: Split blk_pm_add_request() and blk_pm_put_request()

2018-08-09 Thread Bart Van Assche
Move the pm_request_resume() and pm_runtime_mark_last_busy() calls into two new functions. Signed-off-by: Bart Van Assche Cc: Martin K. Petersen Cc: Christoph Hellwig Cc: Ming Lei Cc: Jianchao Wang Cc: Hannes Reinecke Cc: Johannes Thumshirn Cc: Alan Stern --- block/blk-core.c | 1 + bloc

[PATCH v6 10/12] block: Change the runtime power management approach (1/2)

2018-08-09 Thread Bart Van Assche
Instead of scheduling runtime resume of a request queue after a request has been queued, schedule asynchronous resume during request allocation. The new pm_request_resume() calls occur after blk_queue_enter() has increased the q_usage_counter request queue member. This change is needed for a later

[PATCH v6 00/12] blk-mq: Implement runtime power management

2018-08-09 Thread Bart Van Assche
Hello Jens, This patch series not only implements runtime power management for blk-mq but also fixes a starvation issue in the power management code for the legacy block layer. Please consider this patch series for the upstream kernel. Thanks, Bart. Changes compared to v5: - Introduced a new fl

Re: [PATCH RESEND] Blk-throttle: reduce tail io latency when iops limit is enforced

2018-08-09 Thread Jens Axboe
On 8/9/18 11:47 AM, Liu Bo wrote: > When an application's iops has exceeded its cgroup's iops limit, surely it > is throttled and kernel will set a timer for dispatching, thus IO latency > includes the delay. > > However, the dispatch delay which is calculated by the limit and the > elapsed jiffie

[PATCH RESEND] Blk-throttle: reduce tail io latency when iops limit is enforced

2018-08-09 Thread Liu Bo
When an application's iops has exceeded its cgroup's iops limit, surely it is throttled and kernel will set a timer for dispatching, thus IO latency includes the delay. However, the dispatch delay which is calculated by the limit and the elapsed jiffies is suboptimal. As the dispatch delay is onl

Re: [PATCH] Blk-throttle: reduce tail io latency when iops limit is enforced

2018-08-09 Thread Liu Bo
On Thu, Aug 9, 2018 at 10:22 AM, Jens Axboe wrote: > On 7/20/18 6:29 PM, Liu Bo wrote: >> When an application's iops has exceeded its cgroup's iops limit, surely it >> is throttled and kernel will set a timer for dispatching, thus IO latency >> includes the delay. >> >> However, the dispatch delay

Re: [PATCH] Blk-throttle: reduce tail io latency when iops limit is enforced

2018-08-09 Thread Jens Axboe
On 7/20/18 6:29 PM, Liu Bo wrote: > When an application's iops has exceeded its cgroup's iops limit, surely it > is throttled and kernel will set a timer for dispatching, thus IO latency > includes the delay. > > However, the dispatch delay which is calculated by the limit and the > elapsed jiffie

Re: [PATCH] Blk-throttle: reduce tail io latency when iops limit is enforced

2018-08-09 Thread Liu Bo
ping? On Fri, Jul 20, 2018 at 5:29 PM, Liu Bo wrote: > When an application's iops has exceeded its cgroup's iops limit, surely it > is throttled and kernel will set a timer for dispatching, thus IO latency > includes the delay. > > However, the dispatch delay which is calculated by the limit and

Re: [PATCH v5 5/9] block: Change the runtime power management approach (1/2)

2018-08-09 Thread Bart Van Assche
On Thu, 2018-08-09 at 10:52 +0800, Ming Lei wrote: > On Wed, Aug 08, 2018 at 05:28:43PM +, Bart Van Assche wrote: > > Some but not all blk_queue_enter() calls are related to request allocation > > so > > The only one I remember is scsi_ioctl_reset(), in which scsi_autopm_get_host() > is calle

Re: [PATCH v5 0/3] Ensure that a request queue is dissociated from the cgroup controller

2018-08-09 Thread Jens Axboe
On 8/9/18 8:53 AM, Bart Van Assche wrote: > Hello Jens, > > Several block drivers call alloc_disk() followed by put_disk() if something > fails before device_add_disk() is called without calling blk_cleanup_queue(). > Make sure that also for this scenario a request queue is dissociated from the >

[PATCH v5 3/3] block: Ensure that a request queue is dissociated from the cgroup controller

2018-08-09 Thread Bart Van Assche
Several block drivers call alloc_disk() followed by put_disk() if something fails before device_add_disk() is called without calling blk_cleanup_queue(). Make sure that also for this scenario a request queue is dissociated from the cgroup controller. This patch avoids that loading the parport_pc, p

[PATCH v5 0/3] Ensure that a request queue is dissociated from the cgroup controller

2018-08-09 Thread Bart Van Assche
Hello Jens, Several block drivers call alloc_disk() followed by put_disk() if something fails before device_add_disk() is called without calling blk_cleanup_queue(). Make sure that also for this scenario a request queue is dissociated from the cgroup controller. This patch avoids that loading the

[PATCH v5 1/3] blkcg: Introduce blkg_root_lookup()

2018-08-09 Thread Bart Van Assche
This new function will be used in a later patch to verify whether a queue has been dissociated from the cgroup controller before being released. Signed-off-by: Bart Van Assche Cc: Tejun Heo Cc: Christoph Hellwig Cc: Ming Lei Cc: Omar Sandoval Cc: Johannes Thumshirn Cc: Alexandru Moise <00mos

[PATCH v5 2/3] block: Introduce blk_exit_queue()

2018-08-09 Thread Bart Van Assche
This patch does not change any functionality. Signed-off-by: Bart Van Assche Reviewed-by: Johannes Thumshirn Cc: Christoph Hellwig Cc: Ming Lei Cc: Omar Sandoval Cc: Alexandru Moise <00moses.alexande...@gmail.com> Cc: Joseph Qi Cc: --- block/blk-core.c | 54 +++-

[PATCH] block: Remove two superfluous #include directives

2018-08-09 Thread Bart Van Assche
Commit 12f5b9314545 ("blk-mq: Remove generation seqeunce") removed the only seqcount_t and u64_stats_sync instances from but did not remove the corresponding #include directives. Since these include directives are no longer needed, remove them. Signed-off-by: Bart Van Assche Cc: Christoph Hellwi

Re: [PATCH v5 4/9] percpu-refcount: Introduce percpu_ref_is_in_use()

2018-08-09 Thread Bart Van Assche
On Wed, 2018-08-08 at 08:23 -0700, Tejun Heo wrote: > On Tue, Aug 07, 2018 at 03:51:28PM -0700, Bart Van Assche wrote: > > Introduce a function that allows to determine whether a per-cpu refcount > > is in use. This function will be used in a later patch to determine > > whether or not any block la

Re: [PATCH] block: bvec_nr_vecs() returns value for wrong slab

2018-08-09 Thread Jens Axboe
On 8/8/18 1:27 PM, Greg Edwards wrote: > In commit ed996a52c868 ("block: simplify and cleanup bvec pool > handling"), the value of the slab index is incremented by one in > bvec_alloc() after the allocation is done to indicate an index value of > 0 does not need to be later freed. > > bvec_nr_vecs

Re: [GIT PULL] last round of nvme updates for 4.19

2018-08-09 Thread Jens Axboe
On 8/9/18 2:07 AM, Christoph Hellwig wrote: > Hi Jens, > > this should be the last round of NVMe updates before the 4.19 merge > window opens. It conains support for write protected (aka read-only) > namespaces from Chaitanya, two ANA fixes from Hannes and a fabrics > fix from Tal Shorer. Thanks

Re: [PATCH 00/10] bcache patches for 4.19, 2nd wave

2018-08-09 Thread Coly Li
On 2018/8/9 10:21 PM, Jens Axboe wrote: > On 8/9/18 1:48 AM, Coly Li wrote: >> Hi Jens, >> >> Here are 2nd wave bcache patches for 4.19. >> >> The patches from me were either verified by other people or posted >> for quite long time. Except for the debugfs_create_dir() fix and >> "set max writebac

Re: [PATCH 00/10] bcache patches for 4.19, 2nd wave

2018-08-09 Thread Jens Axboe
On 8/9/18 1:48 AM, Coly Li wrote: > Hi Jens, > > Here are 2nd wave bcache patches for 4.19. > > The patches from me were either verified by other people or posted > for quite long time. Except for the debugfs_create_dir() fix and > "set max writeback rate" fix, rested patches are simple or trivi

[GIT PULL] last round of nvme updates for 4.19

2018-08-09 Thread Christoph Hellwig
Hi Jens, this should be the last round of NVMe updates before the 4.19 merge window opens. It conains support for write protected (aka read-only) namespaces from Chaitanya, two ANA fixes from Hannes and a fabrics fix from Tal Shorer. The following changes since commit f10fe9d85dc0802b54519c9177

[PATCH 10/10] bcache: trivial - remove tailing backslash in macro BTREE_FLAG

2018-08-09 Thread Coly Li
From: Shenghui Wang Remove the tailing backslash in macro BTREE_FLAG in btree.h Signed-off-by: Shenghui Wang Signed-off-by: Coly Li --- drivers/md/bcache/btree.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h index d21

[PATCH 09/10] bcache: make the pr_err statement used for ENOENT only in sysfs_attatch section

2018-08-09 Thread Coly Li
From: Shenghui Wang The pr_err statement in the code for sysfs_attatch section would run for various error codes, which maybe confusing. E.g, Run the command twice: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach [the backing

[PATCH 08/10] bcache: set max writeback rate when I/O request is idle

2018-08-09 Thread Coly Li
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") allows the writeback rate to be faster if there is no I/O request on a bcache device. It works well if there is only one bcache device attached to the cache set. If there are many bcache devices attached to a cache set, it may

[PATCH 07/10] bcache: add code comments for bset.c

2018-08-09 Thread Coly Li
This patch tries to add code comments in bset.c, to make some tricky code and designment to be more comprehensible. Most information of this patch comes from the discussion between Kent and I, he offers very informative details. If there is any mistake of the idea behind the code, no doubt that's f

[PATCH 06/10] bcache: fix mistaken comments in request.c

2018-08-09 Thread Coly Li
This patch updates code comment in bch_keylist_realloc() by fixing incorrected function names, to make the code to be more comprehennsible. Signed-off-by: Coly Li --- drivers/md/bcache/request.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/bcache/request.c b/dri

[PATCH 04/10] bcache: add a comment in super.c

2018-08-09 Thread Coly Li
This patch adds a line of code comment in super.c:register_bdev(), to make code to be more comprehensible. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index c7ffa6ef3f82..f517d7d1fa

[PATCH 03/10] bcache: avoid unncessary cache prefetch bch_btree_node_get()

2018-08-09 Thread Coly Li
In bch_btree_node_get() the read-in btree node will be partially prefetched into L1 cache for following bset iteration (if there is). But if the btree node read is failed, the perfetch operations will waste L1 cache space. This patch checkes whether read operation and only does cache prefetch when

[PATCH 05/10] bcache: fix mistaken code comments in bcache.h

2018-08-09 Thread Coly Li
This patch updates the code comment in struct cache with correct array names, to make the code to be more comprehensible. Signed-off-by: Coly Li --- drivers/md/bcache/bcache.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache

[PATCH 02/10] bcache: display rate debug parameters to 0 when writeback is not running

2018-08-09 Thread Coly Li
When writeback is not running, writeback rate should be 0, other value is misleading. And the following dyanmic writeback rate debug parameters should be 0 too, rate, proportional, integral, change otherwise they are misleading when writeback is not running. Signed-off-by: Coly Li --- dr

[PATCH 00/10] bcache patches for 4.19, 2nd wave

2018-08-09 Thread Coly Li
Hi Jens, Here are 2nd wave bcache patches for 4.19. The patches from me were either verified by other people or posted for quite long time. Except for the debugfs_create_dir() fix and "set max writeback rate" fix, rested patches are simple or trivial IMHO. Our new bcache developer Shenghui Wang

[PATCH 01/10] bcache: do not check return value of debugfs_create_dir()

2018-08-09 Thread Coly Li
Greg KH suggests that normal code should not care about debugfs. Therefore no matter successful or failed of debugfs_create_dir() execution, it is unncessary to check its return value. There are two functions called debugfs_create_dir() and check the return value, which are bch_debug_init() and cl