Re: bfq: BUG bfq_queue: Objects remaining in bfq_queue on __kmem_cache_shutdown() after rmmod

2017-12-20 Thread Paolo Valente
> Il giorno 21 dic 2017, alle ore 08:08, Guoqing Jiang ha > scritto: > > Hi, > > > On 12/08/2017 08:34 AM, Holger Hoffstätte wrote: >> So plugging in a device on USB with BFQ as scheduler now works without >> hiccup (probably thanks to Ming Lei's last patch), but of course

Re: bfq: BUG bfq_queue: Objects remaining in bfq_queue on __kmem_cache_shutdown() after rmmod

2017-12-20 Thread Guoqing Jiang
Hi, On 12/08/2017 08:34 AM, Holger Hoffstätte wrote: So plugging in a device on USB with BFQ as scheduler now works without hiccup (probably thanks to Ming Lei's last patch), but of course I found another problem. Unmounting the device after use, changing the scheduler back to deadline or

[PATCH V9 3/7] mq-deadline: Introduce zone locking support

2017-12-20 Thread Damien Le Moal
Introduce zone write locking to avoid write request reordering with zoned block devices. This is achieved using a finer selection of the next request to dispatch: 1) Any non-write request is always allowed to proceed. 2) Any write to a conventional zone is always allowed to proceed. 3) For a write

[PATCH V9 4/7] deadline-iosched: Introduce dispatch helpers

2017-12-20 Thread Damien Le Moal
Avoid directly referencing the next_rq and fifo_list arrays using the helper functions deadline_next_request() and deadline_fifo_request() to facilitate changes in the dispatch request selection in deadline_dispatch_requests() for zoned block devices. While at it, also remove the unnecessary

[PATCH V9 7/7] sd: Remove zone write locking

2017-12-20 Thread Damien Le Moal
The block layer now handles zone write locking. Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen --- drivers/scsi/sd.c| 41 +++- drivers/scsi/sd.h|

[PATCH V9 6/7] sd_zbc: Initialize device request queue zoned data

2017-12-20 Thread Damien Le Moal
Initialize the seq_zones_bitmap, seq_zones_wlock and nr_zones fields of the disk request queue on disk revalidate. As the seq_zones_bitmap and seq_zones_wlock allocations are identical, introduce the helper sd_zbc_alloc_zone_bitmap(). Using this helper, reallocate the bitmaps whenever the disk

[PATCH V9 5/7] deadline-iosched: Introduce zone locking support

2017-12-20 Thread Damien Le Moal
Introduce zone write locking to avoid write request reordering with zoned block devices. This is achieved using a finer selection of the next request to dispatch: 1) Any non-write request is always allowed to proceed. 2) Any write to a conventional zone is always allowed to proceed. 3) For a write

[PATCH V9 2/7] mq-deadline: Introduce dispatch helpers

2017-12-20 Thread Damien Le Moal
Avoid directly referencing the next_rq and fifo_list arrays using the helper functions deadline_next_request() and deadline_fifo_request() to facilitate changes in the dispatch request selection in __dd_dispatch_request() for zoned block devices. Signed-off-by: Damien Le Moal

[PATCH V9 1/7] block: introduce zoned block devices zone write locking

2017-12-20 Thread Damien Le Moal
From: Christoph Hellwig Components relying only on the request_queue structure for accessing block devices (e.g. I/O schedulers) have a limited knowledged of the device characteristics. In particular, the device capacity cannot be easily discovered, which for a zoned block device

[PATCH V9 0/7] blk-mq support for ZBC disks

2017-12-20 Thread Damien Le Moal
This series, formerly titled "scsi-mq support for ZBC disks", implements support for ZBC disks for system using the scsi-mq I/O path. The current scsi level support of ZBC disks guarantees write request ordering using a per-zone write lock which prevents issuing simultaneously multiple write

Re: [PATCHSET v2] blk-mq: reimplement timeout handling

2017-12-20 Thread Bart Van Assche
On Wed, 2017-12-20 at 16:08 -0800, t...@kernel.org wrote: > On Wed, Dec 20, 2017 at 11:41:02PM +, Bart Van Assche wrote: > > On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote: > > > Currently, blk-mq timeout path synchronizes against the usual > > > issue/completion path using a complex

Re: [PATCHSET v2] blk-mq: reimplement timeout handling

2017-12-20 Thread t...@kernel.org
On Wed, Dec 20, 2017 at 11:41:02PM +, Bart Van Assche wrote: > On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote: > > Currently, blk-mq timeout path synchronizes against the usual > > issue/completion path using a complex scheme involving atomic > > bitflags, REQ_ATOM_*, memory barriers and

Re: [PATCHSET v2] blk-mq: reimplement timeout handling

2017-12-20 Thread Bart Van Assche
On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote: > Currently, blk-mq timeout path synchronizes against the usual > issue/completion path using a complex scheme involving atomic > bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence > rules. Unfortunatley, it contains quite a few

Re: [PATCH V3] block: drain queue before waiting for q_usage_counter becoming zero

2017-12-20 Thread Ming Lei
On Thu, Nov 30, 2017 at 07:56:35AM +0800, Ming Lei wrote: > Now we track legacy requests with .q_usage_counter in commit 055f6e18e08f > ("block: Make q_usage_counter also track legacy requests"), but that > commit never runs and drains legacy queue before waiting for this counter > becoming zero,

Re: [dm-devel] [for-4.16 PATCH 4/5] dm mpath: use NVMe error handling to know when an error is retryable

2017-12-20 Thread Sagi Grimberg
But interestingly, with my "mptest" link failure test (test_01_nvme_offline) I'm not actually seeing NVMe trigger a failure that needs a multipath layer (be it NVMe multipath or DM multipath) to fail a path and retry the IO. The pattern is that the link goes down, and nvme waits for it to come

Re: random call_single_data alignment

2017-12-20 Thread Jens Axboe
On 12/20/17 1:18 PM, Peter Zijlstra wrote: > On Wed, Dec 20, 2017 at 12:40:25PM -0700, Jens Axboe wrote: >> On 12/20/17 12:10 PM, Jens Axboe wrote: >>> For some reason, commit 966a967116e6 was added to the tree without >>> CC'ing relevant maintainers, even though it's touching random subsystems.

Re: random call_single_data alignment

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 12:40:25PM -0700, Jens Axboe wrote: > On 12/20/17 12:10 PM, Jens Axboe wrote: > > For some reason, commit 966a967116e6 was added to the tree without > > CC'ing relevant maintainers, even though it's touching random subsystems. > > One example is struct request, a core

Re: random call_single_data alignment

2017-12-20 Thread Jens Axboe
On 12/20/17 12:10 PM, Jens Axboe wrote: > For some reason, commit 966a967116e6 was added to the tree without > CC'ing relevant maintainers, even though it's touching random subsystems. > One example is struct request, a core structure in the block layer. > After this change, struct request grows

[PATCH] null_blk: unalign call_single_data

2017-12-20 Thread Jens Axboe
Commit 966a967116e6 randomly added alignment to this structure, but it's actually detrimental to performance of null_blk. Test case: # modprobe null_blk queue_mode=1 irqmode=1 home_node={0,1} # echo noop > /sys/block/nullb0/queue/scheduler # fio --name=csd --filename=/dev/nullb0 --numjobs=8

[PATCH] block: unalign call_single_data in struct request

2017-12-20 Thread Jens Axboe
A previous change blindly added massive alignment to the call_single_data structure in struct request. This ballooned it in size from 296 to 320 bytes on my setup, for no valid reason at all. Use the unaligned struct __call_single_data variant instead. Fixes: 966a967116e69 ("smp: Avoid using two

random call_single_data alignment

2017-12-20 Thread Jens Axboe
For some reason, commit 966a967116e6 was added to the tree without CC'ing relevant maintainers, even though it's touching random subsystems. One example is struct request, a core structure in the block layer. After this change, struct request grows from 296 to 320 bytes on my setup. Why are we

Re: [PATCH V2] block-throttle: avoid double charge

2017-12-20 Thread Jens Axboe
On 11/13/17 1:37 PM, Shaohua Li wrote: > If a bio is throttled and splitted after throttling, the bio could be > resubmited and enters the throttling again. This will cause part of the > bio is charged multiple times. If the cgroup has an IO limit, the double > charge will significantly harm the

[PATCH 01/25] null_blk: remove lightnvm support

2017-12-20 Thread Matias Bjørling
With rrpc to be removed, the null_blk lightnvm support is no longer functional. Remove the lightnvm implementation and maybe add it to another module in the future if someone takes on the challenge. Signed-off-by: Matias Bjørling --- drivers/block/null_blk.c | 220

[PATCH 00/25] Updates to lightnvm and pblk

2017-12-20 Thread Matias Bjørling
Hi, A bunch of patches for the lightnvm subsystem and pblk. The first part is preparation patches for the 2.0 revision of the specification. This includes removing the null_blk implementation and killing the rrpc implementation, which used already deprecated definitions from the 1.2 revision.

[PATCH 05/25] lightnvm: remove unnecessary field from nvm_rq

2017-12-20 Thread Matias Bjørling
From: Javier González Remove the wait filed in nvm_rq. It is not used anymore, as targets rely on the functionality provided by the LightNVM subsystem when sending sync I/O. Signed-off-by: Javier González Signed-off-by: Matias Bjørling

[PATCH 06/25] lightnvm: remove lower page tables

2017-12-20 Thread Matias Bjørling
The lower page table is unused. All page tables reported by 1.2 devices are all reporting a sequential 1:1 page mapping. This is also not used going forward with the 2.0 revision. Signed-off-by: Matias Bjørling Reviewed-by: Javier González Signed-off-by:

Re: [PATCH] lightnvm: pblk: remove some unnecessary NULL checks

2017-12-20 Thread Matias Bjørling
On 11/06/2017 12:48 PM, Dan Carpenter wrote: Smatch complains that flush_workqueue() dereferences the work queue pointer but then we check if it's NULL on the next line when it's too late. These NULL checks can be removed because the module won't load if we can't allocate the work queues.

[PATCH 11/25] lightnvm: pblk: remove pblk_for_each_lun helper

2017-12-20 Thread Matias Bjørling
From: Javier González Signed-off-by: Javier González Signed-off-by: Matias Bjørling --- drivers/lightnvm/pblk.h | 4 1 file changed, 4 deletions(-) diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h index

[PATCH 03/25] lightnvm: use internal pblk methods

2017-12-20 Thread Matias Bjørling
Now that rrpc has been removed, the only users of the ppa helpers is pblk. However, pblk already defines similar functions. Switch pblk to use the internal ones, and remove the generic ppa helpers. Signed-off-by: Matias Bjørling --- drivers/lightnvm/pblk-map.c | 2 +-

[PATCH 07/25] lightnvm: make geometry structures 2.0 ready

2017-12-20 Thread Matias Bjørling
From: Matias Bjørling Prepare for the 2.0 revision by adapting the geometry structures to coexist with the 1.2 revision. Signed-off-by: Matias Bjørling Reviewed-by: Javier González Signed-off-by: Matias Bjørling

[PATCH 09/25] lightnvm: guarantee target unique name across devs.

2017-12-20 Thread Matias Bjørling
From: Javier González Until now, target unique naming is only guaranteed per device. This is ok from a lightnvm perspective, but not from a sysfs one, since groups will collide regardless of the underlying device. Check that names are unique across all lightnvm-capable

[PATCH 10/25] lightnvm: pblk: compress and reorder helper functions

2017-12-20 Thread Matias Bjørling
From: Javier González Through time, we have generated some redundant helper functions. Refactor them to eliminate redundant and unnecessary code. Also, reorder them to improve readability Signed-off-by: Javier González Signed-off-by: Matias Bjørling

[PATCH 08/25] lightnvm: refactor target type lookup

2017-12-20 Thread Matias Bjørling
From: Javier González Refactor target type lookup to use/not use locks explicitly instead of using a hidden parameter to make the function locking. Signed-off-by: Javier González Signed-off-by: Matias Bjørling ---

[PATCH 15/25] lightnvm: pblk: prevent premature sync point resets

2017-12-20 Thread Matias Bjørling
From: Hans Holmberg Unless we protect flush pointer updates with a lock, we risk resetting new flush points before we've synced all sectors up to that point. This patch protects new flush points with the same spin lock that is being held when advancing the sync

[PATCH 18/25] lightnvm: set target over-provision on create ioctl

2017-12-20 Thread Matias Bjørling
From: Javier González Allow to set the over-provision percentage on target creation. In case that the value is not provided, fall back to the default value set by the target. In pblk, set the default OP to 11% of the total size of the device Signed-off-by: Javier González

[PATCH 12/25] lightnvm: pblk: refactor emeta consistency check

2017-12-20 Thread Matias Bjørling
From: Hans Holmberg Currently pblk_recov_get_lba list does two separate things: it checks the consistency of the emeta and extracts the lba list. This patch separates the consistency check to make the code easier to read and to prepare for version checks of the line

[PATCH 16/25] lightnvm: pblk: remove pblk_gc_stop

2017-12-20 Thread Matias Bjørling
From: Hans Holmberg pblk_gc_stop just sets pblk->gc->gc_active to zero, ignoring the flush parameter. This is plain confusing, so remove the function and set the gc active flag at the call points instead. Signed-off-by: Hans Holmberg

[PATCH 17/25] lightnvm: pblk: use exact free block counter in RL

2017-12-20 Thread Matias Bjørling
From: Javier González Until now, pblk's rate-limiter has used a heuristic to reserve space for GC I/O given that the over-provision area was fixed. In preparation for allowing to define the over-provision area on target creation, define a dedicated free_block counter in the

[PATCH 25/25] lightnvm: pblk: refactor pblk_ppa_comp function

2017-12-20 Thread Matias Bjørling
Shorten function to simply return the value of the if statement. Signed-off-by: Matias Bjørling --- drivers/lightnvm/pblk.h | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h index 1e9eafd..a1434da 100644

[PATCH 24/25] lightnvm: pblk: add iostat support

2017-12-20 Thread Matias Bjørling
From: Javier González Since pblk registers its own block device, the iostat accounting is not automatically done for us. Therefore, add the necessary accounting logic to satisfy the iostat interface. Signed-off-by: Javier González Signed-off-by: Matias

[PATCH 20/25] lightnvm: pblk: do not log recovery read errors

2017-12-20 Thread Matias Bjørling
From: Javier González On scan recovery, reads can fail. This happens because the first page for each line is read in order to determined if the line has been used (and thus needs to be recovered), or not. This can lead to "empty page" read errors. Since these errors are

[PATCH 21/25] lightnvm: pblk: ensure kthread alloc. before kicking it

2017-12-20 Thread Matias Bjørling
From: Javier González When creating the write thread, ensure that the kthread has been created before initializing the timer responsible from kicking it. Otherwise, if the kthread creation fails or gets killed from used space, we risk kicking an empty thread structure.

[PATCH 19/25] lightnvm: pblk: ignore high ecc errors on recovery

2017-12-20 Thread Matias Bjørling
From: Javier González On recovery, do not stop L2P recovery if reads report high ECC error as the data is still available. Signed-off-by: Javier González Signed-off-by: Matias Bjørling --- drivers/lightnvm/pblk-recovery.c | 2 +- 1

[PATCH 23/25] lightnvm: pblk: print instance name on instance info

2017-12-20 Thread Matias Bjørling
From: Javier González Add the instance name to the information printed out on target creation. Signed-off-by: Javier González Signed-off-by: Matias Bjørling --- drivers/lightnvm/pblk-init.c | 3 ++- 1 file changed, 2 insertions(+),

[PATCH 14/25] lightnvm: pblk: clear flush point on completed writes

2017-12-20 Thread Matias Bjørling
From: Hans Holmberg Move completion of syncs and clearing of flush points to the write completion path - this ensures that the data has been committed to the media before completing bios containing syncs. Signed-off-by: Hans Holmberg

[PATCH 13/25] lightnvm: pblk: rename sync_point to flush_point

2017-12-20 Thread Matias Bjørling
From: Hans Holmberg Sync point is a really confusing name for keeping track of the last entry that needs to be flushed so change the name to to flush_point instead. Signed-off-by: Hans Holmberg Signed-off-by: Javier González

[PATCH 04/25] lightnvm: remove hybrid ocssd 1.2 support

2017-12-20 Thread Matias Bjørling
Now that rrpc have been removed. Also remove the hybrid 1.2 support from the core. Signed-off-by: Matias Bjørling --- drivers/lightnvm/core.c | 141 --- drivers/nvme/host/lightnvm.c | 59 -- include/linux/lightnvm.h

[PATCH 02/25] lightnvm: remove rrpc

2017-12-20 Thread Matias Bjørling
The hybrid mode for 1.2 revision was deprecated, and have no users. Remove to make it easier to move to the 2.0 revision. Signed-off-by: Matias Bjørling --- drivers/lightnvm/Kconfig |7 - drivers/lightnvm/Makefile |1 - drivers/lightnvm/rrpc.c | 1625

[PATCH] BLOCK: blk-flush: fixed line with more than 80 character

2017-12-20 Thread Khan M Rashedun-Naby
The line has more than 80 character which is not linux kernel coding style. Thus it has rewritten into two lines. Signed-off-by: Khan M Rashedun-Naby --- block/blk-flush.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/blk-flush.c

Re: [PATCH V2] block-throttle: avoid double charge

2017-12-20 Thread Shaohua Li
On Wed, Dec 20, 2017 at 02:42:20PM +0800, xuejiufei wrote: > Hi Shaohua, > > I noticed that the splitted bio will goto the scheduler directly while > the cloned bio entering the generic_make_request again. So can we just > leave the BIO_THROTTLED flag in the original bio and do not copy the >

[PATCH IMPROVEMENT] block, bfq: increase threshold to deem I/O as random

2017-12-20 Thread Paolo Valente
If two processes do I/O close to each other, i.e., are cooperating processes in BFQ (and CFQ'S) nomenclature, then BFQ merges their associated bfq_queues, so as to get sequential I/O from the union of the I/O requests of the processes, and thus reach a higher throughput. A merged queue is then

[PATCH] Block: blk-flush: removed an unnecessary else statement

2017-12-20 Thread Khan M Rashedun-Naby
As both of the if and else statement block is returning something then there is no need of the else statement. Thus this else statement has been removed. Signed-off-by: Khan M Rashedun-Naby --- block/blk-flush.c | 13 +++-- 1 file changed, 7 insertions(+), 6

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-20 Thread Christian Borntraeger
On 12/18/2017 02:56 PM, Stefan Haberland wrote: > On 07.12.2017 00:29, Christoph Hellwig wrote: >> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote: >> t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad >>> blk-mq: create a blk_mq_ctx for each possible CPU >>>

Re: [PATCH V2 0/2] block: fix queue freeze and cleanup

2017-12-20 Thread Mauricio Faria de Oliveira
Hi Bart, On 12/13/2017 07:49 PM, Bart Van Assche wrote: Would it be possible to repeat your test with the patch below applied on your kernel tree? This patch has just been posted on the dm-devel mailing list. Sorry for the delay. I missed this. Unfortunately the oops problem still happens on

[PATCH IMPROVEMENT/BUGFIX 2/4] block, bfq: check low_latency flag in bfq_bfqq_save_state()

2017-12-20 Thread Paolo Valente
From: Angelo Ruocco A just-created bfq_queue will certainly be deemed as interactive on the arrival of its first I/O request, if the low_latency flag is set. Yet, if the queue is merged with another queue on the arrival of its first I/O request, it will not have the

[PATCH IMPROVEMENT/BUGFIX 3/4] block, bfq: let a queue be merged only shortly after starting I/O

2017-12-20 Thread Paolo Valente
In BFQ and CFQ, two processes are said to be cooperating if they do I/O in such a way that the union of their I/O requests yields a sequential I/O pattern. To get such a sequential I/O pattern out of the non-sequential pattern of each cooperating process, BFQ and CFQ merge the queues associated

[PATCH IMPROVEMENT/BUGFIX 4/4] block, bfq: remove superfluous check in queue-merging setup

2017-12-20 Thread Paolo Valente
From: Angelo Ruocco When two or more processes do I/O in a way that the their requests are sequential in respect to one another, BFQ merges the bfq_queues associated with the processes. This way the overall I/O pattern becomes sequential, and thus there is a boost in

[PATCH IMPROVEMENT/BUGFIX 1/4] block, bfq: add missing rq_pos_tree update on rq removal

2017-12-20 Thread Paolo Valente
If two processes do I/O close to each other, then BFQ merges the bfq_queues associated with these processes, to get a more sequential I/O, and thus a higher throughput. In this respect, to detect whether two processes are doing I/O close to each other, BFQ keeps a list of the head-of-line I/O

[PATCH IMPROVEMENT/BUGFIX 0/4] remove start-up time outlier caused by wrong detection of cooperating processes

2017-12-20 Thread Paolo Valente
Hi, the main patch in this series ("block, bfq: let a queue be merged only shortly after starting I/O") eliminates an outlier in the application start-up time guaranteed by BFQ. This outlier occurs more or less frequently, as a function of the characteristics of the system, and is caused by a

[PATCH blktests] Fix syntax error with bash v4.1.2(e.g RHEL6)

2017-12-20 Thread xiao yang
-v option is not supported by conditional expressions on bash v4.1.2, so we use -n instead of -v to fix this issue. Signed-off-by: xiao yang --- check| 12 ++-- common/fio | 2 +- common/rc| 2 +- tests/meta/group | 4 ++-- 4 files

Re: [PATCH blktests] block/013: Add test for BLKRRPART ioctl

2017-12-20 Thread Johannes Thumshirn
Omar Sandoval writes: > On Tue, Dec 19, 2017 at 11:47:09AM +0100, Johannes Thumshirn wrote: >> xiao yang writes: >> >> > +requires() { >> > + _have_program mkfs.ext3 >> > +} >> [...] >> > + # Format >> > + mkfs.ext3 -F "$TEST_DEV" >> "$FULL"