> Il giorno 21 dic 2017, alle ore 08:08, Guoqing Jiang ha
> scritto:
>
> Hi,
>
>
> On 12/08/2017 08:34 AM, Holger Hoffstätte wrote:
>> So plugging in a device on USB with BFQ as scheduler now works without
>> hiccup (probably thanks to Ming Lei's last patch), but of course
Hi,
On 12/08/2017 08:34 AM, Holger Hoffstätte wrote:
So plugging in a device on USB with BFQ as scheduler now works without
hiccup (probably thanks to Ming Lei's last patch), but of course I found
another problem. Unmounting the device after use, changing the scheduler
back to deadline or
Introduce zone write locking to avoid write request reordering with
zoned block devices. This is achieved using a finer selection of the
next request to dispatch:
1) Any non-write request is always allowed to proceed.
2) Any write to a conventional zone is always allowed to proceed.
3) For a write
Avoid directly referencing the next_rq and fifo_list arrays using the
helper functions deadline_next_request() and deadline_fifo_request() to
facilitate changes in the dispatch request selection in
deadline_dispatch_requests() for zoned block devices.
While at it, also remove the unnecessary
The block layer now handles zone write locking.
Signed-off-by: Damien Le Moal
Reviewed-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
---
drivers/scsi/sd.c| 41 +++-
drivers/scsi/sd.h|
Initialize the seq_zones_bitmap, seq_zones_wlock and nr_zones fields of
the disk request queue on disk revalidate. As the seq_zones_bitmap
and seq_zones_wlock allocations are identical, introduce the helper
sd_zbc_alloc_zone_bitmap(). Using this helper, reallocate the bitmaps
whenever the disk
Introduce zone write locking to avoid write request reordering with
zoned block devices. This is achieved using a finer selection of the
next request to dispatch:
1) Any non-write request is always allowed to proceed.
2) Any write to a conventional zone is always allowed to proceed.
3) For a write
Avoid directly referencing the next_rq and fifo_list arrays using the
helper functions deadline_next_request() and deadline_fifo_request() to
facilitate changes in the dispatch request selection in
__dd_dispatch_request() for zoned block devices.
Signed-off-by: Damien Le Moal
From: Christoph Hellwig
Components relying only on the request_queue structure for accessing
block devices (e.g. I/O schedulers) have a limited knowledged of the
device characteristics. In particular, the device capacity cannot be
easily discovered, which for a zoned block device
This series, formerly titled "scsi-mq support for ZBC disks", implements
support for ZBC disks for system using the scsi-mq I/O path.
The current scsi level support of ZBC disks guarantees write request ordering
using a per-zone write lock which prevents issuing simultaneously multiple
write
On Wed, 2017-12-20 at 16:08 -0800, t...@kernel.org wrote:
> On Wed, Dec 20, 2017 at 11:41:02PM +, Bart Van Assche wrote:
> > On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote:
> > > Currently, blk-mq timeout path synchronizes against the usual
> > > issue/completion path using a complex
On Wed, Dec 20, 2017 at 11:41:02PM +, Bart Van Assche wrote:
> On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote:
> > Currently, blk-mq timeout path synchronizes against the usual
> > issue/completion path using a complex scheme involving atomic
> > bitflags, REQ_ATOM_*, memory barriers and
On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote:
> Currently, blk-mq timeout path synchronizes against the usual
> issue/completion path using a complex scheme involving atomic
> bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
> rules. Unfortunatley, it contains quite a few
On Thu, Nov 30, 2017 at 07:56:35AM +0800, Ming Lei wrote:
> Now we track legacy requests with .q_usage_counter in commit 055f6e18e08f
> ("block: Make q_usage_counter also track legacy requests"), but that
> commit never runs and drains legacy queue before waiting for this counter
> becoming zero,
But interestingly, with my "mptest" link failure test
(test_01_nvme_offline) I'm not actually seeing NVMe trigger a failure
that needs a multipath layer (be it NVMe multipath or DM multipath) to
fail a path and retry the IO. The pattern is that the link goes down,
and nvme waits for it to come
On 12/20/17 1:18 PM, Peter Zijlstra wrote:
> On Wed, Dec 20, 2017 at 12:40:25PM -0700, Jens Axboe wrote:
>> On 12/20/17 12:10 PM, Jens Axboe wrote:
>>> For some reason, commit 966a967116e6 was added to the tree without
>>> CC'ing relevant maintainers, even though it's touching random subsystems.
On Wed, Dec 20, 2017 at 12:40:25PM -0700, Jens Axboe wrote:
> On 12/20/17 12:10 PM, Jens Axboe wrote:
> > For some reason, commit 966a967116e6 was added to the tree without
> > CC'ing relevant maintainers, even though it's touching random subsystems.
> > One example is struct request, a core
On 12/20/17 12:10 PM, Jens Axboe wrote:
> For some reason, commit 966a967116e6 was added to the tree without
> CC'ing relevant maintainers, even though it's touching random subsystems.
> One example is struct request, a core structure in the block layer.
> After this change, struct request grows
Commit 966a967116e6 randomly added alignment to this structure, but
it's actually detrimental to performance of null_blk. Test case:
# modprobe null_blk queue_mode=1 irqmode=1 home_node={0,1}
# echo noop > /sys/block/nullb0/queue/scheduler
# fio --name=csd --filename=/dev/nullb0 --numjobs=8
A previous change blindly added massive alignment to the
call_single_data structure in struct request. This ballooned it in
size from 296 to 320 bytes on my setup, for no valid reason at all.
Use the unaligned struct __call_single_data variant instead.
Fixes: 966a967116e69 ("smp: Avoid using two
For some reason, commit 966a967116e6 was added to the tree without
CC'ing relevant maintainers, even though it's touching random subsystems.
One example is struct request, a core structure in the block layer.
After this change, struct request grows from 296 to 320 bytes on my
setup.
Why are we
On 11/13/17 1:37 PM, Shaohua Li wrote:
> If a bio is throttled and splitted after throttling, the bio could be
> resubmited and enters the throttling again. This will cause part of the
> bio is charged multiple times. If the cgroup has an IO limit, the double
> charge will significantly harm the
With rrpc to be removed, the null_blk lightnvm support is no longer
functional. Remove the lightnvm implementation and maybe add it to
another module in the future if someone takes on the challenge.
Signed-off-by: Matias Bjørling
---
drivers/block/null_blk.c | 220
Hi,
A bunch of patches for the lightnvm subsystem and pblk.
The first part is preparation patches for the 2.0 revision of the
specification. This includes removing the null_blk implementation
and killing the rrpc implementation, which used already deprecated
definitions from the 1.2 revision.
From: Javier González
Remove the wait filed in nvm_rq. It is not used anymore, as targets rely
on the functionality provided by the LightNVM subsystem when sending
sync I/O.
Signed-off-by: Javier González
Signed-off-by: Matias Bjørling
The lower page table is unused. All page tables reported by 1.2
devices are all reporting a sequential 1:1 page mapping. This is
also not used going forward with the 2.0 revision.
Signed-off-by: Matias Bjørling
Reviewed-by: Javier González
Signed-off-by:
On 11/06/2017 12:48 PM, Dan Carpenter wrote:
Smatch complains that flush_workqueue() dereferences the work queue
pointer but then we check if it's NULL on the next line when it's too
late. These NULL checks can be removed because the module won't load if
we can't allocate the work queues.
From: Javier González
Signed-off-by: Javier González
Signed-off-by: Matias Bjørling
---
drivers/lightnvm/pblk.h | 4
1 file changed, 4 deletions(-)
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index
Now that rrpc has been removed, the only users of the ppa helpers
is pblk. However, pblk already defines similar functions.
Switch pblk to use the internal ones, and remove the generic ppa
helpers.
Signed-off-by: Matias Bjørling
---
drivers/lightnvm/pblk-map.c | 2 +-
From: Matias Bjørling
Prepare for the 2.0 revision by adapting the geometry
structures to coexist with the 1.2 revision.
Signed-off-by: Matias Bjørling
Reviewed-by: Javier González
Signed-off-by: Matias Bjørling
From: Javier González
Until now, target unique naming is only guaranteed per device. This is
ok from a lightnvm perspective, but not from a sysfs one, since groups
will collide regardless of the underlying device.
Check that names are unique across all lightnvm-capable
From: Javier González
Through time, we have generated some redundant helper functions.
Refactor them to eliminate redundant and unnecessary code. Also, reorder
them to improve readability
Signed-off-by: Javier González
Signed-off-by: Matias Bjørling
From: Javier González
Refactor target type lookup to use/not use locks explicitly instead of
using a hidden parameter to make the function locking.
Signed-off-by: Javier González
Signed-off-by: Matias Bjørling
---
From: Hans Holmberg
Unless we protect flush pointer updates with a lock, we risk
resetting new flush points before we've synced all sectors
up to that point.
This patch protects new flush points with the same spin lock
that is being held when advancing the sync
From: Javier González
Allow to set the over-provision percentage on target creation. In case
that the value is not provided, fall back to the default value set by
the target.
In pblk, set the default OP to 11% of the total size of the device
Signed-off-by: Javier González
From: Hans Holmberg
Currently pblk_recov_get_lba list does two separate things:
it checks the consistency of the emeta and extracts the lba list.
This patch separates the consistency check to make the code easier
to read and to prepare for version checks of the line
From: Hans Holmberg
pblk_gc_stop just sets pblk->gc->gc_active to zero, ignoring
the flush parameter. This is plain confusing, so remove the
function and set the gc active flag at the call points instead.
Signed-off-by: Hans Holmberg
From: Javier González
Until now, pblk's rate-limiter has used a heuristic to reserve space for
GC I/O given that the over-provision area was fixed.
In preparation for allowing to define the over-provision area on target
creation, define a dedicated free_block counter in the
Shorten function to simply return the value of the if statement.
Signed-off-by: Matias Bjørling
---
drivers/lightnvm/pblk.h | 5 +
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 1e9eafd..a1434da 100644
From: Javier González
Since pblk registers its own block device, the iostat accounting is
not automatically done for us. Therefore, add the necessary
accounting logic to satisfy the iostat interface.
Signed-off-by: Javier González
Signed-off-by: Matias
From: Javier González
On scan recovery, reads can fail. This happens because the first page
for each line is read in order to determined if the line has been used
(and thus needs to be recovered), or not. This can lead to "empty page"
read errors.
Since these errors are
From: Javier González
When creating the write thread, ensure that the kthread has been created
before initializing the timer responsible from kicking it. Otherwise, if
the kthread creation fails or gets killed from used space, we risk
kicking an empty thread structure.
From: Javier González
On recovery, do not stop L2P recovery if reads report high ECC error
as the data is still available.
Signed-off-by: Javier González
Signed-off-by: Matias Bjørling
---
drivers/lightnvm/pblk-recovery.c | 2 +-
1
From: Javier González
Add the instance name to the information printed out on target creation.
Signed-off-by: Javier González
Signed-off-by: Matias Bjørling
---
drivers/lightnvm/pblk-init.c | 3 ++-
1 file changed, 2 insertions(+),
From: Hans Holmberg
Move completion of syncs and clearing of flush points to the
write completion path - this ensures that the data has been
committed to the media before completing bios containing syncs.
Signed-off-by: Hans Holmberg
From: Hans Holmberg
Sync point is a really confusing name for keeping track of
the last entry that needs to be flushed so change the name
to to flush_point instead.
Signed-off-by: Hans Holmberg
Signed-off-by: Javier González
Now that rrpc have been removed. Also remove the hybrid 1.2 support
from the core.
Signed-off-by: Matias Bjørling
---
drivers/lightnvm/core.c | 141 ---
drivers/nvme/host/lightnvm.c | 59 --
include/linux/lightnvm.h
The hybrid mode for 1.2 revision was deprecated, and have
no users. Remove to make it easier to move to the 2.0 revision.
Signed-off-by: Matias Bjørling
---
drivers/lightnvm/Kconfig |7 -
drivers/lightnvm/Makefile |1 -
drivers/lightnvm/rrpc.c | 1625
The line has more than 80 character which is not linux kernel coding
style. Thus it has rewritten into two lines.
Signed-off-by: Khan M Rashedun-Naby
---
block/blk-flush.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/blk-flush.c
On Wed, Dec 20, 2017 at 02:42:20PM +0800, xuejiufei wrote:
> Hi Shaohua,
>
> I noticed that the splitted bio will goto the scheduler directly while
> the cloned bio entering the generic_make_request again. So can we just
> leave the BIO_THROTTLED flag in the original bio and do not copy the
>
If two processes do I/O close to each other, i.e., are cooperating
processes in BFQ (and CFQ'S) nomenclature, then BFQ merges their
associated bfq_queues, so as to get sequential I/O from the union of
the I/O requests of the processes, and thus reach a higher
throughput. A merged queue is then
As both of the if and else statement block is returning something then
there is no need of the else statement. Thus this else statement
has been removed.
Signed-off-by: Khan M Rashedun-Naby
---
block/blk-flush.c | 13 +++--
1 file changed, 7 insertions(+), 6
On 12/18/2017 02:56 PM, Stefan Haberland wrote:
> On 07.12.2017 00:29, Christoph Hellwig wrote:
>> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
>> t > commit 11b2025c3326f7096ceb588c3117c7883850c068 -> bad
>>> blk-mq: create a blk_mq_ctx for each possible CPU
>>>
Hi Bart,
On 12/13/2017 07:49 PM, Bart Van Assche wrote:
Would it be possible to repeat your test with the patch below applied on your
kernel tree? This patch has just been posted on the dm-devel mailing list.
Sorry for the delay. I missed this.
Unfortunately the oops problem still happens on
From: Angelo Ruocco
A just-created bfq_queue will certainly be deemed as interactive on
the arrival of its first I/O request, if the low_latency flag is
set. Yet, if the queue is merged with another queue on the arrival of
its first I/O request, it will not have the
In BFQ and CFQ, two processes are said to be cooperating if they do
I/O in such a way that the union of their I/O requests yields a
sequential I/O pattern. To get such a sequential I/O pattern out of
the non-sequential pattern of each cooperating process, BFQ and CFQ
merge the queues associated
From: Angelo Ruocco
When two or more processes do I/O in a way that the their requests are
sequential in respect to one another, BFQ merges the bfq_queues associated
with the processes. This way the overall I/O pattern becomes sequential,
and thus there is a boost in
If two processes do I/O close to each other, then BFQ merges the
bfq_queues associated with these processes, to get a more sequential
I/O, and thus a higher throughput. In this respect, to detect whether
two processes are doing I/O close to each other, BFQ keeps a list of
the head-of-line I/O
Hi,
the main patch in this series ("block, bfq: let a queue be merged only
shortly after starting I/O") eliminates an outlier in the application
start-up time guaranteed by BFQ. This outlier occurs more or less
frequently, as a function of the characteristics of the system, and is
caused by a
-v option is not supported by conditional expressions on
bash v4.1.2, so we use -n instead of -v to fix this issue.
Signed-off-by: xiao yang
---
check| 12 ++--
common/fio | 2 +-
common/rc| 2 +-
tests/meta/group | 4 ++--
4 files
Omar Sandoval writes:
> On Tue, Dec 19, 2017 at 11:47:09AM +0100, Johannes Thumshirn wrote:
>> xiao yang writes:
>>
>> > +requires() {
>> > + _have_program mkfs.ext3
>> > +}
>> [...]
>> > + # Format
>> > + mkfs.ext3 -F "$TEST_DEV" >> "$FULL"
61 matches
Mail list logo