On 8/20/24 01:51, Philipp Stanner wrote:
> All users of pcim_iounmap_regions() have been removed by now.
>
> Remove pcim_iounmap_regions().
>
> Signed-off-by: Philipp Stanner
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
ok better squashed
with patch 1.
Anyway:
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
pcim_release_region() a public function.
>
> Signed-off-by: Philipp Stanner
Looks fine to me. But I think this should be squashed with patch 2 (I do not see
the point of having 2 patches to export 2 functions that are complementary).
Either way:
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
> Reviewed-by: Christoph Hellwig
> Signed-off-by: John Garry
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
ZE,
> struct virtio_blk_config, blk_size,
> &lim->logical_block_size);
I really wonder why this does not check that the VIRTIO_BLK_F_BLK_SIZE feature
exists... But that is not the fault of this patch :)
Reviewed-by: Damien
On 7/8/24 18:16, John Garry wrote:
> The block queue limits validation does this for us now.
>
> Reviewed-by: Christoph Hellwig
> Signed-off-by: John Garry
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
l_block_size = SECTOR_SIZE;
> + else if (blk_validate_block_size(lim->logical_block_size)) {
> + pr_warn("Invalid logical block size (%d)\n",
> lim->logical_block_size);
logical_block_size is an unsigned int so this needs to use %u.
With this nit fixed, feel free
> Signed-off-by: Christoph Hellwig
Looks good to me.
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
nts)
> + lim->max_segments = BLK_MAX_SEGMENTS;
Other than the above, looks OK to me.
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
> /* Inherit limits from component devices */
> lim->max_segments = USHRT_MAX;
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
S and zoned virtio-blk drives... Cannot use io_uring at the moment. But I do
not thing we reliably can anyway, unless the issuer is CPU/ring aware and always
issue writes to a zone using the same ring.
--
Damien Le Moal
Western Digital Research
d-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
On 1/23/24 02:36, Christoph Hellwig wrote:
> Pass the max_hw_sector limit loop sets at initialization time directly to
> blk_mq_alloc_disk instead of updating it right after the allocation.
>
> Signed-off-by: Christoph Hellwig
Looks OK to me.
Reviewed-by: Damien Le Moal
--
Da
d-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
k to set the queue up with the right parameters
> from the start and only leave a few final touches for zoned devices
> to be done just before adding the disk.
>
> Signed-off-by: Christoph Hellwig
Looks good to me.
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
On 1/23/24 02:36, Christoph Hellwig wrote:
> Split out a virtblk_read_limits helper that just reads the various
> queue limits to separate it from the higher level probing logic.
>
> Signed-off-by: Christoph Hellwig
Looks good to me.
Reviewed-by: Damien Le Moal
--
Damien Le
e.
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
that is a much better
> name for a function that allocates a queue and always pass the queuedata
> argument instead of having a separate version for the extra argument.
>
> Signed-off-by: Christoph Hellwig
Looks good to me.
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
ewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
n flight while changing
s/request/requests
> the limits.
>
> Signed-off-by: Christoph Hellwig
With the typos fixed, looks OK to me.
Reviewed-by: Damien Le Moal
> ---
> block/blk-sysfs.c | 14 ++
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff
s.max_discard_sectors)
> - return;
> -
> blk_queue_max_discard_sectors(queue, max_discard_sectors);
This function references max_user_discard_sectors but that access is done
without holding the queue limits mutex. Is that safe ?
> if (ctrl->dmrl)
> blk_queue_max_discard_segments(queue, ctrl->dmrl);
--
Damien Le Moal
Western Digital Research
equests
> the limits.
>
> Note that this removes the previously held queue_lock that doesn't
> protect against any other read or writer.
s/read/reader
>
> Signed-off-by: Christoph Hellwig
Other than these typos, looks good to me.
Reviewed-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
eft(sector_t offset,
> return chunk_sectors - (offset & (chunk_sectors - 1));
> }
>
> +/**
> + * queue_limits_start_update - start an atomic update of queue limits
> + * @q: queue to update
> + *
> + * This functions starts an atomic update of the queue limits. It takes a
> lock
> + * to prevent other updates and returns a snapshot of the current limits that
> + * the caller can modify. The caller must call queue_limits_commit_update()
> + * to finish the update.
> + *
> + * Context: process context. The caller must have frozen the queue or
> ensured
> + * that there is outstanding I/O by other means.
> + */
> +static inline struct queue_limits
> +queue_limits_start_update(struct request_queue *q)
> + __acquires(q->limits_lock)
> +{
> + mutex_lock(&q->limits_lock);
> + return q->limits;
> +}
> +int queue_limits_commit_update(struct request_queue *q,
> + struct queue_limits *lim);
> +
> /*
> * Access functions for manipulating queue properties
> */
--
Damien Le Moal
Western Digital Research
* at least twice the optimal I/O size.
> + */
> + bdi->ra_pages = max(lim->io_opt * 2 / PAGE_SIZE, VM_READAHEAD_PAGES);
Nit: while at it, you could replace that division by PAGE_SIZE with a right
shift by PAGE_SHIFT.
Other than that, looks good to me.
Reviewed-by
d-by: Damien Le Moal
--
Damien Le Moal
Western Digital Research
On 2019/10/22 18:15, Jan Kara wrote:
> On Tue 22-10-19 07:58:08, Damien Le Moal wrote:
>> On 2019/10/21 17:38, Jan Kara wrote:
>>> Factor out code handling revalidation of bdev on disk change into a
>>> common helper.
>>>
>>> Signed-off-b
ev);
> - }
> + if (bdev->bd_invalidated &&
> + (!ret || ret == -ENOMEDIUM))
> + bdev_disk_changed(bdev, ret == -ENOMEDIUM);
> if (ret)
> goto out_unlock_bdev;
> }
>
--
Damien Le Moal
Western Digital Research
Introduce the new helper function nvme_lba_to_sect() to convert a device
logical block number to a 512B sector number. Use this new helper in
obvious places, cleaning up the code.
Signed-off-by: Damien Le Moal
---
drivers/nvme/host/core.c | 14 +++---
drivers/nvme/host/nvme.h | 8
Rename nvme_block_nr() to nvme_sect_to_lba() and use SECTOR_SHIFT
instead of its hard coded value 9. Also add a comment to decribe this
helper.
Signed-off-by: Damien Le Moal
---
drivers/nvme/host/core.c | 6 +++---
drivers/nvme/host/nvme.h | 7 +--
2 files changed, 8 insertions(+), 5
device logical block number into a 512B sector
value.
Please consider this series for kernel 5.5.
Damien Le Moal (2):
nvme: Cleanup and rename nvme_block_nr()
nvme: Introduce nvme_lba_to_sect()
drivers/nvme/host/core.c | 20 ++--
drivers/nvme/host/nvme.h | 15
Use SECTOR_SHIFT instead of its hard coded value 9. Also add a comment
to decribe this helper.
Signed-off-by: Damien Le Moal
---
drivers/nvme/host/nvme.h | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index
Introduce the new helper function nvme_block_sect() to convert a device
logical block number to a 512B sector number. Use this new helper in
obvious places, cleaning up the code.
Signed-off-by: Damien Le Moal
---
drivers/nvme/host/core.c | 14 +++---
drivers/nvme/host/nvme.h | 8
series for kernel 5.5.
Damien Le Moal (2):
nvme: Cleanup nvme_block_nr()
nvme: Introduce nvme_block_sect()
drivers/nvme/host/core.c | 14 +++---
drivers/nvme/host/nvme.h | 13 -
2 files changed, 19 insertions(+), 8 deletions(-)
--
2.21.0
On 2019/10/09 7:40, Damien Le Moal wrote:
> A BIO based request queue does not have a tag_set, which prevent testing
> for the flag BLK_MQ_F_NO_SCHED indicating that the queue does not
> require an elevator. This leads to an incorrect initialization of a
> default elevator in some c
mode enabled as the default elevator in
this case is mq-deadline instead of "none".
Fix this by testing for a NULL queue mq_ops field which indicates that
the queue is BIO based and should not have an elevator.
Reported-by: Shinichiro Kawasaki
Signed-off-by: Damien Le Moal
---
Chang
mode enabled as the default elevator in this case is
mq-deadline instead of "none".
Fix this by including the absence of a tag_set for a queue as an
indicator that the queue should not have an elevator.
Reported-by: Shinichiro Kawasaki
Signed-off-by: Damien Le Moal
---
block/elevator.c
On 2019/09/29 22:50, Xiubo Li wrote:
> On 2019/9/30 13:28, Damien Le Moal wrote:
>> On 2019/09/29 18:52, xiu...@redhat.com wrote:
>>> From: Xiubo Li
>>>
>>> For some storage drivers, such as the nbd, when there has new socket
>>> connections added
that
argument with a hardcoded value here. So why not simply call kzalloc_node() in
that function with the flags GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY ? That
would avoid needing to add the "gfp_t flags" argument and still fit with your
patch 2 definition of BLK_MQ_GFP_FLAGS.
> if (!tags)
> return NULL;
>
>
--
Damien Le Moal
Western Digital Research
On 2019/09/27 10:32, Jens Axboe wrote:
> On 9/27/19 7:25 PM, Damien Le Moal wrote:
>> On 2019/09/27 0:25, Ming Lei wrote:
>>> Now in case of real MQ, io scheduler may be bypassed, and not only this
>>> way may hurt performance for some slow MQ device, but also break zon
On 2019/09/27 0:25, Ming Lei wrote:
> Some HDD drive may expose multiple hw queue, such as MegraRaid, so
> still apply the normal plugging for such devices because sequential IO
> may benefit a lot from plug merging.
>
> Cc: Bart Van Assche
> Cc: Hannes Reinecke
> Cc: Damie
So don't bypass io scheduler if we have one setup.
>
> This patch can double sequential write performance basically on MQ
> scsi_debug when mq-deadline is applied.
>
> Cc: Bart Van Assche
> Cc: Hannes Reinecke
> Cc: Damien Le Moal
> Cc: Dave Chinner
> Signed-off-by:
tly
> by 'mutex_lock(&q->sysfs_lock)'.
>
> So moving the lockdep_assert_held() from blk_mq_sched_free_requests()
> into elevator_exit() for fixing the report by syzbot.
>
> Cc: Bart Van Assche
> Cc: Damien Le Moal
> Reported-by: syzbot+da3b7677bb913dc1b...@syz
On 2019/09/25 10:56, Damien Le Moal wrote:
> On 2019/09/25 9:56, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:f7c3bf8f Merge tag 'gfs2-for-5.4' of git://git.kernel.org/..
>> git tree: upstream
&
R15:
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
--
Damien Le Moal
Western Digital Research
nes sent
this as an RFC. We now got plenty of comments (thanks to all who provided
feedback !) and will work on a proper patch series backed by more testing.
Best regards.
--
Damien Le Moal
Western Digital Research
t busy, it isn't
>> necessary to enqueue request to sw queue and dequeue it from
>> sw queue because request may be submitted to hw queue asap without
>> extra cost, meantime there shouldn't be much request in sw queue,
>> and we don't need to worry about effect on IO merge.
>>
>> There are still some single hw queue SCSI HBAs(HPSA, megaraid_sas,
> ...)
>> which may connect high performance devices, so 'none' is often
> required
>> for obtaining good performance.
>>
>> This patch improves IOPS and decreases CPU unilization on
> megaraid_sas,
>> per Kashyap's test.
>>
>>
>> Thanks,
>> Ming
>
--
Damien Le Moal
Western Digital Research
r can solve the 1m and upper seq write problem with
> multiple threads?
Not sure what the problem is here. You could look at a blktrace of each case to
see if there is any major difference in the command patterns sent to the disks
of your array, in particular command size.
--
Damien Le Moal
Western Digital Research
g queue->mq_ops->queue_rq() while __blk_mq_insert_request() will put
the request in ctx->rq_lists[type].
This removes the optimized case !q->elevator && !data.hctx->dispatch_busy, but I
am not sure of the actual performance impact yet. We may want to patch
blk_mq_sched_insert_request() to handle that case.
> } else {
> blk_mq_sched_insert_request(rq, false, true, true);
> }
>
--
Damien Le Moal
Western Digital Research
inux-block-ow...@vger.kernel.org
> On Behalf Of Hannes Reinecke Sent: 2019年9
> 月19日 17:46 To: Jens Axboe Cc: linux-s...@vger.kernel.org;
> Martin K. Petersen ; James Bottomley
> ; Christoph Hellwig ;
> linux-block@vger.kernel.org; Hans Holmberg ; Damien Le
> Moal ; Hannes Reineck
+ Miklos
On 2019/09/10 13:41, Kirill A. Shutemov wrote:
> On Tue, Sep 10, 2019 at 12:05:33PM +0000, Damien Le Moal wrote:
>> On 2019/09/10 11:00, Kirill A. Shutemov wrote:
>>> On Mon, Sep 09, 2019 at 11:28:04AM -0500, Mike Christie wrote:
>>>> There are several sto
hink that it's great idea in general to expose this low-level
> machinery to userspace. But it's better to get comment from people move
> familiar with reclaim path.
Any setup with stacked file systems and one of the IO path component being a
user level process can benefit from this. See the problem described in this
patch I pushed for (unsuccessfully as it was a heavy handed solution):
https://www.spinics.net/lists/linux-fsdevel/msg148912.html
As the discussion in this thread shows, there is no existing simple solution to
deal with this reclaim recursion problem. And automatic detection is too hard,
if at all possible. With the proper access rights added, this user accessible
interface does look very sensible to me.
Best regards.
--
Damien Le Moal
Western Digital Research
oc_read,
>> +.write = memalloc_write,
>> +.llseek = default_llseek,
>> +};
>> +
>> #ifdef CONFIG_AUDIT
>> #define TMPBUFLEN 11
>> static ssize_t proc_loginuid_read(struct file * file, char __user * buf,
>> @@ -3097,6 +3148,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>> #ifdef CONFIG_PROC_PID_ARCH_STATUS
>> ONE("arch_status", S_IRUGO, proc_pid_arch_status),
>> #endif
>> +REG("memalloc", S_IRUGO|S_IWUSR, proc_memalloc_operations),
>> };
>>
>> static int proc_tgid_base_readdir(struct file *file, struct dir_context
>> *ctx)
>> @@ -3487,6 +3539,7 @@ static const struct pid_entry tid_base_stuff[] = {
>> #ifdef CONFIG_PROC_PID_ARCH_STATUS
>> ONE("arch_status", S_IRUGO, proc_pid_arch_status),
>> #endif
>> +REG("memalloc", S_IRUGO|S_IWUSR, proc_memalloc_operations),
>> };
>>
>> static int proc_tid_base_readdir(struct file *file, struct dir_context *ctx)
>>
>
>
--
Damien Le Moal
Western Digital Research
On 2019/09/04 21:57, Jens Axboe wrote:
> On 9/4/19 3:02 AM, Damien Le Moal wrote:
>> On 2019/09/04 17:56, Johannes Thumshirn wrote:
>>> On 04/09/2019 10:42, Damien Le Moal wrote:
>>>> @@ -734,6 +741,7 @@ static void __device_add_disk(struct device *pare
required features. Elevators not matching the device requirements are
not shown in the device sysfs queue/scheduler file to prevent their use.
The "none" elevator can always be selected as before.
Signed-off-by: Damien Le Moal
Reviewed-by: Johannes Thumshirn
Reviewed-by: Christo
first available elevator providing the required
features.
In all cases, default to "none" if no elevator is available or if the
initialization of the default elevator fails.
Signed-off-by: Damien Le Moal
---
block/elevator.c | 51 +---
1 fi
() does the right thing based on the queue
settings.
Signed-off-by: Damien Le Moal
Reviewed-by: Johannes Thumshirn
Reviewed-by: Christoph Hellwig
---
block/blk-mq.c | 8 +++-
block/elevator.c | 23 +--
2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a
-kernel compilation when
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.
Signed-off-by: Damien Le Moal
Reviewed-by: Christoph Hellwig
---
drivers/block/null_blk_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c
taking such extreme measure, handle mq-deadline
initialization failures in the same manner as when mq-deadline is not
available (no module to load), that is, default to the "none" scheduler.
With this change, elevator_init_mq() return type can be changed to void.
Signed-off-by: Dami
Christoph (patch 5)
* Fixed title of patch 7
Changes from v1:
* Addressed Johannes comments
* Rebased on newest for-next branch to include Ming's sysfs lock changes
Damien Le Moal (7):
block: Cleanup elevator_init_mq() use
block: Change elevator_init_mq() to always succeed
block: Intr
hile requests are in-flight (there should be none when the device
driver calls device_add_disk()), freeze and quiesce the device request
queue before calling blk_mq_init_sched() in elevator_init_mq().
Signed-off-by: Damien Le Moal
---
block/blk-mq.c | 2 --
block/elevator.c | 7 +++
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.
Signed-off-by: Damien Le Moal
Reviewed-by: Christoph Hellwig
---
drivers/scsi/sd_zbc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 367614f0e34f..de4019dc0f0b 100644
--- a/drivers
On 2019/09/04 17:56, Johannes Thumshirn wrote:
> On 04/09/2019 10:42, Damien Le Moal wrote:
>> @@ -734,6 +741,7 @@ static void __device_add_disk(struct device *parent,
>> struct gendisk *disk,
>> exact_match, exact_lock, disk);
>>
On Tue, 2019-09-03 at 01:57 -0700, Christoph Hellwig wrote:
> On Wed, Aug 28, 2019 at 11:29:44AM +0900, Damien Le Moal wrote:
> > For block devices that do not specify required features, preserve the
> > current default elevator selection (mq-deadline for single queue
> > de
required features. Elevators not matching the device requirements are
not shown in the device sysfs queue/scheduler file to prevent their use.
The "none" elevator can always be selected as before.
Signed-off-by: Damien Le Moal
Reviewed-by: Johannes Thumshirn
Reviewed-by: Christo
-kernel compilation when
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.
Signed-off-by: Damien Le Moal
Reviewed-by: Christoph Hellwig
---
drivers/block/null_blk_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c
first available elevator providing the required
features.
In all cases, default to "none" if no elevator is available or if the
initialization of the default elevator fails.
Signed-off-by: Damien Le Moal
---
block/elevator.c | 51 +---
1 fi
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.
Signed-off-by: Damien Le Moal
Reviewed-by: Christoph Hellwig
---
drivers/scsi/sd_zbc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 367614f0e34f..de4019dc0f0b 100644
--- a/drivers
sysfs lock changes
Damien Le Moal (7):
block: Cleanup elevator_init_mq() use
block: Change elevator_init_mq() to always succeed
block: Introduce elevator features
block: Improve default elevator selection
block: Delay default elevator initialization
block: Set ELEVATOR_F_ZBD_SEQ_WRIT
taking such extreme measure, handle mq-deadline
initialization failures in the same manner as when mq-deadline is not
available (no module to load), that is, default to the "none" scheduler.
With this change, elevator_init_mq() return type can be changed to void.
Signed-off-by: Dami
eue
before executing blk_mq_init_sched().
Signed-off-by: Damien Le Moal
---
block/blk-mq.c | 2 --
block/elevator.c | 7 +++
block/genhd.c| 8
3 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ee4caf0c0807..a37503984206 100644
() does the right thing based on the queue
settings.
Signed-off-by: Damien Le Moal
Reviewed-by: Johannes Thumshirn
Reviewed-by: Christoph Hellwig
---
block/blk-mq.c | 8 +++-
block/elevator.c | 23 +--
2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a
On 2019/09/03 18:02, Christoph Hellwig wrote:
> On Wed, Aug 28, 2019 at 11:29:45AM +0900, Damien Le Moal wrote:
>> When elevator_init_mq() is called from blk_mq_init_allocated_queue(),
>> the only information known about the device is the number of hardware
>> queues as the
On 2019/08/28 19:43, Johannes Thumshirn wrote:
> On 28/08/2019 12:41, Damien Le Moal wrote:
>> On 2019/08/28 17:16, Johannes Thumshirn wrote:
>>> What happened to my review comment for v1 of this patch?
>>>
>>
>> I merged the renamed ELEVATOR_F_ZBD_SEQ_WRITE
hat is what I understood you wanted... Did I misunderstand ?
When tired, my english becomes fuzzy sometimes :)
Please let me know if that is not what you wanted (it does seem so).
Cheers.
--
Damien Le Moal
Western Digital Research
tween write request dispatch selection
and zone unlock on write request completion.
Fixes: 7211aef86f79 ("block: mq-deadline: Fix write completion handling")
Cc: sta...@vger.kernel.org
Reported-by: Hans Holmberg
Signed-off-by: Damien Le Moal
---
block/mq-deadline.c | 19 +-
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.
Signed-off-by: Damien Le Moal
---
drivers/scsi/sd_zbc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 367614f0e34f..de4019dc0f0b 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi
eue
before executing blk_mq_init_sched().
Signed-off-by: Damien Le Moal
---
block/blk-mq.c | 2 --
block/elevator.c | 7 +++
block/genhd.c| 3 +++
3 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0c9b1f403db8..baf0c9cd8237 100644
-kernel compilation when
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.
Signed-off-by: Damien Le Moal
---
drivers/block/null_blk_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c
index b26a178d064d
first available elevator providing the required
features.
In all cases, default to "none" if no elevator is available or if the
initialization of the default elevator fails.
Signed-off-by: Damien Le Moal
---
block/elevator.c | 48
1 fi
required features. Elevators not matching the device requirements are
not shown in the device sysfs queue/scheduler file to prevent their use.
The "none" elevator can always be selected as before.
Signed-off-by: Damien Le Moal
---
block/blk-settings.c | 16 +
block/
, namely,
multi-queue zoned block devices.
Damien Le Moal (7):
block: Cleanup elevator_init_mq() use
block: Change elevator_init_mq() to always succeed
block: Introduce elevator features
block: Improve default elevator selection
block: Delay default elevator initialization
block: Set
taking such extreme measure, handle mq-deadline
initialization failures in the same manner as when mq-deadline is not
available (no module to load), that is, default to the "none" scheduler.
With this change, elevator_init_mq() return type can be changed to void.
Signed-off-by: Dami
() does the right thing based on the queue
settings.
Signed-off-by: Damien Le Moal
Reviewed-by: Johannes Thumshirn
---
block/blk-mq.c | 8 +++-
block/elevator.c | 23 +--
2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
Since elevator_init_rq() is called before the device queue is registered
in sysfs, there is no possible conflict with elevator_switch(). Remove
the unnecessary locking of q->sysfs_lock mutex.
Signed-off-by: Damien Le Moal
---
block/elevator.c | 12 ++--
1 file changed, 2 inserti
eue
before executing blk_mq_init_sched().
Signed-off-by: Damien Le Moal
---
block/blk-mq.c | 2 --
block/elevator.c | 7 +++
block/genhd.c| 3 +++
3 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 274e168c8535..34e9541945dc 100644
ulti-queue,
such as null_blk devices with zoned mode enabled or any SMR disk
connected to a smartpqi HBA (and exposed as multi-queue devices by the
HBA).
Signed-off-by: Damien Le Moal
---
block/elevator.c | 50 +++-
1 file changed, 45 insertions(+), 5
,
multi-queue zoned block devices.
Damien Le Moal (7):
block: Cleanup elevator_init_mq() use
block: Change elevator_init_mq() to always succeed
block: Remove sysfs lock from elevator_init_rq()
block: Introduce elevator features
block: Introduce zoned block device elevator feature
block
. Elevators not matching the device requirements are not
listed in the device sysfs queue/scheduler file to prevent their use.
The "none" elevator can always be selected as before.
Signed-off-by: Damien Le Moal
---
block/blk-settings.c | 15
block/elevator.c
.
Signed-off-by: Damien Le Moal
---
block/mq-deadline.c | 1 +
drivers/block/null_blk_main.c | 5 +
drivers/scsi/sd_zbc.c | 2 ++
include/linux/elevator.h | 7 +++
4 files changed, 15 insertions(+)
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index
, handle mq-deadline
initialization failures in the same manner as if mq-deadline being not
available (no module to load), that is, default to the "none" scheduler.
With this change, elevator_init_mq() return type can be changed to void.
Signed-off-by: Damien Le Moal
---
block/blk-m
() does the right thing based on the queue
settings.
Signed-off-by: Damien Le Moal
---
block/blk-mq.c | 8 +++-
block/elevator.c | 23 +--
2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 509f69fdfcf2..556c774a0f0d
request for 524288 bytes!
[ 191.841553] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 191.875544] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 191.909547] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 191.943466] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 191.977543] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 192.011547] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 192.045550] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 192.079539] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 192.113537] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
[ 192.147543] sd 10:0:0:0: scsi_dma_map failed: request for 524288 bytes!
I am still digging into all this, but strongly suspecting that this is related
to the NOWAIT flag not correctly handling cases where a BIO gets split into
smaller fragments and some of the fragments failing to be created with -EAGAIN.
Not 100% sure yet though.
Best regards.
--
Damien Le Moal
Western Digital Research
On 2019/08/06 22:34, Jens Axboe wrote:
> On 8/6/19 12:05 AM, Damien Le Moal wrote:
>> On 2019/08/06 13:09, Jens Axboe wrote:
>>> On 8/5/19 5:05 PM, Damien Le Moal wrote:
>>>> On 2019/08/06 7:05, Damien Le Moal wrote:
>>>>> On 2019/08/06 6:59, Damien L
On 2019/08/06 13:09, Jens Axboe wrote:
> On 8/5/19 5:05 PM, Damien Le Moal wrote:
>> On 2019/08/06 7:05, Damien Le Moal wrote:
>>> On 2019/08/06 6:59, Damien Le Moal wrote:
>>>> On 2019/08/06 6:28, Jens Axboe wrote:
>>>>> On 8/5/19 2:27 PM, Damien L
On 2019/08/06 9:25, Dave Chinner wrote:
> On Tue, Aug 06, 2019 at 12:05:51AM +0000, Damien Le Moal wrote:
>> On 2019/08/06 7:05, Damien Le Moal wrote:
>>> On 2019/08/06 6:59, Damien Le Moal wrote:
>>>> On 2019/08/06 6:28, Jens Axboe wrote:
>>>>> On 8/
On 2019/08/06 13:09, Jens Axboe wrote:
> On 8/5/19 5:05 PM, Damien Le Moal wrote:
>> On 2019/08/06 7:05, Damien Le Moal wrote:
>>> On 2019/08/06 6:59, Damien Le Moal wrote:
>>>> On 2019/08/06 6:28, Jens Axboe wrote:
>>>>> On 8/5/19 2:27 PM, Damien L
On 2019/08/06 7:05, Damien Le Moal wrote:
> On 2019/08/06 6:59, Damien Le Moal wrote:
>> On 2019/08/06 6:28, Jens Axboe wrote:
>>> On 8/5/19 2:27 PM, Damien Le Moal wrote:
>>>> On 2019/08/06 6:26, Jens Axboe wrote:
>>>>>> In any case, look
On 2019/08/06 6:59, Damien Le Moal wrote:
> On 2019/08/06 6:28, Jens Axboe wrote:
>> On 8/5/19 2:27 PM, Damien Le Moal wrote:
>>> On 2019/08/06 6:26, Jens Axboe wrote:
>>>>> In any case, looking again at this code, it looks like there is a
>>>>> prob
On 2019/08/06 6:28, Jens Axboe wrote:
> On 8/5/19 2:27 PM, Damien Le Moal wrote:
>> On 2019/08/06 6:26, Jens Axboe wrote:
>>>> In any case, looking again at this code, it looks like there is a
>>>> problem with dio->size being incremented early, even for fra
unting with
> this_size and dio->size, and we retain the old style ordering for the
> ret value.
Do you want a proper patch with real testing backup ? I can send that later
today.
--
Damien Le Moal
Western Digital Research
1 - 100 of 526 matches
Mail list logo