vdimm/nd_perf.c
> @@ -323,7 +323,8 @@ EXPORT_SYMBOL_GPL(register_nvdimm_pmu);
> void unregister_nvdimm_pmu(struct nvdimm_pmu *nd_pmu)
> {
> perf_pmu_unregister(&nd_pmu->pmu);
> nvdimm_pmu_free_hotplug_memory(nd_pmu);
> + kfree(nd_pmu->pmu.attr_groups);
> kfree(nd_pmu);
> }
> EXPORT_SYMBOL_GPL(unregister_nvdimm_pmu);
Reviewed-by: Jeff Moyer
platform_device *pdev)
> }
>
> rc = perf_pmu_register(&nd_pmu->pmu, nd_pmu->pmu.name, -1);
> if (rc) {
> - kfree(nd_pmu->pmu.attr_groups);
> nvdimm_pmu_free_hotplug_memory(nd_pmu);
> + kfree(nd_pmu->pmu.attr_groups);
> return rc;
> }
>
> pr_info("%s NVDIMM performance monitor support registered\n",
Reviewed-by: Jeff Moyer
David Hildenbrand writes:
> On 13.07.23 21:12, Jeff Moyer wrote:
>> David Hildenbrand writes:
>>
>>> On 16.06.23 00:00, Vishal Verma wrote:
>>>> The dax/kmem driver can potentially hot-add large amounts of memory
>>>> originating from CX
David Hildenbrand writes:
> On 16.06.23 00:00, Vishal Verma wrote:
>> The dax/kmem driver can potentially hot-add large amounts of memory
>> originating from CXL memory expanders, or NVDIMMs, or other 'device
>> memories'. There is a chance there isn't enough regular system memory
>> available to
Tyler Hicks writes:
> The alignment constraint for namespace creation in a region was
> increased, from 2M to 16M, for non-PowerPC architectures in v5.7 with
> commit 2522afb86a8c ("libnvdimm/region: Introduce an 'align'
> attribute"). The thought behind the change was that region alignment
> sho
"Li, Zhijian" writes:
> Copied to nvdimm list
>
> Thanks
>
> Zhijian
>
>
> on 2022/1/6 14:12, Li Zhijian wrote:
>>
>> Add Dan to the party :)
>>
>> May i know whether there is any existing APIs to check whether
>> a va/page backs to a nvdimm/pmem ?
I don't know of one. You could try walk_system
Christoph Hellwig writes:
> Dan,
>
> can you make any sense of thos report?
>
> name='nfit_test'
> path='/lib/modules/5.11.0-rc5-3-g52f019d43c22/extra/test/nfit_test.ko'
>> check_set_config_data: dimm: 0 read2 data miscompare: 0
>> check_set_config_data: dimm: 0x1 read2 data miscompare: 0
>>
ges[] (and any fields placed after it) are not zeroed out at
* allocation time. Don't add new fields after pages[] unless you
* wish that they not be zeroed.
*/
union {
struct page *pages[DIO_PAGES]; /* page buffer */
struct
!test_bit(family, &nd_desc->bus_family_mask))
> return -EINVAL;
> + family = array_index_nospec(family,
> + NVDIMM_BUS_FAMILY_MAX + 1);
> dsm_mask = acpi_desc->family_dsm_mask[family];
> guid = to_nfit_bus_uuid(family);
> } else {
Reviewed-by: Jeff Moyer
Dan Williams writes:
> Check for NULL entries before checking the entry order, otherwise NULL
> is misinterpreted as a present pte conflict. The 'order' check needs to
> happen before the locked check as an unlocked entry at the wrong order
> must fallback to lookup the correct order.
Please inc
Joe Perches writes:
> Rather than have a local coding style, use the typical kernel style.
The coding style isn't that different from the core kernel, and it's
still quite readable. I'd rather avoid the churn and the risk of
introducing regressions. This will also make backports to stable more
Dan Williams writes:
> Changes since v1 [1]:
> - Cleanup patch1, simplify flags return in the overwrite case and
> consolidate frozen-state cases (Jeff)
> - Clarify the motivation for patch2 (Jeff)
> - Collect Dave's Reviewed-by
The series tests out fine for me.
Thanks, Dan!
-Jeff
>
> [1]:
__security_store() rather than each of the helper routines to enable
> freeze to be run regardless of busy state.
>
> Reviewed-by: Dave Jiang
> Signed-off-by: Dan Williams
Reviewed-by: Jeff Moyer
> ---
> drivers/nvdimm/dimm_devs.c | 33 -
>
x27;locked' state, but will need to revisit if there were cases where
> applications need 'frozen' to show up in the primary 'security'
> attribute. The expectation is that communicating 'frozen' is mostly a
> helper for debug and status monito
Jia He writes:
> commit c221c0b0308f ("device-dax: "Hotplug" persistent memory for use
> like normal RAM") helps to add persistent memory as normal RAM blocks.
> But this driver doesn't work if CONFIG_DEV_DAX_PMEM_COMPAT is enabled.
>
> Here is the debugging call trace when CONFIG_DEV_DAX_PMEM_CO
r compat_uptr_t instead of a compat_sigset_t pointer.
>
> Fixes: 7a074e96 ("aio: implement io_pgetevents")
> Signed-off-by: Guillem Jover
Looks good, thanks for finding and fixing this!
Reviewed-by: Jeff Moyer
> ---
> fs/aio.c | 6 +++---
> 1 file changed, 3 insertions
struct
> iocb *iocb,
> file = req->ki_filp;
> if (unlikely(!(file->f_mode & FMODE_READ)))
> return -EBADF;
> - ret = -EINVAL;
> if (unlikely(!file->f_op->read_iter))
> return -EINVAL;
Acked-by: Jeff Moyer
Dan Williams writes:
> On Fri, Aug 16, 2019 at 1:49 PM Jeff Moyer wrote:
>>
>> Dan Williams writes:
>>
>> > The blanket blocking of all security operations while the DIMM is in
>> > active use in a region is too restrictive. The only security operations
&
, just move the
> __security_store() entry point to live with the helpers.
>
> Cc: Dave Jiang
> Signed-off-by: Dan Williams
Fine by me.
Acked-by: Jeff Moyer
Dan Williams writes:
> The blanket blocking of all security operations while the DIMM is in
> active use in a region is too restrictive. The only security operations
> that need to be aware of the ->busy state are those that mutate the
> state of data, i.e. erase and overwrite.
>
> Refactor the -
x27;locked' state, but will need to revisit if there were cases where
> applications need 'frozen' to show up in the primary 'security'
> attribute. The expectation is that communicating 'frozen' is mostly a
> helper for debug and status monitoring.
&g
Al Viro writes:
> On Mon, Jul 29, 2019 at 10:57:41AM -0400, Jeff Moyer wrote:
>> Al, can you take this through your tree?
>
> Umm... Can do, but I had an impression that Arnd and Deepa
> had a tree for timespec-related work. OTOH, it had been
> relatively quiet last cycle
Al, can you take this through your tree?
Thanks,
Jeff
Jeff Moyer writes:
> "zhangyi (F)" writes:
>
>> io_[p]getevents syscall should return -EINVAL if if timeout is out of
>> range, add this validity check.
>>
>> Signed-off-by: zhangyi (F)
>>
gt; + return ret;
> + until = timespec64_to_ktime(*ts);
> + }
> +
> + ioctx = lookup_ioctx(ctx_id);
> if (likely(ioctx)) {
> if (likely(min_nr <= nr && min_nr >= 0))
> ret = read_eve
"Aneesh Kumar K.V" writes:
> On 6/14/19 10:06 PM, Dan Williams wrote:
>> On Fri, Jun 14, 2019 at 9:26 AM Aneesh Kumar K.V
>> wrote:
>
>>> Why not let the arch
>>> arch decide the SUBSECTION_SHIFT and default to one subsection per
>>> section if arch is not enabled to work with subsection.
>>
>>
Dan Williams writes:
>> Great! Now let's create another one.
>>
>> # ndctl create-namespace -m fsdax -s 132m
>> libndctl: ndctl_pfn_enable: pfn1.1: failed to enable
>> Error: namespace1.2: failed to enable
>>
>> failed to create namespace: No such device or address
>>
>> (along with a kernel w
Dan Williams writes:
>> > However, to fix this situation a non-backwards compatible change
>> > needs to be made to the interpretation of the nd_pfn info-block.
>> > ->start_pad needs to be accounted in ->map.map_offset (formerly
>> > ->data_offset), and ->map.map_base (formerly ->phys_addr) need
Hi, Dan,
Thanks for the comprehensive write-up. Comments below.
Dan Williams writes:
> In the beginning the pmem driver simply passed the persistent memory
> resource range to memremap and was done. With the introduction of
> devm_memremap_pages() and vmem_altmap the implementation needed to
>
Dan Williams writes:
> On Tue, Feb 12, 2019 at 1:37 PM Dan Williams wrote:
>>
>> Lately Linux has encountered platforms that collide Persistent Memory
>> regions between each other, specifically cases where ->start_pad needed
>> to be non-zero. This lead to commit ae86cbfef381 "libnvdimm, pfn: P
s case and call it in the common path.
>
> Fixes: 11189c1089da ("acpi/nfit: Fix command-supported detection")
> Cc:
> Cc: Vishal Verma
> Reported-by: Grzegorz Burzynski
> Signed-off-by: Dan Williams
Tricky code path, eh?
Tested-by: Jeff Moyer
-Jeff
> ---
&g
spec64_to_ktime(*ts) : KTIME_MAX;
> + ktime_t until;
> struct kioctx *ioctx = lookup_ioctx(ctx_id);
> long ret = -EINVAL;
>
> + if (ts && !timespec64_valid(ts))
> + return -EINVAL;
> +
> + until = ts ? timespec64_to_ktime(*ts) : KTIME_MAX;
> +
> if (likely(ioctx)) {
> if (likely(min_nr <= nr && min_nr >= 0))
> ret = read_events(ioctx, min_nr, nr, events, until);
Looks good to me. Thanks for fixing this.
Reviewed-by: Jeff Moyer
FLAG_NAME(NOUNMAP),
> CMD_FLAG_NAME(NOWAIT),
> + CMD_FLAG_NAME(NOUNMAP),
> + CMD_FLAG_NAME(HIPRI),
> };
> #undef CMD_FLAG_NAME
Acked-by: Jeff Moyer
You might consider also adding a comment above the req_flag_bits enum
noting that modifications also need to be propagated to cmd_flag_name.
Keith Busch writes:
>> Keith, you seem to be implying that there are platforms that won't
>> support memory mode. Do you also have some insight into how customers
>> want to use this, beyond my speculation? It's really frustrating to see
>> patch sets like this go by without any real use cases
Keith Busch writes:
> On Thu, Jan 17, 2019 at 11:29:10AM -0500, Jeff Moyer wrote:
>> Dave Hansen writes:
>> > Persistent memory is cool. But, currently, you have to rewrite
>> > your applications to use it. Wouldn't it be cool if you could
>> > just h
Dave Hansen writes:
> Persistent memory is cool. But, currently, you have to rewrite
> your applications to use it. Wouldn't it be cool if you could
> just have it show up in your system like normal RAM and get to
> it like a slow blob of memory? Well... have I got the patch
> series for you!
Dan Williams writes:
> Changes since v2 [1]:
> * Don't allow ND_CMD_CALL to bypass dsm_mask restrictions (Jeff)
>
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2019-January/019498.html
>
> ---
>
> One last resend to make sure all the last bits of thrash have settled.
LGTM.
Thanks!
Jeff
Dan Williams writes:
> On Tue, Jan 15, 2019 at 6:16 AM Jeff Moyer wrote:
>>
>> Dan Williams writes:
>>
>> > Changes since v1 [1]:
>> > * Include another patch make sure that function-number zero can be
>> > safely used as an invalid function n
ff)
> * Collect a Tested-by from Sujith
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2019-January/019435.html
For the series:
Reviewed-by: Jeff Moyer
Thanks, Dan!
>
> ---
>
> Quote patch2 changelog:
>
> The _DSM function number validation only happens to succeed
Dan Williams writes:
> On Mon, Jan 14, 2019 at 8:43 AM Dan Williams wrote:
>> On Mon, Jan 14, 2019 at 7:19 AM Jeff Moyer wrote:
> [..]
>> > > +
>> > > + if (cmd == ND_CMD_CALL) {
>> > > + int i;
>> > > +
>&
Dan Williams writes:
> The _DSM function number validation only happens to succeed when the
> generic Linux command number translation corresponds with a
> DSM-family-specific function number. This breaks NVDIMM-N
> implementations that correctly implement _LSR, _LSW, and _LSI, but do
> not happe
> Cc: Intel SCU Linux support
> Cc: Artur Paszkiewicz
> Cc: "James E.J. Bottomley"
> Cc: "Martin K. Petersen"
> Cc: Christoph Hellwig
> Cc: Jens Axboe
> Cc: Jeff Moyer
Nice job, and excellent commit message. We'll need a similar patch for
lpfc.
Rev
Hi, Logan,
Logan Gunthorpe writes:
> Hey,
>
> I found a regression in v5.0-rc1 this morning. My system panics on boot.
> I've attached a log of the panic.
>
> I bisected to find the problematic commit is:
>
> Fixes: 9d037ad707ed ("block: remove req->timeout_list")
>
> But it makes no sense to me
Jens Axboe writes:
> On 12/11/18 11:02 AM, Jeff Moyer wrote:
>> Matthew Wilcox writes:
>>
>>> On Tue, Dec 11, 2018 at 12:21:52PM -0500, Jeff Moyer wrote:
>>>> I'm going to submit this version formally. If you're interested in
>>>
Matthew Wilcox writes:
> On Tue, Dec 11, 2018 at 12:21:52PM -0500, Jeff Moyer wrote:
>> I'm going to submit this version formally. If you're interested in
>> converting the ioctx_table to xarray, you can do that separately from a
>> security fix. I would incl
Jeff Moyer writes:
> Jeff Moyer writes:
>
>> Matthew Wilcox writes:
>>
>>> This custom resizing array was vulnerable to a Spectre attack (speculating
>>> off the end of an array to a user-controlled offset). The XArray is
>>> not vulnerable to
Jeff Moyer writes:
> Matthew Wilcox writes:
>
>> This custom resizing array was vulnerable to a Spectre attack (speculating
>> off the end of an array to a user-controlled offset). The XArray is
>> not vulnerable to Spectre as it always masks its lookups to be within
&g
Matthew Wilcox writes:
> This custom resizing array was vulnerable to a Spectre attack (speculating
> off the end of an array to a user-controlled offset). The XArray is
> not vulnerable to Spectre as it always masks its lookups to be within
> the bounds of the array.
I'm not a big fan of compl
Alex Richman writes:
> Hi,
>
> I'm seeing some weirdness with AIO, specifically SYS_io_destroy() is
> taking upwards of ~10 microseconds (~100 miliseconds) per call.
> How long is that call expected to take? I can see from the source
Well, it waits for an RCU grace period. I would not expe
adam.manzana...@wdc.com writes:
> From: Adam Manzanares
>
> Now that kiocb has an ioprio field copy this over to the bio when it is
> created from the kiocb.
>
> Signed-off-by: Adam Manzanares
Reviewed-by: Jeff Moyer
Thanks!
Jeff
> ---
> fs/block_dev.c | 2
adam.manzana...@wdc.com writes:
> From: Adam Manzanares
>
> Now that kiocb has an ioprio field copy this over to the bio when it is
> created from the kiocb.
>
> Signed-off-by: Adam Manzanares
> ---
> fs/block_dev.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/block_dev.c b/fs/bl
ld.
>
> We set the blkdev bio iopriority unconditionally, so we need to guarantee the
> kiocb is initialized properly. Added changes to the loopback driver and
> init_sync_kiocb to achieve this.
>
> This patch depends on block: add ioprio_check_cap function.
>
> Signed-o
adam.manzana...@wdc.com writes:
> From: Adam Manzanares
>
> Now that kiocb has an ioprio field copy this over to the bio when it is
> created from the kiocb during direct IO.
>
> Signed-off-by: Adam Manzanares
Reviewed-by: Jeff Moyer
> ---
> fs/iomap.c | 1 +
>
adam.manzana...@wdc.com writes:
> From: Adam Manzanares
>
> Now that kiocb has an ioprio field copy this over to the bio when it is
> created from the kiocb.
>
> Signed-off-by: Adam Manzanares
Reviewed-by: Jeff Moyer
> ---
> fs/block_dev.c | 1 +
> 1 file chang
has sufficient
> priviledges to submit IOPRIO_RT commands. This patch creates the
> ioprio_check_cap function to be used by the ioprio_set system call and also by
> the aio interface.
>
> Signed-off-by: Adam Manzanares
> Reviewed-by: Christoph Hellwig
Reviewed-by: Jeff Moyer
>
Hi, Adam,
adam.manzana...@wdc.com writes:
> From: Adam Manzanares
>
> This is the per-I/O equivalent of the ioprio_set system call.
> See the following link for performance implications on a SATA HDD:
> https://lkml.org/lkml/2016/12/6/495
>
> First patch factors ioprio_check_cap function out of
Dan Williams writes:
> On Mon, May 7, 2018 at 12:08 PM, Jeff Moyer wrote:
>> Dan Williams writes:
>>
>>> On Mon, May 7, 2018 at 11:46 AM, Matthew Wilcox wrote:
>>>> On Mon, May 07, 2018 at 10:50:21PM +0800, Huaisheng Ye wrote:
>>>>> Traditi
Dan Williams writes:
> On Mon, May 7, 2018 at 11:46 AM, Matthew Wilcox wrote:
>> On Mon, May 07, 2018 at 10:50:21PM +0800, Huaisheng Ye wrote:
>>> Traditionally, NVDIMMs are treated by mm(memory management) subsystem as
>>> DEVICE zone, which is a virtual zone and both its start and end of pfn
>
Hi, Adam,
adam.manzana...@wdc.com writes:
> From: Adam Manzanares
>
> This is the per-I/O equivalent of the ioprio_set system call.
>
> When IOCB_FLAG_IOPRIO is set on the iocb aio_flags field, then we set the
> newly added kiocb ki_ioprio field to the value in the iocb aio_reqprio field.
>
> Wh
e8d513483300 ("memremap: change devm_memremap_pages interface to use
> struct dev_pagemap")
> Signed-off-by: Toshi Kani
> Cc: Christoph Hellwig
> Cc: Dan Williams
> Cc:
Reviewed-by: Jeff Moyer
> ---
> drivers/nvdimm/pmem.c |4 +++-
> 1 file change
Pankaj Gupta writes:
>> Ideally, qemu (seabios?) would advertise a platform capabilities
>> sub-table that doesn't fill in the flush bits.
>
> Could you please elaborate on this, how its related to disabling
> MAP_SYNC? We are not doing entire nvdimm device emulation.
My mistake. If you're not
Dan Williams writes:
> [ adding Jeff directly since he has also been looking at
> infrastructure to track when MAP_SYNC should be disabled ]
>
> On Wed, Apr 25, 2018 at 7:21 AM, Dan Williams
> wrote:
>> On Wed, Apr 25, 2018 at 4:24 AM, Pankaj Gupta wrote:
>>> This patch adds virtio-pmem driver
Christoph Hellwig writes:
> On Fri, Apr 06, 2018 at 04:16:30AM +0100, Al Viro wrote:
>> BTW, this is only tangentially related, but... does *anything* call
>> io_submit() for huge amounts of iocb?
I don't know. If an application did that, as many I/Os as could fit
into the ring buffer would be
Christoph Hellwig writes:
> On Tue, Mar 20, 2018 at 11:30:29AM -0400, Jeff Moyer wrote:
>> > In this commit:
>> >
>> > http://git.infradead.org/users/hch/libaio.git/commitdiff/49f77d595210393ce7b125cb00233cf737402f56
>>
>> BTW, the man pages
Christoph Hellwig writes:
> On Mon, Mar 19, 2018 at 07:12:41PM -0700, Darrick J. Wong wrote:
>> > Note that unlike many other signal related calls we do not pass a sigmask
>> > size, as that would get us to 7 arguments, which aren't easily supported
>> > by the syscall infrastructure. It seems a
of the iocb.
>
> Unlike poll or epoll without EPOLLONESHOT this interface always works
> in one shot mode, that is once the iocb is completed, it will have to be
> resubmitted.
>
> Signed-off-by: Christoph Hellwig
Also acked this one in the last posting.
Acked-by: J
, which aren't easily supported
> by the syscall infrastructure. It seems a lot less painful to just add a
> new syscall variant in the unlikely case we're going to increase the
> sigset size.
>
> Signed-off-by: Christoph Hellwig
I acked this in the last set, so...
Ack
Dan Williams writes:
> On Mon, Feb 12, 2018 at 2:53 PM, Jeff Moyer wrote:
>> Dave Jiang writes:
>>
>>> Re-enable deep flush so that users always have a way to be sure that a write
>>> does make it all the way out to the NVDIMM. The PMEM driver writes alway
filesystem-dax case.
That's still very confusing text. Specifically, the part where you say
that pmem driver writes always make it to the DIMM. I think the
changelog could start with "Deep flush is there to explicitly flush
write buffers" Anyway, the fix looks right to me.
Reviewed-by: Jeff Moyer
, which aren't easily supported
> by the syscall infrastructure. It seems a lot less painful to just add a
> new syscall variant in the unlikely case we're going to increase the
> sigset size.
>
> Signed-off-by: Christoph Hellwig
Acked-by: Jeff Moyer
I assume other
of the iocb.
>
> Unlike poll or epoll without EPOLLONESHOT this interface always works
> in one shot mode, that is once the iocb is completed, it will have to be
> resubmitted.
>
> Signed-off-by: Christoph Hellwig
Acked-by: Jeff Moyer
Christoph Hellwig writes:
> On Thu, Jan 18, 2018 at 11:44:03AM -0500, Jeff Moyer wrote:
>> Jeff Moyer writes:
>>
>> > FYI, this kernel has issues. It will boot up, but I don't have
>> > networking, and even rebooting doesn't succeed. I'm lo
Avi Kivity writes:
> On 01/18/2018 05:46 PM, Jeff Moyer wrote:
>> FYI, this kernel has issues. It will boot up, but I don't have
>> networking, and even rebooting doesn't succeed. I'm looking into it.
>
> FWIW, I'm running an older version of this pat
Jeff Moyer writes:
> FYI, this kernel has issues. It will boot up, but I don't have
> networking, and even rebooting doesn't succeed. I'm looking into it.
A bisect lands on: eventfd: switch to ->poll_mask. That's not super
helpful, though. I did run the ltp even
FYI, this kernel has issues. It will boot up, but I don't have
networking, and even rebooting doesn't succeed. I'm looking into it.
-Jeff
Christoph Hellwig writes:
> Hi all,
>
> this series adds support for the IOCB_CMD_POLL operation to poll for the
> readyness of file descriptors using the
Christoph Hellwig writes:
> On Tue, Jan 16, 2018 at 07:41:24PM -0500, Jeff Moyer wrote:
>> I'd be willing to bet the issue is in your io_syscall6 implementation.
>> You pass in arg5 where arg6 should be used. Don't feel bad, it took me
>> the better p
Christoph Hellwig writes:
> On Wed, Jan 17, 2018 at 04:27:21AM +, Al Viro wrote:
>> On Tue, Jan 16, 2018 at 07:41:24PM -0500, Jeff Moyer wrote:
>> >if (sigmask) {
>> > - if (copy_from_user(&ksigmask, sigmask, sizeof(ksigmask)))
>> > +
Hi, Christoph,
Christoph Hellwig writes:
> On Mon, Jan 15, 2018 at 09:53:10AM +0100, Christoph Hellwig wrote:
>> > pselect, as an example, crams the sigmask and size together. Why not
>> > just do that? libaio can take care of setting that up.
>>
>> Yes, I could try that. It's just another d
Christoph Hellwig writes:
> This is the io_getevents equivalent of ppoll/pselect and allows to
> properly mix signals and aio completions (especially with IOCB_CMD_POLL)
> and atomically executes the following sequence:
>
> sigset_t origmask;
>
> pthread_sigmask(SIG_SETMASK, &sigmask,
Matthew Wilcox writes:
> On Thu, Dec 14, 2017 at 11:18:30AM +0800, Leizhen (ThunderTown) wrote:
>> On 2017/12/14 3:31, Matthew Wilcox wrote:
>> > On Wed, Dec 13, 2017 at 11:27:00AM -0500, Jeff Moyer wrote:
>> >> Matthew Wilcox writes:
>> >>
>> &g
Christoph Hellwig writes:
> On Wed, Jan 10, 2018 at 06:26:39PM -0500, Jeff Moyer wrote:
>> >> The upcoming aio poll support would like to be able to complete the
>> >> iocb inline from the cancellation context, but that would cause
>> >> a lock order rever
"Michael Kerrisk (man-pages)" writes:
Hi, Michael,
> Are there some man pages patches already for these changes?
https://patchwork.kernel.org/patch/10144129/
Cheers,
Jeff
Jeff Moyer writes:
> Christoph Hellwig writes:
>
>> The upcoming aio poll support would like to be able to complete the
>> iocb inline from the cancellation context, but that would cause
>> a lock order reversal. Add support for optionally moving the cancelation
>&g
his reversal.
>
> Signed-off-by: Christoph Hellwig
Acked-by: Jeff Moyer
Christoph Hellwig writes:
> One we cancel an iocb there is no reason to keep it on the active_reqs
> list, given that the list is only used to look for cancelation candidates.
>
> Signed-off-by: Christoph Hellwig
Acked-by: Jeff Moyer
path or the
cancellation path delivered the completion event. Now, the completion
is always delivered via aio_complete.
So yeah, we can get rid of this.
Acked-by: Jeff Moyer
;
>
> + spin_lock_irqsave(&ctx->ctx_lock, flags);
> + list_add_tail(&req->ki_list, &ctx->active_reqs);
> req->ki_cancel = cancel;
So, this changes behavior from quietly overwriting the prior cancel
function to not doing it. I don't think it matters one bit, though.
Callers shouldn't do that.
Acked-by: Jeff Moyer
b->aio_fildes);
> + if (unlikely(!req->ki_filp))
> + return -EBADF;
> + req->ki_complete = aio_complete_rw;
> + req->ki_flags = 0;
The above assignment seems superfluous...
> + req->ki_pos = iocb->aio_offset;
> + req->ki_flags = iocb_flags(req->ki_filp);
because of this.
Acked-by: Jeff Moyer
ld have been part of commit 599bd19bdc4c6 ("fs: don't
allow to complete sync iocbs through aio_complete").
Reviewed-by: Jeff Moyer
> ---
> fs/aio.c | 11 ++-
> 1 file changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 03d59
Christoph Hellwig writes:
> The page size is in no way related to the aio code, and printing it in
> the (debug) dmesg at every boot serves no purpose.
>
> Signed-off-by: Christoph Hellwig
Acked-by: Jeff Moyer
Christoph Hellwig writes:
> + p = fork();
> + switch (p) {
[snip]
> + default:
> + close(pipe1[0]);
> + close(pipe2[1]);
> +
> + io_prep_poll(&iocb, pipe2[0], POLLIN);
> +
> + ret = io_setup(1, &ctx);
> + if (ret) {
> +
Christoph Hellwig writes:
> Hi all,
>
> this series resurrects IOCB_CMD_POLL support and adds support for the
> new io_pgetevents system call, as well as adding a test case.
This looks good to me. There may be a couple of changes to the syscall
bits, but I can take care of that. I'll review th
Hi, Ben,
Thanks for the quick reply.
Benjamin LaHaise writes:
> On Fri, Jan 05, 2018 at 11:25:17AM -0500, Jeff Moyer wrote:
>> Christoph Hellwig writes:
>>
>> > This way it can be used for the fallback 6-argument version on
>> > all architectures.
>>
Christoph Hellwig writes:
> This way it can be used for the fallback 6-argument version on
> all architectures.
>
> Signed-off-by: Christoph Hellwig
This is a strange way to do things. However, I was never really sold on
libaio having to implement its own system call wrappers. That decision
d
Matthew Wilcox writes:
> On Wed, Dec 13, 2017 at 09:42:52PM +0800, Zhen Lei wrote:
>> Below information is reported by a lower kernel version, and I saw the
>> problem still exist in current version.
>
> I think you're right, but what an awful interface we have here!
> The user must not only fetc
Kirill Tkhai writes:
>> I think you just need to account the completion ring.
>
> A request of struct aio_kiocb type consumes much more memory, than
> struct io_event does. Shouldn't we account it too?
Not in my opinion. The completion ring is the part that gets pinned for
long periods of time.
Kirill Tkhai writes:
> On 05.12.2017 00:52, Tejun Heo wrote:
>> Hello, Kirill.
>>
>> On Tue, Dec 05, 2017 at 12:44:00AM +0300, Kirill Tkhai wrote:
Can you please explain how this is a fundamental resource which can't
be controlled otherwise?
>>>
>>> Currently, aio_nr and aio_max_nr are
Kirill Tkhai writes:
> Hi, Benjamin,
>
> On 04.12.2017 19:52, Benjamin LaHaise wrote:
>> Hi Kirill,
>>
>> On Mon, Dec 04, 2017 at 07:12:51PM +0300, Kirill Tkhai wrote:
>>> Hi,
>>>
>>> this patch set introduces accounting aio_nr and aio_max_nr per blkio cgroup.
>>> It may be used to limit number
...not ENOMEM. Fix it.
Signed-off-by: Jeff Moyer
diff --git a/tools/testing/radix-tree/idr-test.c
b/tools/testing/radix-tree/idr-test.c
index 30cd0b2..2056d83 100644
--- a/tools/testing/radix-tree/idr-test.c
+++ b/tools/testing/radix-tree/idr-test.c
@@ -380,7 +380,7 @@ void ida_check_random
kernel test robot writes:
> FYI, we noticed the following commit (built with gcc-6):
>
> commit: 332391a9935da939319e473b4680e173df75afcf ("fs: Fix page cache
> inconsistency when mixing buffered and AIO DIO")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[...]
> [ 1
Nikolay Borisov writes:
> We already get the block counts and the calculate the end block at the
> beginning of the function. Let's use the local variables for consistency and
> readability. No functional changes
>
> Signed-off-by: Nikolay Borisov
Looks ok to me.
Reviewed-by: Jeff Moyer
1 - 100 of 705 matches
Mail list logo