[no subject]

2019-04-03 Thread Mail Delivery Subsystem
This Message was undeliverable due to the following reason:

Your message was not delivered because the destination computer was
not reachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message was not delivered within 1 days:
Host 66.24.245.203 is not responding.

The following recipients did not receive this message:


Please reply to postmas...@lists.01.org
if you feel this message to be in error.



___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH] libnvdimm, pmem: fix a possible OOB access when read and write pmem

2019-04-03 Thread Li RongQing
If offset is not zero and length is bigger than PAGE_SIZE,
this will cause to out of boundary access to a page memory

Fixes: 98cc093cba1e "(block, THP: make block_device_operations.rw_page support 
THP)"
Co-developed-by: Liang ZhiCheng 
Signed-off-by: Liang ZhiCheng 
Signed-off-by: Li RongQing 
---
 drivers/nvdimm/pmem.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index bc2f700feef8..0279eb1da3ef 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -113,13 +113,13 @@ static void write_pmem(void *pmem_addr, struct page *page,
 
while (len) {
mem = kmap_atomic(page);
-   chunk = min_t(unsigned int, len, PAGE_SIZE);
+   chunk = min_t(unsigned int, len, PAGE_SIZE - off);
memcpy_flushcache(pmem_addr, mem + off, chunk);
kunmap_atomic(mem);
len -= chunk;
off = 0;
page++;
-   pmem_addr += PAGE_SIZE;
+   pmem_addr += chunk;
}
 }
 
@@ -132,7 +132,7 @@ static blk_status_t read_pmem(struct page *page, unsigned 
int off,
 
while (len) {
mem = kmap_atomic(page);
-   chunk = min_t(unsigned int, len, PAGE_SIZE);
+   chunk = min_t(unsigned int, len, PAGE_SIZE - off);
rem = memcpy_mcsafe(mem + off, pmem_addr, chunk);
kunmap_atomic(mem);
if (rem)
@@ -140,7 +140,7 @@ static blk_status_t read_pmem(struct page *page, unsigned 
int off,
len -= chunk;
off = 0;
page++;
-   pmem_addr += PAGE_SIZE;
+   pmem_addr += chunk;
}
return BLK_STS_OK;
 }
-- 
2.16.2

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v4 5/5] xfs: disable map_sync for async flush

2019-04-03 Thread Darrick J. Wong
On Thu, Apr 04, 2019 at 09:09:12AM +1100, Dave Chinner wrote:
> On Wed, Apr 03, 2019 at 04:10:18PM +0530, Pankaj Gupta wrote:
> > Virtio pmem provides asynchronous host page cache flush
> > mechanism. we don't support 'MAP_SYNC' with virtio pmem 
> > and xfs.
> > 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  fs/xfs/xfs_file.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 1f2e2845eb76..dced2eb8c91a 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1203,6 +1203,14 @@ xfs_file_mmap(
> > if (!IS_DAX(file_inode(filp)) && (vma->vm_flags & VM_SYNC))
> > return -EOPNOTSUPP;
> >  
> > +   /* We don't support synchronous mappings with DAX files if
> > +* dax_device is not synchronous.
> > +*/
> > +   if (IS_DAX(file_inode(filp)) && !dax_synchronous(
> > +   xfs_find_daxdev_for_inode(file_inode(filp))) &&
> > +   (vma->vm_flags & VM_SYNC))
> > +   return -EOPNOTSUPP;
> > +
> > file_accessed(filp);
> > vma->vm_ops = _file_vm_ops;
> > if (IS_DAX(file_inode(filp)))
> 
> All this ad hoc IS_DAX conditional logic is getting pretty nasty.
> 
> xfs_file_mmap(
> 
> {
>   struct inode*inode = file_inode(filp);
> 
>   if (vma->vm_flags & VM_SYNC) {
>   if (!IS_DAX(inode))
>   return -EOPNOTSUPP;
>   if (!dax_synchronous(xfs_find_daxdev_for_inode(inode))
>   return -EOPNOTSUPP;
>   }
> 
>   file_accessed(filp);
>   vma->vm_ops = _file_vm_ops;
>   if (IS_DAX(inode))
>   vma->vm_flags |= VM_HUGEPAGE;
>   return 0;
> }
> 
> 
> Even better, factor out all the "MAP_SYNC supported" checks into a
> helper so that the filesystem code just doesn't have to care about
> the details of checking for DAX+MAP_SYNC support

Seconded, since ext4 has nearly the same flag validation logic.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> da...@fromorbit.com
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v4 5/5] xfs: disable map_sync for async flush

2019-04-03 Thread Dave Chinner
On Wed, Apr 03, 2019 at 04:10:18PM +0530, Pankaj Gupta wrote:
> Virtio pmem provides asynchronous host page cache flush
> mechanism. we don't support 'MAP_SYNC' with virtio pmem 
> and xfs.
> 
> Signed-off-by: Pankaj Gupta 
> ---
>  fs/xfs/xfs_file.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 1f2e2845eb76..dced2eb8c91a 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1203,6 +1203,14 @@ xfs_file_mmap(
>   if (!IS_DAX(file_inode(filp)) && (vma->vm_flags & VM_SYNC))
>   return -EOPNOTSUPP;
>  
> + /* We don't support synchronous mappings with DAX files if
> +  * dax_device is not synchronous.
> +  */
> + if (IS_DAX(file_inode(filp)) && !dax_synchronous(
> + xfs_find_daxdev_for_inode(file_inode(filp))) &&
> + (vma->vm_flags & VM_SYNC))
> + return -EOPNOTSUPP;
> +
>   file_accessed(filp);
>   vma->vm_ops = _file_vm_ops;
>   if (IS_DAX(file_inode(filp)))

All this ad hoc IS_DAX conditional logic is getting pretty nasty.

xfs_file_mmap(

{
struct inode*inode = file_inode(filp);

if (vma->vm_flags & VM_SYNC) {
if (!IS_DAX(inode))
return -EOPNOTSUPP;
if (!dax_synchronous(xfs_find_daxdev_for_inode(inode))
return -EOPNOTSUPP;
}

file_accessed(filp);
vma->vm_ops = _file_vm_ops;
if (IS_DAX(inode))
vma->vm_flags |= VM_HUGEPAGE;
return 0;
}


Even better, factor out all the "MAP_SYNC supported" checks into a
helper so that the filesystem code just doesn't have to care about
the details of checking for DAX+MAP_SYNC support

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules

2019-04-03 Thread Joel Fernandes
On Wed, Apr 03, 2019 at 09:20:39AM -0700, Paul E. McKenney wrote:
> On Wed, Apr 03, 2019 at 10:27:42AM -0400, Mathieu Desnoyers wrote:
> > - On Apr 3, 2019, at 9:32 AM, paulmck paul...@linux.ibm.com wrote:
> > 
> > > On Tue, Apr 02, 2019 at 11:34:07AM -0400, Mathieu Desnoyers wrote:
> > >> - On Apr 2, 2019, at 11:23 AM, paulmck paul...@linux.ibm.com wrote:
> > >> 
> > >> > On Tue, Apr 02, 2019 at 11:14:40AM -0400, Mathieu Desnoyers wrote:
> > >> >> - On Apr 2, 2019, at 10:28 AM, paulmck paul...@linux.ibm.com 
> > >> >> wrote:
> > >> >> 
> > >> >> > Hello!
> > >> >> > 
> > >> >> > This series prohibits use of DEFINE_SRCU() and DEFINE_STATIC_SRCU()
> > >> >> > by loadable modules.  The reason for this prohibition is the fact
> > >> >> > that using these two macros within modules requires that the size of
> > >> >> > the reserved region be increased, which is not something we want to
> > >> >> > be doing all that often.  Instead, loadable modules should define an
> > >> >> > srcu_struct and invoke init_srcu_struct() from their module_init 
> > >> >> > function
> > >> >> > and cleanup_srcu_struct() from their module_exit function.  Note 
> > >> >> > that
> > >> >> > modules using call_srcu() will also need to invoke srcu_barrier() 
> > >> >> > from
> > >> >> > their module_exit function.
> > >> >> 
> > >> >> This arbitrary API limitation seems weird.
> > >> >> 
> > >> >> Isn't there a way to allow modules to use DEFINE_SRCU and 
> > >> >> DEFINE_STATIC_SRCU
> > >> >> while implementing them with dynamic allocation under the hood ?
> > >> > 
> > >> > Although call_srcu() already has initialization hooks, some would
> > >> > also be required in srcu_read_lock(), and I am concerned about adding
> > >> > memory allocation at that point, especially given the possibility
> > >> > of memory-allocation failure.  And the possibility that the first
> > >> > srcu_read_lock() happens in an interrupt handler or similar.
> > >> > 
> > >> > Or am I missing a trick here?
> > >> 
> > >> I was more thinking that under #ifdef MODULE, both DEFINE_SRCU and
> > >> DEFINE_STATIC_SRCU could append data in a dedicated section. module.c
> > >> would additionally lookup that section on module load, and deal with
> > >> those statically defined SRCU entries as if they were dynamically
> > >> allocated ones. It would of course cleanup those resources on module
> > >> unload.
> > >> 
> > >> Am I missing some subtlety there ?
> > > 
> > > If I understand you correctly, that is actually what is already done.  The
> > > size of this dedicated section is currently set by PERCPU_MODULE_RESERVE,
> > > and the additions of DEFINE{_STATIC}_SRCU() in modules was requiring that
> > > this to be increased frequently.  That led to a request that something
> > > be done, in turn leading to this patch series.
> > 
> > I think we are not expressing quite the same idea.
> > 
> > AFAIU, yours is to have DEFINE*_SRCU directly define per-cpu data within 
> > modules,
> > which ends up using percpu module reserved memory.
> > 
> > My idea is to make DEFINE*_SRCU have a different behavior under #ifdef 
> > MODULE.
> > It could emit a _global variable_ (_not_ per-cpu) within a new section. That
> > section would then be used by module init/exit code to figure out what "srcu
> > descriptors" are present in the modules. It would therefore rely on dynamic
> > allocation for those, therefore removing the need to involve the percpu 
> > module
> > reserved pool at all.
> > 
> > > 
> > > I don't see a way around this short of changing module loading to do
> > > alloc_percpu() and then updating the relocation based on this result.
> > > Which would admittedly be far more convenient.  I was assuming that
> > > this would be difficult due to varying CPU offsets or the like.
> > > 
> > > But if it can be done reasonably, it would be quite a bit nicer than
> > > forcing dynamic allocation in cases where it is not otherwise needed.
> > 
> > Hopefully my explanation above helps clear out what I have in mind.
> > 
> > You can find similar tricks performed by include/linux/tracepoint.h:
> > 
> > #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> > static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> > {
> > return offset_to_ptr(p);
> > }
> > 
> > #define __TRACEPOINT_ENTRY(name)\
> > asm("   .section \"__tracepoints_ptrs\", \"a\"  \n" \
> > "   .balign 4   \n" \
> > "   .long   __tracepoint_" #name " - .  \n" \
> > "   .previous   \n")
> > #else
> > static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> > {
> > return *p;
> > }
> > 
> > #define __TRACEPOINT_ENTRY(name) \
> > static tracepoint_ptr_t __tracepoint_ptr_##name __used   \
> > 

Re: [PATCH RFC tip/core/rcu 1/4] dax/super: Dynamically allocate dax_srcu

2019-04-03 Thread Dan Williams
On Tue, Apr 2, 2019 at 7:29 AM Paul E. McKenney  wrote:
>
> Having DEFINE_SRCU() or DEFINE_STATIC_SRCU() in a loadable module
> requires that the size of the reserved region be increased, which is
> not something we really want to be doing.  This commit therefore removes
> the DEFINE_STATIC_SRCU() from drivers/dax/super.c in favor of defining
> dax_srcu as a simple srcu_struct, initializing it in dax_core_init(),
> and cleaning it up in dax_core_exit().
>
> Reported-by: kbuild test robot 
> Signed-off-by: Paul E. McKenney 

Looks good to me.

Reviewed-by: Dan Williams 
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules

2019-04-03 Thread Paul E. McKenney
On Wed, Apr 03, 2019 at 10:27:42AM -0400, Mathieu Desnoyers wrote:
> - On Apr 3, 2019, at 9:32 AM, paulmck paul...@linux.ibm.com wrote:
> 
> > On Tue, Apr 02, 2019 at 11:34:07AM -0400, Mathieu Desnoyers wrote:
> >> - On Apr 2, 2019, at 11:23 AM, paulmck paul...@linux.ibm.com wrote:
> >> 
> >> > On Tue, Apr 02, 2019 at 11:14:40AM -0400, Mathieu Desnoyers wrote:
> >> >> - On Apr 2, 2019, at 10:28 AM, paulmck paul...@linux.ibm.com wrote:
> >> >> 
> >> >> > Hello!
> >> >> > 
> >> >> > This series prohibits use of DEFINE_SRCU() and DEFINE_STATIC_SRCU()
> >> >> > by loadable modules.  The reason for this prohibition is the fact
> >> >> > that using these two macros within modules requires that the size of
> >> >> > the reserved region be increased, which is not something we want to
> >> >> > be doing all that often.  Instead, loadable modules should define an
> >> >> > srcu_struct and invoke init_srcu_struct() from their module_init 
> >> >> > function
> >> >> > and cleanup_srcu_struct() from their module_exit function.  Note that
> >> >> > modules using call_srcu() will also need to invoke srcu_barrier() from
> >> >> > their module_exit function.
> >> >> 
> >> >> This arbitrary API limitation seems weird.
> >> >> 
> >> >> Isn't there a way to allow modules to use DEFINE_SRCU and 
> >> >> DEFINE_STATIC_SRCU
> >> >> while implementing them with dynamic allocation under the hood ?
> >> > 
> >> > Although call_srcu() already has initialization hooks, some would
> >> > also be required in srcu_read_lock(), and I am concerned about adding
> >> > memory allocation at that point, especially given the possibility
> >> > of memory-allocation failure.  And the possibility that the first
> >> > srcu_read_lock() happens in an interrupt handler or similar.
> >> > 
> >> > Or am I missing a trick here?
> >> 
> >> I was more thinking that under #ifdef MODULE, both DEFINE_SRCU and
> >> DEFINE_STATIC_SRCU could append data in a dedicated section. module.c
> >> would additionally lookup that section on module load, and deal with
> >> those statically defined SRCU entries as if they were dynamically
> >> allocated ones. It would of course cleanup those resources on module
> >> unload.
> >> 
> >> Am I missing some subtlety there ?
> > 
> > If I understand you correctly, that is actually what is already done.  The
> > size of this dedicated section is currently set by PERCPU_MODULE_RESERVE,
> > and the additions of DEFINE{_STATIC}_SRCU() in modules was requiring that
> > this to be increased frequently.  That led to a request that something
> > be done, in turn leading to this patch series.
> 
> I think we are not expressing quite the same idea.
> 
> AFAIU, yours is to have DEFINE*_SRCU directly define per-cpu data within 
> modules,
> which ends up using percpu module reserved memory.
> 
> My idea is to make DEFINE*_SRCU have a different behavior under #ifdef MODULE.
> It could emit a _global variable_ (_not_ per-cpu) within a new section. That
> section would then be used by module init/exit code to figure out what "srcu
> descriptors" are present in the modules. It would therefore rely on dynamic
> allocation for those, therefore removing the need to involve the percpu module
> reserved pool at all.
> 
> > 
> > I don't see a way around this short of changing module loading to do
> > alloc_percpu() and then updating the relocation based on this result.
> > Which would admittedly be far more convenient.  I was assuming that
> > this would be difficult due to varying CPU offsets or the like.
> > 
> > But if it can be done reasonably, it would be quite a bit nicer than
> > forcing dynamic allocation in cases where it is not otherwise needed.
> 
> Hopefully my explanation above helps clear out what I have in mind.
> 
> You can find similar tricks performed by include/linux/tracepoint.h:
> 
> #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> {
> return offset_to_ptr(p);
> }
> 
> #define __TRACEPOINT_ENTRY(name)\
> asm("   .section \"__tracepoints_ptrs\", \"a\"  \n" \
> "   .balign 4   \n" \
> "   .long   __tracepoint_" #name " - .  \n" \
> "   .previous   \n")
> #else
> static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> {
> return *p;
> }
> 
> #define __TRACEPOINT_ENTRY(name) \
> static tracepoint_ptr_t __tracepoint_ptr_##name __used   \
> __attribute__((section("__tracepoints_ptrs"))) = \
> &__tracepoint_##name
> #endif
> 
> [...]
> 
> #define DEFINE_TRACE_FN(name, reg, unreg)\
> static const char __tpstrtab_##name[]\
> 

Re: [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules

2019-04-03 Thread Mathieu Desnoyers
- On Apr 3, 2019, at 9:32 AM, paulmck paul...@linux.ibm.com wrote:

> On Tue, Apr 02, 2019 at 11:34:07AM -0400, Mathieu Desnoyers wrote:
>> - On Apr 2, 2019, at 11:23 AM, paulmck paul...@linux.ibm.com wrote:
>> 
>> > On Tue, Apr 02, 2019 at 11:14:40AM -0400, Mathieu Desnoyers wrote:
>> >> - On Apr 2, 2019, at 10:28 AM, paulmck paul...@linux.ibm.com wrote:
>> >> 
>> >> > Hello!
>> >> > 
>> >> > This series prohibits use of DEFINE_SRCU() and DEFINE_STATIC_SRCU()
>> >> > by loadable modules.  The reason for this prohibition is the fact
>> >> > that using these two macros within modules requires that the size of
>> >> > the reserved region be increased, which is not something we want to
>> >> > be doing all that often.  Instead, loadable modules should define an
>> >> > srcu_struct and invoke init_srcu_struct() from their module_init 
>> >> > function
>> >> > and cleanup_srcu_struct() from their module_exit function.  Note that
>> >> > modules using call_srcu() will also need to invoke srcu_barrier() from
>> >> > their module_exit function.
>> >> 
>> >> This arbitrary API limitation seems weird.
>> >> 
>> >> Isn't there a way to allow modules to use DEFINE_SRCU and 
>> >> DEFINE_STATIC_SRCU
>> >> while implementing them with dynamic allocation under the hood ?
>> > 
>> > Although call_srcu() already has initialization hooks, some would
>> > also be required in srcu_read_lock(), and I am concerned about adding
>> > memory allocation at that point, especially given the possibility
>> > of memory-allocation failure.  And the possibility that the first
>> > srcu_read_lock() happens in an interrupt handler or similar.
>> > 
>> > Or am I missing a trick here?
>> 
>> I was more thinking that under #ifdef MODULE, both DEFINE_SRCU and
>> DEFINE_STATIC_SRCU could append data in a dedicated section. module.c
>> would additionally lookup that section on module load, and deal with
>> those statically defined SRCU entries as if they were dynamically
>> allocated ones. It would of course cleanup those resources on module
>> unload.
>> 
>> Am I missing some subtlety there ?
> 
> If I understand you correctly, that is actually what is already done.  The
> size of this dedicated section is currently set by PERCPU_MODULE_RESERVE,
> and the additions of DEFINE{_STATIC}_SRCU() in modules was requiring that
> this to be increased frequently.  That led to a request that something
> be done, in turn leading to this patch series.

I think we are not expressing quite the same idea.

AFAIU, yours is to have DEFINE*_SRCU directly define per-cpu data within 
modules,
which ends up using percpu module reserved memory.

My idea is to make DEFINE*_SRCU have a different behavior under #ifdef MODULE.
It could emit a _global variable_ (_not_ per-cpu) within a new section. That
section would then be used by module init/exit code to figure out what "srcu
descriptors" are present in the modules. It would therefore rely on dynamic
allocation for those, therefore removing the need to involve the percpu module
reserved pool at all.

> 
> I don't see a way around this short of changing module loading to do
> alloc_percpu() and then updating the relocation based on this result.
> Which would admittedly be far more convenient.  I was assuming that
> this would be difficult due to varying CPU offsets or the like.
> 
> But if it can be done reasonably, it would be quite a bit nicer than
> forcing dynamic allocation in cases where it is not otherwise needed.

Hopefully my explanation above helps clear out what I have in mind.

You can find similar tricks performed by include/linux/tracepoint.h:

#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
{
return offset_to_ptr(p);
}

#define __TRACEPOINT_ENTRY(name)\
asm("   .section \"__tracepoints_ptrs\", \"a\"  \n" \
"   .balign 4   \n" \
"   .long   __tracepoint_" #name " - .  \n" \
"   .previous   \n")
#else
static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
{
return *p;
}

#define __TRACEPOINT_ENTRY(name) \
static tracepoint_ptr_t __tracepoint_ptr_##name __used   \
__attribute__((section("__tracepoints_ptrs"))) = \
&__tracepoint_##name
#endif

[...]

#define DEFINE_TRACE_FN(name, reg, unreg)\
static const char __tpstrtab_##name[]\
__attribute__((section("__tracepoints_strings"))) = #name;   \
struct tracepoint __tracepoint_##name\
__attribute__((section("__tracepoints"), used)) =\
{ __tpstrtab_##name, STATIC_KEY_INIT_FALSE, reg, unreg, NULL };\

Re: [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules

2019-04-03 Thread Paul E. McKenney
On Tue, Apr 02, 2019 at 11:34:07AM -0400, Mathieu Desnoyers wrote:
> - On Apr 2, 2019, at 11:23 AM, paulmck paul...@linux.ibm.com wrote:
> 
> > On Tue, Apr 02, 2019 at 11:14:40AM -0400, Mathieu Desnoyers wrote:
> >> - On Apr 2, 2019, at 10:28 AM, paulmck paul...@linux.ibm.com wrote:
> >> 
> >> > Hello!
> >> > 
> >> > This series prohibits use of DEFINE_SRCU() and DEFINE_STATIC_SRCU()
> >> > by loadable modules.  The reason for this prohibition is the fact
> >> > that using these two macros within modules requires that the size of
> >> > the reserved region be increased, which is not something we want to
> >> > be doing all that often.  Instead, loadable modules should define an
> >> > srcu_struct and invoke init_srcu_struct() from their module_init function
> >> > and cleanup_srcu_struct() from their module_exit function.  Note that
> >> > modules using call_srcu() will also need to invoke srcu_barrier() from
> >> > their module_exit function.
> >> 
> >> This arbitrary API limitation seems weird.
> >> 
> >> Isn't there a way to allow modules to use DEFINE_SRCU and 
> >> DEFINE_STATIC_SRCU
> >> while implementing them with dynamic allocation under the hood ?
> > 
> > Although call_srcu() already has initialization hooks, some would
> > also be required in srcu_read_lock(), and I am concerned about adding
> > memory allocation at that point, especially given the possibility
> > of memory-allocation failure.  And the possibility that the first
> > srcu_read_lock() happens in an interrupt handler or similar.
> > 
> > Or am I missing a trick here?
> 
> I was more thinking that under #ifdef MODULE, both DEFINE_SRCU and
> DEFINE_STATIC_SRCU could append data in a dedicated section. module.c
> would additionally lookup that section on module load, and deal with
> those statically defined SRCU entries as if they were dynamically
> allocated ones. It would of course cleanup those resources on module
> unload.
> 
> Am I missing some subtlety there ?

If I understand you correctly, that is actually what is already done.  The
size of this dedicated section is currently set by PERCPU_MODULE_RESERVE,
and the additions of DEFINE{_STATIC}_SRCU() in modules was requiring that
this to be increased frequently.  That led to a request that something
be done, in turn leading to this patch series.

I don't see a way around this short of changing module loading to do
alloc_percpu() and then updating the relocation based on this result.
Which would admittedly be far more convenient.  I was assuming that
this would be difficult due to varying CPU offsets or the like.

But if it can be done reasonably, it would be quite a bit nicer than
forcing dynamic allocation in cases where it is not otherwise needed.

Thanx, Paul

> Thanks,
> 
> Mathieu
> 
> 
> > 
> > Thanx, Paul
> > 
> >> Thanks,
> >> 
> >> Mathieu
> >> 
> >> 
> >> > 
> >> > This series consist of the following:
> >> > 
> >> > 1.   Dynamically allocate dax_srcu.
> >> > 
> >> > 2.   Dynamically allocate drm_unplug_srcu.
> >> > 
> >> > 3.   Dynamically allocate kfd_processes_srcu.
> >> > 
> >> > These build and have been subjected to 0day testing, but might also need
> >> > testing by someone having the requisite hardware.
> >> > 
> >> >  Thanx, Paul
> >> > 
> >> > 
> >> > 
> >> > drivers/dax/super.c|   10 +-
> >> > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |5 +++
> >> > drivers/gpu/drm/amd/amdkfd/kfd_process.c   |2 -
> >> > drivers/gpu/drm/drm_drv.c  |8 
> >> > include/linux/srcutree.h   |   19 +--
> >> > kernel/rcu/rcuperf.c   |   40 
> >> > +++-
> >> > kernel/rcu/rcutorture.c|   48 
> >> > +
> >> >  7 files changed, 105 insertions(+), 27 deletions(-)
> >> 
> >> --
> >> Mathieu Desnoyers
> >> EfficiOS Inc.
> >> http://www.efficios.com
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
> 

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules

2019-04-03 Thread Paul E. McKenney
On Tue, Apr 02, 2019 at 02:40:54PM -0400, Joel Fernandes wrote:
> On Tue, Apr 02, 2019 at 08:23:34AM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 02, 2019 at 11:14:40AM -0400, Mathieu Desnoyers wrote:
> > > - On Apr 2, 2019, at 10:28 AM, paulmck paul...@linux.ibm.com wrote:
> > > 
> > > > Hello!
> > > > 
> > > > This series prohibits use of DEFINE_SRCU() and DEFINE_STATIC_SRCU()
> > > > by loadable modules.  The reason for this prohibition is the fact
> > > > that using these two macros within modules requires that the size of
> > > > the reserved region be increased, which is not something we want to
> > > > be doing all that often.  Instead, loadable modules should define an
> > > > srcu_struct and invoke init_srcu_struct() from their module_init 
> > > > function
> > > > and cleanup_srcu_struct() from their module_exit function.  Note that
> > > > modules using call_srcu() will also need to invoke srcu_barrier() from
> > > > their module_exit function.
> > > 
> > > This arbitrary API limitation seems weird.
> > > 
> > > Isn't there a way to allow modules to use DEFINE_SRCU and 
> > > DEFINE_STATIC_SRCU
> > > while implementing them with dynamic allocation under the hood ?
> > 
> > Although call_srcu() already has initialization hooks, some would
> > also be required in srcu_read_lock(), and I am concerned about adding
> > memory allocation at that point, especially given the possibility
> > of memory-allocation failure.  And the possibility that the first
> > srcu_read_lock() happens in an interrupt handler or similar.
> > 
> > Or am I missing a trick here?
> 
> Hi Paul,
> 
> Which 'reserved region' are you referring to? Isn't this region also
> used for non-module cases in which case the same problem applies to
> non-modules?

The percpu/module reservation discussed in this thread:

https://lore.kernel.org/lkml/c72402f2-967e-cd56-99d8-9139c9e7f...@google.com/T/#mbcacf60ddc0b3fd6e831a3ea71efc90da359a3bf

For non-modules, global per-CPU variables are statically allocated.
For modules, they must be dynamically allocated at modprobe time, and
their size is set by PERCPU_MODULE_RESERVE.

Thanx, Paul

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [Qemu-devel] [PATCH v4 2/5] virtio-pmem: Add virtio pmem driver

2019-04-03 Thread Pankaj Gupta


> Subject: Re: [Qemu-devel] [PATCH v4 2/5] virtio-pmem: Add virtio pmem driver
> 
> On Wed, Apr 03, 2019 at 04:10:15PM +0530, Pankaj Gupta wrote:
> > This patch adds virtio-pmem driver for KVM guest.
> > 
> > Guest reads the persistent memory range information from
> > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > creates a nd_region object with the persistent memory
> > range information so that existing 'nvdimm/pmem' driver
> > can reserve this into system memory map. This way
> > 'virtio-pmem' driver uses existing functionality of pmem
> > driver to register persistent memory compatible for DAX
> > capable filesystems.
> > 
> > This also provides function to perform guest flush over
> > VIRTIO from 'pmem' driver when userspace performs flush
> > on DAX memory range.
> > 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  drivers/nvdimm/virtio_pmem.c |  84 +
> >  drivers/virtio/Kconfig   |  10 +++
> >  drivers/virtio/Makefile  |   1 +
> >  drivers/virtio/pmem.c| 125 +++
> >  include/linux/virtio_pmem.h  |  60 +++
> >  include/uapi/linux/virtio_ids.h  |   1 +
> >  include/uapi/linux/virtio_pmem.h |  10 +++
> >  7 files changed, 291 insertions(+)
> >  create mode 100644 drivers/nvdimm/virtio_pmem.c
> >  create mode 100644 drivers/virtio/pmem.c
> >  create mode 100644 include/linux/virtio_pmem.h
> >  create mode 100644 include/uapi/linux/virtio_pmem.h
> > 
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > new file mode 100644
> > index ..2a1b1ba2c1ff
> > --- /dev/null
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -0,0 +1,84 @@
> > +// SPDX-License-Identifier: GPL-2.0
> 
> Is this comment stile (//) acceptable?

In existing code, i can see same comment
pattern for license at some places.

> 
> > +/*
> > + * virtio_pmem.c: Virtio pmem Driver
> > + *
> > + * Discovers persistent memory range information
> > + * from host and provides a virtio based flushing
> > + * interface.
> > + */
> > +#include 
> > +#include "nd.h"
> > +
> > + /* The interrupt handler */
> > +void host_ack(struct virtqueue *vq)
> > +{
> > +   unsigned int len;
> > +   unsigned long flags;
> > +   struct virtio_pmem_request *req, *req_buf;
> > +   struct virtio_pmem *vpmem = vq->vdev->priv;
> > +
> > +   spin_lock_irqsave(>pmem_lock, flags);
> > +   while ((req = virtqueue_get_buf(vq, )) != NULL) {
> > +   req->done = true;
> > +   wake_up(>host_acked);
> > +
> > +   if (!list_empty(>req_list)) {
> > +   req_buf = list_first_entry(>req_list,
> > +   struct virtio_pmem_request, list);
> > +   list_del(>req_list);
> > +   req_buf->wq_buf_avail = true;
> > +   wake_up(_buf->wq_buf);
> > +   }
> > +   }
> > +   spin_unlock_irqrestore(>pmem_lock, flags);
> > +}
> > +EXPORT_SYMBOL_GPL(host_ack);
> > +
> > + /* The request submission function */
> > +int virtio_pmem_flush(struct nd_region *nd_region)
> > +{
> > +   int err;
> > +   unsigned long flags;
> > +   struct scatterlist *sgs[2], sg, ret;
> > +   struct virtio_device *vdev = nd_region->provider_data;
> > +   struct virtio_pmem *vpmem = vdev->priv;
> > +   struct virtio_pmem_request *req;
> > +
> > +   might_sleep();
> 
> [1]
> 
> > +   req = kmalloc(sizeof(*req), GFP_KERNEL);
> > +   if (!req)
> > +   return -ENOMEM;
> > +
> > +   req->done = req->wq_buf_avail = false;
> > +   strcpy(req->name, "FLUSH");
> > +   init_waitqueue_head(>host_acked);
> > +   init_waitqueue_head(>wq_buf);
> > +   sg_init_one(, req->name, strlen(req->name));
> > +   sgs[0] = 
> > +   sg_init_one(, >ret, sizeof(req->ret));
> > +   sgs[1] = 
> > +
> > +   spin_lock_irqsave(>pmem_lock, flags);
> > +   err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC);
> 
> Is it okay to use GFP_ATOMIC in a might-sleep ([1]) function?

might sleep will give us a warning if we try to sleep from non-sleepable
context. 

We are doing it other way, i.e might_sleep is not inside GFP_ATOMIC. 

> 
> > +   if (err) {
> > +   dev_err(>dev, "failed to send command to virtio pmem 
> > device\n");
> > +
> > +   list_add_tail(>req_list, >list);
> > +   spin_unlock_irqrestore(>pmem_lock, flags);
> > +
> > +   /* When host has read buffer, this completes via host_ack */
> > +   wait_event(req->wq_buf, req->wq_buf_avail);
> > +   spin_lock_irqsave(>pmem_lock, flags);
> > +   }
> > +   virtqueue_kick(vpmem->req_vq);
> 
> You probably want to check return value here.

Don't think it will matter in this case?

> 
> > +   spin_unlock_irqrestore(>pmem_lock, flags);
> > +
> > +   /* When host has read buffer, this completes via host_ack */
> > +   wait_event(req->host_acked, req->done);
> > +   err = req->ret;
> > +   kfree(req);
> > +
> > +   return err;
> > +};
> > +EXPORT_SYMBOL_GPL(virtio_pmem_flush);
> 

Re: [PATCH v2] mm: Fix modifying of page protection by insert_pfn_pmd()

2019-04-03 Thread Sasha Levin
Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.0.5, v4.19.32, v4.14.109, v4.9.166, 
v4.4.177, v3.18.137.

v5.0.5: Build OK!
v4.19.32: Build OK!
v4.14.109: Failed to apply! Possible dependencies:
b4e98d9ac775 ("mm: account pud page tables")
c4812909f5d5 ("mm: introduce wrappers to access mm->nr_ptes")

v4.9.166: Failed to apply! Possible dependencies:
166f61b9435a ("mm: codgin-style fixes")
505a60e22560 ("asm-generic: introduce 5level-fixup.h")
5c6a84a3f455 ("mm/kasan: Switch to using __pa_symbol and lm_alias")
82b0f8c39a38 ("mm: join struct fault_env and vm_fault")
953c66c2b22a ("mm: THP page cache support for ppc64")
b279ddc33824 ("mm: clarify mm_struct.mm_{users,count} documentation")
b4e98d9ac775 ("mm: account pud page tables")
c2febafc6773 ("mm: convert generic code to 5-level paging")
c4812909f5d5 ("mm: introduce wrappers to access mm->nr_ptes")

v4.4.177: Failed to apply! Possible dependencies:
01871e59af5c ("mm, dax: fix livelock, allow dax pmd mappings to become 
writeable")
01c8f1c44b83 ("mm, dax, gpu: convert vm_insert_mixed to pfn_t")
0e749e54244e ("dax: increase granularity of dax_clear_blocks() operations")
1bdb2d4ee05f ("ARM: split off core mapping logic from create_mapping")
34c0fd540e79 ("mm, dax, pmem: introduce pfn_t")
3ed3a4f0ddff ("mm: cleanup *pte_alloc* interfaces")
52db400fcd50 ("pmem, dax: clean up clear_pmem()")
7bc3777ca19c ("sparc64: Trim page tables for 8M hugepages")
b2e0d1625e19 ("dax: fix lifetime of in-kernel dax mappings with 
dax_map_atomic()")
c4812909f5d5 ("mm: introduce wrappers to access mm->nr_ptes")
c7936206b971 ("ARM: implement create_mapping_late() for EFI use")
f25748e3c34e ("mm, dax: convert vmf_insert_pfn_pmd() to pfn_t")
f579b2b10412 ("ARM: factor out allocation routine from __create_mapping()")

v3.18.137: Failed to apply! Possible dependencies:
047fc8a1f9a6 ("libnvdimm, nfit, nd_blk: driver for BLK-mode access 
persistent memory")
2a3746984c98 ("x86: Use new cache mode type in track_pfn_remap() and 
track_pfn_insert()")
34c0fd540e79 ("mm, dax, pmem: introduce pfn_t")
4c1eaa2344fb ("drivers/block/pmem: Fix 32-bit build warning in 
pmem_alloc()")
5cad465d7fa6 ("mm: add vmf_insert_pfn_pmd()")
61031952f4c8 ("arch, x86: pmem api for ensuring durability of persistent 
memory updates")
62232e45f4a2 ("libnvdimm: control (ioctl) messages for nvdimm_bus and 
nvdimm devices")
83e0abae ("staging: android: binder: move to the "real" part of the 
kernel")
957e3facd147 ("gcov: enable GCOV_PROFILE_ALL from ARCH Kconfigs")
9e853f2313e5 ("drivers/block/pmem: Add a driver for persistent memory")
9f53f9fa4ad1 ("libnvdimm, pmem: add libnvdimm support to the pmem driver")
b94d5230d06e ("libnvdimm, nfit: initial libnvdimm infrastructure and NFIT 
support")
cb389b9c0e00 ("dax: drop size parameter to ->direct_access()")
dd22f551ac0a ("block: Change direct_access calling convention")
e2e05394e4a3 ("pmem, dax: have direct_access use __pmem annotation")
ec776ef6bbe1 ("x86/mm: Add support for the non-standard protected e820 
type")
f0dc089ce217 ("libnvdimm: enable iostat")
f25748e3c34e ("mm, dax: convert vmf_insert_pfn_pmd() to pfn_t")


How should we proceed with this patch?

--
Thanks,
Sasha
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [Qemu-devel] [PATCH v4 2/5] virtio-pmem: Add virtio pmem driver

2019-04-03 Thread Yuval Shaia
On Wed, Apr 03, 2019 at 04:10:15PM +0530, Pankaj Gupta wrote:
> This patch adds virtio-pmem driver for KVM guest.
> 
> Guest reads the persistent memory range information from
> Qemu over VIRTIO and registers it on nvdimm_bus. It also
> creates a nd_region object with the persistent memory
> range information so that existing 'nvdimm/pmem' driver
> can reserve this into system memory map. This way
> 'virtio-pmem' driver uses existing functionality of pmem
> driver to register persistent memory compatible for DAX
> capable filesystems.
> 
> This also provides function to perform guest flush over
> VIRTIO from 'pmem' driver when userspace performs flush
> on DAX memory range.
> 
> Signed-off-by: Pankaj Gupta 
> ---
>  drivers/nvdimm/virtio_pmem.c |  84 +
>  drivers/virtio/Kconfig   |  10 +++
>  drivers/virtio/Makefile  |   1 +
>  drivers/virtio/pmem.c| 125 +++
>  include/linux/virtio_pmem.h  |  60 +++
>  include/uapi/linux/virtio_ids.h  |   1 +
>  include/uapi/linux/virtio_pmem.h |  10 +++
>  7 files changed, 291 insertions(+)
>  create mode 100644 drivers/nvdimm/virtio_pmem.c
>  create mode 100644 drivers/virtio/pmem.c
>  create mode 100644 include/linux/virtio_pmem.h
>  create mode 100644 include/uapi/linux/virtio_pmem.h
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> new file mode 100644
> index ..2a1b1ba2c1ff
> --- /dev/null
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -0,0 +1,84 @@
> +// SPDX-License-Identifier: GPL-2.0

Is this comment stile (//) acceptable?

> +/*
> + * virtio_pmem.c: Virtio pmem Driver
> + *
> + * Discovers persistent memory range information
> + * from host and provides a virtio based flushing
> + * interface.
> + */
> +#include 
> +#include "nd.h"
> +
> + /* The interrupt handler */
> +void host_ack(struct virtqueue *vq)
> +{
> + unsigned int len;
> + unsigned long flags;
> + struct virtio_pmem_request *req, *req_buf;
> + struct virtio_pmem *vpmem = vq->vdev->priv;
> +
> + spin_lock_irqsave(>pmem_lock, flags);
> + while ((req = virtqueue_get_buf(vq, )) != NULL) {
> + req->done = true;
> + wake_up(>host_acked);
> +
> + if (!list_empty(>req_list)) {
> + req_buf = list_first_entry(>req_list,
> + struct virtio_pmem_request, list);
> + list_del(>req_list);
> + req_buf->wq_buf_avail = true;
> + wake_up(_buf->wq_buf);
> + }
> + }
> + spin_unlock_irqrestore(>pmem_lock, flags);
> +}
> +EXPORT_SYMBOL_GPL(host_ack);
> +
> + /* The request submission function */
> +int virtio_pmem_flush(struct nd_region *nd_region)
> +{
> + int err;
> + unsigned long flags;
> + struct scatterlist *sgs[2], sg, ret;
> + struct virtio_device *vdev = nd_region->provider_data;
> + struct virtio_pmem *vpmem = vdev->priv;
> + struct virtio_pmem_request *req;
> +
> + might_sleep();

[1]

> + req = kmalloc(sizeof(*req), GFP_KERNEL);
> + if (!req)
> + return -ENOMEM;
> +
> + req->done = req->wq_buf_avail = false;
> + strcpy(req->name, "FLUSH");
> + init_waitqueue_head(>host_acked);
> + init_waitqueue_head(>wq_buf);
> + sg_init_one(, req->name, strlen(req->name));
> + sgs[0] = 
> + sg_init_one(, >ret, sizeof(req->ret));
> + sgs[1] = 
> +
> + spin_lock_irqsave(>pmem_lock, flags);
> + err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC);

Is it okay to use GFP_ATOMIC in a might-sleep ([1]) function?

> + if (err) {
> + dev_err(>dev, "failed to send command to virtio pmem 
> device\n");
> +
> + list_add_tail(>req_list, >list);
> + spin_unlock_irqrestore(>pmem_lock, flags);
> +
> + /* When host has read buffer, this completes via host_ack */
> + wait_event(req->wq_buf, req->wq_buf_avail);
> + spin_lock_irqsave(>pmem_lock, flags);
> + }
> + virtqueue_kick(vpmem->req_vq);

You probably want to check return value here.

> + spin_unlock_irqrestore(>pmem_lock, flags);
> +
> + /* When host has read buffer, this completes via host_ack */
> + wait_event(req->host_acked, req->done);
> + err = req->ret;
> + kfree(req);
> +
> + return err;
> +};
> +EXPORT_SYMBOL_GPL(virtio_pmem_flush);
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index 35897649c24f..9f634a2ed638 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -42,6 +42,16 @@ config VIRTIO_PCI_LEGACY
>  
> If unsure, say Y.
>  
> +config VIRTIO_PMEM
> + tristate "Support for virtio pmem driver"
> + depends on VIRTIO
> + depends on LIBNVDIMM
> + help
> + This driver provides support for virtio based flushing interface
> + for persistent memory range.
> 

Re: [PATCH v4 4/5] ext4: disable map_sync for async flush

2019-04-03 Thread Jan Kara
On Wed 03-04-19 16:10:17, Pankaj Gupta wrote:
> Virtio pmem provides asynchronous host page cache flush
> mechanism. We don't support 'MAP_SYNC' with virtio pmem 
> and ext4. 
> 
> Signed-off-by: Pankaj Gupta 

The patch looks good to me. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  fs/ext4/file.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 69d65d49837b..86e4bf464320 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -360,8 +360,10 @@ static const struct vm_operations_struct 
> ext4_file_vm_ops = {
>  static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
>  {
>   struct inode *inode = file->f_mapping->host;
> + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> + struct dax_device *dax_dev = sbi->s_daxdev;
>  
> - if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb
> + if (unlikely(ext4_forced_shutdown(sbi)))
>   return -EIO;
>  
>   /*
> @@ -371,6 +373,13 @@ static int ext4_file_mmap(struct file *file, struct 
> vm_area_struct *vma)
>   if (!IS_DAX(file_inode(file)) && (vma->vm_flags & VM_SYNC))
>   return -EOPNOTSUPP;
>  
> + /* We don't support synchronous mappings with DAX files if
> +  * dax_device is not synchronous.
> +  */
> + if (IS_DAX(file_inode(file)) && !dax_synchronous(dax_dev)
> + && (vma->vm_flags & VM_SYNC))
> + return -EOPNOTSUPP;
> +
>   file_accessed(file);
>   if (IS_DAX(file_inode(file))) {
>   vma->vm_ops = _dax_vm_ops;
> -- 
> 2.20.1
> 
-- 
Jan Kara 
SUSE Labs, CR
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v2 1/1] treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively

2019-04-03 Thread Sakari Ailus
Ping.

On Tue, Mar 26, 2019 at 02:35:10PM +0100, Petr Mladek wrote:
> Linus,
> 
> On Mon 2019-03-25 21:32:28, Sakari Ailus wrote:
> > %pF and %pf are functionally equivalent to %pS and %ps conversion
> > specifiers. The former are deprecated, therefore switch the current users
> > to use the preferred variant.
> > 
> > The changes have been produced by the following command:
> > 
> > git grep -l '%p[fF]' | grep -v '^\(tools\|Documentation\)/' | \
> > while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done
> > 
> > And verifying the result.
> 
> I guess that the best timing for such tree-wide clean up is the end
> of the merge window. Should we wait for 5.2 or is it still acceptable
> to push this for 5.1-rc3?

The patch still cleanly applies to linux-next as wells as Linus's tree.
Some %pf bits have appeared and fixed since (include/trace/events/timer.h);
the fix is in linux-next so once that and this patch are merged, there are
no remaining %pf (or %pF) users left.

The original patch:

https://lore.kernel.org/lkml/tip-6849cbb0f9a8dbc1ba56e9abc6955613103e0...@git.kernel.org/>

The fix:

https://lore.kernel.org/lkml/20190321120921.16463-4-anna-ma...@linutronix.de/T/#u>

> 
> The %pf/%pF modifiers are deprecated since the commit
> 04b8eb7a4ccd9ef93 ("symbol lookup: introduce
> dereference_symbol_descriptor()") in v4.16-rc1.
> 
> Note that include/trace/events/timer.h is handled separately
> in linux-next by
> https://lkml.kernel.org/r/20190321120921.16463-4-anna-ma...@linutronix.de
> 
> Or we could use v1 of this patch, see
> https://lkml.kernel.org/r/20190322132108.25501-2-sakari.ai...@linux.intel.com
> 
> > Signed-off-by: Sakari Ailus 
> > Acked-by: David Sterba  (for btrfs)
> > Acked-by: Mike Rapoport  (for mm/memblock.c)
> > Acked-by: Rafael J. Wysocki 
> 
> Reviewed-by: Petr Mladek 

Thank you!

> 
> Best Regards,
> Petr
> 
> > ---
> > I split this off from the set as there's a change in
> > include/trace/events/timer.h that conflicts with v1 of this patch and
> > should the second patch to be applied without that change, results into
> > invalid use of %pf. To address the matter safely without conflicts or
> > trying to print invalid pointer conversions, this patch and the other one
> > changing include/trace/events/timer.h must be merged before second patch
> > in v1 of this set can go in.
> > 
> > since v1:
> > 
> > - Drop such changes to include/trace/events/timer.h where %pf has already
> >   been converted to %ps in linux-next master.
> > 
> >  arch/alpha/kernel/pci_iommu.c   | 20 ++--
> >  arch/arm/mach-imx/pm-imx6.c |  2 +-
> >  arch/arm/mm/alignment.c |  2 +-
> >  arch/arm/nwfpe/fpmodule.c   |  2 +-
> >  arch/microblaze/mm/pgtable.c|  2 +-
> >  arch/sparc/kernel/ds.c  |  2 +-
> >  arch/um/kernel/sysrq.c  |  2 +-
> >  arch/x86/include/asm/trace/exceptions.h |  2 +-
> >  arch/x86/kernel/irq_64.c|  2 +-
> >  arch/x86/mm/extable.c   |  4 ++--
> >  arch/x86/xen/multicalls.c   |  2 +-
> >  drivers/acpi/device_pm.c|  2 +-
> >  drivers/base/power/main.c   |  6 +++---
> >  drivers/base/syscore.c  | 12 ++--
> >  drivers/block/drbd/drbd_receiver.c  |  2 +-
> >  drivers/block/floppy.c  | 10 +-
> >  drivers/cpufreq/cpufreq.c   |  2 +-
> >  drivers/mmc/core/quirks.h   |  2 +-
> >  drivers/nvdimm/bus.c|  2 +-
> >  drivers/nvdimm/dimm_devs.c  |  2 +-
> >  drivers/pci/pci-driver.c| 14 +++---
> >  drivers/pci/quirks.c|  4 ++--
> >  drivers/pnp/quirks.c|  2 +-
> >  drivers/scsi/esp_scsi.c |  2 +-
> >  fs/btrfs/tests/free-space-tree-tests.c  |  4 ++--
> >  fs/f2fs/f2fs.h  |  2 +-
> >  fs/pstore/inode.c   |  2 +-
> >  include/trace/events/btrfs.h|  2 +-
> >  include/trace/events/cpuhp.h|  4 ++--
> >  include/trace/events/preemptirq.h   |  2 +-
> >  include/trace/events/rcu.h  |  4 ++--
> >  include/trace/events/sunrpc.h   |  2 +-
> >  include/trace/events/vmscan.h   |  4 ++--
> >  include/trace/events/workqueue.h|  4 ++--
> >  include/trace/events/xen.h  |  2 +-
> >  init/main.c |  6 +++---
> >  kernel/async.c  |  4 ++--
> >  kernel/events/uprobes.c |  2 +-
> >  kernel/fail_function.c  |  2 +-
> >  kernel/irq/debugfs.c|  2 +-
> >  kernel/irq/handle.c |  2 +-
> >  kernel/irq/manage.c |  2 +-
> >  kernel/irq/spurious.c   |  4 ++--
> >  kernel/rcu/tree.c   |  2 +-
> >  kernel/stop_machine.c   |  2 +-
> >  

[PATCH v4 5/5] xfs: disable map_sync for async flush

2019-04-03 Thread Pankaj Gupta
Virtio pmem provides asynchronous host page cache flush
mechanism. we don't support 'MAP_SYNC' with virtio pmem 
and xfs.

Signed-off-by: Pankaj Gupta 
---
 fs/xfs/xfs_file.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 1f2e2845eb76..dced2eb8c91a 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1203,6 +1203,14 @@ xfs_file_mmap(
if (!IS_DAX(file_inode(filp)) && (vma->vm_flags & VM_SYNC))
return -EOPNOTSUPP;
 
+   /* We don't support synchronous mappings with DAX files if
+* dax_device is not synchronous.
+*/
+   if (IS_DAX(file_inode(filp)) && !dax_synchronous(
+   xfs_find_daxdev_for_inode(file_inode(filp))) &&
+   (vma->vm_flags & VM_SYNC))
+   return -EOPNOTSUPP;
+
file_accessed(filp);
vma->vm_ops = _file_vm_ops;
if (IS_DAX(file_inode(filp)))
-- 
2.20.1

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v4 4/5] ext4: disable map_sync for async flush

2019-04-03 Thread Pankaj Gupta
Virtio pmem provides asynchronous host page cache flush
mechanism. We don't support 'MAP_SYNC' with virtio pmem 
and ext4. 

Signed-off-by: Pankaj Gupta 
---
 fs/ext4/file.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 69d65d49837b..86e4bf464320 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -360,8 +360,10 @@ static const struct vm_operations_struct ext4_file_vm_ops 
= {
 static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
 {
struct inode *inode = file->f_mapping->host;
+   struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+   struct dax_device *dax_dev = sbi->s_daxdev;
 
-   if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb
+   if (unlikely(ext4_forced_shutdown(sbi)))
return -EIO;
 
/*
@@ -371,6 +373,13 @@ static int ext4_file_mmap(struct file *file, struct 
vm_area_struct *vma)
if (!IS_DAX(file_inode(file)) && (vma->vm_flags & VM_SYNC))
return -EOPNOTSUPP;
 
+   /* We don't support synchronous mappings with DAX files if
+* dax_device is not synchronous.
+*/
+   if (IS_DAX(file_inode(file)) && !dax_synchronous(dax_dev)
+   && (vma->vm_flags & VM_SYNC))
+   return -EOPNOTSUPP;
+
file_accessed(file);
if (IS_DAX(file_inode(file))) {
vma->vm_ops = _dax_vm_ops;
-- 
2.20.1

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v4 3/5] libnvdimm: add dax_dev sync flag

2019-04-03 Thread Pankaj Gupta
This patch adds 'DAXDEV_SYNC' flag which is set
for nd_region doing synchronous flush. This later 
is used to disable MAP_SYNC functionality for 
ext4 & xfs filesystem for devices don't support
synchronous flush.

Signed-off-by: Pankaj Gupta 
---
 drivers/dax/bus.c|  2 +-
 drivers/dax/super.c  | 13 -
 drivers/md/dm.c  |  2 +-
 drivers/nvdimm/pmem.c|  3 ++-
 drivers/nvdimm/region_devs.c |  7 +++
 include/linux/dax.h  |  9 +++--
 include/linux/libnvdimm.h|  1 +
 7 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 2109cfe80219..431bf7d2a7f9 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -388,7 +388,7 @@ struct dev_dax *__devm_create_dev_dax(struct dax_region 
*dax_region, int id,
 * No 'host' or dax_operations since there is no access to this
 * device outside of mmap of the resulting character device.
 */
-   dax_dev = alloc_dax(dev_dax, NULL, NULL);
+   dax_dev = alloc_dax(dev_dax, NULL, NULL, true);
if (!dax_dev)
goto err;
 
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 0a339b85133e..bd6509308d05 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -186,6 +186,8 @@ enum dax_device_flags {
DAXDEV_ALIVE,
/* gate whether dax_flush() calls the low level flush routine */
DAXDEV_WRITE_CACHE,
+   /* flag to check if device supports synchronous flush */
+   DAXDEV_SYNC,
 };
 
 /**
@@ -354,6 +356,12 @@ bool dax_write_cache_enabled(struct dax_device *dax_dev)
 }
 EXPORT_SYMBOL_GPL(dax_write_cache_enabled);
 
+bool dax_synchronous(struct dax_device *dax_dev)
+{
+   return test_bit(DAXDEV_SYNC, _dev->flags);
+}
+EXPORT_SYMBOL_GPL(dax_synchronous);
+
 bool dax_alive(struct dax_device *dax_dev)
 {
lockdep_assert_held(_srcu);
@@ -511,7 +519,7 @@ static void dax_add_host(struct dax_device *dax_dev, const 
char *host)
 }
 
 struct dax_device *alloc_dax(void *private, const char *__host,
-   const struct dax_operations *ops)
+   const struct dax_operations *ops, bool sync)
 {
struct dax_device *dax_dev;
const char *host;
@@ -534,6 +542,9 @@ struct dax_device *alloc_dax(void *private, const char 
*__host,
dax_add_host(dax_dev, host);
dax_dev->ops = ops;
dax_dev->private = private;
+   if (sync)
+   set_bit(DAXDEV_SYNC, _dev->flags);
+
return dax_dev;
 
  err_dev:
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 68d24056d0b1..534e12ca6329 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1965,7 +1965,7 @@ static struct mapped_device *alloc_dev(int minor)
sprintf(md->disk->disk_name, "dm-%d", minor);
 
if (IS_ENABLED(CONFIG_DAX_DRIVER)) {
-   dax_dev = alloc_dax(md, md->disk->disk_name, _dax_ops);
+   dax_dev = alloc_dax(md, md->disk->disk_name, _dax_ops, true);
if (!dax_dev)
goto bad;
}
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 5a5b3ea4d073..78f71ba0e7cf 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -466,7 +466,8 @@ static int pmem_attach_disk(struct device *dev,
nvdimm_badblocks_populate(nd_region, >bb, _res);
disk->bb = >bb;
 
-   dax_dev = alloc_dax(pmem, disk->disk_name, _dax_ops);
+   dax_dev = alloc_dax(pmem, disk->disk_name, _dax_ops,
+   is_nvdimm_sync(nd_region));
 
if (!dax_dev) {
put_disk(disk);
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index fb1041ab32a6..8c7aa047fe2b 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1231,6 +1231,13 @@ int nvdimm_has_cache(struct nd_region *nd_region)
 }
 EXPORT_SYMBOL_GPL(nvdimm_has_cache);
 
+bool is_nvdimm_sync(struct nd_region *nd_region)
+{
+   return is_nd_pmem(_region->dev) &&
+   !test_bit(ND_REGION_ASYNC, _region->flags);
+}
+EXPORT_SYMBOL_GPL(is_nvdimm_sync);
+
 struct conflict_context {
struct nd_region *nd_region;
resource_size_t start, size;
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 0dd316a74a29..9bdd50d06ef6 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -32,18 +32,19 @@ extern struct attribute_group dax_attribute_group;
 #if IS_ENABLED(CONFIG_DAX)
 struct dax_device *dax_get_by_host(const char *host);
 struct dax_device *alloc_dax(void *private, const char *host,
-   const struct dax_operations *ops);
+   const struct dax_operations *ops, bool sync);
 void put_dax(struct dax_device *dax_dev);
 void kill_dax(struct dax_device *dax_dev);
 void dax_write_cache(struct dax_device *dax_dev, bool wc);
 bool dax_write_cache_enabled(struct dax_device *dax_dev);
+bool dax_synchronous(struct dax_device *dax_dev);
 #else
 static inline struct dax_device 

[PATCH v4 2/5] virtio-pmem: Add virtio pmem driver

2019-04-03 Thread Pankaj Gupta
This patch adds virtio-pmem driver for KVM guest.

Guest reads the persistent memory range information from
Qemu over VIRTIO and registers it on nvdimm_bus. It also
creates a nd_region object with the persistent memory
range information so that existing 'nvdimm/pmem' driver
can reserve this into system memory map. This way
'virtio-pmem' driver uses existing functionality of pmem
driver to register persistent memory compatible for DAX
capable filesystems.

This also provides function to perform guest flush over
VIRTIO from 'pmem' driver when userspace performs flush
on DAX memory range.

Signed-off-by: Pankaj Gupta 
---
 drivers/nvdimm/virtio_pmem.c |  84 +
 drivers/virtio/Kconfig   |  10 +++
 drivers/virtio/Makefile  |   1 +
 drivers/virtio/pmem.c| 125 +++
 include/linux/virtio_pmem.h  |  60 +++
 include/uapi/linux/virtio_ids.h  |   1 +
 include/uapi/linux/virtio_pmem.h |  10 +++
 7 files changed, 291 insertions(+)
 create mode 100644 drivers/nvdimm/virtio_pmem.c
 create mode 100644 drivers/virtio/pmem.c
 create mode 100644 include/linux/virtio_pmem.h
 create mode 100644 include/uapi/linux/virtio_pmem.h

diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
new file mode 100644
index ..2a1b1ba2c1ff
--- /dev/null
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -0,0 +1,84 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * virtio_pmem.c: Virtio pmem Driver
+ *
+ * Discovers persistent memory range information
+ * from host and provides a virtio based flushing
+ * interface.
+ */
+#include 
+#include "nd.h"
+
+ /* The interrupt handler */
+void host_ack(struct virtqueue *vq)
+{
+   unsigned int len;
+   unsigned long flags;
+   struct virtio_pmem_request *req, *req_buf;
+   struct virtio_pmem *vpmem = vq->vdev->priv;
+
+   spin_lock_irqsave(>pmem_lock, flags);
+   while ((req = virtqueue_get_buf(vq, )) != NULL) {
+   req->done = true;
+   wake_up(>host_acked);
+
+   if (!list_empty(>req_list)) {
+   req_buf = list_first_entry(>req_list,
+   struct virtio_pmem_request, list);
+   list_del(>req_list);
+   req_buf->wq_buf_avail = true;
+   wake_up(_buf->wq_buf);
+   }
+   }
+   spin_unlock_irqrestore(>pmem_lock, flags);
+}
+EXPORT_SYMBOL_GPL(host_ack);
+
+ /* The request submission function */
+int virtio_pmem_flush(struct nd_region *nd_region)
+{
+   int err;
+   unsigned long flags;
+   struct scatterlist *sgs[2], sg, ret;
+   struct virtio_device *vdev = nd_region->provider_data;
+   struct virtio_pmem *vpmem = vdev->priv;
+   struct virtio_pmem_request *req;
+
+   might_sleep();
+   req = kmalloc(sizeof(*req), GFP_KERNEL);
+   if (!req)
+   return -ENOMEM;
+
+   req->done = req->wq_buf_avail = false;
+   strcpy(req->name, "FLUSH");
+   init_waitqueue_head(>host_acked);
+   init_waitqueue_head(>wq_buf);
+   sg_init_one(, req->name, strlen(req->name));
+   sgs[0] = 
+   sg_init_one(, >ret, sizeof(req->ret));
+   sgs[1] = 
+
+   spin_lock_irqsave(>pmem_lock, flags);
+   err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC);
+   if (err) {
+   dev_err(>dev, "failed to send command to virtio pmem 
device\n");
+
+   list_add_tail(>req_list, >list);
+   spin_unlock_irqrestore(>pmem_lock, flags);
+
+   /* When host has read buffer, this completes via host_ack */
+   wait_event(req->wq_buf, req->wq_buf_avail);
+   spin_lock_irqsave(>pmem_lock, flags);
+   }
+   virtqueue_kick(vpmem->req_vq);
+   spin_unlock_irqrestore(>pmem_lock, flags);
+
+   /* When host has read buffer, this completes via host_ack */
+   wait_event(req->host_acked, req->done);
+   err = req->ret;
+   kfree(req);
+
+   return err;
+};
+EXPORT_SYMBOL_GPL(virtio_pmem_flush);
+MODULE_LICENSE("GPL");
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 35897649c24f..9f634a2ed638 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -42,6 +42,16 @@ config VIRTIO_PCI_LEGACY
 
  If unsure, say Y.
 
+config VIRTIO_PMEM
+   tristate "Support for virtio pmem driver"
+   depends on VIRTIO
+   depends on LIBNVDIMM
+   help
+   This driver provides support for virtio based flushing interface
+   for persistent memory range.
+
+   If unsure, say M.
+
 config VIRTIO_BALLOON
tristate "Virtio balloon driver"
depends on VIRTIO
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 3a2b5c5dcf46..143ce91eabe9 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 

[PATCH v4 1/5] ibnvdimm: nd_region flush callback support

2019-04-03 Thread Pankaj Gupta
This patch adds functionality to perform flush from guest
to host over VIRTIO. We are registering a callback based
on 'nd_region' type. virtio_pmem driver requires this special
flush function. For rest of the region types we are registering
existing flush function. Report error returned by host fsync
failure to userspace.

This also handles asynchronous flush requests from the block layer
by creating a child bio and chaining it with parent bio.

Signed-off-by: Pankaj Gupta 
---
 drivers/acpi/nfit/core.c |  4 ++--
 drivers/nvdimm/claim.c   |  6 --
 drivers/nvdimm/nd.h  |  1 +
 drivers/nvdimm/pmem.c| 14 -
 drivers/nvdimm/region_devs.c | 38 ++--
 include/linux/libnvdimm.h|  8 +++-
 6 files changed, 59 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 5a389a4f4f65..567017a2190e 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2434,7 +2434,7 @@ static void write_blk_ctl(struct nfit_blk *nfit_blk, 
unsigned int bw,
offset = to_interleave_offset(offset, mmio);
 
writeq(cmd, mmio->addr.base + offset);
-   nvdimm_flush(nfit_blk->nd_region);
+   nvdimm_flush(nfit_blk->nd_region, NULL, false);
 
if (nfit_blk->dimm_flags & NFIT_BLK_DCR_LATCH)
readq(mmio->addr.base + offset);
@@ -2483,7 +2483,7 @@ static int acpi_nfit_blk_single_io(struct nfit_blk 
*nfit_blk,
}
 
if (rw)
-   nvdimm_flush(nfit_blk->nd_region);
+   nvdimm_flush(nfit_blk->nd_region, NULL, false);
 
rc = read_blk_stat(nfit_blk, lane) ? -EIO : 0;
return rc;
diff --git a/drivers/nvdimm/claim.c b/drivers/nvdimm/claim.c
index fb667bf469c7..a1dfa066786b 100644
--- a/drivers/nvdimm/claim.c
+++ b/drivers/nvdimm/claim.c
@@ -263,7 +263,7 @@ static int nsio_rw_bytes(struct nd_namespace_common *ndns,
struct nd_namespace_io *nsio = to_nd_namespace_io(>dev);
unsigned int sz_align = ALIGN(size + (offset & (512 - 1)), 512);
sector_t sector = offset >> 9;
-   int rc = 0;
+   int rc = 0, ret = 0;
 
if (unlikely(!size))
return 0;
@@ -301,7 +301,9 @@ static int nsio_rw_bytes(struct nd_namespace_common *ndns,
}
 
memcpy_flushcache(nsio->addr + offset, buf, size);
-   nvdimm_flush(to_nd_region(ndns->dev.parent));
+   ret = nvdimm_flush(to_nd_region(ndns->dev.parent), NULL, false);
+   if (ret)
+   rc = ret;
 
return rc;
 }
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index a5ac3b240293..916cd6c5451a 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -159,6 +159,7 @@ struct nd_region {
struct badblocks bb;
struct nd_interleave_set *nd_set;
struct nd_percpu_lane __percpu *lane;
+   int (*flush)(struct nd_region *nd_region);
struct nd_mapping mapping[0];
 };
 
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index bc2f700feef8..5a5b3ea4d073 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -192,6 +192,7 @@ static blk_status_t pmem_do_bvec(struct pmem_device *pmem, 
struct page *page,
 
 static blk_qc_t pmem_make_request(struct request_queue *q, struct bio *bio)
 {
+   int ret = 0;
blk_status_t rc = 0;
bool do_acct;
unsigned long start;
@@ -201,7 +202,7 @@ static blk_qc_t pmem_make_request(struct request_queue *q, 
struct bio *bio)
struct nd_region *nd_region = to_region(pmem);
 
if (bio->bi_opf & REQ_PREFLUSH)
-   nvdimm_flush(nd_region);
+   ret = nvdimm_flush(nd_region, bio, true);
 
do_acct = nd_iostat_start(bio, );
bio_for_each_segment(bvec, bio, iter) {
@@ -216,7 +217,10 @@ static blk_qc_t pmem_make_request(struct request_queue *q, 
struct bio *bio)
nd_iostat_end(bio, start);
 
if (bio->bi_opf & REQ_FUA)
-   nvdimm_flush(nd_region);
+   ret = nvdimm_flush(nd_region, bio, true);
+
+   if (ret)
+   bio->bi_status = errno_to_blk_status(ret);
 
bio_endio(bio);
return BLK_QC_T_NONE;
@@ -463,13 +467,13 @@ static int pmem_attach_disk(struct device *dev,
disk->bb = >bb;
 
dax_dev = alloc_dax(pmem, disk->disk_name, _dax_ops);
+
if (!dax_dev) {
put_disk(disk);
return -ENOMEM;
}
dax_write_cache(dax_dev, nvdimm_has_cache(nd_region));
pmem->dax_dev = dax_dev;
-
gendev = disk_to_dev(disk);
gendev->groups = pmem_attribute_groups;
 
@@ -527,14 +531,14 @@ static int nd_pmem_remove(struct device *dev)
sysfs_put(pmem->bb_state);
pmem->bb_state = NULL;
}
-   nvdimm_flush(to_nd_region(dev->parent));
+   nvdimm_flush(to_nd_region(dev->parent), NULL, false);
 
return 0;
 }
 
 static void nd_pmem_shutdown(struct device *dev)
 {
-   

[PATCH v4 0/5] virtio pmem driver

2019-04-03 Thread Pankaj Gupta
 This patch series has implementation for "virtio pmem". 
 "virtio pmem" is fake persistent memory(nvdimm) in guest 
 which allows to bypass the guest page cache. This also
 implements a VIRTIO based asynchronous flush mechanism.  
 
 Sharing guest kernel driver in this patchset with the 
 changes suggested in v3. Tested with Qemu side device 
 emulation [6] for virtio-pmem. 

 We have incorporated all the suggestions in V3. Documented 
 the impact of possible page cache side channel attacks with 
 suggested countermeasures.
 
 Details of project idea for 'virtio pmem' flushing interface 
 is shared [3] & [4].

 Implementation is divided into two parts:
 New virtio pmem guest driver and qemu code changes for new 
 virtio pmem paravirtualized device.

1. Guest virtio-pmem kernel driver
-
   - Reads persistent memory range from paravirt device and 
 registers with 'nvdimm_bus'.  
   - 'nvdimm/pmem' driver uses this information to allocate 
 persistent memory region and setup filesystem operations 
 to the allocated memory. 
   - virtio pmem driver implements asynchronous flushing 
 interface to flush from guest to host.

2. Qemu virtio-pmem device
-
   - Creates virtio pmem device and exposes a memory range to 
 KVM guest. 
   - At host side this is file backed memory which acts as 
 persistent memory. 
   - Qemu side flush uses aio thread pool API's and virtio 
 for asynchronous guest multi request handling. 

   David Hildenbrand CCed also posted a modified version[7] of 
   qemu virtio-pmem code based on updated Qemu memory device API. 

 Virtio-pmem security implications and countermeasures:
 -

 In previous posting of kernel driver, there was discussion [9]
 on possible implications of page cache side channel attacks with 
 virtio pmem. After thorough analysis of details of known side 
 channel attacks, below are the suggestions:

 - Depends entirely on how host backing image file is mapped 
   into guest address space. 

 - virtio-pmem device emulation, by default shared mapping is used
   to map host backing file. It is recommended to use separate
   backing file at host side for every guest. This will prevent
   any possibility of executing common code from multiple guests
   and any chance of inferring guest local data based based on 
   execution time.

 - If backing file is required to be shared among multiple guests 
   it is recommended to don't support host page cache eviction 
   commands from the guest driver. This will avoid any possibility
   of inferring guest local data or host data from another guest. 

 - Proposed device specification [8] for virtio-pmem device with 
   details of possible security implications and suggested 
   countermeasures for device emulation.

 Virtio-pmem errors handling:
 
  Checked behaviour of virtio-pmem for below types of errors
  Need suggestions on expected behaviour for handling these errors?

  - Hardware Errors: Uncorrectable recoverable Errors: 
  a] virtio-pmem: 
- As per current logic if error page belongs to Qemu process, 
  host MCE handler isolates(hwpoison) that page and send SIGBUS. 
  Qemu SIGBUS handler injects exception to KVM guest. 
- KVM guest then isolates the page and send SIGBUS to guest 
  userspace process which has mapped the page. 
  
  b] Existing implementation for ACPI pmem driver: 
- Handles such errors with MCE notifier and creates a list 
  of bad blocks. Read/direct access DAX operation return EIO 
  if accessed memory page fall in bad block list.
- It also starts backgound scrubbing.  
- Similar functionality can be reused in virtio-pmem with MCE 
  notifier but without scrubbing(no ACPI/ARS)? Need inputs to 
  confirm if this behaviour is ok or needs any change?

Changes from PATCH v3: [1]

- Use generic dax_synchronous() helper to check for DAXDEV_SYNC 
  flag - [Dan, Darrick, Jan]
- Add 'is_nvdimm_async' function
- Document page cache side channel attacks implications & 
  countermeasures - [Dave Chinner, Michael]

Changes from PATCH v2: [2]
- Disable MAP_SYNC for ext4 & XFS filesystems - [Dan] 
- Use name 'virtio pmem' in place of 'fake dax' 

Changes from PATCH v1: 
- 0-day build test for build dependency on libnvdimm 

 Changes suggested by - [Dan Williams]
- Split the driver into two parts virtio & pmem  
- Move queuing of async block request to block layer
- Add "sync" parameter in nvdimm_flush function
- Use indirect call for nvdimm_flush
- Don’t move declarations to common global header e.g nd.h
- nvdimm_flush() return 0 or -EIO if it fails
- Teach nsio_rw_bytes() that the flush can fail
- Rename nvdimm_flush() to generic_nvdimm_flush()
- Use 'nd_region->provider_data' for long dereferencing
- Remove virtio_pmem_freeze/restore functions
- Remove BSD license text with SPDX license text


朱经理:会不会补征?会不会有罚款及滞纳金?

2019-04-03 Thread 朱经理
 请 查 收 附 件 内 容 


 原邮件信息 -
发件人:朱经理
收件人:linux-nvdimm 
发送时间:2019-4-3  17:13:08
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm