failure notice

2018-05-17 Thread MAILER-DAEMON
Hi. This is the qmail-send program at mail01.weave-sv.jp.
I'm afraid I wasn't able to deliver your message to the following addresses.
This is a permanent error; I've given up. Sorry it didn't work out.

:
Sorry, no mailbox here by that name. (#5.1.1)

--- Below this line is a copy of the message.

Return-Path: 
Received: (qmail 6010 invoked from network); 18 May 2018 08:16:57 +0900
Received: from unknown (HELO 187-162-140-207.static.axtel.net) (187.162.140.207)
  by mail01.weave-sv.jp with SMTP; 18 May 2018 08:16:57 +0900
Message-ID: 
From: 
To: 
Subject: The world is full of love, I want to be a part of it!
Date: 17 May 2018 12:02:47 -0600
MIME-Version: 1.0
Content-Type: text/plain;
charset="cp-850"
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994

Dear,

It's a beautiful day and i'am in a hurry to get in touch with you asap!

My name is Sevgi, and I'm from Turkey. 
I really do believe in a destiny with a bright future for myself and that you 
could become a part of it become my true soulmate. 
I do want to be next to a loving man. 

I love traveling, movies, pop music, seafood, and doing crazy things, but i 
feel like loneliness is swallowing me intensely lonely sometimes. 
I wish to find for my second half, who a man that will give me a real hope and 
true love! 

Hope you're interested in becoming a part of my adventure and will reply back 
soon.
In the next letter, I'll send you my photo.

Please write me back using my personal email: dentistsevg...@aol.com


Your true soul, 
Sevgi.

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


RE: [qemu PATCH v2 3/4] nvdimm, acpi: support NFIT platform capabilities

2018-05-17 Thread Elliott, Robert (Persistent Memory)


> -Original Message-
> From: Linux-nvdimm [mailto:linux-nvdimm-boun...@lists.01.org] On Behalf Of
> Ross Zwisler
> Sent: Thursday, May 17, 2018 12:00 AM
> Subject: [qemu PATCH v2 3/4] nvdimm, acpi: support NFIT platform
> capabilities
> 
> Add a machine command line option to allow the user to control the
> Platform
> Capabilities Structure in the virtualized NFIT.  This Platform
> Capabilities
> Structure was added in ACPI 6.2 Errata A.
> 
...
> +Platform Capabilities
> +-
> +
> +ACPI 6.2 Errata A added support for a new Platform Capabilities Structure
> +which allows the platform to communicate what features it supports
> related to
> +NVDIMM data durability.  Users can provide a capabilities value to a
> guest via
> +the optional "nvdimm-cap" machine command line option:
> +
> +-machine pc,accel=kvm,nvdimm,nvdimm-cap=2
> +
> +As of ACPI 6.2 Errata A, the following values are valid for the bottom
> two
> +bits:
> +
> +2 - Memory Controller Flush to NVDIMM Durability on Power Loss Capable.
> +3 - CPU Cache Flush to NVDIMM Durability on Power Loss Capable.

It's a bit unclear that those are decimal values for the field, not 
bit numbers.

...
> -static GArray *nvdimm_build_device_structure(void)
> +/*
> + * ACPI 6.2 Errata A: 5.2.25.9 NVDIMM Platform Capabilities Structure
> + */
> +static void
> +nvdimm_build_structure_caps(GArray *structures, uint32_t capabilities)
> +{
> +NvdimmNfitPlatformCaps *nfit_caps;
> +
> +nfit_caps = acpi_data_push(structures, sizeof(*nfit_caps));
> +
> +nfit_caps->type = cpu_to_le16(7 /* NVDIMM Platform Capabilities */);
> +nfit_caps->length = cpu_to_le16(sizeof(*nfit_caps));
> +nfit_caps->highest_cap = 2;
> +nfit_caps->capabilities = cpu_to_le32(capabilities);

highest_cap needs to be set to a value that at least covers the highest 
bit set to 1 used in capabilities.

As capabilities bits are added, there are three different meanings:
* 1: bit within highest_cap range, platform is claiming the 1 meaning
* 0: bit within highest_cap range, platform is claiming the 0 meaning
* not reported: bit not within highest_cap range, so the platform's
  implementation of this feature is unknown. Not necessarily the same 
  as the 0 meaning.

So, there should be a way to specify a highest_cap value to convey that
some of the upper capabilities bits are valid and contain 0.

---
Robert Elliott, HPE Persistent Memory


___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Ross Zwisler
On Thu, May 17, 2018 at 12:57:39PM -0700, Matthew Wilcox wrote:
> On Thu, May 17, 2018 at 01:29:10PM -0600, Ross Zwisler wrote:
> > On Thu, May 17, 2018 at 01:24:00PM -0600, Ross Zwisler wrote:
> > > On Thu, May 17, 2018 at 11:37:11AM -0700, Matthew Wilcox wrote:
> > > > 
> > > > I plucked this patch from my XArray work.  It seems self-contained 
> > > > enough
> > > > that it could go into the DAX tree for merging this cycle.
> > > > 
> > > > From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
> > > > From: Matthew Wilcox 
> > > > Date: Thu, 29 Mar 2018 22:41:18 -0400
> > > > Subject: [PATCH] dax: Fix use of zero page
> > > > 
> > > > Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
> > > > works on MIPS and s390.
> > > > 
> > > > Signed-off-by: Matthew Wilcox 
> > > 
> > > Yep, this looks fine.
> > > 
> > > Reviewed-by: Ross Zwisler 
> > 
> > Huh, actually, it looks like this relies on patch 01/63 of your full Xarray
> > series where you s/RADIX_DAX_ZERO_PAGE/DAX_ZERO_PAGE/g.
> > 
> > Ditto for the 2nd patch you sent today.
> 
> Argh, thanks.  I can respin them against linux-next if you like.

Yep, that would be awesome, thanks.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Ross Zwisler
On Thu, May 17, 2018 at 01:03:48PM -0700, Dan Williams wrote:
> On Thu, May 17, 2018 at 12:56 PM, Matthew Wilcox  wrote:
> > On Thu, May 17, 2018 at 12:32:07PM -0700, Dan Williams wrote:
> >> On Thu, May 17, 2018 at 11:37 AM, Matthew Wilcox  
> >> wrote:
> >> >
> >> > I plucked this patch from my XArray work.  It seems self-contained enough
> >> > that it could go into the DAX tree for merging this cycle.
> >> >
> >> > From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
> >> > From: Matthew Wilcox 
> >> > Date: Thu, 29 Mar 2018 22:41:18 -0400
> >> > Subject: [PATCH] dax: Fix use of zero page
> >> >
> >> > Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
> >> > works on MIPS and s390.
> >> >
> >> > Signed-off-by: Matthew Wilcox 
> >>
> >> I'm being thick and / or lazy, what's the user visible effect of this fix?
> >
> > For s390 it appears to be a performance issue:
> >
> > Author: Martin Schwidefsky 
> > Date:   Mon Oct 25 16:10:07 2010 +0200
> >
> > [S390] zero page cache synonyms
> >
> > If the zero page is mapped to virtual user space addresses that differ
> > only in bit 2^12 or 2^13 we get L1 cache synonyms which can affect
> > performance. Follow the mips model and use multiple zero pages to avoid
> > the synonyms.
> >
> > MIPS' use of multiple ZERO_PAGEs predates git history.  Given the
> > history of MIPS' caches behaving in incredibly weird ways, I'd assume
> > that getting this wrong results in miniature black holes forming and/or
> > the CPU calculating the largest prime number.
> 
> Unless I am missing something I think this sounds like 4.18-rc1
> material with a cc: stable. Last I heard no one is really using
> dccssblk + dax, and MIPS has no way to describe pmem outside of
> memmap= which is only a development tool.

Yea, I agree that this is v4.18 material.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v10] mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS

2018-05-17 Thread Dan Williams
In preparation for fixing dax-dma-vs-unmap issues, filesystems need to
be able to rely on the fact that they will get wakeups on dev_pagemap
page-idle events. Introduce MEMORY_DEVICE_FS_DAX and
generic_dax_page_free() as common indicator / infrastructure for dax
filesytems to require. With this change there are no users of the
MEMORY_DEVICE_HOST designation, so remove it.

The HMM sub-system extended dev_pagemap to arrange a callback when a
dev_pagemap managed page is freed. Since a dev_pagemap page is free /
idle when its reference count is 1 it requires an additional branch to
check the page-type at put_page() time. Given put_page() is a hot-path
we do not want to incur that check if HMM is not in use, so a static
branch is used to avoid that overhead when not necessary.

Now, the FS_DAX implementation wants to reuse this mechanism for
receiving dev_pagemap ->page_free() callbacks. Rework the HMM-specific
static-key into a generic mechanism that either HMM or FS_DAX code paths
can enable.

For ARCH=um builds, and any other arch that lacks ZONE_DEVICE support,
care must be taken to compile out the DEV_PAGEMAP_OPS infrastructure.
However, we still need to support FS_DAX in the FS_DAX_LIMITED case
implemented by the s390/dcssblk driver.

Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Michal Hocko 
Reported-by: kbuild test robot 
Reported-by: Thomas Meyer 
Reported-by: Dave Jiang 
Cc: Christoph Hellwig 
Cc: "Jérôme Glisse" 
Cc: Jan Kara 
Signed-off-by: Dan Williams 
---

This patch replaces and consolidates patch 2 [1] and 4 [2] from the v9
series [3] for "dax: fix dma vs truncate/hole-punch".

The original implementation which introduced fs_dax_claim() was broken
in the presence of partitions as filesystems on partitions of a pmem
device would collide when attempting to issue fs_dax_claim().

Instead, since this new page wakeup behavior is a property of
dev_pagemap pages and there is a 1:1 relationship between a pmem device
and its dev_pagemap instance, make the pmem driver own the page wakeup
initialization rather than the filesystem.

This simplifies the implementation considerably. The diffstat for the
series is now:

21 files changed, 546 insertions(+), 277 deletions(-)

...down from:

24 files changed, 730 insertions(+), 297 deletions(-)

The other patches in the series are not included since they did not
change in any meaningful way. Let me know if anyone wants a full resend,
it will otherwise be available in -next shortly. Given the change in
approach I did not carry Reviewed-by tags from patch 2 and 4 to this
patch.

[1]: [PATCH v9 2/9] mm, dax: enable filesystems to trigger dev_pagemap 
->page_free callbacks
https://lists.01.org/pipermail/linux-nvdimm/2018-April/015459.html

[2]: [PATCH v9 4/9] mm, dev_pagemap: introduce CONFIG_DEV_PAGEMAP_OPS
https://lists.01.org/pipermail/linux-nvdimm/2018-April/015461.html

[3]: [PATCH v9 0/9] dax: fix dma vs truncate/hole-punch
https://lists.01.org/pipermail/linux-nvdimm/2018-April/015457.html


 drivers/dax/super.c   |   18 ---
 drivers/nvdimm/pfn_devs.c |2 -
 drivers/nvdimm/pmem.c |   20 +
 fs/Kconfig|1 +
 include/linux/memremap.h  |   41 ++
 include/linux/mm.h|   71 ++---
 kernel/memremap.c |   37 +--
 mm/Kconfig|5 +++
 mm/hmm.c  |   13 +---
 mm/swap.c |3 +-
 10 files changed, 143 insertions(+), 68 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 2b2332b605e4..4928b7fcfb71 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -134,15 +134,21 @@ int __bdev_dax_supported(struct super_block *sb, int 
blocksize)
 * on being able to do (page_address(pfn_to_page())).
 */
WARN_ON(IS_ENABLED(CONFIG_ARCH_HAS_PMEM_API));
+   return 0;
} else if (pfn_t_devmap(pfn)) {
-   /* pass */;
-   } else {
-   pr_debug("VFS (%s): error: dax support not enabled\n",
-   sb->s_id);
-   return -EOPNOTSUPP;
+   struct dev_pagemap *pgmap;
+
+   pgmap = get_dev_pagemap(pfn_t_to_pfn(pfn), NULL);
+   if (pgmap && pgmap->type == MEMORY_DEVICE_FS_DAX) {
+   put_dev_pagemap(pgmap);
+   return 0;
+   }
+   put_dev_pagemap(pgmap);
}
 
-   return 0;
+   pr_debug("VFS (%s): error: dax support not enabled\n",
+   sb->s_id);
+   return -EOPNOTSUPP;
 }
 EXPORT_SYMBOL_GPL(__bdev_dax_supported);
 #endif
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index 

Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Dan Williams
On Thu, May 17, 2018 at 12:56 PM, Matthew Wilcox  wrote:
> On Thu, May 17, 2018 at 12:32:07PM -0700, Dan Williams wrote:
>> On Thu, May 17, 2018 at 11:37 AM, Matthew Wilcox  wrote:
>> >
>> > I plucked this patch from my XArray work.  It seems self-contained enough
>> > that it could go into the DAX tree for merging this cycle.
>> >
>> > From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
>> > From: Matthew Wilcox 
>> > Date: Thu, 29 Mar 2018 22:41:18 -0400
>> > Subject: [PATCH] dax: Fix use of zero page
>> >
>> > Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
>> > works on MIPS and s390.
>> >
>> > Signed-off-by: Matthew Wilcox 
>>
>> I'm being thick and / or lazy, what's the user visible effect of this fix?
>
> For s390 it appears to be a performance issue:
>
> Author: Martin Schwidefsky 
> Date:   Mon Oct 25 16:10:07 2010 +0200
>
> [S390] zero page cache synonyms
>
> If the zero page is mapped to virtual user space addresses that differ
> only in bit 2^12 or 2^13 we get L1 cache synonyms which can affect
> performance. Follow the mips model and use multiple zero pages to avoid
> the synonyms.
>
> MIPS' use of multiple ZERO_PAGEs predates git history.  Given the
> history of MIPS' caches behaving in incredibly weird ways, I'd assume
> that getting this wrong results in miniature black holes forming and/or
> the CPU calculating the largest prime number.

Unless I am missing something I think this sounds like 4.18-rc1
material with a cc: stable. Last I heard no one is really using
dccssblk + dax, and MIPS has no way to describe pmem outside of
memmap= which is only a development tool.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Matthew Wilcox
On Thu, May 17, 2018 at 01:29:10PM -0600, Ross Zwisler wrote:
> On Thu, May 17, 2018 at 01:24:00PM -0600, Ross Zwisler wrote:
> > On Thu, May 17, 2018 at 11:37:11AM -0700, Matthew Wilcox wrote:
> > > 
> > > I plucked this patch from my XArray work.  It seems self-contained enough
> > > that it could go into the DAX tree for merging this cycle.
> > > 
> > > From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
> > > From: Matthew Wilcox 
> > > Date: Thu, 29 Mar 2018 22:41:18 -0400
> > > Subject: [PATCH] dax: Fix use of zero page
> > > 
> > > Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
> > > works on MIPS and s390.
> > > 
> > > Signed-off-by: Matthew Wilcox 
> > 
> > Yep, this looks fine.
> > 
> > Reviewed-by: Ross Zwisler 
> 
> Huh, actually, it looks like this relies on patch 01/63 of your full Xarray
> series where you s/RADIX_DAX_ZERO_PAGE/DAX_ZERO_PAGE/g.
> 
> Ditto for the 2nd patch you sent today.

Argh, thanks.  I can respin them against linux-next if you like.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Matthew Wilcox
On Thu, May 17, 2018 at 12:32:07PM -0700, Dan Williams wrote:
> On Thu, May 17, 2018 at 11:37 AM, Matthew Wilcox  wrote:
> >
> > I plucked this patch from my XArray work.  It seems self-contained enough
> > that it could go into the DAX tree for merging this cycle.
> >
> > From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
> > From: Matthew Wilcox 
> > Date: Thu, 29 Mar 2018 22:41:18 -0400
> > Subject: [PATCH] dax: Fix use of zero page
> >
> > Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
> > works on MIPS and s390.
> >
> > Signed-off-by: Matthew Wilcox 
> 
> I'm being thick and / or lazy, what's the user visible effect of this fix?

For s390 it appears to be a performance issue:

Author: Martin Schwidefsky 
Date:   Mon Oct 25 16:10:07 2010 +0200

[S390] zero page cache synonyms

If the zero page is mapped to virtual user space addresses that differ
only in bit 2^12 or 2^13 we get L1 cache synonyms which can affect
performance. Follow the mips model and use multiple zero pages to avoid
the synonyms.

MIPS' use of multiple ZERO_PAGEs predates git history.  Given the
history of MIPS' caches behaving in incredibly weird ways, I'd assume
that getting this wrong results in miniature black holes forming and/or
the CPU calculating the largest prime number.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Dan Williams
On Thu, May 17, 2018 at 11:37 AM, Matthew Wilcox  wrote:
>
> I plucked this patch from my XArray work.  It seems self-contained enough
> that it could go into the DAX tree for merging this cycle.
>
> From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
> From: Matthew Wilcox 
> Date: Thu, 29 Mar 2018 22:41:18 -0400
> Subject: [PATCH] dax: Fix use of zero page
>
> Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
> works on MIPS and s390.
>
> Signed-off-by: Matthew Wilcox 

I'm being thick and / or lazy, what's the user visible effect of this fix?
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Ross Zwisler
On Thu, May 17, 2018 at 01:24:00PM -0600, Ross Zwisler wrote:
> On Thu, May 17, 2018 at 11:37:11AM -0700, Matthew Wilcox wrote:
> > 
> > I plucked this patch from my XArray work.  It seems self-contained enough
> > that it could go into the DAX tree for merging this cycle.
> > 
> > From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
> > From: Matthew Wilcox 
> > Date: Thu, 29 Mar 2018 22:41:18 -0400
> > Subject: [PATCH] dax: Fix use of zero page
> > 
> > Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
> > works on MIPS and s390.
> > 
> > Signed-off-by: Matthew Wilcox 
> 
> Yep, this looks fine.
> 
> Reviewed-by: Ross Zwisler 

Huh, actually, it looks like this relies on patch 01/63 of your full Xarray
series where you s/RADIX_DAX_ZERO_PAGE/DAX_ZERO_PAGE/g.

Ditto for the 2nd patch you sent today.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Ross Zwisler
On Thu, May 17, 2018 at 11:37:11AM -0700, Matthew Wilcox wrote:
> 
> I plucked this patch from my XArray work.  It seems self-contained enough
> that it could go into the DAX tree for merging this cycle.
> 
> From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
> From: Matthew Wilcox 
> Date: Thu, 29 Mar 2018 22:41:18 -0400
> Subject: [PATCH] dax: Fix use of zero page
> 
> Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
> works on MIPS and s390.
> 
> Signed-off-by: Matthew Wilcox 

Yep, this looks fine.

Reviewed-by: Ross Zwisler 
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH] dax: dax_insert_mapping_entry always succeeds

2018-05-17 Thread Matthew Wilcox

Another bugfix from the XArray work; please queue for merge.

>From 4a53cab1968d2a1022f35d00b29519970ef624e9 Mon Sep 17 00:00:00 2001
From: Matthew Wilcox 
Date: Thu, 29 Mar 2018 22:47:50 -0400
Subject: [PATCH] dax: dax_insert_mapping_entry always succeeds

It does not return an error, so we don't need to check the return value
for IS_ERR().  Indeed, it is a bug to do so; with a sufficiently large
PFN, a legitimate DAX entry may be mistaken for an error return.

Signed-off-by: Matthew Wilcox 
---
 fs/dax.c | 18 ++
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index f643e8fc34ee..99c5084b845c 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1005,19 +1005,13 @@ static vm_fault_t dax_load_hole(struct address_space 
*mapping, void *entry,
 {
struct inode *inode = mapping->host;
unsigned long vaddr = vmf->address;
-   vm_fault_t ret = VM_FAULT_NOPAGE;
-   void *entry2;
+   vm_fault_t ret;
pfn_t pfn = pfn_to_pfn_t(my_zero_pfn(vaddr));
 
-   entry2 = dax_insert_mapping_entry(mapping, vmf, entry, pfn,
+   dax_insert_mapping_entry(mapping, vmf, entry, pfn,
DAX_ZERO_PAGE, false);
-   if (IS_ERR(entry2)) {
-   ret = VM_FAULT_SIGBUS;
-   goto out;
-   }
 
ret = vmf_insert_mixed(vmf->vma, vaddr, pfn);
-out:
trace_dax_load_hole(inode, vmf, ret);
return ret;
 }
@@ -1327,10 +1321,6 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault 
*vmf, pfn_t *pfnp,
 
entry = dax_insert_mapping_entry(mapping, vmf, entry, pfn,
 0, write && !sync);
-   if (IS_ERR(entry)) {
-   error = PTR_ERR(entry);
-   goto error_finish_iomap;
-   }
 
/*
 * If we are doing synchronous page fault and inode needs fsync,
@@ -1411,8 +1401,6 @@ static vm_fault_t dax_pmd_load_hole(struct vm_fault *vmf, 
struct iomap *iomap,
pfn = page_to_pfn_t(zero_page);
ret = dax_insert_mapping_entry(mapping, vmf, entry, pfn,
DAX_PMD | DAX_ZERO_PAGE, false);
-   if (IS_ERR(ret))
-   goto fallback;
 
ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd);
if (!pmd_none(*(vmf->pmd))) {
@@ -1534,8 +1522,6 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault 
*vmf, pfn_t *pfnp,
 
entry = dax_insert_mapping_entry(mapping, vmf, entry, pfn,
DAX_PMD, write && !sync);
-   if (IS_ERR(entry))
-   goto finish_iomap;
 
/*
 * If we are doing synchronous page fault and inode needs fsync,
-- 
2.17.0

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH] dax: Fix use of zero page

2018-05-17 Thread Matthew Wilcox

I plucked this patch from my XArray work.  It seems self-contained enough
that it could go into the DAX tree for merging this cycle.

>From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001
From: Matthew Wilcox 
Date: Thu, 29 Mar 2018 22:41:18 -0400
Subject: [PATCH] dax: Fix use of zero page

Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
works on MIPS and s390.

Signed-off-by: Matthew Wilcox 
---
 fs/dax.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 6a26626f20f3..f643e8fc34ee 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1006,17 +1006,9 @@ static vm_fault_t dax_load_hole(struct address_space 
*mapping, void *entry,
struct inode *inode = mapping->host;
unsigned long vaddr = vmf->address;
vm_fault_t ret = VM_FAULT_NOPAGE;
-   struct page *zero_page;
void *entry2;
-   pfn_t pfn;
-
-   zero_page = ZERO_PAGE(0);
-   if (unlikely(!zero_page)) {
-   ret = VM_FAULT_OOM;
-   goto out;
-   }
+   pfn_t pfn = pfn_to_pfn_t(my_zero_pfn(vaddr));
 
-   pfn = page_to_pfn_t(zero_page);
entry2 = dax_insert_mapping_entry(mapping, vmf, entry, pfn,
DAX_ZERO_PAGE, false);
if (IS_ERR(entry2)) {
-- 
2.17.0

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[ndctl PATCH] ndctl: fix libtool versioning

2018-05-17 Thread Vishal Verma
Commit fdd4f06 ("libndctl, ars: add an API to retrieve clear_err_unit")
incremented the libtool versioning 'vurrent', but neglected to increment
the 'age', resulting in an soname bump.

Fixes: commit fdd4f06 ("libndctl, ars: add an API to retrieve clear_err_unit")
Reported-by: Arjan van de Ven 
Cc: Dan Williams 
Signed-off-by: Vishal Verma 
---
 Makefile.am.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile.am.in b/Makefile.am.in
index b23b175..479d4a5 100644
--- a/Makefile.am.in
+++ b/Makefile.am.in
@@ -37,7 +37,7 @@ SED_PROCESS = \
 
 LIBNDCTL_CURRENT=16
 LIBNDCTL_REVISION=0
-LIBNDCTL_AGE=9
+LIBNDCTL_AGE=10
 
 LIBDAXCTL_CURRENT=3
 LIBDAXCTL_REVISION=0
-- 
2.14.3

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[qemu PATCH v3 3/4] nvdimm, acpi: support NFIT platform capabilities

2018-05-17 Thread Ross Zwisler
Add a machine command line option to allow the user to control the Platform
Capabilities Structure in the virtualized NFIT.  This Platform Capabilities
Structure was added in ACPI 6.2 Errata A.

Signed-off-by: Ross Zwisler 
---

v3: Added brackets around single statement "if" conditional to comply
with qemu coding style.

---
 docs/nvdimm.txt | 18 ++
 hw/acpi/nvdimm.c| 45 +
 hw/i386/pc.c| 31 +++
 include/hw/i386/pc.h|  1 +
 include/hw/mem/nvdimm.h |  5 +
 5 files changed, 96 insertions(+), 4 deletions(-)

diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
index e903d8bb09..3f18013880 100644
--- a/docs/nvdimm.txt
+++ b/docs/nvdimm.txt
@@ -153,3 +153,21 @@ guest NVDIMM region mapping structure.  This unarmed flag 
indicates
 guest software that this vNVDIMM device contains a region that cannot
 accept persistent writes. In result, for example, the guest Linux
 NVDIMM driver, marks such vNVDIMM device as read-only.
+
+Platform Capabilities
+-
+
+ACPI 6.2 Errata A added support for a new Platform Capabilities Structure
+which allows the platform to communicate what features it supports related to
+NVDIMM data durability.  Users can provide a capabilities value to a guest via
+the optional "nvdimm-cap" machine command line option:
+
+-machine pc,accel=kvm,nvdimm,nvdimm-cap=2
+
+As of ACPI 6.2 Errata A, the following values are valid for the bottom two
+bits:
+
+2 - Memory Controller Flush to NVDIMM Durability on Power Loss Capable.
+3 - CPU Cache Flush to NVDIMM Durability on Power Loss Capable.
+
+For a complete list of the flags available please consult the ACPI spec.
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 59d6e4254c..946937f3ca 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -169,6 +169,21 @@ struct NvdimmNfitControlRegion {
 } QEMU_PACKED;
 typedef struct NvdimmNfitControlRegion NvdimmNfitControlRegion;
 
+/*
+ * NVDIMM Platform Capabilities Structure
+ *
+ * Defined in section 5.2.25.9 of ACPI 6.2 Errata A, September 2017
+ */
+struct NvdimmNfitPlatformCaps {
+uint16_t type;
+uint16_t length;
+uint8_t highest_cap;
+uint8_t reserved[3];
+uint32_t capabilities;
+uint8_t reserved2[4];
+} QEMU_PACKED;
+typedef struct NvdimmNfitPlatformCaps NvdimmNfitPlatformCaps;
+
 /*
  * Module serial number is a unique number for each device. We use the
  * slot id of NVDIMM device to generate this number so that each device
@@ -351,7 +366,23 @@ static void nvdimm_build_structure_dcr(GArray *structures, 
DeviceState *dev)
  JEDEC Annex L Release 3. */);
 }
 
-static GArray *nvdimm_build_device_structure(void)
+/*
+ * ACPI 6.2 Errata A: 5.2.25.9 NVDIMM Platform Capabilities Structure
+ */
+static void
+nvdimm_build_structure_caps(GArray *structures, uint32_t capabilities)
+{
+NvdimmNfitPlatformCaps *nfit_caps;
+
+nfit_caps = acpi_data_push(structures, sizeof(*nfit_caps));
+
+nfit_caps->type = cpu_to_le16(7 /* NVDIMM Platform Capabilities */);
+nfit_caps->length = cpu_to_le16(sizeof(*nfit_caps));
+nfit_caps->highest_cap = 2;
+nfit_caps->capabilities = cpu_to_le32(capabilities);
+}
+
+static GArray *nvdimm_build_device_structure(AcpiNVDIMMState *state)
 {
 GSList *device_list = nvdimm_get_device_list();
 GArray *structures = g_array_new(false, true /* clear */, 1);
@@ -373,6 +404,10 @@ static GArray *nvdimm_build_device_structure(void)
 }
 g_slist_free(device_list);
 
+if (state->capabilities) {
+nvdimm_build_structure_caps(structures, state->capabilities);
+}
+
 return structures;
 }
 
@@ -381,16 +416,18 @@ static void nvdimm_init_fit_buffer(NvdimmFitBuffer 
*fit_buf)
 fit_buf->fit = g_array_new(false, true /* clear */, 1);
 }
 
-static void nvdimm_build_fit_buffer(NvdimmFitBuffer *fit_buf)
+static void nvdimm_build_fit_buffer(AcpiNVDIMMState *state)
 {
+NvdimmFitBuffer *fit_buf = >fit_buf;
+
 g_array_free(fit_buf->fit, true);
-fit_buf->fit = nvdimm_build_device_structure();
+fit_buf->fit = nvdimm_build_device_structure(state);
 fit_buf->dirty = true;
 }
 
 void nvdimm_plug(AcpiNVDIMMState *state)
 {
-nvdimm_build_fit_buffer(>fit_buf);
+nvdimm_build_fit_buffer(state);
 }
 
 static void nvdimm_build_nfit(AcpiNVDIMMState *state, GArray *table_offsets,
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d768930d02..1b2684c549 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2182,6 +2182,33 @@ static void pc_machine_set_nvdimm(Object *obj, bool 
value, Error **errp)
 pcms->acpi_nvdimm_state.is_enabled = value;
 }
 
+static void pc_machine_get_nvdimm_capabilities(Object *obj, Visitor *v,
+   const char *name, void *opaque,
+   Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+

Re: Draft NVDIMM proposal

2018-05-17 Thread George Dunlap
On 05/15/2018 07:06 PM, Dan Williams wrote:
> On Tue, May 15, 2018 at 7:19 AM, George Dunlap  
> wrote:
>> So, who decides what this SPA range and interleave set is?  Can the
>> operating system change these interleave sets and mappings, or change
>> data from PMEM to BLK, and is so, how?
> 
> The interleave-set to SPA range association and delineation of
> capacity between PMEM and BLK access modes is current out-of-scope for
> ACPI. The BIOS reports the configuration to the OS via the NFIT, but
> the configuration is currently written by vendor specific tooling.
> Longer term it would be great for this mechanism to become
> standardized and available to the OS, but for now it requires platform
> specific tooling to change the DIMM interleave configuration.

OK -- I was sort of assuming that different hardware would have
different drivers in Linux that ndctl knew how to drive (just like any
other hardware with vendor-specific interfaces); but it sounds a bit
more like at the moment it's binary blobs either in the BIOS/firmware,
or a vendor-supplied tool.

>> And so (here's another guess) -- when you're talking about namespaces
>> and label areas, you're talking about namespaces stored *within a
>> pre-existing SPA range*.  You use the same format as described in the
>> UEFI spec, but ignore all the stuff about interleave sets and whatever,
>> and use system physical addresses relative to the SPA range rather than
>> DPAs.
> 
> Well, we don't ignore it because we need to validate in the driver
> that the interleave set configuration matches a checksum that we
> generated when the namespace was first instantiated on the interleave
> set. However, you are right, for accesses at run time all we care
> about is the SPA for PMEM accesses.
[snip]
> They can change, but only under the control of the BIOS. All changes
> to the interleave set configuration need a reboot because the memory
> controller needs to be set up differently at system-init time.
[snip]
> No, the checksum I'm referring to is the interleave set cookie (see:
> "SetCookie" in the UEFI 2.7 specification). It validates that the
> interleave set backing the SPA has not changed configuration since the
> last boot.
[snip]
> The NVDIMM just provides storage area for the OS to write opaque data
> that just happens to conform to the UEFI Namespace label format. The
> interleave-set configuration is stored in yet another out-of-band
> location on the DIMM or on some platform-specific storage location and
> is consulted / restored by the BIOS each boot. The NFIT is the output
> from the platform specific physical mappings of the DIMMs, and
> Namespaces are logical volumes built on top of those hard-defined NFIT
> boundaries.

OK, so what I'm hearing is:

The label area isn't "within a pre-existing SPA range" as I was guessing
(i.e., similar to a partition table residing within a disk); it is the
per-DIMM label area as described by UEFI spec.

But, the interleave set data in the label area doesn't *control* the
hardware -- the NVDIMM controller / bios / firmware don't read it or do
anything based on what's in it.  Rather, the interleave set data in the
label area is there to *record*, for the operating system's benefit,
what the hardware configuration was when the labels were created, so
that if it changes, the OS knows that the label area is invalid; it must
either refrain from touching the NVRAM (if it wants to preserve the
data), or write a new label area.

The OS can also use labels to partition a single SPA range into several
namespaces.  It can't change the interleaving, but it can specify that
[0-A) is one namespace, [A-B) is another namespace,  and these
namespaces will naturally map into the SPA range advertised in the NFIT.

And if a controller allows the same memory to be used either as PMEM or
PBLK, it can write which *should* be used for which, and then can avoid
accessing the same underlying NVRAM in two different ways (which will
yield unpredictable results).

That makes sense.

>> If SPA regions don't change after boot, and if Xen can find its own
>> Xen-specific namespace to use for the frame tables by reading the NFIT
>> table, then that significantly reduces the amount of interaction it
>> needs with Linux.
>>
>> If SPA regions *can* change after boot, and if Xen must rely on Linux to
>> read labels and find out what it can safely use for frame tables, then
>> it makes things significantly more involved.  Not impossible by any
>> means, but a lot more complicated.
>>
>> Hope all that makes sense -- thanks again for your help.
> 
> I think it does, but it seems namespaces are out of reach for Xen
> without some agent / enabling that can execute the necessary AML
> methods.

Sure, we're pretty much used to that. :-)  We'll have Linux read the
label area and tell Xen what it needs to know.  But:

* Xen can know the SPA ranges of all potential NVDIMMs before dom0
starts.  So it can tell, for instance, if a page