Re: [PATCH v6 02/12] mm/sparsemem: Introduce common definitions for the size and mask of a section

2019-05-03 Thread Robin Murphy

On 03/05/2019 01:41, Dan Williams wrote:

On Thu, May 2, 2019 at 7:53 AM Pavel Tatashin  wrote:


On Wed, Apr 17, 2019 at 2:52 PM Dan Williams  wrote:


Up-level the local section size and mask from kernel/memremap.c to
global definitions.  These will be used by the new sub-section hotplug
support.

Cc: Michal Hocko 
Cc: Vlastimil Babka 
Cc: Jérôme Glisse 
Cc: Logan Gunthorpe 
Signed-off-by: Dan Williams 


Should be dropped from this series as it has been replaced by a very
similar patch in the mainline:

7c697d7fb5cb14ef60e2b687333ba3efb74f73da
  mm/memremap: Rename and consolidate SECTION_SIZE


I saw that patch fly by and acked it, but I have not seen it picked up
anywhere. I grabbed latest -linus and -next, but don't see that
commit.

$ git show 7c697d7fb5cb14ef60e2b687333ba3efb74f73da
fatal: bad object 7c697d7fb5cb14ef60e2b687333ba3efb74f73da


Yeah, I don't recognise that ID either, nor have I had any notifications 
that Andrew's picked up anything of mine yet :/


Robin.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: arch_wb_cache_pmem on ARM64

2018-05-30 Thread Robin Murphy

On 30/05/18 18:00, Mikulas Patocka wrote:

Hi

I'd like to ask what's the purpose of dmb(osh) in the function
arch_wb_cache_pmem in arch/arm64/mm/flush.c?

void arch_wb_cache_pmem(void *addr, size_t size)
{
 /* Ensure order against any prior non-cacheable writes */
 dmb(osh);
 __clean_dcache_area_pop(addr, size);
}

The processor may flush the cache spontaneously, that means that all the
flushing may actually happen before the dmb(osh) instruction - so what
does that dmb instruction guard against?


IIRC the (very subtle) problem was to do with the odd case of a 
transparent (i.e. beyond the PoC) system cache - if data has been 
written to the pmem region via some non-cacheable alias, then the 
barrier was necessary to ensure that cache maintenance via the 
inner-shareable kernel mapping can push any data already at the PoC 
further along to the PoP.


Robin.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v8] dma-mapping: introduce dma_get_iommu_domain()

2017-10-09 Thread Robin Murphy
Hi Dan,

On 08/10/17 04:45, Dan Williams wrote:
> Add a dma-mapping api helper to retrieve the generic iommu_domain for a 
> device.
> The motivation for this interface is making RDMA transfers to DAX mappings
> safe. If the DAX file's block map changes we need to be to reliably stop
> accesses to blocks that have been freed or re-assigned to a new file.

...which is also going to require some way to force the IOMMU drivers
(on x86 at least) to do a fully-synchronous unmap, instead of just
throwing the IOVA onto a flush queue to invalidate the TLBs at some
point in the future. Assuming of course that there's an IOMMU both
present and performing DMA translation in the first place.

> With the
> iommu_domain and a callback from the DAX filesystem the kernel can safely
> revoke access to a DMA device. The process that performed the RDMA memory
> registration is also notified of this revocation event, but the kernel can not
> otherwise be in the position of waiting for userspace to quiesce the device.

OK, but why reinvent iommu_get_domain_for_dev()?

> Since PMEM+DAX is currently only enabled for x86, we only update the x86
> iommu drivers.

Note in particular that those two drivers happen to be the *only* place
this approach could work - everyone else is going to have to fall back
to the generic IOMMU API function anyway.

Robin.

> Cc: Marek Szyprowski <m.szyprow...@samsung.com>
> Cc: Robin Murphy <robin.mur...@arm.com>
> Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
> Cc: Joerg Roedel <j...@8bytes.org>
> Cc: David Woodhouse <dw...@infradead.org>
> Cc: Ashok Raj <ashok@intel.com>
> Cc: Jan Kara <j...@suse.cz>
> Cc: Jeff Moyer <jmo...@redhat.com>
> Cc: Christoph Hellwig <h...@lst.de>
> Cc: Dave Chinner <da...@fromorbit.com>
> Cc: "Darrick J. Wong" <darrick.w...@oracle.com>
> Cc: Ross Zwisler <ross.zwis...@linux.intel.com>
> Signed-off-by: Dan Williams <dan.j.willi...@intel.com>
> ---
> Changes since v7:
> * retrieve the iommu_domain so that we can later pass the results of
>   dma_map_* to iommu_unmap() in advance of the actual dma_unmap_*.
> 
>  drivers/base/dma-mapping.c  |   10 ++
>  drivers/iommu/amd_iommu.c   |   10 ++
>  drivers/iommu/intel-iommu.c |   15 +++
>  include/linux/dma-mapping.h |3 +++
>  4 files changed, 38 insertions(+)
> 
> diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
> index e584eddef0a7..fdb9764f95a4 100644
> --- a/drivers/base/dma-mapping.c
> +++ b/drivers/base/dma-mapping.c
> @@ -369,3 +369,13 @@ void dma_deconfigure(struct device *dev)
>   of_dma_deconfigure(dev);
>   acpi_dma_deconfigure(dev);
>  }
> +
> +struct iommu_domain *dma_get_iommu_domain(struct device *dev)
> +{
> + const struct dma_map_ops *ops = get_dma_ops(dev);
> +
> + if (ops && ops->get_iommu)
> + return ops->get_iommu(dev);
> + return NULL;
> +}
> +EXPORT_SYMBOL(dma_get_iommu_domain);
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 51f8215877f5..c8e1a45af182 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2271,6 +2271,15 @@ static struct protection_domain *get_domain(struct 
> device *dev)
>   return domain;
>  }
>  
> +static struct iommu_domain *amd_dma_get_iommu(struct device *dev)
> +{
> + struct protection_domain *domain = get_domain(dev);
> +
> + if (IS_ERR(domain))
> + return NULL;
> + return >domain;
> +}
> +
>  static void update_device_table(struct protection_domain *domain)
>  {
>   struct iommu_dev_data *dev_data;
> @@ -2689,6 +2698,7 @@ static const struct dma_map_ops amd_iommu_dma_ops = {
>   .unmap_sg   = unmap_sg,
>   .dma_supported  = amd_iommu_dma_supported,
>   .mapping_error  = amd_iommu_mapping_error,
> + .get_iommu  = amd_dma_get_iommu,
>  };
>  
>  static int init_reserved_iova_ranges(void)
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 6784a05dd6b2..f3f4939cebad 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -3578,6 +3578,20 @@ static int iommu_no_mapping(struct device *dev)
>   return 0;
>  }
>  
> +static struct iommu_domain *intel_dma_get_iommu(struct device *dev)
> +{
> + struct dmar_domain *domain;
> +
> + if (iommu_no_mapping(dev))
> + return NULL;
> +
> + domain = get_valid_domain_for_dev(dev);
> + if (!domain)
> + return NULL;
> +
> + return >domain;
> +}
> +
>  static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,

[PATCH v2] nd_blk: Remove mmio_flush_range()

2017-08-31 Thread Robin Murphy
mmio_flush_range() suffers from a lack of clearly-defined semantics,
and is somewhat ambiguous to port to other architectures where the
scope of the writeback implied by "flush" and ordering might matter,
but MMIO would tend to imply non-cacheable anyway. Per the rationale
in 67a3e8fe9015 ("nd_blk: change aperture mapping from WC to WB"), the
only existing use is actually to invalidate clean cache lines for
ARCH_MEMREMAP_PMEM type mappings *without* writeback. Since the recent
cleanup of the pmem API, that also now happens to be the exact purpose
of arch_invalidate_pmem(), which would be a far more well-defined tool
for the job.

Rather than risk potentially inconsistent implementations of
mmio_flush_range() for the sake of one callsite, streamline things by
removing it entirely and instead move the ARCH_MEMREMAP_PMEM related
definitions up to the libnvdimm level, so they can be shared by NFIT
as well. This allows NFIT to be enabled for arm64.

Signed-off-by: Robin Murphy <robin.mur...@arm.com>
---
 arch/x86/Kconfig  |  1 -
 arch/x86/include/asm/cacheflush.h |  2 --
 drivers/acpi/nfit/Kconfig |  2 +-
 drivers/acpi/nfit/core.c  |  2 +-
 drivers/nvdimm/pmem.h | 14 --
 include/linux/libnvdimm.h | 15 +++
 lib/Kconfig   |  3 ---
 tools/testing/nvdimm/test/nfit.c  |  4 ++--
 8 files changed, 19 insertions(+), 24 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 781521b7cf9e..5f3b756ec0d3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -53,7 +53,6 @@ config X86
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOVif X86_64
-   select ARCH_HAS_MMIO_FLUSH
select ARCH_HAS_PMEM_APIif X86_64
select ARCH_HAS_UACCESS_FLUSHCACHE  if X86_64
select ARCH_HAS_SET_MEMORY
diff --git a/arch/x86/include/asm/cacheflush.h 
b/arch/x86/include/asm/cacheflush.h
index 8b4140f6724f..cb9a1af109b4 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -7,6 +7,4 @@
 
 void clflush_cache_range(void *addr, unsigned int size);
 
-#define mmio_flush_range(addr, size) clflush_cache_range(addr, size)
-
 #endif /* _ASM_X86_CACHEFLUSH_H */
diff --git a/drivers/acpi/nfit/Kconfig b/drivers/acpi/nfit/Kconfig
index 6d3351452ea2..929ba4da0b30 100644
--- a/drivers/acpi/nfit/Kconfig
+++ b/drivers/acpi/nfit/Kconfig
@@ -2,7 +2,7 @@ config ACPI_NFIT
tristate "ACPI NVDIMM Firmware Interface Table (NFIT)"
depends on PHYS_ADDR_T_64BIT
depends on BLK_DEV
-   depends on ARCH_HAS_MMIO_FLUSH
+   depends on ARCH_HAS_PMEM_API
select LIBNVDIMM
help
  Infrastructure to probe ACPI 6 compliant platforms for
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 19182d091587..ee7726a16693 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1930,7 +1930,7 @@ static int acpi_nfit_blk_single_io(struct nfit_blk 
*nfit_blk,
memcpy_flushcache(mmio->addr.aperture + offset, iobuf + 
copied, c);
else {
if (nfit_blk->dimm_flags & NFIT_BLK_READ_FLUSH)
-   mmio_flush_range((void __force *)
+   arch_invalidate_pmem((void __force *)
mmio->addr.aperture + offset, c);
 
memcpy(iobuf + copied, mmio->addr.aperture + offset, c);
diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h
index 5434321cad67..c5917f040fa7 100644
--- a/drivers/nvdimm/pmem.h
+++ b/drivers/nvdimm/pmem.h
@@ -5,20 +5,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_ARCH_HAS_PMEM_API
-#define ARCH_MEMREMAP_PMEM MEMREMAP_WB
-void arch_wb_cache_pmem(void *addr, size_t size);
-void arch_invalidate_pmem(void *addr, size_t size);
-#else
-#define ARCH_MEMREMAP_PMEM MEMREMAP_WT
-static inline void arch_wb_cache_pmem(void *addr, size_t size)
-{
-}
-static inline void arch_invalidate_pmem(void *addr, size_t size)
-{
-}
-#endif
-
 /* this definition is in it's own header for tools/testing/nvdimm to consume */
 struct pmem_device {
/* One contiguous memory region per device */
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index f3d3e6af8838..d11bc9881206 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -173,4 +173,19 @@ u64 nd_fletcher64(void *addr, size_t len, bool le);
 void nvdimm_flush(struct nd_region *nd_region);
 int nvdimm_has_flush(struct nd_region *nd_region);
 int nvdimm_has_cache(struct nd_region *nd_region);
+
+#ifdef CONFIG_ARCH_HAS_PMEM_API
+#define ARCH_MEMREMAP_PMEM MEMREMAP_WB
+void arch_wb_cache_pmem(void *addr, size_t size);
+void arch_invalidate_pmem(void *addr, size_t size);
+#else
+#define ARCH_MEMREMAP_PMEM MEMREMAP_WT
+static inline void arch_wb_cach

Re: [v6,2/8] dmaengine: Add DMA_MEMCPY_SG transaction op

2017-08-31 Thread Robin Murphy
On 30/08/17 19:25, Dave Jiang wrote:
> On 08/30/2017 11:18 AM, Robin Murphy wrote:
>> On 25/08/17 21:59, Dave Jiang wrote:
>>> Adding a dmaengine transaction operation that allows copy to/from a
>>> scatterlist and a flat buffer.
>>
>> Apologies if I'm late to the party, but doesn't DMA_SG already cover
>> this use-case? As far as I can see, all this does is save the caller
>> from setting up a single-entry scatterlist to describe the buffer - even
>> if such a simplified interface is justified it seems like something that
>> could be implemented as a wrapper around dmaengine_prep_dma_sg() rather
>> than the providers having to implement a whole extra callback.
>>
> 
> DMA_SG is queued to be removed in 4.14. There is no in kernel consumer
> for the code.

Ah, I see, that's what I was missing. So we're effectively just
replacing that interface with a more pragmatic alternative - that makes
sense.

Thanks,
Robin.

>>>
>>> Signed-off-by: Dave Jiang <dave.ji...@intel.com>
>>> ---
>>>  Documentation/dmaengine/provider.txt |3 +++
>>>  drivers/dma/dmaengine.c  |2 ++
>>>  include/linux/dmaengine.h|   19 +++
>>>  3 files changed, 24 insertions(+)
>>>
>>> diff --git a/Documentation/dmaengine/provider.txt 
>>> b/Documentation/dmaengine/provider.txt
>>> index a75f52f..6241e36 100644
>>> --- a/Documentation/dmaengine/provider.txt
>>> +++ b/Documentation/dmaengine/provider.txt
>>> @@ -181,6 +181,9 @@ Currently, the types available are:
>>>  - Used by the client drivers to register a callback that will be
>>>called on a regular basis through the DMA controller interrupt
>>>  
>>> +  * DMA_MEMCPY_SG
>>> +- The device supports scatterlist to/from memory.
>>> +
>>>* DMA_PRIVATE
>>>  - The devices only supports slave transfers, and as such isn't
>>>available for async transfers.
>>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>>> index 428b141..4d2c4e1 100644
>>> --- a/drivers/dma/dmaengine.c
>>> +++ b/drivers/dma/dmaengine.c
>>> @@ -937,6 +937,8 @@ int dma_async_device_register(struct dma_device *device)
>>> !device->device_prep_dma_memset);
>>> BUG_ON(dma_has_cap(DMA_INTERRUPT, device->cap_mask) &&
>>> !device->device_prep_dma_interrupt);
>>> +   BUG_ON(dma_has_cap(DMA_MEMCPY_SG, device->cap_mask) &&
>>> +   !device->device_prep_dma_memcpy_sg);
>>> BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
>>> !device->device_prep_dma_cyclic);
>>> BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
>>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>>> index 64fbd38..0c91411 100644
>>> --- a/include/linux/dmaengine.h
>>> +++ b/include/linux/dmaengine.h
>>> @@ -67,6 +67,7 @@ enum dma_transaction_type {
>>> DMA_PQ_VAL,
>>> DMA_MEMSET,
>>> DMA_MEMSET_SG,
>>> +   DMA_MEMCPY_SG,
>>> DMA_INTERRUPT,
>>> DMA_PRIVATE,
>>> DMA_ASYNC_TX,
>>> @@ -692,6 +693,7 @@ struct dma_filter {
>>>   * @device_prep_dma_pq_val: prepares a pqzero_sum operation
>>>   * @device_prep_dma_memset: prepares a memset operation
>>>   * @device_prep_dma_memset_sg: prepares a memset operation over a scatter 
>>> list
>>> + * @device_prep_dma_memcpy_sg: prepares memcpy between scatterlist and 
>>> buffer
>>>   * @device_prep_dma_interrupt: prepares an end of chain interrupt operation
>>>   * @device_prep_slave_sg: prepares a slave dma operation
>>>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for 
>>> audio.
>>> @@ -768,6 +770,10 @@ struct dma_device {
>>> struct dma_async_tx_descriptor *(*device_prep_dma_memset_sg)(
>>> struct dma_chan *chan, struct scatterlist *sg,
>>> unsigned int nents, int value, unsigned long flags);
>>> +   struct dma_async_tx_descriptor *(*device_prep_dma_memcpy_sg)(
>>> +   struct dma_chan *chan,
>>> +   struct scatterlist *sg, unsigned int sg_nents,
>>> +   dma_addr_t buf, bool to_sg, unsigned long flags);
>>> struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)(
>>> struct dma_chan *chan, unsigned long flags);
>>>  
>>> @@ -899,6 +905,19 @@ static inline struct 

Re: [v6,2/8] dmaengine: Add DMA_MEMCPY_SG transaction op

2017-08-30 Thread Robin Murphy
On 25/08/17 21:59, Dave Jiang wrote:
> Adding a dmaengine transaction operation that allows copy to/from a
> scatterlist and a flat buffer.

Apologies if I'm late to the party, but doesn't DMA_SG already cover
this use-case? As far as I can see, all this does is save the caller
from setting up a single-entry scatterlist to describe the buffer - even
if such a simplified interface is justified it seems like something that
could be implemented as a wrapper around dmaengine_prep_dma_sg() rather
than the providers having to implement a whole extra callback.

Robin.

> 
> Signed-off-by: Dave Jiang 
> ---
>  Documentation/dmaengine/provider.txt |3 +++
>  drivers/dma/dmaengine.c  |2 ++
>  include/linux/dmaengine.h|   19 +++
>  3 files changed, 24 insertions(+)
> 
> diff --git a/Documentation/dmaengine/provider.txt 
> b/Documentation/dmaengine/provider.txt
> index a75f52f..6241e36 100644
> --- a/Documentation/dmaengine/provider.txt
> +++ b/Documentation/dmaengine/provider.txt
> @@ -181,6 +181,9 @@ Currently, the types available are:
>  - Used by the client drivers to register a callback that will be
>called on a regular basis through the DMA controller interrupt
>  
> +  * DMA_MEMCPY_SG
> +- The device supports scatterlist to/from memory.
> +
>* DMA_PRIVATE
>  - The devices only supports slave transfers, and as such isn't
>available for async transfers.
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index 428b141..4d2c4e1 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -937,6 +937,8 @@ int dma_async_device_register(struct dma_device *device)
>   !device->device_prep_dma_memset);
>   BUG_ON(dma_has_cap(DMA_INTERRUPT, device->cap_mask) &&
>   !device->device_prep_dma_interrupt);
> + BUG_ON(dma_has_cap(DMA_MEMCPY_SG, device->cap_mask) &&
> + !device->device_prep_dma_memcpy_sg);
>   BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
>   !device->device_prep_dma_cyclic);
>   BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 64fbd38..0c91411 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -67,6 +67,7 @@ enum dma_transaction_type {
>   DMA_PQ_VAL,
>   DMA_MEMSET,
>   DMA_MEMSET_SG,
> + DMA_MEMCPY_SG,
>   DMA_INTERRUPT,
>   DMA_PRIVATE,
>   DMA_ASYNC_TX,
> @@ -692,6 +693,7 @@ struct dma_filter {
>   * @device_prep_dma_pq_val: prepares a pqzero_sum operation
>   * @device_prep_dma_memset: prepares a memset operation
>   * @device_prep_dma_memset_sg: prepares a memset operation over a scatter 
> list
> + * @device_prep_dma_memcpy_sg: prepares memcpy between scatterlist and buffer
>   * @device_prep_dma_interrupt: prepares an end of chain interrupt operation
>   * @device_prep_slave_sg: prepares a slave dma operation
>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for 
> audio.
> @@ -768,6 +770,10 @@ struct dma_device {
>   struct dma_async_tx_descriptor *(*device_prep_dma_memset_sg)(
>   struct dma_chan *chan, struct scatterlist *sg,
>   unsigned int nents, int value, unsigned long flags);
> + struct dma_async_tx_descriptor *(*device_prep_dma_memcpy_sg)(
> + struct dma_chan *chan,
> + struct scatterlist *sg, unsigned int sg_nents,
> + dma_addr_t buf, bool to_sg, unsigned long flags);
>   struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)(
>   struct dma_chan *chan, unsigned long flags);
>  
> @@ -899,6 +905,19 @@ static inline struct dma_async_tx_descriptor 
> *dmaengine_prep_dma_memcpy(
>   len, flags);
>  }
>  
> +static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_memcpy_sg(
> + struct dma_chan *chan, struct scatterlist *sg,
> + unsigned int sg_nents, dma_addr_t buf, bool to_sg,
> + unsigned long flags)
> +{
> + if (!chan || !chan->device ||
> + !chan->device->device_prep_dma_memcpy_sg)
> + return NULL;
> +
> + return chan->device->device_prep_dma_memcpy_sg(chan, sg, sg_nents,
> +buf, to_sg, flags);
> +}
> +
>  /**
>   * dmaengine_terminate_all() - Terminate all active DMA transfers
>   * @chan: The channel for which to terminate the transfers
> 
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 5/6] arm64: Implement pmem API support

2017-08-04 Thread Robin Murphy
On 04/08/17 19:09, Dan Williams wrote:
> On Fri, Aug 4, 2017 at 10:43 AM, Robin Murphy <robin.mur...@arm.com> wrote:
>> On 04/08/17 16:25, Catalin Marinas wrote:
>>> Two minor comments below.
>>>
>>> On Tue, Jul 25, 2017 at 11:55:42AM +0100, Robin Murphy wrote:
>>>> --- a/arch/arm64/Kconfig
>>>> +++ b/arch/arm64/Kconfig
>>>> @@ -960,6 +960,17 @@ config ARM64_UAO
>>>>regular load/store instructions if the cpu does not implement the
>>>>feature.
>>>>
>>>> +config ARM64_PMEM
>>>> +bool "Enable support for persistent memory"
>>>> +select ARCH_HAS_PMEM_API
>>>> +help
>>>> +  Say Y to enable support for the persistent memory API based on the
>>>> +  ARMv8.2 DCPoP feature.
>>>> +
>>>> +  The feature is detected at runtime, and the kernel will use DC CVAC
>>>> +  operations if DC CVAP is not supported (following the behaviour of
>>>> +  DC CVAP itself if the system does not define a point of 
>>>> persistence).
>>>
>>> Any reason not to have this default y?
>>
>> Mostly because it's untested, and not actually useful without some way
>> of describing persistent memory regions to the kernel (I'm currently
>> trying to make sense of what exactly ARCH_HAS_MMIO_FLUSH is supposed to
>> mean in order to enable ACPI NFIT support).
> 
> This is related to block-aperture support described by the NFIT where
> a sliding-memory-mapped window can be programmed to access different
> ranges of the NVDIMM. Before the window is programmed to a new
> DIMM-address we need to flush any dirty data through the current
> window setting to media. See the call to mmio_flush_range() in
> acpi_nfit_blk_single_io(). I think it's ok to omit ARCH_HAS_MMIO_FLUSH
> support, and add a configuration option to compile out the
> block-aperture support.

Oh, I have every intention of implementing it one way or another if
necessary - it's not difficult, it's just been a question of working
through the NFIT code to figure out the subtleties of translation to
arm64 ;)

If mmio_flush_range() is for true MMIO (i.e. __iomem) mappings, then
arm64 should only need a barrier, rather than actual cache operations.
If on the other hand it's misleadingly named and only actually used on
MEMREMAP_WB mappings (as I'm staring to think it might be), then I can't
help thinking it could simply go away in favour of arch_wb_pmem(), since
that now seems to have those same semantics and intent, plus a much more
appropriate name.

Robin.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 5/6] arm64: Implement pmem API support

2017-08-04 Thread Robin Murphy
On 04/08/17 16:25, Catalin Marinas wrote:
> Two minor comments below.
> 
> On Tue, Jul 25, 2017 at 11:55:42AM +0100, Robin Murphy wrote:
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -960,6 +960,17 @@ config ARM64_UAO
>>regular load/store instructions if the cpu does not implement the
>>feature.
>>  
>> +config ARM64_PMEM
>> +bool "Enable support for persistent memory"
>> +select ARCH_HAS_PMEM_API
>> +help
>> +  Say Y to enable support for the persistent memory API based on the
>> +  ARMv8.2 DCPoP feature.
>> +
>> +  The feature is detected at runtime, and the kernel will use DC CVAC
>> +  operations if DC CVAP is not supported (following the behaviour of
>> +  DC CVAP itself if the system does not define a point of persistence).
> 
> Any reason not to have this default y?

Mostly because it's untested, and not actually useful without some way
of describing persistent memory regions to the kernel (I'm currently
trying to make sense of what exactly ARCH_HAS_MMIO_FLUSH is supposed to
mean in order to enable ACPI NFIT support).

There's also the potential issue that we can't disable ARCH_HAS_PMEM_API
at runtime on pre-v8.2 systems where DC CVAC may not strictly give the
guarantee of persistence that that is supposed to imply. However, I
guess that's more of an open problem, since even on a v8.2 CPU reporting
(mandatory) DC CVAP support we've still no way to actually know whether
the interconnect/memory controller/etc. of any old system is up to the
job. At this point I'm mostly hoping that people will only be sticking
NVDIMMs into systems that *are* properly designed to support them, v8.2
CPUs or not.

>> --- a/arch/arm64/mm/cache.S
>> +++ b/arch/arm64/mm/cache.S
>> @@ -172,6 +172,20 @@ ENDPIPROC(__clean_dcache_area_poc)
>>  ENDPROC(__dma_clean_area)
>>  
>>  /*
>> + *  __clean_dcache_area_pop(kaddr, size)
>> + *
>> + *  Ensure that any D-cache lines for the interval [kaddr, kaddr+size)
>> + *  are cleaned to the PoP.
>> + *
>> + *  - kaddr   - kernel address
>> + *  - size- size in question
>> + */
>> +ENTRY(__clean_dcache_area_pop)
>> +dcache_by_line_op cvap, sy, x0, x1, x2, x3
>> +ret
>> +ENDPIPROC(__clean_dcache_area_pop)
>> +
>> +/*
>>   *  __dma_flush_area(start, size)
>>   *
>>   *  clean & invalidate D / U line
>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>> index a682a0a2a0fa..a461a00ceb3e 100644
>> --- a/arch/arm64/mm/pageattr.c
>> +++ b/arch/arm64/mm/pageattr.c
>> @@ -183,3 +183,21 @@ bool kernel_page_present(struct page *page)
>>  }
>>  #endif /* CONFIG_HIBERNATION */
>>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>> +
>> +#ifdef CONFIG_ARCH_HAS_PMEM_API
>> +#include 
>> +
>> +static inline void arch_wb_cache_pmem(void *addr, size_t size)
>> +{
>> +/* Ensure order against any prior non-cacheable writes */
>> +dmb(sy);
>> +__clean_dcache_area_pop(addr, size);
>> +}
> 
> Could we keep the dmb() in the actual __clean_dcache_area_pop()
> implementation?

Mark held the opinion that it should follow the same pattern as the
other cache maintenance primitives - e.g. we don't have such a dmb in
__inval_cache_range(), but do place them at callsites where we know it
may be necessary (head.S) - and I found it hard to disagree. The callers
in patch #6 should never need a barrier, and arguably we may not even
need this one, since it looks like pmem should currently always be
mapped as MEMREMAP_WB if ARCH_HAS_PMEM_API.

> I can do the changes myself if you don't have any objections.

If you would prefer to play safe and move it back into the assembly
that's fine by me, but note that the associated comments in patch #6
should also be removed if so.

Robin.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 0/6] arm64 pmem support

2017-07-25 Thread Robin Murphy
Hi all,

With the latest updates to the pmem API, the arch code contribution
becomes very straightforward to wire up - I think there's about as
much code here to just cope with the existence of our new instruction
as there is to actually make use of it. I don't have access to any
NVDIMMs nor suitable hardware to put them in, so this is written purely
to spec - the extent of testing has been the feature detection on a
v8.2 Fast Model vs. v8.0 systems.

Patch #1 could go in as a fix ahead of the rest; it just needs to come
before patch #5 to prevent that blowing up the build.

Robin.

Robin Murphy (6):
  arm64: mm: Fix set_memory_valid() declaration
  arm64: Convert __inval_cache_range() to area-based
  arm64: Expose DC CVAP to userspace
  arm64: Handle trapped DC CVAP
  arm64: Implement pmem API support
  arm64: uaccess: Implement *_flushcache variants

 Documentation/arm64/cpu-feature-registers.txt |  2 ++
 arch/arm64/Kconfig| 12 +++
 arch/arm64/include/asm/assembler.h|  6 
 arch/arm64/include/asm/cacheflush.h   |  4 ++-
 arch/arm64/include/asm/cpucaps.h  |  3 +-
 arch/arm64/include/asm/esr.h  |  3 +-
 arch/arm64/include/asm/string.h   |  4 +++
 arch/arm64/include/asm/sysreg.h   |  1 +
 arch/arm64/include/asm/uaccess.h  | 12 +++
 arch/arm64/include/uapi/asm/hwcap.h   |  1 +
 arch/arm64/kernel/cpufeature.c| 13 
 arch/arm64/kernel/cpuinfo.c   |  1 +
 arch/arm64/kernel/head.S  | 18 +-
 arch/arm64/kernel/traps.c |  3 ++
 arch/arm64/lib/Makefile   |  2 ++
 arch/arm64/lib/uaccess_flushcache.c   | 47 +++
 arch/arm64/mm/cache.S | 37 -
 arch/arm64/mm/pageattr.c  | 18 ++
 18 files changed, 166 insertions(+), 21 deletions(-)
 create mode 100644 arch/arm64/lib/uaccess_flushcache.c

-- 
2.12.2.dirty

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 2/6] arm64: Convert __inval_cache_range() to area-based

2017-07-25 Thread Robin Murphy
__inval_cache_range() is already the odd one out among our data cache
maintenance routines as the only remaining range-based one; as we're
going to want an invalidation routine to call from C code for the pmem
API, let's tweak the prototype and name to bring it in line with the
clean operations, and to make its relationship with __dma_inv_area()
neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area().
The loop clearing the early page tables gets mildly massaged in the
process for the sake of consistency.

Signed-off-by: Robin Murphy <robin.mur...@arm.com>
---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/arm64/kernel/head.S| 18 +-
 arch/arm64/mm/cache.S   | 23 ++-
 3 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h 
b/arch/arm64/include/asm/cacheflush.h
index 4d4f650c290e..b4b43a94dffd 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -67,6 +67,7 @@
  */
 extern void flush_icache_range(unsigned long start, unsigned long end);
 extern void __flush_dcache_area(void *addr, size_t len);
+extern void __inval_dcache_area(void *addr, size_t len);
 extern void __clean_dcache_area_poc(void *addr, size_t len);
 extern void __clean_dcache_area_pou(void *addr, size_t len);
 extern long __flush_cache_user_range(unsigned long start, unsigned long end);
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 973df7de7bf8..73a0531e0187 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -143,8 +143,8 @@ preserve_boot_args:
dmb sy  // needed before dc ivac with
// MMU off
 
-   add x1, x0, #0x20   // 4 x 8 bytes
-   b   __inval_cache_range // tail call
+   mov x1, #0x20   // 4 x 8 bytes
+   b   __inval_dcache_area // tail call
 ENDPROC(preserve_boot_args)
 
 /*
@@ -221,20 +221,20 @@ __create_page_tables:
 * dirty cache lines being evicted.
 */
adrpx0, idmap_pg_dir
-   adrpx1, swapper_pg_dir + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE
-   bl  __inval_cache_range
+   ldr x1, =(IDMAP_DIR_SIZE + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
+   bl  __inval_dcache_area
 
/*
 * Clear the idmap and swapper page tables.
 */
adrpx0, idmap_pg_dir
-   adrpx6, swapper_pg_dir + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE
+   ldr x1, =(IDMAP_DIR_SIZE + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
 1: stp xzr, xzr, [x0], #16
stp xzr, xzr, [x0], #16
stp xzr, xzr, [x0], #16
stp xzr, xzr, [x0], #16
-   cmp x0, x6
-   b.lo1b
+   subsx1, x1, #64
+   b.ne1b
 
mov x7, SWAPPER_MM_MMUFLAGS
 
@@ -307,9 +307,9 @@ __create_page_tables:
 * tables again to remove any speculatively loaded cache lines.
 */
adrpx0, idmap_pg_dir
-   adrpx1, swapper_pg_dir + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE
+   ldr x1, =(IDMAP_DIR_SIZE + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
dmb sy
-   bl  __inval_cache_range
+   bl  __inval_dcache_area
 
ret x28
 ENDPROC(__create_page_tables)
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 83c27b6e6dca..ed47fbbb4b05 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -109,20 +109,25 @@ ENTRY(__clean_dcache_area_pou)
 ENDPROC(__clean_dcache_area_pou)
 
 /*
+ * __inval_dcache_area(kaddr, size)
+ *
+ * Ensure that any D-cache lines for the interval [kaddr, kaddr+size)
+ * are invalidated. Any partial lines at the ends of the interval are
+ * also cleaned to PoC to prevent data loss.
+ *
+ * - kaddr   - kernel address
+ * - size- size in question
+ */
+ENTRY(__inval_dcache_area)
+   /* FALLTHROUGH */
+
+/*
  * __dma_inv_area(start, size)
  * - start   - virtual start address of region
  * - size- size in question
  */
 __dma_inv_area:
add x1, x1, x0
-   /* FALLTHROUGH */
-
-/*
- * __inval_cache_range(start, end)
- * - start   - start address of region
- * - end - end address of region
- */
-ENTRY(__inval_cache_range)
dcache_line_size x2, x3
sub x3, x2, #1
tst x1, x3  // end cache line aligned?
@@ -140,7 +145,7 @@ ENTRY(__inval_cache_range)
b.lo2b
dsb sy
ret
-ENDPIPROC(__inval_cache_range)
+ENDPIPROC(__inval_dcache_area)
 ENDPROC(__dma_inv_area)
 
 /*
-- 
2.12.2.dirty

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 3/6] arm64: Expose DC CVAP to userspace

2017-07-25 Thread Robin Murphy
The ARMv8.2-DCPoP feature introduces persistent memory support to the
architecture, by defining a point of persistence in the memory
hierarchy, and a corresponding cache maintenance operation, DC CVAP.
Expose the support via HWCAP and MRS emulation.

Signed-off-by: Robin Murphy <robin.mur...@arm.com>
---
 Documentation/arm64/cpu-feature-registers.txt | 2 ++
 arch/arm64/include/asm/sysreg.h   | 1 +
 arch/arm64/include/uapi/asm/hwcap.h   | 1 +
 arch/arm64/kernel/cpufeature.c| 2 ++
 arch/arm64/kernel/cpuinfo.c   | 1 +
 5 files changed, 7 insertions(+)

diff --git a/Documentation/arm64/cpu-feature-registers.txt 
b/Documentation/arm64/cpu-feature-registers.txt
index d1c97f9f51cc..dad411d635d8 100644
--- a/Documentation/arm64/cpu-feature-registers.txt
+++ b/Documentation/arm64/cpu-feature-registers.txt
@@ -179,6 +179,8 @@ infrastructure:
  | FCMA | [19-16] |y|
  |--|
  | JSCVT| [15-12] |y|
+ |--|
+ | DPB  | [3-0]   |y|
  x--x
 
 Appendix I: Example
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 16e44fa9b3b6..1974731baa91 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -329,6 +329,7 @@
 #define ID_AA64ISAR1_LRCPC_SHIFT   20
 #define ID_AA64ISAR1_FCMA_SHIFT16
 #define ID_AA64ISAR1_JSCVT_SHIFT   12
+#define ID_AA64ISAR1_DPB_SHIFT 0
 
 /* id_aa64pfr0 */
 #define ID_AA64PFR0_GIC_SHIFT  24
diff --git a/arch/arm64/include/uapi/asm/hwcap.h 
b/arch/arm64/include/uapi/asm/hwcap.h
index 4e187ce2a811..4b9344cba83a 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -35,5 +35,6 @@
 #define HWCAP_JSCVT(1 << 13)
 #define HWCAP_FCMA (1 << 14)
 #define HWCAP_LRCPC(1 << 15)
+#define HWCAP_DCPOP(1 << 16)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9f9e0064c8c1..a2542ef3ff25 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -120,6 +120,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 
ID_AA64ISAR1_LRCPC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 
ID_AA64ISAR1_FCMA_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 
ID_AA64ISAR1_JSCVT_SHIFT, 4, 0),
+   ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 
ID_AA64ISAR1_DPB_SHIFT, 4, 0),
ARM64_FTR_END,
 };
 
@@ -916,6 +917,7 @@ static const struct arm64_cpu_capabilities 
arm64_elf_hwcaps[] = {
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 1, 
CAP_HWCAP, HWCAP_FPHP),
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 0, 
CAP_HWCAP, HWCAP_ASIMD),
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 1, 
CAP_HWCAP, HWCAP_ASIMDHP),
+   HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, FTR_UNSIGNED, 
1, CAP_HWCAP, HWCAP_DCPOP),
HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_JSCVT_SHIFT, FTR_UNSIGNED, 
1, CAP_HWCAP, HWCAP_JSCVT),
HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FCMA_SHIFT, FTR_UNSIGNED, 
1, CAP_HWCAP, HWCAP_FCMA),
HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, FTR_UNSIGNED, 
1, CAP_HWCAP, HWCAP_LRCPC),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index f495ee5049fd..311885962830 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -68,6 +68,7 @@ static const char *const hwcap_str[] = {
"jscvt",
"fcma",
"lrcpc",
+   "dcpop",
NULL
 };
 
-- 
2.12.2.dirty

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 6/6] arm64: uaccess: Implement *_flushcache variants

2017-07-25 Thread Robin Murphy
Implement the set of copy functions with guarantees of a clean cache
upon completion necessary to support the pmem driver.

Signed-off-by: Robin Murphy <robin.mur...@arm.com>
---
 arch/arm64/Kconfig  |  1 +
 arch/arm64/include/asm/string.h |  4 
 arch/arm64/include/asm/uaccess.h| 12 ++
 arch/arm64/lib/Makefile |  2 ++
 arch/arm64/lib/uaccess_flushcache.c | 47 +
 5 files changed, 66 insertions(+)
 create mode 100644 arch/arm64/lib/uaccess_flushcache.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0b0576a54724..e43a63b3d14b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -963,6 +963,7 @@ config ARM64_UAO
 config ARM64_PMEM
bool "Enable support for persistent memory"
select ARCH_HAS_PMEM_API
+   select ARCH_HAS_UACCESS_FLUSHCACHE
help
  Say Y to enable support for the persistent memory API based on the
  ARMv8.2 DCPoP feature.
diff --git a/arch/arm64/include/asm/string.h b/arch/arm64/include/asm/string.h
index d0aa42907569..dd95d33a5bd5 100644
--- a/arch/arm64/include/asm/string.h
+++ b/arch/arm64/include/asm/string.h
@@ -52,6 +52,10 @@ extern void *__memset(void *, int, __kernel_size_t);
 #define __HAVE_ARCH_MEMCMP
 extern int memcmp(const void *, const void *, size_t);
 
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+#define __HAVE_ARCH_MEMCPY_FLUSHCACHE
+void memcpy_flushcache(void *dst, const void *src, size_t cnt);
+#endif
 
 #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
 
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 8f0a1de11e4a..bb056fee297c 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -347,4 +347,16 @@ extern long strncpy_from_user(char *dest, const char 
__user *src, long count);
 
 extern __must_check long strnlen_user(const char __user *str, long n);
 
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+struct page;
+void memcpy_page_flushcache(char *to, struct page *page, size_t offset, size_t 
len);
+extern unsigned long __must_check __copy_user_flushcache(void *to, const void 
__user *from, unsigned long n);
+
+static inline int __copy_from_user_flushcache(void *dst, const void __user 
*src, unsigned size)
+{
+   kasan_check_write(dst, size);
+   return __copy_user_flushcache(dst, src, size);
+}
+#endif
+
 #endif /* __ASM_UACCESS_H */
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index c86b7909ef31..a0abc142c92b 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -17,3 +17,5 @@ CFLAGS_atomic_ll_sc.o := -fcall-used-x0 -ffixed-x1 -ffixed-x2 
\
   -fcall-saved-x10 -fcall-saved-x11 -fcall-saved-x12   \
   -fcall-saved-x13 -fcall-saved-x14 -fcall-saved-x15   \
   -fcall-saved-x18
+
+lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
diff --git a/arch/arm64/lib/uaccess_flushcache.c 
b/arch/arm64/lib/uaccess_flushcache.c
new file mode 100644
index ..b6ceafdb8b72
--- /dev/null
+++ b/arch/arm64/lib/uaccess_flushcache.c
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2017 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include 
+#include 
+#include 
+
+void memcpy_flushcache(void *dst, const void *src, size_t cnt)
+{
+   /*
+* We assume this should not be called with @dst pointing to
+* non-cacheable memory, such that we don't need an explicit
+* barrier to order the cache maintenance against the memcpy.
+*/
+   memcpy(dst, src, cnt);
+   __clean_dcache_area_pop(dst, cnt);
+}
+EXPORT_SYMBOL_GPL(memcpy_flushcache);
+
+void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
+   size_t len)
+{
+   memcpy_flushcache(to, page_address(page) + offset, len);
+}
+
+unsigned long __copy_user_flushcache(void *to, const void __user *from,
+unsigned long n)
+{
+   unsigned long rc = __arch_copy_from_user(to, from, n);
+
+   /* See above */
+   __clean_dcache_area_pop(to, n - rc);
+   return rc;
+}
-- 
2.12.2.dirty

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 1/6] arm64: mm: Fix set_memory_valid() declaration

2017-07-25 Thread Robin Murphy
Clearly, set_memory_valid() has never been seen in the same room as its
declaration... Whilst the type mismatch is such that kexec probably
wasn't broken in practice, fix it to match the definition as it should.

Fixes: 9b0aa14e3155 ("arm64: mm: add set_memory_valid()")
Signed-off-by: Robin Murphy <robin.mur...@arm.com>
---
 arch/arm64/include/asm/cacheflush.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cacheflush.h 
b/arch/arm64/include/asm/cacheflush.h
index d74a284abdc2..4d4f650c290e 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -150,6 +150,6 @@ static inline void flush_cache_vunmap(unsigned long start, 
unsigned long end)
 {
 }
 
-int set_memory_valid(unsigned long addr, unsigned long size, int enable);
+int set_memory_valid(unsigned long addr, int numpages, int enable);
 
 #endif
-- 
2.12.2.dirty

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 5/6] arm64: Implement pmem API support

2017-07-25 Thread Robin Murphy
Add a clean-to-point-of-persistence cache maintenance helper, and wire
up the basic architectural support for the pmem driver based on it.

Signed-off-by: Robin Murphy <robin.mur...@arm.com>
---
 arch/arm64/Kconfig  | 11 +++
 arch/arm64/include/asm/assembler.h  |  6 ++
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/arm64/include/asm/cpucaps.h|  3 ++-
 arch/arm64/kernel/cpufeature.c  | 11 +++
 arch/arm64/mm/cache.S   | 14 ++
 arch/arm64/mm/pageattr.c| 18 ++
 7 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index dfd908630631..0b0576a54724 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -960,6 +960,17 @@ config ARM64_UAO
  regular load/store instructions if the cpu does not implement the
  feature.
 
+config ARM64_PMEM
+   bool "Enable support for persistent memory"
+   select ARCH_HAS_PMEM_API
+   help
+ Say Y to enable support for the persistent memory API based on the
+ ARMv8.2 DCPoP feature.
+
+ The feature is detected at runtime, and the kernel will use DC CVAC
+ operations if DC CVAP is not supported (following the behaviour of
+ DC CVAP itself if the system does not define a point of persistence).
+
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index 1b67c3782d00..5d8903c45031 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -353,6 +353,12 @@ alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE
 alternative_else
dc  civac, \kaddr
 alternative_endif
+   .elseif (\op == cvap)
+alternative_if ARM64_HAS_DCPOP
+   sys 3, c7, c12, 1, \kaddr   // dc cvap
+alternative_else
+   dc  cvac, \kaddr
+alternative_endif
.else
dc  \op, \kaddr
.endif
diff --git a/arch/arm64/include/asm/cacheflush.h 
b/arch/arm64/include/asm/cacheflush.h
index b4b43a94dffd..76d1cc85d5b1 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -69,6 +69,7 @@ extern void flush_icache_range(unsigned long start, unsigned 
long end);
 extern void __flush_dcache_area(void *addr, size_t len);
 extern void __inval_dcache_area(void *addr, size_t len);
 extern void __clean_dcache_area_poc(void *addr, size_t len);
+extern void __clean_dcache_area_pop(void *addr, size_t len);
 extern void __clean_dcache_area_pou(void *addr, size_t len);
 extern long __flush_cache_user_range(unsigned long start, unsigned long end);
 extern void sync_icache_aliases(void *kaddr, unsigned long len);
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8d2272c6822c..8da621627d7c 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -39,7 +39,8 @@
 #define ARM64_WORKAROUND_QCOM_FALKOR_E1003 18
 #define ARM64_WORKAROUND_85892119
 #define ARM64_WORKAROUND_CAVIUM_30115  20
+#define ARM64_HAS_DCPOP21
 
-#define ARM64_NCAPS21
+#define ARM64_NCAPS22
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a2542ef3ff25..cd52d365d1f0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -889,6 +889,17 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.min_field_value = 0,
.matches = has_no_fpsimd,
},
+#ifdef CONFIG_ARM64_PMEM
+   {
+   .desc = "Data cache clean to Point of Persistence",
+   .capability = ARM64_HAS_DCPOP,
+   .def_scope = SCOPE_SYSTEM,
+   .matches = has_cpuid_feature,
+   .sys_reg = SYS_ID_AA64ISAR1_EL1,
+   .field_pos = ID_AA64ISAR1_DPB_SHIFT,
+   .min_field_value = 1,
+   },
+#endif
{},
 };
 
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index ed47fbbb4b05..7f1dbe962cf5 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -172,6 +172,20 @@ ENDPIPROC(__clean_dcache_area_poc)
 ENDPROC(__dma_clean_area)
 
 /*
+ * __clean_dcache_area_pop(kaddr, size)
+ *
+ * Ensure that any D-cache lines for the interval [kaddr, kaddr+size)
+ * are cleaned to the PoP.
+ *
+ * - kaddr   - kernel address
+ * - size- size in question
+ */
+ENTRY(__clean_dcache_area_pop)
+   dcache_by_line_op cvap, sy, x0, x1, x2, x3
+   ret
+ENDPIPROC(__clean_dcache_area_pop)
+
+/*
  * __dma_flush_area(start, size)
  *
  * clean & invalidate D / U line
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index a682a0a2a0fa..a461a00ceb3e 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -183,3 +183,