from:"Catalin Marinas"

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-13 Thread Catalin Marinas

On Fri, Sep 13, 2024 at 11:08:23AM +0100, Catalin Marinas wrote:
> On Thu, Sep 12, 2024 at 02:15:59PM -0700, Charlie Jenkins wrote:
> > On Thu, Sep 12, 2024 at 11:53:49AM +0100, Catalin Marinas wrote:
> > > On Wed, Sep 11, 2024 at 11:18:12PM -0700, Charlie Jenkins wrote:
> > > > Opting-in to the higher address space is reasonable. However, it is not
> > > > my preference, because the purpose of this flag is to ensure that
> > > > allocations do not exceed 47-bits, so it is a clearer ABI to have the
> > > > applications that want this guarantee to be the ones setting the flag,
> > > > rather than the applications that want the higher bits setting the flag.
[...]
> Anyway, the prctl() can go both ways, either expanding or limiting the
> default address space. So I'd be fine with such interface.

Ah, I just realised (while reading Lorenzo's reply) that we can't really
restrict the space via a prctl() as we have the main thread stack
already allocated by the kernel before the user code starts. You may
need to limit this stack as well, not just the later heap allocations
(anonymous mmap()).

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-13 Thread Catalin Marinas

On Thu, Sep 12, 2024 at 02:15:59PM -0700, Charlie Jenkins wrote:
> On Thu, Sep 12, 2024 at 11:53:49AM +0100, Catalin Marinas wrote:
> > On Wed, Sep 11, 2024 at 11:18:12PM -0700, Charlie Jenkins wrote:
> > > Opting-in to the higher address space is reasonable. However, it is not
> > > my preference, because the purpose of this flag is to ensure that
> > > allocations do not exceed 47-bits, so it is a clearer ABI to have the
> > > applications that want this guarantee to be the ones setting the flag,
> > > rather than the applications that want the higher bits setting the flag.
> > 
> > Yes, this would be ideal. Unfortunately those applications don't know
> > they need to set a flag in order to work.
> 
> It's not a regression, the applications never worked (on platforms that
> do not have this default). The 47-bit default would allow applications
> that didn't work to start working at the cost of a non-ideal ABI. That
> doesn't seem like a reasonable tradeoff to me.  If applications want to
> run on new hardware that has different requirements, shouldn't they be
> required to update rather than expect the kernel will solve their
> problems for them?

That's a valid point but it depends on the application and how much you
want to spend updating user-space. OpenJDK is fine, if you need a JIT
you'll have to add support for that architecture anyway. But others are
arch-agnostic, you just recompile to your target. It's not an ABI
problem, more of an API one.

The x86 case (and powerpc/arm64) was different, the 47-bit worked for a
long time before expanding it. So it made a lot of sense to keep the
same default.

Anyway, the prctl() can go both ways, either expanding or limiting the
default address space. So I'd be fine with such interface.

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-12 Thread Catalin Marinas

On Wed, Sep 11, 2024 at 11:18:12PM -0700, Charlie Jenkins wrote:
> Opting-in to the higher address space is reasonable. However, it is not
> my preference, because the purpose of this flag is to ensure that
> allocations do not exceed 47-bits, so it is a clearer ABI to have the
> applications that want this guarantee to be the ones setting the flag,
> rather than the applications that want the higher bits setting the flag.

Yes, this would be ideal. Unfortunately those applications don't know
they need to set a flag in order to work.

A slightly better option is to leave the default 47-bit at the kernel
ABI level and have the libc/dynamic loader issue the prctl(). You can
control the default with environment variables if needed.

We do something similar in glibc for arm64 MTE. When MTE is enabled, the
top byte of an allocated pointer contains the tag that must not be
corrupted. We left the decision to the C library via the
glibc.mem.tagging tunable (Android has something similar via the app
manifest). An app can change the default if it wants but if you run with
old glibc or no environment variable to say otherwise, the default would
be safe. Distros can set the environment to be the maximum range by
default if they know the apps included have been upgraded and tested.

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-11 Thread Catalin Marinas

On Tue, Sep 10, 2024 at 05:45:07PM -0700, Charlie Jenkins wrote:
> On Tue, Sep 10, 2024 at 03:08:14PM -0400, Liam R. Howlett wrote:
> > * Catalin Marinas  [240906 07:44]:
> > > On Fri, Sep 06, 2024 at 09:55:42AM +, Arnd Bergmann wrote:
> > > > On Fri, Sep 6, 2024, at 09:14, Guo Ren wrote:
> > > > > On Fri, Sep 6, 2024 at 3:18 PM Arnd Bergmann  wrote:
> > > > >> It's also unclear to me how we want this flag to interact with
> > > > >> the existing logic in arch_get_mmap_end(), which attempts to
> > > > >> limit the default mapping to a 47-bit address space already.
> > > > >
> > > > > To optimize RISC-V progress, I recommend:
> > > > >
> > > > > Step 1: Approve the patch.
> > > > > Step 2: Update Go and OpenJDK's RISC-V backend to utilize it.
> > > > > Step 3: Wait approximately several iterations for Go & OpenJDK
> > > > > Step 4: Remove the 47-bit constraint in arch_get_mmap_end()

Point 4 is an ABI change. What guarantees that there isn't still
software out there that relies on the old behaviour?

> > > > I really want to first see a plausible explanation about why
> > > > RISC-V can't just implement this using a 47-bit DEFAULT_MAP_WINDOW
> > > > like all the other major architectures (x86, arm64, powerpc64),
> > > 
> > > FWIW arm64 actually limits DEFAULT_MAP_WINDOW to 48-bit in the default
> > > configuration. We end up with a 47-bit with 16K pages but for a
> > > different reason that has to do with LPA2 support (I doubt we need this
> > > for the user mapping but we need to untangle some of the macros there;
> > > that's for a separate discussion).
> > > 
> > > That said, we haven't encountered any user space problems with a 48-bit
> > > DEFAULT_MAP_WINDOW. So I also think RISC-V should follow a similar
> > > approach (47 or 48 bit default limit). Better to have some ABI
> > > consistency between architectures. One can still ask for addresses above
> > > this default limit via mmap().
> > 
> > I think that is best as well.
> > 
> > Can we please just do what x86 and arm64 does?
> 
> I responded to Arnd in the other thread, but I am still not convinced
> that the solution that x86 and arm64 have selected is the best solution.
> The solution of defaulting to 47 bits does allow applications the
> ability to get addresses that are below 47 bits. However, due to
> differences across architectures it doesn't seem possible to have all
> architectures default to the same value. Additionally, this flag will be
> able to help users avoid potential bugs where a hint address is passed
> that causes upper bits of a VA to be used.

The reason we added this limit on arm64 is that we noticed programs
using the top 8 bits of a 64-bit pointer for additional information.
IIRC, it wasn't even openJDK but some JavaScript JIT. We could have
taught those programs of a new flag but since we couldn't tell how many
are out there, it was the safest to default to a smaller limit and opt
in to the higher one. Such opt-in is via mmap() but if you prefer a
prctl() flag, that's fine by me as well (though I think this should be
opt-in to higher addresses rather than opt-out of the higher addresses).

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-06 Thread Catalin Marinas

On Fri, Sep 06, 2024 at 09:55:42AM +, Arnd Bergmann wrote:
> On Fri, Sep 6, 2024, at 09:14, Guo Ren wrote:
> > On Fri, Sep 6, 2024 at 3:18 PM Arnd Bergmann  wrote:
> >> It's also unclear to me how we want this flag to interact with
> >> the existing logic in arch_get_mmap_end(), which attempts to
> >> limit the default mapping to a 47-bit address space already.
> >
> > To optimize RISC-V progress, I recommend:
> >
> > Step 1: Approve the patch.
> > Step 2: Update Go and OpenJDK's RISC-V backend to utilize it.
> > Step 3: Wait approximately several iterations for Go & OpenJDK
> > Step 4: Remove the 47-bit constraint in arch_get_mmap_end()
> 
> I really want to first see a plausible explanation about why
> RISC-V can't just implement this using a 47-bit DEFAULT_MAP_WINDOW
> like all the other major architectures (x86, arm64, powerpc64),

FWIW arm64 actually limits DEFAULT_MAP_WINDOW to 48-bit in the default
configuration. We end up with a 47-bit with 16K pages but for a
different reason that has to do with LPA2 support (I doubt we need this
for the user mapping but we need to untangle some of the macros there;
that's for a separate discussion).

That said, we haven't encountered any user space problems with a 48-bit
DEFAULT_MAP_WINDOW. So I also think RISC-V should follow a similar
approach (47 or 48 bit default limit). Better to have some ABI
consistency between architectures. One can still ask for addresses above
this default limit via mmap().

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 11/17] arm64: rework compat syscall macros

2024-07-05 Thread Catalin Marinas

On Thu, Jul 04, 2024 at 04:36:05PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The generated asm/unistd_compat_32.h header file now contains
> macros that can be used directly in the vdso and the signal
> trampolines, so remove the duplicate definitions.
> 
> Signed-off-by: Arnd Bergmann 

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 10/17] arm64: generate 64-bit syscall.tbl

2024-07-05 Thread Catalin Marinas

On Thu, Jul 04, 2024 at 04:36:04PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Change the asm/unistd.h header for arm64 to no longer include
> asm-generic/unistd.h itself, but instead generate both the asm/unistd.h
> contents and the list of entry points using the syscall.tbl scripts that
> we use on most other architectures.
> 
> Once his is done for the remaining architectures, the generic unistd.h
> header can be removed and the generated tbl file put in its place.
> 
> The Makefile changes are more complex than they should be, I need
> a little help to improve those. Ideally this should be done in an
> architecture-independent way as well.
> 
> Signed-off-by: Arnd Bergmann 

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 09/17] arm64: convert unistd_32.h to syscall.tbl format

2024-07-05 Thread Catalin Marinas

On Thu, Jul 04, 2024 at 04:36:03PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> This is a straight conversion from the old asm/unistd32.h into the
> format used by 32-bit arm and most other architectures, calling scripts
> to generate the asm/unistd32.h header and a new asm/syscalls32.h headers.
> 
> I used a semi-automated text replacement method to do the conversion,
> and then used 'vimdiff' to synchronize the whitespace and the (unused)
> names of the non-compat syscalls with the arm version.
> 
> There are two differences between the generated syscalls names and the
> old version:
> 
>  - the old asm/unistd32.h contained only a __NR_sync_file_range2
>entry, while the arm32 version also defines
>__NR_arm_sync_file_range with the same number. I added this
>duplicate back in asm/unistd32.h.
> 
>  - __NR__sysctl was removed from the arm64 file a while ago, but
>all the tables still contain it. This should probably get removed
>everywhere but I added it here for consistency.
> 
> On top of that, the arm64 version does not contain any references to
> the 32-bit OABI syscalls that are not supported by arm64. If we ever
> want to share the file between arm32 and arm64, it would not be
> hard to add support for both in one file.
> 
> Signed-off-by: Arnd Bergmann 

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 4/4] vdso: avoid including asm/page.h

2024-02-27 Thread Catalin Marinas

On Mon, Feb 26, 2024 at 05:14:14PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The recent change to the vdso_data_store broke building compat VDSO
> on at least arm64 because it includes headers outside of the include/vdso/
> namespace:
> 
> In file included from arch/arm64/include/asm/lse.h:5,
>  from arch/arm64/include/asm/cmpxchg.h:14,
>  from arch/arm64/include/asm/atomic.h:16,
>  from include/linux/atomic.h:7,
>  from include/asm-generic/bitops/atomic.h:5,
>  from arch/arm64/include/asm/bitops.h:25,
>  from include/linux/bitops.h:68,
>  from arch/arm64/include/asm/memory.h:209,
>  from arch/arm64/include/asm/page.h:46,
>  from include/vdso/datapage.h:22,
>  from lib/vdso/gettimeofday.c:5,
>  from :
> arch/arm64/include/asm/atomic_ll_sc.h:298:9: error: unknown type name 'u128'
>   298 | u128 full;
> 
> Use an open-coded page size calculation based on the new CONFIG_PAGE_SHIFT
> Kconfig symbol instead.
> 
> Reported-by: Linux Kernel Functional Testing 
> Fixes: a0d2fcd62ac2 ("vdso/ARM: Make union vdso_data_store available for all 
> architectures")
> Link: 
> https://lore.kernel.org/lkml/ca+g9fytrxxm_ko9fnpz3xarxhv7ud_yqp-teupqrnrhu+_0...@mail.gmail.com/
> Signed-off-by: Arnd Bergmann 

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 2/4] arch: simplify architecture specific page size configuration

2024-02-27 Thread Catalin Marinas

On Mon, Feb 26, 2024 at 05:14:12PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> arc, arm64, parisc and powerpc all have their own Kconfig symbols
> in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
> so the common symbols are the ones that are actually used, while
> leaving the arhcitecture specific ones as the user visible
> place for configuring it, to avoid breaking user configs.
> 
> Signed-off-by: Arnd Bergmann 

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 11/12] arm64: memory: Make virt_to_pfn() a static inline

2023-05-12 Thread Catalin Marinas

On Thu, May 11, 2023 at 01:59:28PM +0200, Linus Walleij wrote:
> Making virt_to_pfn() a static inline taking a strongly typed
> (const void *) makes the contract of a passing a pointer of that
> type to the function explicit and exposes any misuse of the
> macro virt_to_pfn() acting polymorphic and accepting many types
> such as (void *), (unitptr_t) or (unsigned long) as arguments
> without warnings.
> 
> Since arm64 is using  to provide
> __phys_to_pfn() we need to move the inclusion of that header
> up, so we can resolve the static inline at compile time.
> 
> Signed-off-by: Linus Walleij 

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 08/12] arm64: vdso: Pass (void *) to virt_to_page()

2023-05-12 Thread Catalin Marinas

On Thu, May 11, 2023 at 01:59:25PM +0200, Linus Walleij wrote:
> Like the other calls in this function virt_to_page() expects
> a pointer, not an integer.
> 
> However since many architectures implement virt_to_pfn() as
> a macro, this function becomes polymorphic and accepts both a
> (unsigned long) and a (void *).
> 
> Fix this up with an explicit cast.
> 
> Signed-off-by: Linus Walleij 

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 18/21] ARM: drop SMP support for ARM11MPCore

2023-03-31 Thread Catalin Marinas

On Mon, Mar 27, 2023 at 02:13:14PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The cache management operations for noncoherent DMA on ARMv6 work
> in two different ways:
> 
>  * When CONFIG_DMA_CACHE_RWFO is set, speculative prefetches on in-flight
>DMA buffers lead to data corruption when the prefetched data is written
>back on top of data from the device.
> 
>  * When CONFIG_DMA_CACHE_RWFO is disabled, a cache flush on one CPU
>is not seen by the other core(s), leading to inconsistent contents
>accross the system.
> 
> As a consequence, neither configuration is actually safe to use in a
> general-purpose kernel that is used on both MPCore systems and ARM1176
> with prefetching enabled.

As the author of this terrible hack (created under duress ;))

Acked-by: Catalin Marinas 

IIRC, RWFO is working in combination with the cache operations. Because
the cache maintenance broadcast did not happen, we forced the cache
lines to migrate to a CPU via a write (for ownership) and doing the
cache maintenance on that CPU (that was the FROM_DEVICE case). For the
TO_DEVICE case, reading on a CPU would cause dirty lines on another CPU
to be evicted (or migrated as dirty to the current CPU IIRC) then the
cache maintenance to clean them to PoC on the local CPU.

But there's always a small window between read/write for ownership and
the actual cache maintenance which can cause a cache line to migrate to
other CPUs if they do speculative prefetches. At the time ARM11MPCore
was deemed safe-ish but I haven't followed what later implementations
actually did (luckily we fixed the architecture in ARMv7).

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 00/21] dma-mapping: unify support for cache flushes

2023-03-31 Thread Catalin Marinas

On Mon, Mar 27, 2023 at 02:12:56PM +0200, Arnd Bergmann wrote:
> Another difference that I do not address here is what cache invalidation
> does for partical cache lines. On arm32, arm64 and powerpc, a partial
> cache line always gets written back before invalidation in order to
> ensure that data before or after the buffer is not discarded. On all
> other architectures, the assumption is cache lines are never shared
> between DMA buffer and data that is accessed by the CPU.

I don't think sharing the DMA buffer with other data is safe even with
this clean+invalidate on the unaligned cache. Mapping the DMA buffer as
FROM_DEVICE or BIDIRECTIONAL can cause the shared cache line to be
evicted and override the device written data. This sharing only works if
the CPU guarantees not to dirty the corresponding cache line.

I'm fine with removing this partial cache line hack from arm64 as it's
not safe anyway. We'll see if any driver stops working. If there's some
benign sharing (I wouldn't trust it), the cache cleaning prior to
mapping and invalidate on unmap would not lose any data.

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH v3 02/24] arm64: Remove COMMAND_LINE_SIZE from uapi

2023-02-14 Thread Catalin Marinas

On Tue, Feb 14, 2023 at 08:49:03AM +0100, Alexandre Ghiti wrote:
> From: Palmer Dabbelt 
> 
> As far as I can tell this is not used by userspace and thus should not
> be part of the user-visible API.
> 
> Signed-off-by: Palmer Dabbelt 

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH] mm: remove kern_addr_valid() completely

2022-11-01 Thread Catalin Marinas

On Tue, Oct 18, 2022 at 03:40:14PM +0800, Kefeng Wang wrote:
> Most architectures(except arm64/x86/sparc) simply return 1 for
> kern_addr_valid(), which is only used in read_kcore(), and it
> calls copy_from_kernel_nofault() which could check whether the
> address is a valid kernel address, so no need kern_addr_valid(),
> let's remove unneeded kern_addr_valid() completely.
> 
> Signed-off-by: Kefeng Wang 

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH] kernel: exit: cleanup release_thread()

2022-08-21 Thread Catalin Marinas

On Fri, Aug 19, 2022 at 09:44:06AM +0800, Kefeng Wang wrote:
> Only x86 has own release_thread(), introduce a new weak
> release_thread() function to clean empty definitions in
> other ARCHs.
> 
> Signed-off-by: Kefeng Wang 
[...]
>  arch/arm64/include/asm/processor.h  | 3 ---
>  arch/arm64/kernel/process.c | 4 ----

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH] arch: mm: rename FORCE_MAX_ZONEORDER to ARCH_FORCE_MAX_ORDER

2022-08-16 Thread Catalin Marinas

On Mon, Aug 15, 2022 at 10:39:59AM -0400, Zi Yan wrote:
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 571cc234d0b3..c6fcd8746f60 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1401,7 +1401,7 @@ config XEN
>   help
> Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64.
>  
> -config FORCE_MAX_ZONEORDER
> +config ARCH_FORCE_MAX_ORDER
>   int
>   default "14" if ARM64_64K_PAGES
>   default "12" if ARM64_16K_PAGES

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH V4 05/26] arm64/mm: Move protection_map[] inside the platform

2022-06-24 Thread Catalin Marinas

On Fri, Jun 24, 2022 at 10:13:18AM +0530, Anshuman Khandual wrote:
> This moves protection_map[] inside the platform and makes it a static.
> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Anshuman Khandual 

Reviewed-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH v4] mm: Avoid unnecessary page fault retires on shared memory types

2022-05-30 Thread Catalin Marinas

On Fri, May 27, 2022 at 03:39:36PM -0400, Peter Xu wrote:
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 77341b160aca..e401d416bbd6 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -607,6 +607,10 @@ static int __kprobes do_page_fault(unsigned long far, 
> unsigned int esr,
>   return 0;
>   }
>  
> + /* The fault is fully completed (including releasing mmap lock) */
> + if (fault & VM_FAULT_COMPLETED)
> + return 0;
> +
>   if (fault & VM_FAULT_RETRY) {
>   mm_flags |= FAULT_FLAG_TRIED;
>       goto retry;

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH V3 05/30] arm64/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-03-03 Thread Catalin Marinas

Hi Anshuman,

On Mon, Feb 28, 2022 at 04:17:28PM +0530, Anshuman Khandual wrote:
> +static inline pgprot_t __vm_get_page_prot(unsigned long vm_flags)
> +{
> + switch (vm_flags & (VM_READ | VM_WRITE | VM_EXEC | VM_SHARED)) {
> + case VM_NONE:
> + return PAGE_NONE;
> + case VM_READ:
> + case VM_WRITE:
> + case VM_WRITE | VM_READ:
> + return PAGE_READONLY;
> + case VM_EXEC:
> + return PAGE_EXECONLY;
> + case VM_EXEC | VM_READ:
> + case VM_EXEC | VM_WRITE:
> + case VM_EXEC | VM_WRITE | VM_READ:
> + return PAGE_READONLY_EXEC;
> + case VM_SHARED:
> + return PAGE_NONE;
> + case VM_SHARED | VM_READ:
> + return PAGE_READONLY;
> + case VM_SHARED | VM_WRITE:
> + case VM_SHARED | VM_WRITE | VM_READ:
> + return PAGE_SHARED;
> + case VM_SHARED | VM_EXEC:
> + return PAGE_EXECONLY;
> + case VM_SHARED | VM_EXEC | VM_READ:
> + return PAGE_READONLY_EXEC;
> + case VM_SHARED | VM_EXEC | VM_WRITE:
> + case VM_SHARED | VM_EXEC | VM_WRITE | VM_READ:
> + return PAGE_SHARED_EXEC;
> + default:
> + BUILD_BUG();
> + }
> +}

I'd say ack for trying to get of the extra arch_vm_get_page_prot() and
arch_filter_pgprot() but, TBH, I'm not so keen on the outcome. I haven't
built the code to see what's generated but I suspect it's no significant
improvement. As for the code readability, the arm64 parts don't look
much better either. The only advantage with this patch is that all
functions have been moved under arch/arm64.

I'd keep most architectures that don't have own arch_vm_get_page_prot()
or arch_filter_pgprot() unchanged and with a generic protection_map[]
array. For architectures that need fancier stuff, add a
CONFIG_ARCH_HAS_VM_GET_PAGE_PROT (as you do) and allow them to define
vm_get_page_prot() while getting rid of arch_vm_get_page_prot() and
arch_filter_pgprot(). I think you could also duplicate protection_map[]
for architectures with own vm_get_page_prot() (make it static) and
#ifdef it out in mm/mmap.c.

If later you have more complex needs or a switch statement generates
better code, go for it, but for this series I'd keep things simple, only
focus on getting rid of arch_vm_get_page_prot() and
arch_filter_pgprot().

If I grep'ed correctly, there are only 4 architectures that have own
arch_vm_get_page_prot() (arm64, powerpc, sparc, x86) and 2 that have own
arch_filter_pgprot() (arm64, x86). Try to only change these for the time
being, together with the other generic mm cleanups you have in this
series. I think there are a couple more that touch protection_map[]
(arm, m68k). You can leave the generic protection_map[] global if the
arch does not select ARCH_HAS_VM_GET_PAGE_PROT.

> +static pgprot_t arm64_arch_filter_pgprot(pgprot_t prot)
> +{
> + if (cpus_have_const_cap(ARM64_HAS_EPAN))
> + return prot;
> +
> + if (pgprot_val(prot) != pgprot_val(PAGE_EXECONLY))
> + return prot;
> +
> + return PAGE_READONLY_EXEC;
> +}
> +
> +static pgprot_t arm64_arch_vm_get_page_prot(unsigned long vm_flags)
> +{
> + pteval_t prot = 0;
> +
> + if (vm_flags & VM_ARM64_BTI)
> + prot |= PTE_GP;
> +
> + /*
> +  * There are two conditions required for returning a Normal Tagged
> +  * memory type: (1) the user requested it via PROT_MTE passed to
> +  * mmap() or mprotect() and (2) the corresponding vma supports MTE. We
> +  * register (1) as VM_MTE in the vma->vm_flags and (2) as
> +  * VM_MTE_ALLOWED. Note that the latter can only be set during the
> +  * mmap() call since mprotect() does not accept MAP_* flags.
> +  * Checking for VM_MTE only is sufficient since arch_validate_flags()
> +  * does not permit (VM_MTE & !VM_MTE_ALLOWED).
> +  */
> + if (vm_flags & VM_MTE)
> + prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
> +
> + return __pgprot(prot);
> +}
> +
> +pgprot_t vm_get_page_prot(unsigned long vm_flags)
> +{
> + pgprot_t ret = __pgprot(pgprot_val(__vm_get_page_prot(vm_flags)) |
> + pgprot_val(arm64_arch_vm_get_page_prot(vm_flags)));
> +
> + return arm64_arch_filter_pgprot(ret);
> +}

If we kept the array, we can have everything in a single function
(untested and with my own comments for future changes):

pgprot_t vm_get_page_prot(unsigned long vm_flags)
{
pgprot_t prot = __pgprot(pgprot_val(protection_map[vm_flags &
(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]));

/*
 * We could get rid of this test if we updated protection_map[]
 * to turn exec-only into read-exec during boot.
 */
if (!cpus_have_const_cap(ARM64_HAS_EPAN) &&
pgprot_val(prot) == pgprot_val(PAGE_EXECONLY))
prot = PAGE_READONLY_EXEC;

if (vm_flags & VM_ARM64_BTI)
prot != PTE_GP;

/*
 * We can get rid of the

Re: [PATCH 1/3] arch: Export machine_restart() instances so they can be called from modules

2021-08-05 Thread Catalin Marinas

On Thu, Aug 05, 2021 at 08:50:30AM +0100, Lee Jones wrote:
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index b4bb67f17a2ca..cf89ce91d7145 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -212,6 +212,7 @@ void machine_restart(char *cmd)
>   printk("Reboot failed -- System halted\n");
>   while (1);
>  }
> +EXPORT_SYMBOL(machine_restart);

Should we make this EXPORT_SYMBOL_GPL? I suppose it's not for general
use by out of tree drivers and it matches the other pm_power_off symbol
we export in this file.

Either way:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 2/3] trace: refactor TRACE_IRQFLAGS_SUPPORT in Kconfig

2021-08-02 Thread Catalin Marinas

On Sat, Jul 31, 2021 at 02:22:32PM +0900, Masahiro Yamada wrote:
> Make architectures select TRACE_IRQFLAGS_SUPPORT instead of
> having many defines.
> 
> Signed-off-by: Masahiro Yamada 

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 0/6] mm: some config cleanups

2021-03-19 Thread Catalin Marinas

On Tue, Mar 09, 2021 at 02:03:04PM +0530, Anshuman Khandual wrote:
> This series contains config cleanup patches which reduces code duplication
> across platforms and also improves maintainability. There is no functional
> change intended with this series. This has been boot tested on arm64 but
> only build tested on some other platforms.
> 
> This applies on 5.12-rc2
> 
> Cc: x...@kernel.org
> Cc: linux-i...@vger.kernel.org
> Cc: linux-s...@vger.kernel.org
> Cc: linux-snps-arc@lists.infradead.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-m...@vger.kernel.org
> Cc: linux-par...@vger.kernel.org
> Cc: linuxppc-...@lists.ozlabs.org
> Cc: linux-ri...@lists.infradead.org
> Cc: linux...@vger.kernel.org
> Cc: linux-fsde...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux-ker...@vger.kernel.org
> 
> Anshuman Khandual (6):
>   mm: Generalize ARCH_HAS_CACHE_LINE_SIZE
>   mm: Generalize SYS_SUPPORTS_HUGETLBFS (rename as ARCH_SUPPORTS_HUGETLBFS)
>   mm: Generalize ARCH_ENABLE_MEMORY_[HOTPLUG|HOTREMOVE]
>   mm: Drop redundant ARCH_ENABLE_[HUGEPAGE|THP]_MIGRATION
>   mm: Drop redundant ARCH_ENABLE_SPLIT_PMD_PTLOCK
>   mm: Drop redundant HAVE_ARCH_TRANSPARENT_HUGEPAGE
> 
>  arch/arc/Kconfig   |  9 ++--
>  arch/arm/Kconfig   | 10 ++---
>  arch/arm64/Kconfig | 30 ++

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH v2 09/13] arm, arm64: move free_unused_memmap() to generic mm

2020-11-14 Thread Catalin Marinas

On Sun, Nov 01, 2020 at 07:04:50PM +0200, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> ARM and ARM64 free unused parts of the memory map just before the
> initialization of the page allocator. To allow holes in the memory map both
> architectures overload pfn_valid() and define HAVE_ARCH_PFN_VALID.
> 
> Allowing holes in the memory map for FLATMEM may be useful for small
> machines, such as ARC and m68k and will enable those architectures to cease
> using DISCONTIGMEM and still support more than one memory bank.
> 
> Move the functions that free unused memory map to generic mm and enable
> them in case HAVE_ARCH_PFN_VALID=y.
> 
> Signed-off-by: Mike Rapoport 

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: Flushing transparent hugepages

2020-08-28 Thread Catalin Marinas

On Tue, Aug 18, 2020 at 05:08:16PM +0100, Will Deacon wrote:
> On Tue, Aug 18, 2020 at 04:07:36PM +0100, Matthew Wilcox wrote:
> > For example, arm64 seems confused in this scenario:
> > 
> > void flush_dcache_page(struct page *page)
> > {
> > if (test_bit(PG_dcache_clean, &page->flags))
> > clear_bit(PG_dcache_clean, &page->flags);
> > }
> > 
> > ...
> > 
> > void __sync_icache_dcache(pte_t pte)
> > {
> > struct page *page = pte_page(pte);
> > 
> > if (!test_and_set_bit(PG_dcache_clean, &page->flags))
> > sync_icache_aliases(page_address(page), page_size(page));
> > }
> > 
> > So arm64 keeps track on a per-page basis which ones have been flushed.
> > page_size() will return PAGE_SIZE if called on a tail page or regular
> > page, but will return PAGE_SIZE << compound_order if called on a head
> > page.  So this will either over-flush, or it's missing the opportunity
> > to clear the bits on all the subpages which have now been flushed.
> 
> Hmm, that seems to go all the way back to 2014 as the result of a bug fix
> in 923b8f5044da ("arm64: mm: Make icache synchronisation logic huge page
> aware") which has a Reported-by Mark and a CC stable, suggesting something
> _was_ going wrong at the time :/ Was there a point where the tail pages
> could end up with PG_arch_1 uncleared on allocation?

In my experience, it's the other way around: you can end up with
PG_arch_1 cleared in a tail page when the head one was set (splitting
THP).

> > What would you _like_ to see?  Would you rather flush_dcache_page()
> > were called once for each subpage, or would you rather maintain
> > the page-needs-flushing state once per compound page?  We could also
> > introduce flush_dcache_thp() if some architectures would prefer it one
> > way and one the other, although that brings into question what to do
> > for hugetlbfs pages.
> 
> For arm64, we'd like to see PG_arch_1 preserved during huge page splitting
> [1], but there was a worry that it might break x86 and s390. It's also not
> clear to me that we can change __sync_icache_dcache() as it's called when
> we're installing the entry in the page-table, so why would it be called
> again for the tail pages?

Indeed, __sync_icache_dcache() is called from set_pte_at() on the head
page, though it could always iterate and flush the tail pages
individually (I think we could have done this in commit 923b8f5044da).
Currently I suspect it does some over-flushing if you use THP on
executable pages (it's a no-op on non-exec pages).

With MTE (arm64 memory tagging) I'm introducing a PG_arch_2 flag and
losing this is more problematic as it can lead to clearing valid tags.
In the subsequent patch [2], mte_sync_tags() (also called from
set_pte_at()) checks the PG_arch_2 in each page of a compound one.

My preference would be to treat both PG_arch_1 and _2 similarly.

> [1] 
> https://lore.kernel.org/linux-arch/20200703153718.16973-8-catalin.mari...@arm.com/

[2] 
https://lore.kernel.org/linux-arch/20200703153718.16973-9-catalin.mari...@arm.com/

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH v2 08/20] arm64: simplify detection of memory zone boundaries for UMA configs

2020-05-26 Thread Catalin Marinas

On Wed, Apr 29, 2020 at 03:11:14PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> The free_area_init() function only requires the definition of maximal PFN
> for each of the supported zone rater than calculation of actual zone sizes
> and the sizes of the holes between the zones.
> 
> After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
> available to all architectures.
> 
> Using this function instead of free_area_init_node() simplifies the zone
> detection.
> 
> Signed-off-by: Mike Rapoport 

Acked-by: Catalin Marinas 

(BTW, none of my acks so far made it to the linux-arm-kernel list
because of the large number of people on cc)

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH v2 05/20] mm: use free_area_init() instead of free_area_init_nodes()

2020-05-26 Thread Catalin Marinas

On Wed, Apr 29, 2020 at 03:11:11PM +0300, Mike Rapoport wrote:
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index e42727e3568e..a650adb358ee 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -206,7 +206,7 @@ static void __init zone_sizes_init(unsigned long min, 
> unsigned long max)
>  #endif
>   max_zone_pfns[ZONE_NORMAL] = max;
>  
> - free_area_init_nodes(max_zone_pfns);
> + free_area_init(max_zone_pfns);
>  }

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH v2 03/20] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option

2020-05-26 Thread Catalin Marinas

On Wed, Apr 29, 2020 at 03:11:09PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> The CONFIG_HAVE_MEMBLOCK_NODE_MAP is used to differentiate initialization
> of nodes and zones structures between the systems that have region to node
> mapping in memblock and those that don't.
> 
> Currently all the NUMA architectures enable this option and for the
> non-NUMA systems we can presume that all the memory belongs to node 0 and
> therefore the compile time configuration option is not required.
> 
> The remaining few architectures that use DISCONTIGMEM without NUMA are
> easily updated to use memblock_add_node() instead of memblock_add() and
> thus have proper correspondence of memblock regions to NUMA nodes.
> 
> Still, free_area_init_node() must have a backward compatible version
> because its semantics with and without CONFIG_HAVE_MEMBLOCK_NODE_MAP is
> different. Once all the architectures will use the new semantics, the
> entire compatibility layer can be dropped.
> 
> To avoid addition of extra run time memory to store node id for
> architectures that keep memblock but have only a single node, the node id
> field of the memblock_region is guarded by CONFIG_NEED_MULTIPLE_NODES and
> the corresponding accessors presume that in those cases it is always 0.
> 
> Signed-off-by: Mike Rapoport 
> ---
>  .../vm/numa-memblock/arch-support.txt |  34 --
>  arch/alpha/mm/numa.c  |   4 +-
>  arch/arm64/Kconfig|   1 -

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH V12] mm/debug: Add tests validating architecture page table helpers

2020-02-10 Thread Catalin Marinas

On Tue, Jan 28, 2020 at 06:57:53AM +0530, Anshuman Khandual wrote:
> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with
> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to
> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and
> arm64. Going forward, other architectures too can enable this after fixing
> build or runtime problems (if any) with their page table helpers.

It may be worth posting the next version to linux-arch to reach out to
other arch maintainers.

Also I've seen that you posted a v13 but it hasn't reached
linux-arm-kernel (likely held in moderation because of the large amount
of addresses cc'ed) and I don't normally follow LKML. I'm not cc'ed to
this patch either (which is fine as long as you post to a list that I
read).

Since I started the reply on v12 about a week ago, I'll follow up here.
When you post a v14, please trim the people on cc only to those strictly
necessary (e.g. arch maintainers, linux-mm, linux-arch and lkml).

> diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt 
> b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
> new file mode 100644
> index ..f3f8111edbe3
> --- /dev/null
> +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
> @@ -0,0 +1,35 @@
> +#
> +# Feature name:  debug-vm-pgtable
> +# Kconfig:   ARCH_HAS_DEBUG_VM_PGTABLE
> +# description:   arch supports pgtable tests for semantics compliance
> +#
> +---
> +| arch |status|
> +---
> +|   alpha: | TODO |
> +| arc: |  ok  |
> +| arm: | TODO |

I'm sure you can find some arm32 hardware around (or a VM) to give this
a try ;).

> diff --git a/arch/x86/include/asm/pgtable_64.h 
> b/arch/x86/include/asm/pgtable_64.h
> index 0b6c4042942a..fb0e76d254b3 100644
> --- a/arch/x86/include/asm/pgtable_64.h
> +++ b/arch/x86/include/asm/pgtable_64.h
[...]
> @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void)
>   sched_init_smp();
>  
>   page_alloc_init_late();
> + debug_vm_pgtable();
>   /* Initialize page ext after all struct pages are initialized. */
>   page_ext_init();

I guess you could even make debug_vm_pgtable() an early_initcall(). I
don't have a strong opinion either way.

> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> new file mode 100644
> index ..0f37f32d15f1
> --- /dev/null
> +++ b/mm/debug_vm_pgtable.c
> @@ -0,0 +1,388 @@
[...]
> +/*
> + * Basic operations
> + *
> + * mkold(entry)  = An old and not a young entry
> + * mkyoung(entry)= A young and not an old entry
> + * mkdirty(entry)= A dirty and not a clean entry
> + * mkclean(entry)= A clean and not a dirty entry
> + * mkwrite(entry)= A write and not a write protected entry
> + * wrprotect(entry)  = A write protected and not a write entry
> + * pxx_bad(entry)= A mapped and non-table entry
> + * pxx_same(entry1, entry2)  = Both entries hold the exact same value
> + */
> +#define VMFLAGS  (VM_READ|VM_WRITE|VM_EXEC)
> +
> +/*
> + * On s390 platform, the lower 12 bits are used to identify given page table
> + * entry type and for other arch specific requirements. But these bits might
> + * affect the ability to clear entries with pxx_clear(). So while loading up
> + * the entries skip all lower 12 bits in order to accommodate s390 platform.
> + * It does not have affect any other platform.
> + */
> +#define RANDOM_ORVALUE   (0xf000UL)

I'd suggest you generate this mask with something like
GENMASK(BITS_PER_LONG, PAGE_SHIFT).

> +#define RANDOM_NZVALUE   (0xff)
> +
> +static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
> +{
> + pte_t pte = pfn_pte(pfn, prot);
> +
> + WARN_ON(!pte_same(pte, pte));
> + WARN_ON(!pte_young(pte_mkyoung(pte)));
> + WARN_ON(!pte_dirty(pte_mkdirty(pte)));
> + WARN_ON(!pte_write(pte_mkwrite(pte)));
> + WARN_ON(pte_young(pte_mkold(pte)));
> + WARN_ON(pte_dirty(pte_mkclean(pte)));
> + WARN_ON(pte_write(pte_wrprotect(pte)));

Given that you start with rwx permissions set,
some of these ops would not have any effect. For example, on arm64 at
least, mkwrite clears a bit already cleared here. You could try with
multiple rwx combinations values (e.g. all set and all cleared) or maybe
something like below:

WARN_ON(!pte_write(pte_mkwrite(pte_wrprotect(pte;

You could also try something like this:

WARN_ON(!pte_same(pte_wrprotect(pte), pte_wrprotect(pte_mkwrite(pte;

though the above approach may not work for arm64 ptep_set_wrprotect() on
a dirty pte (if you extend these tests later).

> +}
> +
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
> +{
> + pmd_t pmd = pfn_pmd(pfn, pr

Re: [PATCH V12] mm/debug: Add tests validating architecture page table helpers

2020-01-29 Thread Catalin Marinas

On Tue, Jan 28, 2020 at 02:07:10PM -0500, Qian Cai wrote:
> On Jan 28, 2020, at 12:47 PM, Catalin Marinas  wrote:
> > The primary goal here is not finding regressions but having clearly
> > defined semantics of the page table accessors across architectures. x86
> > and arm64 are a good starting point and other architectures will be
> > enabled as they are aligned to the same semantics.
> 
> This still does not answer the fundamental question. If this test is
> simply inefficient to find bugs,

Who said this is inefficient (other than you)?

> who wants to spend time to use it regularly? 

Arch maintainers, mm maintainers introducing new macros or assuming
certain new semantics of the existing macros.

> If this is just one off test that may get running once in a few years
> (when introducing a new arch), how does it justify the ongoing cost to
> maintain it?

You are really missing the point. It's not only for a new arch but
changes to existing arch code. And if the arch code churn in this area
is relatively small, I'd expect a similarly small cost of maintaining
this test.

If you only turn DEBUG_VM on once every few years, don't generalise this
to the rest of the kernel developers (as others pointed out, this test
is default y if DEBUG_VM).

Anyway, I think that's a pointless discussion, so not going to reply
further (unless you have technical content to add).

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH V12] mm/debug: Add tests validating architecture page table helpers

2020-01-28 Thread Catalin Marinas

On Mon, Jan 27, 2020 at 09:11:53PM -0500, Qian Cai wrote:
> On Jan 27, 2020, at 8:28 PM, Anshuman Khandual  
> wrote:
> > This adds tests which will validate architecture page table helpers and
> > other accessors in their compliance with expected generic MM semantics.
> > This will help various architectures in validating changes to existing
> > page table helpers or addition of new ones.
[...]
> What’s the value of this block of new code? It only supports x86 and
> arm64 which are supposed to be good now. Did those tests ever find any
> regression or this is almost only useful for new architectures which
> only happened once in a few years?

The primary goal here is not finding regressions but having clearly
defined semantics of the page table accessors across architectures. x86
and arm64 are a good starting point and other architectures will be
enabled as they are aligned to the same semantics.

See for example this past discussion:

https://lore.kernel.org/linux-mm/20190628102003.ga56...@arrakis.emea.arm.com/

These tests should act as the 'contract' between the generic mm code and
the architecture port. Without clear semantics, some bugs may be a lot
subtler than a boot failure.

FTR, I fully support this patch (and I should get around to review it
properly; thanks for the reminder ;)).

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: add config symbols for arch_{setup,teardown}_dma_ops

2019-02-11 Thread Catalin Marinas

On Mon, Feb 11, 2019 at 02:21:56PM +0100, Christoph Hellwig wrote:
> Any chance to get a quick review on this small series?

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH v2 06/21] memblock: memblock_phys_alloc_try_nid(): don't panic

2019-01-25 Thread Catalin Marinas

On Mon, Jan 21, 2019 at 10:03:53AM +0200, Mike Rapoport wrote:
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> index ae34e3a..2c61ea4 100644
> --- a/arch/arm64/mm/numa.c
> +++ b/arch/arm64/mm/numa.c
> @@ -237,6 +237,10 @@ static void __init setup_node_data(int nid, u64 
> start_pfn, u64 end_pfn)
>   pr_info("Initmem setup node %d []\n", nid);
>  
>   nd_pa = memblock_phys_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
> + if (!nd_pa)
> + panic("Cannot allocate %zu bytes for node %d data\n",
> +   nd_size, nid);
> +
>   nd = __va(nd_pa);
>  
>   /* report and initialize */

Does it mean that memblock_phys_alloc_try_nid() never returns valid
physical memory starting at 0?

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [REPOST PATCH v6 0/4] kgdb: Fix kgdb_roundup_cpus()

2018-12-10 Thread Catalin Marinas

Hi Doug,

On Fri, Dec 07, 2018 at 10:40:24AM -0800, Doug Anderson wrote:
> On Fri, Dec 7, 2018 at 9:42 AM Catalin Marinas  
> wrote:
> > On Tue, Dec 04, 2018 at 07:38:24PM -0800, Douglas Anderson wrote:
> > > Douglas Anderson (4):
> > >   kgdb: Remove irq flags from roundup
> > >   kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()
> > >   kgdb: Don't round up a CPU that failed rounding up before
> > >   kdb: Don't back trace on a cpu that didn't round up
> >
> > FWIW, trying these on arm64 (ThunderX2) with CONFIG_KGDB_TESTS_ON_BOOT=y
> > on top of 4.20-rc5 doesn't boot. It looks like they leave interrupts
> > disabled when they shouldn't and it trips over the BUG at
> > mm/vmalloc.c:1380 (called via do_fork -> copy_process).
> >
> > Now, I don't think these patches make things worse on arm64 since prior
> > to them the kgdb boot tests on arm64 were stuck in a loop (RUN
> > singlestep).
> 
> Thanks for the report!  ...actually, I'd never tried CONFIG_KGDB_TESTS
> before.  ...so I tried them now:
> 
> A) chromeos-4.19 tree on qcom-sdm845 without this series: booted up OK
> B) chromeos-4.19 tree on qcom-sdm845 with this series: booted up OK
> C) v4.20-rc5-90-g30002dd008ed on rockchip-rk3399 (kevin) with this
> series: booted up OK
> 
> Example output from B) above:
> 
> localhost ~ # dmesg | grep kgdbts
> [2.139814] KGDB: Registered I/O driver kgdbts
> [2.144582] kgdbts:RUN plant and detach test
> [2.165333] kgdbts:RUN sw breakpoint test
> [2.172990] kgdbts:RUN bad memory access test
> [2.178640] kgdbts:RUN singlestep test 1000 iterations
> [2.187765] kgdbts:RUN singlestep [0/1000]
> [2.559596] kgdbts:RUN singlestep [100/1000]
> [2.931419] kgdbts:RUN singlestep [200/1000]
> [3.303474] kgdbts:RUN singlestep [300/1000]
> [3.675121] kgdbts:RUN singlestep [400/1000]
> [4.046867] kgdbts:RUN singlestep [500/1000]
> [4.418920] kgdbts:RUN singlestep [600/1000]
> [4.790824] kgdbts:RUN singlestep [700/1000]
> [5.162479] kgdbts:RUN singlestep [800/1000]
> [5.534103] kgdbts:RUN singlestep [900/1000]
> [5.902299] kgdbts:RUN do_fork for 100 breakpoints
> [8.463900] KGDB: Unregistered I/O driver kgdbts, debugger disabled
> 
> ...so I guess I'm a little confused.  Either I have a different config
> than you do or something is special about your machine?

I tried it now on a Juno board both as a host and a guest and boots
fine. It must be something that only triggers ThunderX2. Ignore the
report for now, if I find anything interesting I'll let you know.

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [REPOST PATCH v6 0/4] kgdb: Fix kgdb_roundup_cpus()

2018-12-07 Thread Catalin Marinas

On Tue, Dec 04, 2018 at 07:38:24PM -0800, Douglas Anderson wrote:
> Douglas Anderson (4):
>   kgdb: Remove irq flags from roundup
>   kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()
>   kgdb: Don't round up a CPU that failed rounding up before
>   kdb: Don't back trace on a cpu that didn't round up

FWIW, trying these on arm64 (ThunderX2) with CONFIG_KGDB_TESTS_ON_BOOT=y
on top of 4.20-rc5 doesn't boot. It looks like they leave interrupts
disabled when they shouldn't and it trips over the BUG at
mm/vmalloc.c:1380 (called via do_fork -> copy_process).

Now, I don't think these patches make things worse on arm64 since prior
to them the kgdb boot tests on arm64 were stuck in a loop (RUN
singlestep).

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH] arc: Implement arch-specific dma_map_ops.mmap

2016-11-02 Thread Catalin Marinas

On Wed, Nov 02, 2016 at 03:19:43PM +0300, Alexey Brodkin wrote:
> diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
> index 20afc65e22dc..034ec6a8a764 100644
> --- a/arch/arc/mm/dma.c
> +++ b/arch/arc/mm/dma.c
> @@ -105,6 +105,31 @@ static void arc_dma_free(struct device *dev, size_t 
> size, void *vaddr,
>   __free_pages(page, get_order(size));
>  }
>  
> +static int arc_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> + void *cpu_addr, dma_addr_t dma_addr, size_t size,
> + unsigned long attrs)
> +{
> + unsigned long user_count = vma_pages(vma);
> + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> + unsigned long pfn = __phys_to_pfn(dma_addr);

I don't think that's correct in all situations. Better as (for arc):

unsigned long pfn = __phys_to_pfn(plat_dma_to_phys(dma_addr));

Other than that:

Reviewed-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [RFC] dma-mapping: fix dma_common_mmap() for ARC

2016-10-30 Thread Catalin Marinas

On Wed, Oct 26, 2016 at 10:22:44PM +0300, Alexey Brodkin wrote:
> >8---
> arc_dma_alloc()
>   ioremap_nocache() AKA ioremap()
> ioremap_prot()
>   get_vm_area() + ioremap_page_range() on obtained vaddr
> >8---
> 
> As a result we get TLB entry of the following kind:
> >8---
> vaddr = 0x7200_
> paddr = 0x8200_
> flags = _uncached_
> >8---
> 
> Kerenl thinks frame buffer is located @ 0x7200_ and uses it
> perfectly fine.
> 
> But here comes a time for user-space application to request frame buffer
> to be mapped for it. That happens easily with the following call path:
> >8---
> fb_mmap()
>   drm_fb_cma_mmap()
> dma_mmap_writecombine() AKA dma_mmap_wc()
>   dma_mmap_attrs()
> dma_common_mmap() since we don't [yet] have dma_map_ops.mmap()
>   for ARC
> >8---
> 
> And in dma_common_mmap() we first calculate pfn of what we think is
> "physical page" and then do remap_pfn_range() to that "physical page".
> 
> Here we're getting to the interesting thing - how pfn is calculated.
> As of now this is done as simple as:
> >8---
> pfn = page_to_pfn(virt_to_page(cpu_addr));
> >8---

The virt_to_page() function here only works for addresses in the kernel
linear map. In your case, the DMA buffer is mapped out of the ioremap
space, so the cpu_addr you pass in here would return the incorrect pfn
(as you've already noticed).

> Simplest fix for ARC is to use dma_addr instead because it matches
> real physical memory address and so mapping for user-space we're
> getting then is this:
> >8---
> vaddr = 0x0200_
> paddr = 0x8200_
> flags = _uncached_
> >8---
> And it works perfectly fine.

But it breaks the other architectures where dma_addr is actually closer
to the phys_addr than the kernel linear map.

> diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
> index 8f8b68c80986..16307eed453f 100644
> --- a/drivers/base/dma-mapping.c
> +++ b/drivers/base/dma-mapping.c
> @@ -252,7 +252,7 @@ int dma_common_mmap(struct device *dev, struct 
> vm_area_struct *vma,
>  #if defined(CONFIG_MMU) && !defined(CONFIG_ARCH_NO_COHERENT_DMA_MMAP)
>   unsigned long user_count = vma_pages(vma);
>   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
> + unsigned long pfn = page_to_pfn(virt_to_page(dma_addr));

As I said above, this is incorrect. I would suggest that you implement
an arc specific mmap operation. We do this for arm64 using
remap_pfn_range; see __swiotlb_mmap under arch/arm64/mm/dma-mapping.c
where the pfn is calculated using an arm64-specific dma_to_phys()
function.

-- 
Catalin

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [patch V4 02/31] bitops: Include generic parity.h in some architectures' bitops.h

2016-05-11 Thread Catalin Marinas

On Wed, May 11, 2016 at 05:06:17PM +0800, zengzhao...@163.com wrote:
> From: Zhaoxiu Zeng 
> 
> Simply use the generic version.
> 
> Signed-off-by: Zhaoxiu Zeng 
> Acked-by: Hans-Christian Noren Egtvedt  [for avr32]
> ---
>  arch/arc/include/asm/bitops.h  | 1 +
>  arch/arm/include/asm/bitops.h  | 1 +
>  arch/arm64/include/asm/bitops.h| 1 +
>  arch/avr32/include/asm/bitops.h| 1 +
>  arch/c6x/include/asm/bitops.h  | 1 +
>  arch/cris/include/asm/bitops.h | 1 +
>  arch/frv/include/asm/bitops.h  | 1 +
>  arch/h8300/include/asm/bitops.h| 1 +
>  arch/hexagon/include/asm/bitops.h  | 1 +
>  arch/m32r/include/asm/bitops.h | 1 +
>  arch/m68k/include/asm/bitops.h | 1 +
>  arch/metag/include/asm/bitops.h| 1 +
>  arch/mn10300/include/asm/bitops.h  | 1 +
>  arch/openrisc/include/asm/bitops.h | 1 +
>  arch/parisc/include/asm/bitops.h   | 1 +
>  arch/s390/include/asm/bitops.h | 1 +
>  arch/sh/include/asm/bitops.h   | 1 +
>  arch/xtensa/include/asm/bitops.h   | 1 +

For arm64:

Acked-by: Catalin Marinas 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

39 matches

Mail list logo