from:"Arnd Bergmann"

Re: [PATCH 5/5] [RFC] mm: Remove MAP_UNINITIALIZED support

2024-09-26 Thread Arnd Bergmann

On Thu, Sep 26, 2024, at 08:46, David Hildenbrand wrote:
> On 25.09.24 23:06, Arnd Bergmann wrote:
>
> The first, uncontroversial step could indeed be to make 
> MAP_UNINITIALIZED a nop, but still leave the definitions in mman.h etc 
> around.
>
> This is the same we did with MAP_DENYWRITE. There might be some weird 
> user out there, and carelessly reusing the bit could result in trouble. 
> (people might argue that they are not using it with MAP_HUGETLB, so it 
> would work)
>
> Going forward and removing MAP_UNINITIALIZED is a bit more 
> controversial, but maybe there really isn't any other user around. 
> Software that is not getting recompiled cannot be really identified by 
> letting it rest in -next only.
>
> My take would be to leave MAP_UNINITIALIZED in the headers in some form 
> for documentation purposes.

I don't think there is much point in doing this in multiple
steps, either we want to break it at compile time or leave
it silently doing nothing. There is also very little
difference in practice because applications almost always
use sys/mman.h instead of linux/mman.h.

FWIW, the main user appears to be the uClibc and uclibc-ng
malloc() implementation for NOMMU targets:

https://git.uclibc.org/uClibc/commit/libc/stdlib/malloc/malloc.c?id=00673f93826bf1f

Both of these also define this constant itself as 0x400
for all architectures.

There are a few others that I could find with Debian codesearch:

https://sources.debian.org/src/monado/21.0.0+git2905.e26a272c1~dfsg1-2/src/external/tracy/client/tracy_rpmalloc.cpp/?hl=890#L889
https://sources.debian.org/src/systemtap/5.1-4/testsuite/systemtap.syscall/mmap.c/?hl=224#L224
https://sources.debian.org/src/fuzzel/1.11.1+ds-1/shm.c/?hl=488#L488
https://sources.debian.org/src/notcurses/3.0.7+dfsg.1-1/src/lib/fbuf.h/?hl=35#L35
https://sources.debian.org/src/lmms/1.2.2+dfsg1-6/src/3rdparty/rpmalloc/rpmalloc/rpmalloc/rpmalloc.c/?hl=1753#L1753

All of these will fall back to not passing MAP_UNINITIALIZED
if it's not defined, which is what happens on glibc and musl.

   Arnd

Re: [PATCH 1/5] asm-generic: cosmetic updates to uapi/asm/mman.h

2024-09-26 Thread Arnd Bergmann

On Thu, Sep 26, 2024, at 09:21, Helge Deller wrote:
> On 9/25/24 23:06, Arnd Bergmann wrote:

>> -/* not used by linux, but here to make sure we don't clash with OSF/1 
>> defines */
>> -#define _MAP_HASSEMAPHORE 0x0200
>> -#define _MAP_INHERIT0x0400
>> -#define _MAP_UNALIGNED  0x0800
>
> I suggest to keep ^^ those. It's useful information which isn't
> easily visible otherwise.

Fair enough. I removed them in order to bring the differences
between files to an absolute minimum, but since at the end
of the series the files only contain the map values, there is
no real harm in keeping them, and they may help.

>> -/* not used by linux, but here to make sure we don't clash with ABI defines 
>> */
>> -#define MAP_RENAME  0x020   /* Assign page to file */
>> -#define MAP_AUTOGROW0x040   /* File may grow by writing */
>> -#define MAP_LOCAL   0x080   /* Copy on fork/sproc */
>> -#define MAP_AUTORSRV0x100   /* Logical swap reserved on 
>> demand */
>
> same here. I think they should be preserved.

Right.

>>   /* 0x01 - 0x03 are defined in linux/mman.h */
>> -#define MAP_TYPE0x00f   /* Mask for type of mapping */
>> -#define MAP_FIXED   0x010   /* Interpret addr exactly */
>> +#define MAP_TYPE0x0f/* Mask for type of mapping */
>> +#define MAP_FIXED   0x10/* Interpret addr exactly */
>>
>> -/* not used by linux, but here to make sure we don't clash with ABI defines 
>> */
>> -#define MAP_RENAME  0x020   /* Assign page to file */
>> -#define MAP_AUTOGROW0x040   /* File may grow by writing */
>> -#define MAP_LOCAL   0x080   /* Copy on fork/sproc */
>> -#define MAP_AUTORSRV0x100   /* Logical swap reserved on 
>> demand */
>
> If xtensa had those, those should be kept as well IMHO.

The thing with xtensa is that the file was blindly copied from
mips, so I'm sure it never had these, but there may be value
in keeping the two files in sync anyway. The only difference
at the moment is MAP_UNINITIALIZED, which is potentially
used on xtensa-nommu.

Let's see if Max Filippov has an opinion on this, otherwise I'd
keep it the same as mips.

  Arnd

[PATCH 5/5] [RFC] mm: Remove MAP_UNINITIALIZED support

2024-09-25 Thread Arnd Bergmann

From: Arnd Bergmann 

MAP_UNINITIALIZED was added back in 2009 for NOMMU kernels, specifically
for blackfin, which is long gone. MAP_HUGE_SHIFT/MAP_HUGE_MASK were
added in 2012 for architectures supporting hugepages, which at the time
did not overlap with the ones supporting NOMMU.

Adding the macro under an #ifdef was obviously a mistake, which
Christoph Hellwig tried to address by making it unconditionally defined
to 0x400 as part of the series to support RISC-V NOMMU kernels. At
this point linux/mman.h contained two conflicting definitions for bit 26,
though the two are still mutually exclusive at runtime in all supported
configurations.

According to the commit 854e9ed09ded ("mm: support madvise(MADV_FREE)")
description, it was previously used internally by facebook, which
would have resulted in MAP_HUGE_1MB turning into MAP_HUGE_2MB
with MAP_UNINITIALIZED enabled, and every other page size implying
MAP_UNINITIALIZED. I assume there are no remaining out of tree users
on MMU-enabled kernels today.

I do not see any sensible way to redefine the macros for the ABI in
a way avoids breaking something. The only ideas so far are:

 - do nothing, try to document the bug, hope for the best

 - remove the kernel implementation and redefine MAP_UNINITIALIZED to
   zero in the header to silently turn it off for everyone. There are
   few NOMMU users left, and the ones that do use NOMMU usually turn
   off MMAP_ALLOW_UNINITIALIZED, as it still has the potential to cause
   bugs and even security issues on systems with a memory protection
   unit.

 - remove both the implementation and the macro to force a build
   failure for anyone trying to use the feature. This way we can
   see who complains and whether we need to put it back in some
   form or change the userspace sources to no longer pass the flag.

Implement the third option here for the sake of discussion.

Link: 
https://git.uclibc.org/uClibc/commit/libc/stdlib/malloc/malloc.c?id=00673f93826bf1f
Link: https://lore.kernel.org/lkml/20190610221621.10938-4-...@lst.de/
Link: 
https://lore.kernel.org/lkml/1352157848-29473-1-git-send-email-a...@firstfloor.org/
Link: 
https://lore.kernel.org/lkml/1448865583-2446-2-git-send-email-minc...@kernel.org/
Cc: Christoph Hellwig 
Cc: Damien Le Moal 
Cc: Alexandre Torgue 
Cc: linux-st...@st-md-mailman.stormreply.com
Cc: Greg Ungerer 
Cc: Vladimir Murzin 
Cc: Max Filippov 
Signed-off-by: Arnd Bergmann 
---
 Documentation/admin-guide/mm/nommu-mmap.rst | 10 ++
 arch/alpha/include/uapi/asm/mman.h  |  2 --
 arch/mips/include/uapi/asm/mman.h   |  2 --
 arch/parisc/include/uapi/asm/mman.h |  2 --
 arch/powerpc/include/uapi/asm/mman.h|  5 -
 arch/sh/configs/rsk7264_defconfig   |  1 -
 arch/sparc/include/uapi/asm/mman.h  |  3 ---
 arch/xtensa/include/uapi/asm/mman.h |  3 ---
 fs/binfmt_elf_fdpic.c   |  3 +--
 include/linux/mman.h|  4 
 include/uapi/asm-generic/mman.h |  4 
 mm/Kconfig  | 22 -
 mm/nommu.c  |  4 +---
 13 files changed, 4 insertions(+), 61 deletions(-)

diff --git a/Documentation/admin-guide/mm/nommu-mmap.rst 
b/Documentation/admin-guide/mm/nommu-mmap.rst
index 530fed08de2c..9434c2fa99ae 100644
--- a/Documentation/admin-guide/mm/nommu-mmap.rst
+++ b/Documentation/admin-guide/mm/nommu-mmap.rst
@@ -135,14 +135,8 @@ Further notes on no-MMU MMAP
  significant delays during a userspace malloc() as the C library does an
  anonymous mapping and the kernel then does a memset for the entire map.
 
- However, for memory that isn't required to be precleared - such as that
- returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to
- indicate to the kernel that it shouldn't bother clearing the memory before
- returning it.  Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled
- to permit this, otherwise the flag will be ignored.
-
- uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
- to allocate the brk and stack region.
+ Previously, Linux also supported a MAP_UNINITIALIZED flag to allocate
+ memory without clearing it, this is no longer support.
 
  (#) A list of all the private copy and anonymous mappings on the system is
  visible through /proc/maps in no-MMU mode.
diff --git a/arch/alpha/include/uapi/asm/mman.h 
b/arch/alpha/include/uapi/asm/mman.h
index fc8b74aa3f89..1099b17a4003 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -21,8 +21,6 @@
 /* MAP_SYNC not supported */
 #define MAP_FIXED_NOREPLACE0x20/* MAP_FIXED which doesn't unmap 
underlying mapping */
 
-/* MAP_UNINITIALIZED not supported */
-
 /*
  * Flags for mlockall
  */
diff --git a/arch/mips/include/uapi/asm/mman.h 
b/arch/mips/include/uapi/asm/mman.h
index 6deb62db90de..9463c90712

[PATCH 4/5] asm-generic: use asm-generic/mman-common.h on parisc and alpha

2024-09-25 Thread Arnd Bergmann

From: Arnd Bergmann 

These two architectures each have their own set of MAP_* flags, like
powerpc, mips and others do. In addition, the msync() flags are also
different, here both define the same flags but in a different order.
Finally, alpha also has a custom MADV_DONTNEED flag for madvise.

Make the generic MADV_DONTNEED and MS_* definitions conditional on
them already being defined and then include the common header
header from both architectures, to remove the bulk of the contents.

Signed-off-by: Arnd Bergmann 
---
 arch/alpha/include/uapi/asm/mman.h | 68 +++---
 arch/parisc/include/uapi/asm/mman.h| 66 +
 include/uapi/asm-generic/mman-common.h |  5 ++
 3 files changed, 13 insertions(+), 126 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/mman.h 
b/arch/alpha/include/uapi/asm/mman.h
index 1f1c03c047ce..fc8b74aa3f89 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -2,18 +2,6 @@
 #ifndef __ALPHA_MMAN_H__
 #define __ALPHA_MMAN_H__
 
-#define PROT_READ  0x1 /* page can be read */
-#define PROT_WRITE 0x2 /* page can be written */
-#define PROT_EXEC  0x4 /* page can be executed */
-#ifndef PROT_SEM /* different on mips and xtensa */
-#define PROT_SEM   0x8 /* page may be used for atomic ops */
-#endif
-/* 0x10   reserved for arch-specific use */
-/* 0x20   reserved for arch-specific use */
-#define PROT_NONE  0x0 /* page can not be accessed */
-#define PROT_GROWSDOWN 0x0100  /* mprotect flag: extend change to 
start of growsdown vma */
-#define PROT_GROWSUP   0x0200  /* mprotect flag: extend change to end 
of growsup vma */
-
 /* 0x01 - 0x03 are defined in linux/mman.h */
 #define MAP_TYPE   0x0f/* Mask for type of mapping (OSF/1 is 
_wrong_) */
 #define MAP_FIXED  0x100   /* Interpret addr exactly */
@@ -43,62 +31,18 @@
 #define MCL_ONFAULT32768   /* lock all pages that are faulted in */
 
 /*
- * Flags for mlock
- */
-#define MLOCK_ONFAULT  0x01/* Lock pages in range after they are 
faulted in, do not prefault */
-
-/*
- * Flags for msync
+ * Flags for msync, order is different from all others
  */
 #define MS_ASYNC   1   /* sync memory asynchronously */
 #define MS_SYNC2   /* synchronous memory sync */
 #define MS_INVALIDATE  4   /* invalidate the caches */
 
-#define MADV_NORMAL0   /* no further special treatment */
-#define MADV_RANDOM1   /* expect random page references */
-#define MADV_SEQUENTIAL2   /* expect sequential page 
references */
-#define MADV_WILLNEED  3   /* will need these pages */
-#define MADV_DONTNEED  6   /* don't need these pages */
+/*
+ * Flags for madvise, 1 through 3 are normal
+ */
 /* originally MADV_SPACEAVAIL 5 */
+#define MADV_DONTNEED  6   /* don't need these pages */
 
-/* common parameters: try to keep these consistent across architectures */
-#define MADV_FREE  8   /* free pages only if memory pressure */
-#define MADV_REMOVE9   /* remove these pages & resources */
-#define MADV_DONTFORK  10  /* don't inherit across fork */
-#define MADV_DOFORK11  /* do inherit across fork */
-
-#define MADV_MERGEABLE   12/* KSM may merge identical pages */
-#define MADV_UNMERGEABLE 13/* KSM may not merge identical pages */
-
-#define MADV_HUGEPAGE  14  /* Worth backing with hugepages */
-#define MADV_NOHUGEPAGE15  /* Not worth backing with 
hugepages */
-
-#define MADV_DONTDUMP   16 /* Explicity exclude from the core dump,
-  overrides the coredump filter bits */
-#define MADV_DODUMP17  /* Clear the MADV_DONTDUMP flag */
-
-#define MADV_WIPEONFORK 18 /* Zero memory on fork, child only */
-#define MADV_KEEPONFORK 19 /* Undo MADV_WIPEONFORK */
-
-#define MADV_COLD  20  /* deactivate these pages */
-#define MADV_PAGEOUT   21  /* reclaim these pages */
-
-#define MADV_POPULATE_READ 22  /* populate (prefault) page tables 
readable */
-#define MADV_POPULATE_WRITE23  /* populate (prefault) page tables 
writable */
-
-#define MADV_DONTNEED_LOCKED   24  /* like DONTNEED, but drop locked pages 
too */
-
-#define MADV_COLLAPSE  25  /* Synchronous hugepage collapse */
-
-#define MADV_HWPOISON  100 /* poison a page for testing */
-#define MADV_SOFT_OFFLINE 101  /* soft offline page for testing */
-
-/* compatibility flags */
-#define MAP_FILE   0
-
-#define PKEY_DISABLE_ACCESS0x1
-#define PKEY_DISABLE_WRITE 0x2
-#defi

[PATCH 3/5] asm-generic: use asm-generic/mman-common.h on mips and xtensa

2024-09-25 Thread Arnd Bergmann

From: Arnd Bergmann 

mips and xtensa have almost the same asm/mman.h, aside from an
unintentional difference in MAP_UNINITIALIZED that has no effect in
practice.

Now that the MAP_* flags are moved out of asm-generic/mman-common.h,
the only difference from the its contents and the mips/xtensa version
is the PROT_SEM definition that is one bit off from the rest.

Make the generic PROT_SEM definition conditional on it already being
defined and then include that header from both architectures, to
remove the bulk of the contents.

Signed-off-by: Arnd Bergmann 
---
 arch/alpha/include/uapi/asm/mman.h |  2 +
 arch/mips/include/uapi/asm/mman.h  | 65 +
 arch/parisc/include/uapi/asm/mman.h|  3 ++
 arch/xtensa/include/uapi/asm/mman.h| 66 +-
 include/uapi/asm-generic/mman-common.h |  2 +
 5 files changed, 9 insertions(+), 129 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/mman.h 
b/arch/alpha/include/uapi/asm/mman.h
index 8946a13ce858..1f1c03c047ce 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -5,7 +5,9 @@
 #define PROT_READ  0x1 /* page can be read */
 #define PROT_WRITE 0x2 /* page can be written */
 #define PROT_EXEC  0x4 /* page can be executed */
+#ifndef PROT_SEM /* different on mips and xtensa */
 #define PROT_SEM   0x8 /* page may be used for atomic ops */
+#endif
 /* 0x10   reserved for arch-specific use */
 /* 0x20   reserved for arch-specific use */
 #define PROT_NONE  0x0 /* page can not be accessed */
diff --git a/arch/mips/include/uapi/asm/mman.h 
b/arch/mips/include/uapi/asm/mman.h
index 399937cefaa6..6deb62db90de 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -9,15 +9,8 @@
 #ifndef _ASM_MMAN_H
 #define _ASM_MMAN_H
 
-#define PROT_READ  0x1 /* page can be read */
-#define PROT_WRITE 0x2 /* page can be written */
-#define PROT_EXEC  0x4 /* page can be executed */
 /* 0x8reserved for PROT_EXEC_NOFLUSH */
 #define PROT_SEM   0x10/* page may be used for atomic ops */
-/* 0x20   reserved for arch-specific use */
-#define PROT_NONE  0x0 /* page can not be accessed */
-#define PROT_GROWSDOWN 0x0100  /* mprotect flag: extend change to 
start of growsdown vma */
-#define PROT_GROWSUP   0x0200  /* mprotect flag: extend change to end 
of growsup vma */
 
 /* 0x01 - 0x03 are defined in linux/mman.h */
 #define MAP_TYPE   0x0f/* Mask for type of mapping */
@@ -47,62 +40,6 @@
 #define MCL_FUTURE 2   /* lock all future mappings */
 #define MCL_ONFAULT4   /* lock all pages that are faulted in */
 
-/*
- * Flags for mlock
- */
-#define MLOCK_ONFAULT  0x01/* Lock pages in range after they are 
faulted in, do not prefault */
-
-/*
- * Flags for msync
- */
-#define MS_ASYNC   1   /* sync memory asynchronously */
-#define MS_INVALIDATE  2   /* invalidate the caches */
-#define MS_SYNC4   /* synchronous memory sync */
-
-#define MADV_NORMAL0   /* no further special treatment */
-#define MADV_RANDOM1   /* expect random page references */
-#define MADV_SEQUENTIAL2   /* expect sequential page 
references */
-#define MADV_WILLNEED  3   /* will need these pages */
-#define MADV_DONTNEED  4   /* don't need these pages */
-
-/* common parameters: try to keep these consistent across architectures */
-#define MADV_FREE  8   /* free pages only if memory pressure */
-#define MADV_REMOVE9   /* remove these pages & resources */
-#define MADV_DONTFORK  10  /* don't inherit across fork */
-#define MADV_DOFORK11  /* do inherit across fork */
-
-#define MADV_MERGEABLE   12/* KSM may merge identical pages */
-#define MADV_UNMERGEABLE 13/* KSM may not merge identical pages */
-
-#define MADV_HUGEPAGE  14  /* Worth backing with hugepages */
-#define MADV_NOHUGEPAGE15  /* Not worth backing with 
hugepages */
-
-#define MADV_DONTDUMP   16 /* Explicity exclude from the core dump,
-  overrides the coredump filter bits */
-#define MADV_DODUMP17  /* Clear the MADV_DONTDUMP flag */
-
-#define MADV_WIPEONFORK 18 /* Zero memory on fork, child only */
-#define MADV_KEEPONFORK 19 /* Undo MADV_WIPEONFORK */
-
-#define MADV_COLD  20  /* deactivate these pages */
-#define MADV_PAGEOUT   21  /* reclaim these pages */
-
-#define MADV_POPULATE_READ 22

[PATCH 2/5] asm-generic: move MAP_* flags from mman-common.h to mman.h

2024-09-25 Thread Arnd Bergmann

From: Arnd Bergmann 

powerpc and sparc include asm-generic/mman-common.h to get the MAP_* flags
0x008000 through 0x400, but those flags are all different on alpha,
mips, parisc and xtensa.

Add duplicate definitions for these along with the MAP_* flags for 0x100
through 0x4000 that are already different on powerpc and sparc, as a
preparation for actually sharing mman-common.h with all architectures.

Signed-off-by: Arnd Bergmann 
---
 arch/powerpc/include/uapi/asm/mman.h   | 16 
 arch/sparc/include/uapi/asm/mman.h | 15 +++
 include/uapi/asm-generic/mman-common.h | 16 
 include/uapi/asm-generic/mman.h| 21 +
 include/uapi/linux/mman.h  |  5 +
 5 files changed, 57 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/mman.h 
b/arch/powerpc/include/uapi/asm/mman.h
index c0c737215b00..d57b347c37fe 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -13,6 +13,11 @@
 
 #define PROT_SAO   0x10/* Strong Access Ordering */
 
+/* 0x01 - 0x03 are defined in linux/mman.h */
+#define MAP_TYPE   0x0f/* Mask for type of mapping */
+#define MAP_FIXED  0x10/* Interpret addr exactly */
+#define MAP_ANONYMOUS  0x20/* don't use a file */
+
 #define MAP_RENAME  MAP_ANONYMOUS   /* In SunOS terminology */
 #define MAP_NORESERVE   0x40/* don't reserve swap pages */
 #define MAP_LOCKED 0x80
@@ -21,6 +26,17 @@
 #define MAP_DENYWRITE  0x0800  /* ETXTBSY */
 #define MAP_EXECUTABLE 0x1000  /* mark it as an executable */
 
+#define MAP_POPULATE   0x008000/* populate (prefault) 
pagetables */
+#define MAP_NONBLOCK   0x01/* do not block on IO */
+#define MAP_STACK  0x02/* give out an address that is 
best suited for process/thread stacks */
+#define MAP_HUGETLB0x04/* create a huge page mapping */
+#define MAP_SYNC   0x08 /* perform synchronous page faults for 
the mapping */
+#define MAP_FIXED_NOREPLACE0x10/* MAP_FIXED which doesn't 
unmap underlying mapping */
+
+#define MAP_UNINITIALIZED 0x400/* For anonymous mmap, memory could be
+* uninitialized */
+
+
 
 #define MCL_CURRENT 0x2000  /* lock all currently mapped pages */
 #define MCL_FUTURE  0x4000  /* lock all additions to address space 
*/
diff --git a/arch/sparc/include/uapi/asm/mman.h 
b/arch/sparc/include/uapi/asm/mman.h
index cec9f4109687..afb86698cdb1 100644
--- a/arch/sparc/include/uapi/asm/mman.h
+++ b/arch/sparc/include/uapi/asm/mman.h
@@ -8,6 +8,11 @@
 
 #define PROT_ADI   0x10/* ADI enabled */
 
+/* 0x01 - 0x03 are defined in linux/mman.h */
+#define MAP_TYPE   0x0f/* Mask for type of mapping */
+#define MAP_FIXED  0x10/* Interpret addr exactly */
+#define MAP_ANONYMOUS  0x20/* don't use a file */
+
 #define MAP_RENAME  MAP_ANONYMOUS   /* In SunOS terminology */
 #define MAP_NORESERVE   0x40/* don't reserve swap pages */
 #define MAP_INHERIT 0x80/* SunOS doesn't do this, but... */
@@ -18,6 +23,16 @@
 #define MAP_DENYWRITE  0x0800  /* ETXTBSY */
 #define MAP_EXECUTABLE 0x1000  /* mark it as an executable */
 
+#define MAP_POPULATE   0x008000/* populate (prefault) 
pagetables */
+#define MAP_NONBLOCK   0x01/* do not block on IO */
+#define MAP_STACK  0x02/* give out an address that is 
best suited for process/thread stacks */
+#define MAP_HUGETLB0x04/* create a huge page mapping */
+#define MAP_SYNC   0x08 /* perform synchronous page faults for 
the mapping */
+#define MAP_FIXED_NOREPLACE0x10/* MAP_FIXED which doesn't 
unmap underlying mapping */
+
+#define MAP_UNINITIALIZED 0x400/* For anonymous mmap, memory could be
+* uninitialized */
+
 #define MCL_CURRENT 0x2000  /* lock all currently mapped pages */
 #define MCL_FUTURE  0x4000  /* lock all additions to address space 
*/
 #define MCL_ONFAULT0x8000  /* lock all pages that are faulted in */
diff --git a/include/uapi/asm-generic/mman-common.h 
b/include/uapi/asm-generic/mman-common.h
index 792ad5599d9c..8d66d2dabaa8 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -17,22 +17,6 @@
 #define PROT_GROWSDOWN 0x0100  /* mprotect flag: extend change to 
start of growsdown vma */
 #define PROT_GROWSUP   0x0200  /* mprotect flag: extend change to end 
of growsup vma */
 
-/* 0x01 - 0x03 are defined in linux/mman.h */
-#define MAP_TYPE   0x0f/* Mask for type of mapp

[PATCH 1/5] asm-generic: cosmetic updates to uapi/asm/mman.h

2024-09-25 Thread Arnd Bergmann

From: Arnd Bergmann 

All but four architectures use asm-generic/mman-common.h, and the
differences between these are mostly accidental. Rearrange them
slightly to make it possible to 'vimdiff' them to see the actual
relevant differences:

 - Move MADV_HWPOISON/MADV_SOFT_OFFLINE to the end of the list
   and ensure that all architectures include definitions

 - Use the exact same amount of whitespace and leading digits
   in each architecture

 - Synchronize comments, replacing historic defines that were
   never used with appropriate comments

 - explicitly point out MAP_SYNC and MAP_UNINITIALIZED as
   unsupported

Signed-off-by: Arnd Bergmann 
---
 arch/alpha/include/uapi/asm/mman.h | 53 ---
 arch/mips/include/uapi/asm/mman.h  | 72 --
 arch/parisc/include/uapi/asm/mman.h| 50 +++---
 arch/xtensa/include/uapi/asm/mman.h| 61 ++
 include/uapi/asm-generic/mman-common.h |  8 ++-
 5 files changed, 129 insertions(+), 115 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/mman.h 
b/arch/alpha/include/uapi/asm/mman.h
index 763929e814e9..8946a13ce858 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -6,6 +6,8 @@
 #define PROT_WRITE 0x2 /* page can be written */
 #define PROT_EXEC  0x4 /* page can be executed */
 #define PROT_SEM   0x8 /* page may be used for atomic ops */
+/* 0x10   reserved for arch-specific use */
+/* 0x20   reserved for arch-specific use */
 #define PROT_NONE  0x0 /* page can not be accessed */
 #define PROT_GROWSDOWN 0x0100  /* mprotect flag: extend change to 
start of growsdown vma */
 #define PROT_GROWSUP   0x0200  /* mprotect flag: extend change to end 
of growsup vma */
@@ -15,41 +17,49 @@
 #define MAP_FIXED  0x100   /* Interpret addr exactly */
 #define MAP_ANONYMOUS  0x10/* don't use a file */
 
-/* not used by linux, but here to make sure we don't clash with OSF/1 defines 
*/
-#define _MAP_HASSEMAPHORE 0x0200
-#define _MAP_INHERIT   0x0400
-#define _MAP_UNALIGNED 0x0800
-
-/* These are linux-specific */
-#define MAP_GROWSDOWN  0x01000 /* stack-like segment */
-#define MAP_DENYWRITE  0x02000 /* ETXTBSY */
-#define MAP_EXECUTABLE 0x04000 /* mark it as an executable */
-#define MAP_LOCKED 0x08000 /* lock the mapping */
+/* 0x200 through 0x800 originally for OSF-1 compat */
+#define MAP_GROWSDOWN  0x1000  /* stack-like segment */
+#define MAP_DENYWRITE  0x2000  /* ETXTBSY */
+#define MAP_EXECUTABLE 0x4000  /* mark it as an executable */
+#define MAP_LOCKED 0x8000  /* pages are locked */
 #define MAP_NORESERVE  0x1 /* don't check for reservations */
-#define MAP_POPULATE   0x2 /* populate (prefault) pagetables */
-#define MAP_NONBLOCK   0x4 /* do not block on IO */
-#define MAP_STACK  0x8 /* give out an address that is best 
suited for process/thread stacks */
-#define MAP_HUGETLB0x10/* create a huge page mapping */
+
+#define MAP_POPULATE   0x02/* populate (prefault) 
pagetables */
+#define MAP_NONBLOCK   0x04/* do not block on IO */
+#define MAP_STACK  0x08/* give out an address that is 
best suited for process/thread stacks */
+#define MAP_HUGETLB0x10/* create a huge page mapping */
+/* MAP_SYNC not supported */
 #define MAP_FIXED_NOREPLACE0x20/* MAP_FIXED which doesn't unmap 
underlying mapping */
 
-#define MS_ASYNC   1   /* sync memory asynchronously */
-#define MS_SYNC2   /* synchronous memory sync */
-#define MS_INVALIDATE  4   /* invalidate the caches */
+/* MAP_UNINITIALIZED not supported */
 
+/*
+ * Flags for mlockall
+ */
 #define MCL_CURRENT 8192   /* lock all currently mapped pages */
 #define MCL_FUTURE 16384   /* lock all additions to address space 
*/
 #define MCL_ONFAULT32768   /* lock all pages that are faulted in */
 
+/*
+ * Flags for mlock
+ */
 #define MLOCK_ONFAULT  0x01/* Lock pages in range after they are 
faulted in, do not prefault */
 
+/*
+ * Flags for msync
+ */
+#define MS_ASYNC   1   /* sync memory asynchronously */
+#define MS_SYNC2   /* synchronous memory sync */
+#define MS_INVALIDATE  4   /* invalidate the caches */
+
 #define MADV_NORMAL0   /* no further special treatment */
 #define MADV_RANDOM1   /* expect random page references */
 #define MADV_SEQUENTIAL2   /* expect sequential page 
references */
 #define MADV_WILLNEED  3   /* will need these pages */
-#d

[PATCH 0/5] asm-generic: clean up asm/mman.h

2024-09-25 Thread Arnd Bergmann

From: Arnd Bergmann 

While thinking about the changes to linux/mman.h in
https://lore.kernel.org/all/20240923141943.133551-1-vincenzo.frasc...@arm.com/
I ended up trying to clean up the duplicate definitions in order to
better see what's in there, and then I found a clash between two MAP_* flags.

Here is my current state, lightly tested. Please have a look at
the last patch in particular.

 Arnd

Arnd Bergmann (5):
  asm-generic: cosmetic updates to uapi/asm/mman.h
  asm-generic: move MAP_* flags from mman-common.h to mman.h
  asm-generic: use asm-generic/mman-common.h on mips and xtensa
  asm-generic: use asm-generic/mman-common.h on parisc and alpha
  [RFC] mm: Remove MAP_UNINITIALIZED support

 Documentation/admin-guide/mm/nommu-mmap.rst | 10 +--
 arch/alpha/include/uapi/asm/mman.h  | 93 ++-
 arch/mips/include/uapi/asm/mman.h   | 95 +++-
 arch/parisc/include/uapi/asm/mman.h | 79 -
 arch/powerpc/include/uapi/asm/mman.h| 11 +++
 arch/sh/configs/rsk7264_defconfig   |  1 -
 arch/sparc/include/uapi/asm/mman.h  | 12 +++
 arch/xtensa/include/uapi/asm/mman.h | 98 +++--
 fs/binfmt_elf_fdpic.c   |  3 +-
 include/linux/mman.h|  4 -
 include/uapi/asm-generic/mman-common.h  | 31 +++
 include/uapi/asm-generic/mman.h | 17 
 include/uapi/linux/mman.h   |  5 ++
 mm/Kconfig  | 22 -
 mm/nommu.c  |  4 +-
 15 files changed, 125 insertions(+), 360 deletions(-)

-- 
2.39.2

Cc: "Jason A. Donenfeld" 
Cc: Alexander Viro 
Cc: Alexandre Torgue 
Cc: Andreas Larsson 
Cc: Andrew Morton 
Cc: Ard Biesheuvel 
Cc: Christian Brauner 
Cc: Christoph Hellwig 
Cc: Christophe Leroy 
Cc: Damien Le Moal 
Cc: David Hildenbrand 
Cc: Greg Ungerer 
Cc: Helge Deller 
Cc: Kees Cook 
Cc: Liam R. Howlett  
Cc: Lorenzo Stoakes 
Cc: Matt Turner 
Cc: Max Filippov 
Cc: Michael Ellerman 
Cc: Michal Hocko 
Cc: Nicholas Piggin 
Cc: Richard Henderson 
Cc: Thomas Bogendoerfer 
Cc: Vladimir Murzin 
Cc: Vlastimil Babka 
Cc: linux-st...@st-md-mailman.stormreply.com
Cc: linux-ker...@vger.kernel.org
Cc: linux-m...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux...@kvack.org
Cc: linux-a...@vger.kernel.org

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-11 Thread Arnd Bergmann

On Wed, Sep 11, 2024, at 00:45, Charlie Jenkins wrote:
> On Tue, Sep 10, 2024 at 03:08:14PM -0400, Liam R. Howlett wrote:
>
> I responded to Arnd in the other thread, but I am still not convinced
> that the solution that x86 and arm64 have selected is the best solution.
> The solution of defaulting to 47 bits does allow applications the
> ability to get addresses that are below 47 bits. However, due to
> differences across architectures it doesn't seem possible to have all
> architectures default to the same value. Additionally, this flag will be
> able to help users avoid potential bugs where a hint address is passed
> that causes upper bits of a VA to be used.
>
> The other issue I have with this is that if there is not a hint address
> specified to be greater than 47 bits on x86, then mmap() may return an
> address that is greater than 47-bits. The documentation in
> Documentation/arch/x86/x86_64/5level-paging.rst says:
>
> "If hint address set above 47-bit, but MAP_FIXED is not specified, we try
> to look for unmapped area by specified address. If it's already
> occupied, we look for unmapped area in *full* address space, rather than
> from 47-bit window."

This is also in the commit message of b569bab78d8d ("x86/mm: Prepare
to expose larger address space to userspace"), which introduced it.
However, I don't actually see the fallback to the full address space,
instead the actual behavior seems to be the same as arm64.

Am I missing something in the x86 implementation, or do we just
need to update the documentation?

  Arnd

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-10 Thread Arnd Bergmann

On Mon, Sep 9, 2024, at 23:22, Charlie Jenkins wrote:
> On Fri, Sep 06, 2024 at 10:52:34AM +0100, Lorenzo Stoakes wrote:
>> On Fri, Sep 06, 2024 at 09:14:08AM GMT, Arnd Bergmann wrote:
>> The intent is to optionally be able to run a process that keeps higher bits
>> free for tagging and to be sure no memory mapping in the process will
>> clobber these (correct me if I'm wrong Charlie! :)
>> 
>> So you really wouldn't want this if you are using tagged pointers, you'd
>> want to be sure literally nothing touches the higher bits.

My understanding was that the purpose of the existing design
is to allow applications to ask for a high address without having
to resort to the complexity of MAP_FIXED.

In particular, I'm sure there is precedent for applications that
want both tagged pointers (for most mappings) and untagged pointers
(for large mappings). With a per-mm_struct or per-task_struct
setting you can't do that.

> Various architectures handle the hint address differently, but it
> appears that the only case across any architecture where an address
> above 47 bits will be returned is if the application had a hint address
> with a value greater than 47 bits and was using the MAP_FIXED flag.
> MAP_FIXED bypasses all other checks so I was assuming that it would be
> logical for MAP_FIXED to bypass this as well. If MAP_FIXED is not set,
> then the intent is for no hint address to cause a value greater than 47
> bits to be returned.

I don't think the MAP_FIXED case is that interesting here because
it has to work in both fixed and non-fixed mappings.

>> This would be more consistent vs. other arches.
>
> Yes riscv is an outlier here. The reason I am pushing for something like
> a flag to restrict the address space rather than setting it to be the
> default is it seems like if applications are relying on upper bits to be
> free, then they should be explicitly asking the kernel to keep them free
> rather than assuming them to be free.

Let's see what the other architectures do and then come up with
a way that fixes the pointer tagging case first on those that are
broken. We can see if there needs to be an extra flag after that.
Here is what I found:

- x86_64 uses DEFAULT_MAP_WINDOW of BIT(47), uses a 57 bit
  address space when an addr hint is passed.
- arm64 uses DEFAULT_MAP_WINDOW of BIT(47) or BIT(48), returns
  higher 52-bit addresses when either a hint is passed or
  CONFIG_EXPERT and CONFIG_ARM64_FORCE_52BIT is set (this
  is a debugging option)
- ppc64 uses a DEFAULT_MAP_WINDOW of BIT(47) or BIT(48),
  returns 52 bit address when an addr hint is passed
- riscv uses a DEFAULT_MAP_WINDOW of BIT(47) but only uses
  it for allocating the stack below, ignoring it for normal
  mappings
- s390 has no DEFAULT_MAP_WINDOW but tried to allocate in
  the current number of pgtable levels and only upgrades to
  the next level (31, 42, 53, 64 bits) if a hint is passed or
  the current level is exhausted.
- loongarch64 has no DEFAULT_MAP_WINDOW, and a default VA
  space of 47 bits (16K pages, 3 levels), but can support
  a 55 bit space (64K pages, 3 levels).
- sparc has no DEFAULT_MAP_WINDOW and up to 52 bit VA space.
  It may allocate both positive and negative addresses in
  there. (?)
- mips64, parisc64 and alpha have no DEFAULT_MAP_WINDOW and
  at most 48, 41 or 39 address bits, respectively.

I would suggest these changes:

- make riscv enforce DEFAULT_MAP_WINDOW like x86_64, arm64
   and ppc64, leave it at 47

- add DEFAULT_MAP_WINDOW on loongarch64 (47/48 bits
  based on page size), sparc (48 bits) and s390 (unsure if
  42, 53, 47 or 48 bits)

- leave the rest unchanged.

   Arnd

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-06 Thread Arnd Bergmann

On Fri, Sep 6, 2024, at 09:14, Guo Ren wrote:
> On Fri, Sep 6, 2024 at 3:18 PM Arnd Bergmann  wrote:
>>
>> It's also unclear to me how we want this flag to interact with
>> the existing logic in arch_get_mmap_end(), which attempts to
>> limit the default mapping to a 47-bit address space already.
>
> To optimize RISC-V progress, I recommend:
>
> Step 1: Approve the patch.
> Step 2: Update Go and OpenJDK's RISC-V backend to utilize it.
> Step 3: Wait approximately several iterations for Go & OpenJDK
> Step 4: Remove the 47-bit constraint in arch_get_mmap_end()

I really want to first see a plausible explanation about why
RISC-V can't just implement this using a 47-bit DEFAULT_MAP_WINDOW
like all the other major architectures (x86, arm64, powerpc64),
e.g. something like the patch below (untested, probably slightly
wrong but show illustrate my point).

 Arnd

diff --git a/arch/riscv/include/asm/processor.h 
b/arch/riscv/include/asm/processor.h
index 8702b8721a27..de9863be1efd 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -20,17 +20,8 @@
  * mmap_end < addr, being mmap_end the top of that address space.
  * See Documentation/arch/riscv/vm-layout.rst for more details.
  */
-#define arch_get_mmap_end(addr, len, flags)\
-({ \
-   unsigned long mmap_end; \
-   typeof(addr) _addr = (addr);\
-   if ((_addr) == 0 || is_compat_task() || \
-   ((_addr + len) > BIT(VA_BITS - 1))) \
-   mmap_end = STACK_TOP_MAX;   \
-   else\
-   mmap_end = (_addr + len);   \
-   mmap_end;   \
-})
+#define arch_get_mmap_end(addr, len, flags) \
+   (((addr) > DEFAULT_MAP_WINDOW) ? TASK_SIZE : DEFAULT_MAP_WINDOW)
 
 #define arch_get_mmap_base(addr, base) \
 ({ \
@@ -47,7 +38,7 @@
 })
 
 #ifdef CONFIG_64BIT
-#define DEFAULT_MAP_WINDOW (UL(1) << (MMAP_VA_BITS - 1))
+#define DEFAULT_MAP_WINDOW (is_compat_task() ? (UL(1) << (MMAP_VA_BITS - 
1)) : TASK_SIZE_32)
 #define STACK_TOP_MAX  TASK_SIZE_64
 #else
 #define DEFAULT_MAP_WINDOW TASK_SIZE

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-06 Thread Arnd Bergmann

On Fri, Sep 6, 2024, at 08:14, Lorenzo Stoakes wrote:
> On Fri, Sep 06, 2024 at 07:17:44AM GMT, Arnd Bergmann wrote:
>> On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote:
>> > Create a personality flag ADDR_LIMIT_47BIT to support applications
>> > that wish to transition from running in environments that support at
>> > most 47-bit VAs to environments that support larger VAs. This
>> > personality can be set to cause all allocations to be below the 47-bit
>> > boundary. Using MAP_FIXED with mmap() will bypass this restriction.
>> >
>> > Signed-off-by: Charlie Jenkins 
>>
>> I think having an architecture-independent mechanism to limit the size
>> of the 64-bit address space is useful in general, and we've discussed
>> the same thing for arm64 in the past, though we have not actually
>> reached an agreement on the ABI previously.
>
> The thread on the original proposals attests to this being rather a fraught
> topic, and I think the weight of opinion was more so in favour of opt-in
> rather than opt-out.

You mean opt-in to using the larger addresses like we do on arm64 and
powerpc, while "opt-out" means a limit as Charlie suggested?

>> > @@ -22,6 +22,7 @@ enum {
>> >WHOLE_SECONDS = 0x200,
>> >STICKY_TIMEOUTS =   0x400,
>> >ADDR_LIMIT_3GB =0x800,
>> > +  ADDR_LIMIT_47BIT =  0x1000,
>> > };
>>
>> I'm a bit worried about having this done specifically in the
>> personality flag bits, as they are rather limited. We obviously
>> don't want to add many more such flags when there could be
>> a way to just set the default limit.
>
> Since I'm the one who suggested it, I feel I should offer some kind of
> vague defence here :)
>
> We shouldn't let perfect be the enemy of the good. This is a relatively
> straightforward means of achieving the aim (assuming your concern about
> arch_get_mmap_end() below isn't a blocker) which has the least impact on
> existing code.
>
> Of course we can end up in absurdities where we start doing
> ADDR_LIMIT_xxBIT... but again - it's simple, shouldn't represent an
> egregious maintenance burden and is entirely opt-in so has things going for
> it.

I'm more confused now, I think most importantly we should try to
handle this consistently across all architectures. The proposed
implementation seems to completely block addresses above BIT(47)
even for applications that opt in by calling mmap(BIT(47), ...),
which seems to break the existing applications.

If we want this flag for RISC-V and also keep the behavior of
defaulting to >BIT(47) addresses for mmap(0, ...) how about
changing arch_get_mmap_end() to return the limit based on
ADDR_LIMIT_47BIT and then make this default to enabled on
arm64 and powerpc but disabled on riscv?

>> It's also unclear to me how we want this flag to interact with
>> the existing logic in arch_get_mmap_end(), which attempts to
>> limit the default mapping to a 47-bit address space already.
>
> How does ADDR_LIMIT_3GB presently interact with that?

That is x86 specific and only relevant to compat tasks, limiting
them to 3 instead of 4 GB. There is also ADDR_LIMIT_32BIT, which
on arm32 is always set in practice to allow 32-bit addressing 
as opposed to ARMv2 style 26-bit addressing (IIRC ARMv3 supported
both 26-bit and 32-bit addressing, while ARMv4 through ARMv7 are
32-bit only.

  Arnd

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-06 Thread Arnd Bergmann

On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote:
> Create a personality flag ADDR_LIMIT_47BIT to support applications
> that wish to transition from running in environments that support at
> most 47-bit VAs to environments that support larger VAs. This
> personality can be set to cause all allocations to be below the 47-bit
> boundary. Using MAP_FIXED with mmap() will bypass this restriction.
>
> Signed-off-by: Charlie Jenkins 

I think having an architecture-independent mechanism to limit the size
of the 64-bit address space is useful in general, and we've discussed
the same thing for arm64 in the past, though we have not actually
reached an agreement on the ABI previously.

> @@ -22,6 +22,7 @@ enum {
>   WHOLE_SECONDS = 0x200,
>   STICKY_TIMEOUTS =   0x400,
>   ADDR_LIMIT_3GB =0x800,
> + ADDR_LIMIT_47BIT =  0x1000,
> };

I'm a bit worried about having this done specifically in the
personality flag bits, as they are rather limited. We obviously
don't want to add many more such flags when there could be
a way to just set the default limit.

It's also unclear to me how we want this flag to interact with
the existing logic in arch_get_mmap_end(), which attempts to
limit the default mapping to a 47-bit address space already.

For some reason, it appears that the arch_get_mmap_end()
logic on RISC-V defaults to the maximum address
space for the 'addr==0' case which is inconsistentn with
the other architectures, so we should probably fix that
part first, possibly moving more of that logic into a
shared implementation.

  Arnd

Re: [PATCH] soc: fsl: qe: ucc: Export ucc_mux_set_grant_tsa_bkpt

2024-09-05 Thread Arnd Bergmann

On Thu, Sep 5, 2024, at 07:31, Christophe Leroy wrote:
> Le 05/09/2024 à 09:22, Herve Codina a écrit :
>> When TSA is compiled as module the following error is reported:
>>"ucc_mux_set_grant_tsa_bkpt" [drivers/soc/fsl/qe/tsa.ko] undefined!
>> 
>> Indeed, the ucc_mux_set_grant_tsa_bkpt symbol is not exported.
>> 
>> Simply export ucc_mux_set_grant_tsa_bkpt.
>> 
>> Reported-by: kernel test robot 
>> Closes: 
>> https://lore.kernel.org/oe-kbuild-all/202409051409.fszn8reo-...@intel.com/
>> Signed-off-by: Herve Codina 
>
> Acked-by: Christophe Leroy 
>
> Arnd, it is ok for you to take this patch directly ?

I've applied this one directly, but I'm not always paying attention
to patches flying by, so if you have more fixes like this in the future,
I recommend that you forward those to s...@kernel.org, either as a patch
or a pull request.

That way, I see them in patchwork and will apply them from there.

  Arnd

Re: [GIT PULL] SOC FSL for 6.12 (retry)

2024-09-03 Thread Arnd Bergmann

On Tue, Sep 3, 2024, at 06:36, Christophe Leroy wrote:
> Hi Arnd,
>
> Please pull the following Freescale Soc Drivers changes for 6.12
>
> There are no conflicts with latest linux-next tree.

Thanks, pulled now.

 Arnd

Re: [GIT PULL] SOC FSL for 6.12

2024-09-02 Thread Arnd Bergmann

On Wed, Aug 28, 2024, at 13:44, Christophe Leroy wrote:
> Hi Arnd,
>
> Please pull the following Freescale Soc Drivers changes for 6.12:
> - A series from Hervé Codina that bring support for the newer version of 
> QMC (QUICC Multi-channel Controller) and TSA (Time Slots Assigner) found 
> on MPC 83xx micro-controllers.
> - Misc changes for qbman freescale drivers
>
> There are no conflicts with latest linux-next tree.

Hi Christophe,

I've tried pulling this but ran into a few issues here, none of which
are related to the actual patches in your branch that look totally
fine to me:

> The following changes since commit 5be63fc19fcaa4c236b307420483578a56986a37:
>
>Linux 6.11-rc5 (2024-08-25 19:07:11 +1200)
>
> are available in the Git repository at:
>
>https://github.com/chleroy/linux.git tags/soc_fsl-6.12-1
>
> for you to fetch changes up to 1fe683bf6113da3cb694bc18ae655b2ee10ba393:
>
>Merge branch 'support-for-quicc-engine-tsa-and-qmc' (2024-08-25 
> 20:48:47 +0200)

- There is no tag description in here, which would give me an empty
  changelog text for the merge commit, or force me to summarize your
  contents myself. Please describe the contents of your branch in a couple
  of short paragraphs, in a way that helps me and future readers of
  the changelog understand what kind of work is being done. Don't
  repeat the oneline commit messages of the individual patches though,
  as they show up right under your summary anyway.

- You have not signed the tag, so there is no way for me to verify that
  you are actually the person that uploaded the branch. Ideally this
  should be signed with a gpg key that is on the kernel keyring, but
  even a brand new key is better than nothing because that way I can
  at least check that your next pull requests are signed by the same
  account as this one. Since you use a github.com account, this is
  even more important, as I can't easily see if you are the only person
  that is able to push to the github user 'chleroy'.
  Using a git tree on either git.kernel.org or your own domain would
  be ideal here, but github works if that is all you can easily do.

- My branch is based on 6.11-rc4, while your tag is on top of 6.11-rc5,
  so pulling it into my tree would require a backmerge that I try to
  avoid (it shows up when Linus pulls from me). Please rebase on
  an earlier -rc, ideally 6.11-rc1 unless you have a reason to need
  something later.

- Please add the linux-arm-kernel and powerpc mailing lists to cc
  for the pull request, so the PR gets properly archived. I saw this
  was missing because I could not apply it using

b4 pr dfafbd92-1e61-4e80-aa5c-2bfbe1def...@csgroup.eu

  This command failed as none of the mailing list archives have
  your message ID.

Please resend with all of the above changed.

   Arnd

Re: [PATCH v2 05/17] vdso: Avoid call to memset() by getrandom

2024-08-28 Thread Arnd Bergmann

On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
> On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
>> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
>> > 
>> > Is there a compiler flag that could be used to disable the generation of 
>> > calls
>> > to memset?
>> 
>> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
>> what it actually does (and how it avoids your problem, and mostly: learn
>> what the actual problem *was*!)
>
> This might help with various loops, but it doesn't help with the matter
> that this patch fixes, which is struct initialization. I just tried it
> with the arm64 patch to no avail.

Maybe -ffreestanding can help here? That should cause the vdso to be built
with the assumption that there is no libc, so it would neither add nor
remove standard library calls. Not sure if that causes other problems,
e.g. if the calling conventions are different.

   Arnd

Re: [PATCH] random: vDSO: Redefine PAGE_SIZE and PAGE_MASK

2024-08-27 Thread Arnd Bergmann

On Tue, Aug 27, 2024, at 10:40, Jason A. Donenfeld wrote:
> I don't love this, but it might be the lesser of evils, so sure, let's
> do it.
>
> I think I'll combine these header fixups so that the whole operation is
> a bit more clear. The commit is still pretty small. Something like
> below:
>
> From 0d9a3d68cd6222395a605abd0ac625c41d4cabfa Mon Sep 17 00:00:00 2001
> From: Christophe Leroy 
> Date: Tue, 27 Aug 2024 09:31:47 +0200
> Subject: [PATCH] random: vDSO: clean header inclusion in getrandom
>
> Depending on the architecture, building a 32-bit vDSO on a 64-bit kernel
> is problematic when some system headers are included.
>
> Minimise the amount of headers by moving needed items, such as
> __{get,put}_unaligned_t, into dedicated common headers and in general
> use more specific headers, similar to what was done in commit
> 8165b57bca21 ("linux/const.h: Extract common header for vDSO") and
> commit 8c59ab839f52 ("lib/vdso: Enable common headers").
>
> On some architectures this results in missing PAGE_SIZE, as was
> described by commit 8b3843ae3634 ("vdso/datapage: Quick fix - use
> asm/page-def.h for ARM64"), so define this if necessary, in the same way
> as done prior by commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in
> vdso/datapage.h").
>
> Removing linux/time64.h leads to missing 'struct timespec64' in
> x86's asm/pvclock.h. Add a forward declaration of that struct in
> that file.
>
> Signed-off-by: Christophe Leroy 
> Signed-off-by: Jason A. Donenfeld 

This is clearly better, but there are still a couple of inaccuracies
that may end up biting us again later. Not sure whether it's worth
trying to fix it all at once or if we want to address them when that
happens:

>  #include 
> -#include 
> -#include 
> -#include 
> +#include 

These are still two headers outside of the vdso/ namespace. For arm64
we had concluded that this is never safe, and any vdso header should
only include other vdso headers so we never pull in anything that
e.g. depends on memory management headers that are in turn broken
for the compat vdso.

The array_size.h header is really small, so that one could
probably just be moved into the vdso/ namespace. The minmax.h
header is already rather complex, so it may be better to just
open-code the usage of MIN/MAX where needed?

>  #include 
>  #include 
> +#include 
>  #include 
> -#include 
> -#include 
>  #include 
> +#include 
> +
> +#undef PAGE_SIZE
> +#undef PAGE_MASK
> +#define PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT)
> +#define PAGE_MASK (~(PAGE_SIZE - 1))

Since these are now the same across all architectures, maybe we
can just have the PAGE_SIZE definitions a vdso header instead
and include that from asm/page.h.

Including uapi/linux/mman.h may still be problematic on
some architectures if they change it in a way that is
incompatible with compat vdso, but at least that can't
accidentally rely on CONFIG_64BIT or something else that
would be wrong there.

 Arnd

Re: [PATCH v4 24/26] arch_numa: switch over to numa_memblks

2024-08-07 Thread Arnd Bergmann

On Wed, Aug 7, 2024, at 20:18, Mike Rapoport wrote:
> On Wed, Aug 07, 2024 at 08:58:37AM +0200, Arnd Bergmann wrote:
>> On Wed, Aug 7, 2024, at 08:41, Mike Rapoport wrote:
>> > 
>> >  void __init arch_numa_init(void);
>> >  int __init numa_add_memblk(int nodeid, u64 start, u64 end);
>> > -void __init numa_set_distance(int from, int to, int distance);
>> > -void __init numa_free_distance(void);
>> >  void __init early_map_cpu_to_node(unsigned int cpu, int nid);
>> >  int __init early_cpu_to_node(int cpu);
>> >  void numa_store_cpu_info(unsigned int cpu);
>> 
>> but is still declared as __init in the header, so it is
>> still put in that section and discarded after boot.
>
> I believe this should fix it

Yes, sorry I should have posted the patch as well, this is
what I tested with locally.

 Arnd

Re: [PATCH v3 0/8] PCI: Align small BARs

2024-08-07 Thread Arnd Bergmann

On Wed, Aug 7, 2024, at 17:17, Stewart Hildebrand wrote:
> In this context, "small" is defined as max(SZ_4K, PAGE_SIZE).
>
> This series sets the default minimum resource alignment to
> max(SZ_4K, PAGE_SIZE) for memory BARs. In preparation, it makes an
> optimization and addresses some corner cases observed when reallocating
> BARs. I consider the prepapatory patches to be prerequisites to changing
> the default BAR alignment.

It's probably worth noting that Linux does not support any
architectures with software page sizes smaller than 4KB,
and it would likely break a lot of assumptions, so
max(SZ_4K, PAGE_SIZE) is really the same as PAGE_SIZE
in practice.

 Arnd

Re: [PATCH v4 24/26] arch_numa: switch over to numa_memblks

2024-08-07 Thread Arnd Bergmann

On Wed, Aug 7, 2024, at 08:41, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" 
>
> Until now arch_numa was directly translating firmware NUMA information
> to memblock.

I get a link time warning from this:

WARNING: modpost: vmlinux: section mismatch in reference: 
numa_set_cpumask+0x24 (section: .text.unlikely) -> early_cpu_to_node (section: 
.init.text)

> @@ -142,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int 
> nid)
>  unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
>  EXPORT_SYMBOL(__per_cpu_offset);
> 
> -int __init early_cpu_to_node(int cpu)
> +int early_cpu_to_node(int cpu)
>  {
>   return cpu_to_node_map[cpu];
>  }

early_cpu_to_node() can no longer be __init here

> +#endif /* CONFIG_NUMA_EMU */
> diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
> index c32e0cf23c90..c2b046d1fd82 100644
> --- a/include/asm-generic/numa.h
> +++ b/include/asm-generic/numa.h
> @@ -32,8 +32,6 @@ static inline const struct cpumask *cpumask_of_node(int 
> node)
> 
>  void __init arch_numa_init(void);
>  int __init numa_add_memblk(int nodeid, u64 start, u64 end);
> -void __init numa_set_distance(int from, int to, int distance);
> -void __init numa_free_distance(void);
>  void __init early_map_cpu_to_node(unsigned int cpu, int nid);
>  int __init early_cpu_to_node(int cpu);
>  void numa_store_cpu_info(unsigned int cpu);

but is still declared as __init in the header, so it is
still put in that section and discarded after boot.

I was confused by this at first, since the 'early' name
seems to imply that you shouldn't call it once the system
is up, but now you do.

 Arnd

Re: [PATCH] crypto: ppc/curve25519 - add missing MODULE_DESCRIPTION() macro

2024-08-02 Thread Arnd Bergmann

On Fri, Aug 2, 2024, at 16:27, Jeff Johnson wrote:
> On 8/2/2024 6:15 AM, Herbert Xu wrote:
>> On Thu, Jul 18, 2024 at 06:14:18PM -0700, Jeff Johnson wrote:
>>> Since commit 1fffe7a34c89 ("script: modpost: emit a warning when the
>>> description is missing"), a module without a MODULE_DESCRIPTION() will
>>> result in a warning with make W=1. The following warning is being
>>> observed when building ppc64le with CRYPTO_CURVE25519_PPC64=m:
>>>
>>> WARNING: modpost: missing MODULE_DESCRIPTION() in 
>>> arch/powerpc/crypto/curve25519-ppc64le.o
>>>
>>> Add the missing invocation of the MODULE_DESCRIPTION() macro.
>>>
>>> Signed-off-by: Jeff Johnson 
>>> ---
>>>  arch/powerpc/crypto/curve25519-ppc64le-core.c | 1 +
>>>  1 file changed, 1 insertion(+)
>> 
>> Patch applied.  Thanks.
>
> Great, that was the last of my MODULE_DESCRIPTION patches!!!
>
> There are a few more instances of the warning that Arnd has patches for,
> covering issues that appear in randconfigs that I didn't test.

Are all of your patches in linux-next now, or is there a another
git tree that has them all?

I can send the ones I have left, but I want to avoid duplication.

Arnd

Re: Build regressions/improvements in v6.11-rc1

2024-07-29 Thread Arnd Bergmann

On Mon, Jul 29, 2024, at 11:35, Geert Uytterhoeven wrote:
>
>>  + /kisskb/src/kernel/fork.c: error: #warning clone3() entry point is 
>> missing, please fix [-Werror=cpp]:  => 3072:2
>
> sh4-gcc13/se{7619,7750}_defconfig
> sh4-gcc13/sh-all{mod,no,yes}config
> sh4-gcc13/sh-defconfig
> sparc64-gcc5/sparc-allnoconfig
> sparc64-gcc{5,13}/sparc32_defconfig
> sparc64-gcc{5,13}/sparc64-{allno,def}config
> sparc64-gcc13/sparc-all{mod,no}config
> sparc64-gcc13/sparc64-allmodconfig

Hexagon and NIOS2 as well, but this is expected. I really just
moved the warning into the actual implementation, the warning
is the same as before. hexagon and sh look like they should be
trivial, it's just that nobody seems to care. I'm sure the
patches were posted before and never applied.

sparc and nios2 do need some real work to write and test
the wrappers.

It does look like CONFIG_WERROR did not fail the build before
505d66d1abfb ("clone3: drop __ARCH_WANT_SYS_CLONE3 macro")
as it probably was intended.

  Arnd

Re: [PATCH 1/2] MAINTAINERS: Mark powerpc Cell as orphaned

2024-07-26 Thread Arnd Bergmann

On Fri, Jul 26, 2024, at 14:33, Michael Ellerman wrote:
> Arnd is no longer actively maintaining Cell, mark it as orphan.
>
> Also drop the dead developerworks link.
>
> Signed-off-by: Michael Ellerman 

Acked-by: Arnd Bergmann 

The platform contains two separate bits, so we need to
decide what to do with each one of them in the long run:

CONFIG_PPC_IBM_CELL_BLADE is clearly dead, they were sold
from 2006 to 2012 and never that popular aside from a
handful of supercomputers that were all dismantled a
long time ago. Unless there is a user that wants to
keep maintaining these, we can probably remove all this
code soon, e.g. after the next LTS kernel.

CONFIG_SPU_FS is shared with the PS3 platform, which is
still used and maintained. The bit I don't know is how
common it is to actually still use spufs on the PS3.
Support for spu programs was removed in gcc-9.1 and
gdb-8.3, so none of the major distros even ship old
enough toolchains any more, but existing applications
and older toolchains should still work for people
that have them and want to run new kernels.

Geoff, are you using spufs on ps3, and if so, should
we move arch/powerpc/platforms/cell/spu* to the PS3
entry in the MAINTAINERS file? I don't think there
is any advantage in actually moving the files to
platforms/ps3 if we delete the cell blade support.

 Arnd

Re: [PATCH 2/2] MAINTAINERS: Mark powerpc spufs as orphaned

2024-07-26 Thread Arnd Bergmann

On Fri, Jul 26, 2024, at 14:33, Michael Ellerman wrote:
> Jeremy is no longer actively maintaining spufs, mark it as orphan.
>
> Also drop the dead developerworks link.
>
> Signed-off-by: Michael Ellerman 
> Acked-by: Jeremy Kerr 

Acked-by: Arnd Bergmann

Re: [PATCH] vmlinux.lds.h: catch .bss..L* sections into BSS")

2024-07-11 Thread Arnd Bergmann

On Fri, Jul 12, 2024, at 07:51, Christophe Leroy wrote:
> Commit 9a427556fb8e ("vmlinux.lds.h: catch compound literals into
> data and BSS") added catches for .data..L* and .rodata..L* but missed
> .bss..L*
>
> Since commit 5431fdd2c181 ("ptrace: Convert ptrace_attach() to use
> lock guards") the following appears at build:
>
>   LD  .tmp_vmlinux.kallsyms1
> powerpc64-linux-ld: warning: orphan section `.bss..Lubsan_data33' from 
> `kernel/ptrace.o' being placed in section `.bss..Lubsan_data33'
>   NM  .tmp_vmlinux.kallsyms1.syms
>   KSYMS   .tmp_vmlinux.kallsyms1.S
>   AS  .tmp_vmlinux.kallsyms1.S
>   LD  .tmp_vmlinux.kallsyms2
> powerpc64-linux-ld: warning: orphan section `.bss..Lubsan_data33' from 
> `kernel/ptrace.o' being placed in section `.bss..Lubsan_data33'
>   NM  .tmp_vmlinux.kallsyms2.syms
>   KSYMS   .tmp_vmlinux.kallsyms2.S
>   AS  .tmp_vmlinux.kallsyms2.S
>   LD  vmlinux
> powerpc64-linux-ld: warning: orphan section `.bss..Lubsan_data33' from 
> `kernel/ptrace.o' being placed in section `.bss..Lubsan_data33'
>
> Lets add .bss..L* to BSS_MAIN macro to catch those sections into BSS.
>
> Fixes: 9a427556fb8e ("vmlinux.lds.h: catch compound literals into data 
> and BSS")
> Signed-off-by: Christophe Leroy 
> Reported-by: kernel test robot 
> Closes: 
> https://lore.kernel.org/oe-kbuild-all/202404031349.nmkhyuug-...@intel.com/

Applied to the asm-generic tree, thanks!

 Arnd

Re: [PATCH 5/9] ARM: defconfig: convert to MTD_EEPROM_AT24

2024-07-10 Thread Arnd Bergmann

On Wed, Jul 10, 2024, at 14:59, Bartosz Golaszewski wrote:
> On Wed, Jul 10, 2024 at 2:49 PM Arnd Bergmann  wrote:
>>
>> On Mon, Jul 1, 2024, at 15:53, Marco Felsch wrote:
>> > The EEPROM_AT24 Kconfig symbol is marked as deprecated. Make use of the
>> > new Kconfig symbol to select the I2C EEPROM driver support.
>> >
>> > Signed-off-by: Marco Felsch 
>> > ---
>> >  arch/arm/configs/aspeed_g4_defconfig   | 2 +-
>> >  arch/arm/configs/aspeed_g5_defconfig   | 2 +-
>> >  arch/arm/configs/at91_dt_defconfig | 2 +-
>> >  arch/arm/configs/axm55xx_defconfig | 2 +-
>> >  arch/arm/configs/davinci_all_defconfig | 2 +-
>> >  arch/arm/configs/imx_v4_v5_defconfig   | 2 +-
>> >  arch/arm/configs/imx_v6_v7_defconfig   | 2 +-
>> >  arch/arm/configs/ixp4xx_defconfig  | 2 +-
>> >  arch/arm/configs/keystone_defconfig| 2 +-
>> >  arch/arm/configs/lpc18xx_defconfig | 2 +-
>>
>> Applied to soc/defconfig, thanks
>
> No! Why? This is still being discussed and it's not clear it will even
> make it upstream.

Ok, dropped again, thanks for catching this.

 Arnd

Re: [PATCH 5/9] ARM: defconfig: convert to MTD_EEPROM_AT24

2024-07-10 Thread Arnd Bergmann

On Mon, Jul 1, 2024, at 15:53, Marco Felsch wrote:
> The EEPROM_AT24 Kconfig symbol is marked as deprecated. Make use of the
> new Kconfig symbol to select the I2C EEPROM driver support.
>
> Signed-off-by: Marco Felsch 
> ---
>  arch/arm/configs/aspeed_g4_defconfig   | 2 +-
>  arch/arm/configs/aspeed_g5_defconfig   | 2 +-
>  arch/arm/configs/at91_dt_defconfig | 2 +-
>  arch/arm/configs/axm55xx_defconfig | 2 +-
>  arch/arm/configs/davinci_all_defconfig | 2 +-
>  arch/arm/configs/imx_v4_v5_defconfig   | 2 +-
>  arch/arm/configs/imx_v6_v7_defconfig   | 2 +-
>  arch/arm/configs/ixp4xx_defconfig  | 2 +-
>  arch/arm/configs/keystone_defconfig| 2 +-
>  arch/arm/configs/lpc18xx_defconfig | 2 +-

Applied to soc/defconfig, thanks

   Arnd

Re: [PATCH v2 06/13] parisc: use generic sys_fanotify_mark implementation

2024-06-29 Thread Arnd Bergmann

On Sat, Jun 29, 2024, at 19:46, Guenter Roeck wrote:

> Building parisc:allmodconfig ... failed
> --
> Error log:
> In file included from fs/notify/fanotify/fanotify_user.c:14:
> include/linux/syscalls.h:248:25: error: conflicting types for 
> 'sys_fanotify_mark'; have 'long int(int,  unsigned int,  u32,  u32,  
> int,  const char *)' {aka 'long int(int,  unsigned int,  unsigned int,  
> unsigned int,  int,  const char *)'}
>   248 | asmlinkage long 
> sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))   \
>   | ^~~
> include/linux/syscalls.h:234:9: note: in expansion of macro 
> '__SYSCALL_DEFINEx'
>   234 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>   | ^

Thanks for the report, this has escaped my build testing
since I had fanotify disabled on the parisc build.

Sent a fix now and queued it as a fix in the asm-generic
tree:

https://lore.kernel.org/lkml/20240629210359.94426-1-a...@kernel.org/T/#u

 Arnd

Re: powerpc: nvram_64.c:75:13: error: 'oops_to_nvram' used but never defined [-Werror]

2024-06-27 Thread Arnd Bergmann

On Thu, Jun 27, 2024, at 14:49, Naresh Kamboju wrote:
> The powerpc builds failed on Linux next-20240626 tag due to following 
> 
> arch/powerpc/kernel/nvram_64.c:79:17: error: initialization of 'void
> (*)(struct kmsg_dumper *, enum kmsg_dump_reason,  const char *)' from
> incompatible pointer type 'void (*)(struct kmsg_dumper *, enum
> kmsg_dump_reason)' [-Werror=incompatible-pointer-types]
>79 | .dump = oops_to_nvram
>   | ^
> arch/powerpc/kernel/nvram_64.c:79:17: note: (near initialization for
> 'nvram_kmsg_dumper.dump')
> arch/powerpc/kernel/nvram_64.c:645:13: error: conflicting types for
> 'oops_to_nvram'; have 'void(struct kmsg_dumper *, enum
> kmsg_dump_reason,  const char *)'
>   645 | static void oops_to_nvram(struct kmsg_dumper *dumper,
>   | ^
> arch/powerpc/kernel/nvram_64.c:75:13: note: previous declaration of
> 'oops_to_nvram' with type 'void(struct kmsg_dumper *, enum
> kmsg_dump_reason)'
>75 | static void oops_to_nvram(struct kmsg_dumper *dumper,
>   | ^
> arch/powerpc/kernel/nvram_64.c:75:13: error: 'oops_to_nvram' used but
> never defined [-Werror]
> arch/powerpc/kernel/nvram_64.c:645:13: error: 'oops_to_nvram' defined
> but not used [-Werror=unused-function]
>   645 | static void oops_to_nvram(struct kmsg_dumper *dumper,
>   | ^
> cc1: all warnings being treated as error

The problem is the forward declaration that was not changed
as part of commit 7e72bb7504d1 ("printk: add a short
description string to kmsg_dump()"). This should fix it:

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index e385d3164648..a9da83c4243a 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -73,7 +73,8 @@ static const char *nvram_os_partitions[] = {
 };
 
 static void oops_to_nvram(struct kmsg_dumper *dumper,
- enum kmsg_dump_reason reason);
+ enum kmsg_dump_reason reason,
+ const char *desc);
 
 static struct kmsg_dumper nvram_kmsg_dumper = {
.dump = oops_to_nvram


  Arnd

[PATCH v2 09/13] csky, hexagon: fix broken sys_sync_file_range

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

Both of these architectures require u64 function arguments to be
passed in even/odd pairs of registers or stack slots, which in case of
sync_file_range would result in a seven-argument system call that is
not currently possible. The system call is therefore incompatible with
all existing binaries.

While it would be possible to implement support for seven arguments
like on mips, it seems better to use a six-argument version, either
with the normal argument order but misaligned as on most architectures
or with the reordered sync_file_range2() calling conventions as on
arm and powerpc.

Cc: sta...@vger.kernel.org
Acked-by: Guo Ren 
Signed-off-by: Arnd Bergmann 
---
 arch/csky/include/uapi/asm/unistd.h| 1 +
 arch/hexagon/include/uapi/asm/unistd.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/csky/include/uapi/asm/unistd.h 
b/arch/csky/include/uapi/asm/unistd.h
index 7ff6a2466af1..e0594b6370a6 100644
--- a/arch/csky/include/uapi/asm/unistd.h
+++ b/arch/csky/include/uapi/asm/unistd.h
@@ -6,6 +6,7 @@
 #define __ARCH_WANT_SYS_CLONE3
 #define __ARCH_WANT_SET_GET_RLIMIT
 #define __ARCH_WANT_TIME32_SYSCALLS
+#define __ARCH_WANT_SYNC_FILE_RANGE2
 #include 
 
 #define __NR_set_thread_area   (__NR_arch_specific_syscall + 0)
diff --git a/arch/hexagon/include/uapi/asm/unistd.h 
b/arch/hexagon/include/uapi/asm/unistd.h
index 432c4db1b623..21ae22306b5d 100644
--- a/arch/hexagon/include/uapi/asm/unistd.h
+++ b/arch/hexagon/include/uapi/asm/unistd.h
@@ -36,5 +36,6 @@
 #define __ARCH_WANT_SYS_VFORK
 #define __ARCH_WANT_SYS_FORK
 #define __ARCH_WANT_TIME32_SYSCALLS
+#define __ARCH_WANT_SYNC_FILE_RANGE2
 
 #include 
-- 
2.39.2

[PATCH v2 08/13] sh: rework sync_file_range ABI

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

The unusual function calling conventions on SuperH ended up causing
sync_file_range to have the wrong argument order, with the 'flags'
argument getting sorted before 'nbytes' by the compiler.

In userspace, I found that musl, glibc, uclibc and strace all expect the
normal calling conventions with 'nbytes' last, so changing the kernel
to match them should make all of those work.

In order to be able to also fix libc implementations to work with existing
kernels, they need to be able to tell which ABI is used. An easy way
to do this is to add yet another system call using the sync_file_range2
ABI that works the same on all architectures.

Old user binaries can now work on new kernels, and new binaries can
try the new sync_file_range2() to work with new kernels or fall back
to the old sync_file_range() version if that doesn't exist.

Cc: sta...@vger.kernel.org
Fixes: 75c92acdd5b1 ("sh: Wire up new syscalls.")
Signed-off-by: Arnd Bergmann 
---
 arch/sh/kernel/sys_sh32.c   | 11 +++
 arch/sh/kernel/syscalls/syscall.tbl |  3 ++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/sh/kernel/sys_sh32.c b/arch/sh/kernel/sys_sh32.c
index 9dca568509a5..d6f4afcb0e87 100644
--- a/arch/sh/kernel/sys_sh32.c
+++ b/arch/sh/kernel/sys_sh32.c
@@ -59,3 +59,14 @@ asmlinkage int sys_fadvise64_64_wrapper(int fd, u32 offset0, 
u32 offset1,
 (u64)len0 << 32 | len1, advice);
 #endif
 }
+
+/*
+ * swap the arguments the way that libc wants them instead of
+ * moving flags ahead of the 64-bit nbytes argument
+ */
+SYSCALL_DEFINE6(sh_sync_file_range6, int, fd, SC_ARG64(offset),
+SC_ARG64(nbytes), unsigned int, flags)
+{
+return ksys_sync_file_range(fd, SC_VAL64(loff_t, offset),
+SC_VAL64(loff_t, nbytes), flags);
+}
diff --git a/arch/sh/kernel/syscalls/syscall.tbl 
b/arch/sh/kernel/syscalls/syscall.tbl
index bbf83a2db986..c55fd7696d40 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -321,7 +321,7 @@
 311common  set_robust_list sys_set_robust_list
 312common  get_robust_list sys_get_robust_list
 313common  splice  sys_splice
-314common  sync_file_range sys_sync_file_range
+314common  sync_file_range sys_sh_sync_file_range6
 315common  tee sys_tee
 316common  vmsplicesys_vmsplice
 317common  move_pages  sys_move_pages
@@ -395,6 +395,7 @@
 385common  pkey_alloc  sys_pkey_alloc
 386common  pkey_free   sys_pkey_free
 387common  rseqsys_rseq
+388common  sync_file_range2sys_sync_file_range2
 # room for arch specific syscalls
 393common  semget  sys_semget
 394common  semctl  sys_semctl
-- 
2.39.2

[PATCH v2 07/13] powerpc: restore some missing spu syscalls

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

A couple of system calls were inadventently removed from the table during
a bugfix for 32-bit powerpc entry. Restore the original behavior.

Fixes: e23750623835 ("powerpc/32: fix syscall wrappers with 64-bit arguments of 
unaligned register-pairs")
Acked-by: Michael Ellerman 
Signed-off-by: Arnd Bergmann 
---
 arch/powerpc/kernel/syscalls/syscall.tbl | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index c6b0546b284d..ebae8415dfbb 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -230,8 +230,10 @@
 178nospu   rt_sigsuspend   sys_rt_sigsuspend   
compat_sys_rt_sigsuspend
 17932  pread64 sys_ppc_pread64 
compat_sys_ppc_pread64
 17964  pread64 sys_pread64
+179spu pread64 sys_pread64
 18032  pwrite64sys_ppc_pwrite64
compat_sys_ppc_pwrite64
 18064  pwrite64sys_pwrite64
+180spu pwrite64sys_pwrite64
 181common  chown   sys_chown
 182common  getcwd  sys_getcwd
 183common  capget  sys_capget
@@ -246,6 +248,7 @@
 190common  ugetrlimit  sys_getrlimit   
compat_sys_getrlimit
 19132  readahead   sys_ppc_readahead   
compat_sys_ppc_readahead
 19164  readahead   sys_readahead
+191spu readahead   sys_readahead
 19232  mmap2   sys_mmap2   
compat_sys_mmap2
 19332  truncate64  sys_ppc_truncate64  
compat_sys_ppc_truncate64
 19432  ftruncate64 sys_ppc_ftruncate64 
compat_sys_ppc_ftruncate64
@@ -293,6 +296,7 @@
 232nospu   set_tid_address sys_set_tid_address
 23332  fadvise64   sys_ppc32_fadvise64 
compat_sys_ppc32_fadvise64
 23364  fadvise64   sys_fadvise64
+233spu fadvise64   sys_fadvise64
 234nospu   exit_group  sys_exit_group
 235nospu   lookup_dcookie  sys_ni_syscall
 236common  epoll_createsys_epoll_create
-- 
2.39.2

[PATCH v2 06/13] parisc: use generic sys_fanotify_mark implementation

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

The sys_fanotify_mark() syscall on parisc uses the reverse word order
for the two halves of the 64-bit argument compared to all syscalls on
all 32-bit architectures. As far as I can tell, the problem is that
the function arguments on parisc are sorted backwards (26, 25, 24, 23,
...) compared to everyone else, so the calling conventions of using an
even/odd register pair in native word order result in the lower word
coming first in function arguments, matching the expected behavior
on little-endian architectures. The system call conventions however
ended up matching what the other 32-bit architectures do.

A glibc cleanup in 2020 changed the userspace behavior in a way that
handles all architectures consistently, but this inadvertently broke
parisc32 by changing to the same method as everyone else.

The change made it into glibc-2.35 and subsequently into debian 12
(bookworm), which is the latest stable release. This means we
need to choose between reverting the glibc change or changing the
kernel to match it again, but either hange will leave some systems
broken.

Pick the option that is more likely to help current and future
users and change the kernel to match current glibc. This also
means the behavior is now consistent across architectures, but
it breaks running new kernels with old glibc builds before 2.35.

Link: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d150181d73d9
Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/arch/parisc/kernel/sys_parisc.c?h=57b1dfbd5b4a39d
Cc: Adhemerval Zanella 
Tested-by: Helge Deller 
Acked-by: Helge Deller 
Signed-off-by: Arnd Bergmann 
---
I found this through code inspection, please double-check to make
sure I got the bug and the fix right.

The alternative is to fix this by reverting glibc back to the
unusual behavior.
---
 arch/parisc/Kconfig | 1 +
 arch/parisc/kernel/sys_parisc32.c   | 9 -
 arch/parisc/kernel/syscalls/syscall.tbl | 2 +-
 3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index daafeb20f993..dc9b902de8ea 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -16,6 +16,7 @@ config PARISC
select ARCH_HAS_UBSAN
select ARCH_HAS_PTE_SPECIAL
select ARCH_NO_SG_CHAIN
+   select ARCH_SPLIT_ARG64 if !64BIT
select ARCH_SUPPORTS_HUGETLBFS if PA20
select ARCH_SUPPORTS_MEMORY_FAILURE
select ARCH_STACKWALK
diff --git a/arch/parisc/kernel/sys_parisc32.c 
b/arch/parisc/kernel/sys_parisc32.c
index 2a12a547b447..826c8e51b585 100644
--- a/arch/parisc/kernel/sys_parisc32.c
+++ b/arch/parisc/kernel/sys_parisc32.c
@@ -23,12 +23,3 @@ asmlinkage long sys32_unimplemented(int r26, int r25, int 
r24, int r23,
current->comm, current->pid, r20);
 return -ENOSYS;
 }
-
-asmlinkage long sys32_fanotify_mark(compat_int_t fanotify_fd, compat_uint_t 
flags,
-   compat_uint_t mask0, compat_uint_t mask1, compat_int_t dfd,
-   const char  __user * pathname)
-{
-   return sys_fanotify_mark(fanotify_fd, flags,
-   ((__u64)mask1 << 32) | mask0,
-dfd, pathname);
-}
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
b/arch/parisc/kernel/syscalls/syscall.tbl
index 39e67fab7515..66dc406b12e4 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -364,7 +364,7 @@
 320common  accept4 sys_accept4
 321common  prlimit64   sys_prlimit64
 322common  fanotify_init   sys_fanotify_init
-323common  fanotify_mark   sys_fanotify_mark   
sys32_fanotify_mark
+323common  fanotify_mark   sys_fanotify_mark   
compat_sys_fanotify_mark
 32432  clock_adjtime   sys_clock_adjtime32
 32464  clock_adjtime   sys_clock_adjtime
 325common  name_to_handle_at   sys_name_to_handle_at
-- 
2.39.2

[PATCH v2 05/13] parisc: use correct compat recv/recvfrom syscalls

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

Johannes missed parisc back when he introduced the compat version
of these syscalls, so receiving cmsg messages that require a compat
conversion is still broken.

Use the correct calls like the other architectures do.

Fixes: 1dacc76d0014 ("net/compat/wext: send different messages to compat tasks")
Acked-by: Helge Deller 
Signed-off-by: Arnd Bergmann 
---
 arch/parisc/kernel/syscalls/syscall.tbl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
b/arch/parisc/kernel/syscalls/syscall.tbl
index b13c21373974..39e67fab7515 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -108,7 +108,7 @@
 95 common  fchown  sys_fchown
 96 common  getpriority sys_getpriority
 97 common  setpriority sys_setpriority
-98 common  recvsys_recv
+98 common  recvsys_recv
compat_sys_recv
 99 common  statfs  sys_statfs  
compat_sys_statfs
 100common  fstatfs sys_fstatfs 
compat_sys_fstatfs
 101common  stat64  sys_stat64
@@ -135,7 +135,7 @@
 120common  clone   sys_clone_wrapper
 121common  setdomainname   sys_setdomainname
 122common  sendfilesys_sendfile
compat_sys_sendfile
-123common  recvfromsys_recvfrom
+123common  recvfromsys_recvfrom
compat_sys_recvfrom
 12432  adjtimexsys_adjtimex_time32
 12464  adjtimexsys_adjtimex
 125common  mprotectsys_mprotect
-- 
2.39.2

[PATCH v2 04/13] sparc: fix compat recv/recvfrom syscalls

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

sparc has the wrong compat version of recv() and recvfrom() for both the
direct syscalls and socketcall().

The direct syscalls just need to use the compat version. For socketcall,
the same thing could be done, but it seems better to completely remove
the custom assembler code for it and just use the same implementation that
everyone else has.

Fixes: 1dacc76d0014 ("net/compat/wext: send different messages to compat tasks")
Signed-off-by: Arnd Bergmann 
---
 arch/sparc/kernel/sys32.S  | 221 -
 arch/sparc/kernel/syscalls/syscall.tbl |   4 +-
 2 files changed, 2 insertions(+), 223 deletions(-)

diff --git a/arch/sparc/kernel/sys32.S b/arch/sparc/kernel/sys32.S
index a45f0f31fe51..a3d308f2043e 100644
--- a/arch/sparc/kernel/sys32.S
+++ b/arch/sparc/kernel/sys32.S
@@ -18,224 +18,3 @@ sys32_mmap2:
sethi   %hi(sys_mmap), %g1
jmpl%g1 + %lo(sys_mmap), %g0
 sllx   %o5, 12, %o5
-
-   .align  32
-   .globl  sys32_socketcall
-sys32_socketcall:  /* %o0=call, %o1=args */
-   cmp %o0, 1
-   bl,pn   %xcc, do_einval
-cmp%o0, 18
-   bg,pn   %xcc, do_einval
-sub%o0, 1, %o0
-   sllx%o0, 5, %o0
-   sethi   %hi(__socketcall_table_begin), %g2
-   or  %g2, %lo(__socketcall_table_begin), %g2
-   jmpl%g2 + %o0, %g0
-nop
-do_einval:
-   retl
-mov-EINVAL, %o0
-
-   .align  32
-__socketcall_table_begin:
-
-   /* Each entry is exactly 32 bytes. */
-do_sys_socket: /* sys_socket(int, int, int) */
-1: ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_socket), %g1
-2: ldswa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_socket), %g0
-3:  ldswa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_bind: /* sys_bind(int fd, struct sockaddr *, int) */
-4: ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_bind), %g1
-5: ldswa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_bind), %g0
-6:  lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_connect: /* sys_connect(int, struct sockaddr *, int) */
-7: ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_connect), %g1
-8: ldswa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_connect), %g0
-9:  lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_listen: /* sys_listen(int, int) */
-10:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_listen), %g1
-   jmpl%g1 + %lo(sys_listen), %g0
-11: ldswa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-   nop
-do_sys_accept: /* sys_accept(int, struct sockaddr *, int *) */
-12:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_accept), %g1
-13:lduwa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_accept), %g0
-14: lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_getsockname: /* sys_getsockname(int, struct sockaddr *, int *) */
-15:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_getsockname), %g1
-16:lduwa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_getsockname), %g0
-17: lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_getpeername: /* sys_getpeername(int, struct sockaddr *, int *) */
-18:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_getpeername), %g1
-19:lduwa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_getpeername), %g0
-20: lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_socketpair: /* sys_socketpair(int, int, int, int *) */
-21:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_socketpair), %g1
-22:ldswa   [%o1 + 0x8] %asi, %o2
-23:lduwa   [%o1 + 0xc] %asi, %o3
-   jmpl%g1 + %lo(sys_socketpair), %g0
-24: ldswa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-do_sys_send: /* sys_send(int, void *, size_t, unsigned int) */
-25:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_send), %g1
-26:lduwa   [%o1 + 0x8] %asi, %o2
-27:lduwa   [%o1 + 0xc] %asi, %o3
-   jmpl%g1 + %lo(sys_send), %g0
-28: lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-do_sys_recv: /* sys_recv(int, void *, size_t, unsigned int) */
-29:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_recv), %g1
-30:lduwa   [%o1 + 0x8] %asi, %o2
-31:lduwa   [%o1 + 0xc] %asi, %o3
-   jmpl%g1 + %l

[PATCH v2 03/13] sparc: fix old compat_sys_select()

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

sparc has two identical select syscalls at numbers 93 and 230, respectively.
During the conversion to the modern syscall.tbl format, the older one of the
two broke in compat mode, and now refers to the native 64-bit syscall.

Restore the correct behavior. This has very little effect, as glibc has
been using the newer number anyway.

Fixes: 6ff645dd683a ("sparc: add system call table generation support")
Signed-off-by: Arnd Bergmann 
---
 arch/sparc/kernel/syscalls/syscall.tbl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sparc/kernel/syscalls/syscall.tbl 
b/arch/sparc/kernel/syscalls/syscall.tbl
index b354139b40be..5e55f73f9880 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -117,7 +117,7 @@
 90 common  dup2sys_dup2
 91 32  setfsuid32  sys_setfsuid
 92 common  fcntl   sys_fcntl   
compat_sys_fcntl
-93 common  select  sys_select
+93 common  select  sys_select  
compat_sys_select
 94 32  setfsgid32  sys_setfsgid
 95 common  fsync   sys_fsync
 96 common  setpriority sys_setpriority
-- 
2.39.2

[PATCH v2 02/13] syscalls: fix compat_sys_io_pgetevents_time64 usage

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

Using sys_io_pgetevents() as the entry point for compat mode tasks
works almost correctly, but misses the sign extension for the min_nr
and nr arguments.

This was addressed on parisc by switching to
compat_sys_io_pgetevents_time64() in commit 6431e92fc827 ("parisc:
io_pgetevents_time64() needs compat syscall in 32-bit compat mode"),
as well as by using more sophisticated system call wrappers on x86 and
s390. However, arm64, mips, powerpc, sparc and riscv still have the
same bug.

Change all of them over to use compat_sys_io_pgetevents_time64()
like parisc already does. This was clearly the intention when the
function was originally added, but it got hooked up incorrectly in
the tables.

Cc: sta...@vger.kernel.org
Fixes: 48166e6ea47d ("y2038: add 64-bit time_t syscalls to all 32-bit 
architectures")
Acked-by: Heiko Carstens  # s390
Signed-off-by: Arnd Bergmann 
---
v2: fix kernel/sys_ni.c which was previously broken only on
parisc. found by kernel build bot.
---
 arch/arm64/include/asm/unistd32.h | 2 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl | 2 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl | 2 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  | 2 +-
 arch/s390/kernel/syscalls/syscall.tbl | 2 +-
 arch/sparc/kernel/syscalls/syscall.tbl| 2 +-
 arch/x86/entry/syscalls/syscall_32.tbl| 2 +-
 include/uapi/asm-generic/unistd.h | 2 +-
 kernel/sys_ni.c   | 2 +-
 9 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 266b96acc014..1386e8e751f2 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -840,7 +840,7 @@ __SYSCALL(__NR_pselect6_time64, compat_sys_pselect6_time64)
 #define __NR_ppoll_time64 414
 __SYSCALL(__NR_ppoll_time64, compat_sys_ppoll_time64)
 #define __NR_io_pgetevents_time64 416
-__SYSCALL(__NR_io_pgetevents_time64, sys_io_pgetevents)
+__SYSCALL(__NR_io_pgetevents_time64, compat_sys_io_pgetevents_time64)
 #define __NR_recvmmsg_time64 417
 __SYSCALL(__NR_recvmmsg_time64, compat_sys_recvmmsg_time64)
 #define __NR_mq_timedsend_time64 418
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl 
b/arch/mips/kernel/syscalls/syscall_n32.tbl
index cc869f5d5693..953f5b7dc723 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -354,7 +354,7 @@
 412n32 utimensat_time64sys_utimensat
 413n32 pselect6_time64 compat_sys_pselect6_time64
 414n32 ppoll_time64compat_sys_ppoll_time64
-416n32 io_pgetevents_time64sys_io_pgetevents
+416n32 io_pgetevents_time64compat_sys_io_pgetevents_time64
 417n32 recvmmsg_time64 compat_sys_recvmmsg_time64
 418n32 mq_timedsend_time64 sys_mq_timedsend
 419n32 mq_timedreceive_time64  sys_mq_timedreceive
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl 
b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 008ebe60263e..85751c9b9cdb 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -403,7 +403,7 @@
 412o32 utimensat_time64sys_utimensat   
sys_utimensat
 413o32 pselect6_time64 sys_pselect6
compat_sys_pselect6_time64
 414o32 ppoll_time64sys_ppoll   
compat_sys_ppoll_time64
-416o32 io_pgetevents_time64sys_io_pgetevents   
sys_io_pgetevents
+416o32 io_pgetevents_time64sys_io_pgetevents   
compat_sys_io_pgetevents_time64
 417o32 recvmmsg_time64 sys_recvmmsg
compat_sys_recvmmsg_time64
 418o32 mq_timedsend_time64 sys_mq_timedsend
sys_mq_timedsend
 419o32 mq_timedreceive_time64  sys_mq_timedreceive 
sys_mq_timedreceive
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index 3656f1ca7a21..c6b0546b284d 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -502,7 +502,7 @@
 41232  utimensat_time64sys_utimensat   
sys_utimensat
 41332  pselect6_time64 sys_pselect6
compat_sys_pselect6_time64
 41432  ppoll_time64sys_ppoll   
compat_sys_ppoll_time64
-41632  io_pgetevents_time64sys_io_pgetevents   
sys_io_pgetevents
+41632  io_pgetevents_time64sys_io_pgetevents   
compat_sys_io_pgetevents_time64
 41732  recvmmsg_time64 sys_recvmmsg
compat_sys_recvmmsg_time64
 41832

[PATCH v2 01/13] ftruncate: pass a signed offset

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

The old ftruncate() syscall, using the 32-bit off_t misses a sign
extension when called in compat mode on 64-bit architectures.  As a
result, passing a negative length accidentally succeeds in truncating
to file size between 2GiB and 4GiB.

Changing the type of the compat syscall to the signed compat_off_t
changes the behavior so it instead returns -EINVAL.

The native entry point, the truncate() syscall and the corresponding
loff_t based variants are all correct already and do not suffer
from this mistake.

Fixes: 3f6d078d4acc ("fix compat truncate/ftruncate")
Reviewed-by: Christian Brauner 
Cc: sta...@vger.kernel.org
Signed-off-by: Arnd Bergmann 
---
 fs/open.c| 4 ++--
 include/linux/compat.h   | 2 +-
 include/linux/syscalls.h | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 89cafb572061..50e45bc7c4d8 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -202,13 +202,13 @@ long do_sys_ftruncate(unsigned int fd, loff_t length, int 
small)
return error;
 }
 
-SYSCALL_DEFINE2(ftruncate, unsigned int, fd, unsigned long, length)
+SYSCALL_DEFINE2(ftruncate, unsigned int, fd, off_t, length)
 {
return do_sys_ftruncate(fd, length, 1);
 }
 
 #ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE2(ftruncate, unsigned int, fd, compat_ulong_t, length)
+COMPAT_SYSCALL_DEFINE2(ftruncate, unsigned int, fd, compat_off_t, length)
 {
return do_sys_ftruncate(fd, length, 1);
 }
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 233f61ec8afc..56cebaff0c91 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -608,7 +608,7 @@ asmlinkage long compat_sys_fstatfs(unsigned int fd,
 asmlinkage long compat_sys_fstatfs64(unsigned int fd, compat_size_t sz,
 struct compat_statfs64 __user *buf);
 asmlinkage long compat_sys_truncate(const char __user *, compat_off_t);
-asmlinkage long compat_sys_ftruncate(unsigned int, compat_ulong_t);
+asmlinkage long compat_sys_ftruncate(unsigned int, compat_off_t);
 /* No generic prototype for truncate64, ftruncate64, fallocate */
 asmlinkage long compat_sys_openat(int dfd, const char __user *filename,
  int flags, umode_t mode);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 9104952d323d..ba9337709878 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -418,7 +418,7 @@ asmlinkage long sys_listmount(const struct mnt_id_req 
__user *req,
  u64 __user *mnt_ids, size_t nr_mnt_ids,
  unsigned int flags);
 asmlinkage long sys_truncate(const char __user *path, long length);
-asmlinkage long sys_ftruncate(unsigned int fd, unsigned long length);
+asmlinkage long sys_ftruncate(unsigned int fd, off_t length);
 #if BITS_PER_LONG == 32
 asmlinkage long sys_truncate64(const char __user *path, loff_t length);
 asmlinkage long sys_ftruncate64(unsigned int fd, loff_t length);
-- 
2.39.2

[PATCH v2 00/13] linux system call fixes

2024-06-24 Thread Arnd Bergmann

From: Arnd Bergmann 

This is a minor update to v1 of this series. If there are no new
concerns, I would like to send this as a pull request for v6.10-rc6,
which is a little late, but these are all bug fixes. Changes since
v1 are:

 - collect acks
 - minor fixes to the changelog text
 - drop mips patch that was already merged
 - drop the time32 patch that caused build failures
 - fix a kernel/sys_ni.c stub bug that was exposed by
   the compat_sys_io_pgetevents_time64 change

 Arnd
 
--- 
Original series description:

I'm working on cleanup series for Linux system call handling, trying to
unify some of the architecture specific code there among other things.

In the process, I came across a number of bugs that are ABI relevant,
so I'm trying to merge these first. I found all of these by inspection,
not by running the code, so any extra review would help. I assume some
of the issues were already caught by existing LTP tests, while for others
we could add a test. Again, I did not check what is already there.

The sync_file_range and fadvise64_64 changes on sh, csky and hexagon
are likely to also require changes in the libc implementation.

Once the patches are reviewed, I plan to merge my changes as bugfixes
through the asm-generic tree, but architecture maintainers can also
pick them up directly to speed up the bugfix.

Cc: linux-a...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: Thomas Bogendoerfer 
Cc: linux-m...@vger.kernel.org
Cc: Helge Deller 
Cc: linux-par...@vger.kernel.org
Cc: "David S. Miller" 
Cc: Andreas Larsson 
Cc: sparcli...@vger.kernel.org
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: Naveen N. Rao 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Brian Cain 
Cc: linux-hexa...@vger.kernel.org
Cc: Guo Ren 
Cc: linux-c...@vger.kernel.org
Cc: Heiko Carstens 
Cc: linux-s...@vger.kernel.org
Cc: Rich Felker 
Cc: John Paul Adrian Glaubitz 
Cc: linux...@vger.kernel.org
Cc: "H. Peter Anvin" 
Cc: Alexander Viro 
Cc: Christian Brauner 
Cc: linux-fsde...@vger.kernel.org
Cc: libc-al...@sourceware.org
Cc: m...@lists.openwall.com

Arnd Bergmann (13):
  ftruncate: pass a signed offset
  syscalls: fix compat_sys_io_pgetevents_time64 usage
  sparc: fix old compat_sys_select()
  sparc: fix compat recv/recvfrom syscalls
  parisc: use correct compat recv/recvfrom syscalls
  parisc: use generic sys_fanotify_mark implementation
  powerpc: restore some missing spu syscalls
  sh: rework sync_file_range ABI
  csky, hexagon: fix broken sys_sync_file_range
  hexagon: fix fadvise64_64 calling conventions
  s390: remove native mmap2() syscall
  syscalls: mmap(): use unsigned offset type consistently
  linux/syscalls.h: add missing __user annotations

 arch/arm64/include/asm/unistd32.h |   2 +-
 arch/csky/include/uapi/asm/unistd.h   |   1 +
 arch/csky/kernel/syscall.c|   2 +-
 arch/hexagon/include/asm/syscalls.h   |   6 +
 arch/hexagon/include/uapi/asm/unistd.h|   1 +
 arch/hexagon/kernel/syscalltab.c  |   7 +
 arch/loongarch/kernel/syscall.c   |   2 +-
 arch/microblaze/kernel/sys_microblaze.c   |   2 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |   2 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |   2 +-
 arch/parisc/Kconfig   |   1 +
 arch/parisc/kernel/sys_parisc32.c |   9 -
 arch/parisc/kernel/syscalls/syscall.tbl   |   6 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |   6 +-
 arch/riscv/kernel/sys_riscv.c |   4 +-
 arch/s390/kernel/syscall.c|  27 ---
 arch/s390/kernel/syscalls/syscall.tbl |   2 +-
 arch/sh/kernel/sys_sh32.c |  11 ++
 arch/sh/kernel/syscalls/syscall.tbl   |   3 +-
 arch/sparc/kernel/sys32.S | 221 --
 arch/sparc/kernel/syscalls/syscall.tbl|   8 +-
 arch/x86/entry/syscalls/syscall_32.tbl|   2 +-
 fs/open.c |   4 +-
 include/asm-generic/syscalls.h|   2 +-
 include/linux/compat.h|   2 +-
 include/linux/syscalls.h  |  20 +-
 include/uapi/asm-generic/unistd.h |   2 +-
 kernel/sys_ni.c   |   2 +-
 28 files changed, 67 insertions(+), 292 deletions(-)
 create mode 100644 arch/hexagon/include/asm/syscalls.h

-- 
2.39.2

Re: [PATCH 02/15] syscalls: fix compat_sys_io_pgetevents_time64 usage

2024-06-24 Thread Arnd Bergmann

On Thu, Jun 20, 2024, at 18:23, Arnd Bergmann wrote:
> From: Arnd Bergmann 
>
> Using sys_io_pgetevents() as the entry point for compat mode tasks
> works almost correctly, but misses the sign extension for the min_nr
> and nr arguments.
>
> This was addressed on parisc by switching to
> compat_sys_io_pgetevents_time64() in commit 6431e92fc827 ("parisc:
> io_pgetevents_time64() needs compat syscall in 32-bit compat mode"),
> as well as by using more sophisticated system call wrappers on x86 and
> s390. However, arm64, mips, powerpc, sparc and riscv still have the
> same bug.
>
> Changes all of them over to use compat_sys_io_pgetevents_time64()
> like parisc already does. This was clearly the intention when the
> function was originally added, but it got hooked up incorrectly in
> the tables.
>
> Cc: sta...@vger.kernel.org
> Fixes: 48166e6ea47d ("y2038: add 64-bit time_t syscalls to all 32-bit 
> architectures")
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/arm64/include/asm/unistd32.h | 2 +-
>  arch/mips/kernel/syscalls/syscall_n32.tbl | 2 +-
>  arch/mips/kernel/syscalls/syscall_o32.tbl | 2 +-
>  arch/powerpc/kernel/syscalls/syscall.tbl  | 2 +-
>  arch/s390/kernel/syscalls/syscall.tbl | 2 +-
>  arch/sparc/kernel/syscalls/syscall.tbl| 2 +-
>  arch/x86/entry/syscalls/syscall_32.tbl| 2 +-
>  include/uapi/asm-generic/unistd.h | 2 +-
>  8 files changed, 8 insertions(+), 8 deletions(-)

The build bot reported a randconfig regressions with this
patch, which I've now fixed up like this:

diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index d7eee421d4bc..b696b85ac63e 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -46,8 +46,8 @@ COND_SYSCALL(io_getevents_time32);
 COND_SYSCALL(io_getevents);
 COND_SYSCALL(io_pgetevents_time32);
 COND_SYSCALL(io_pgetevents);
-COND_SYSCALL_COMPAT(io_pgetevents_time32);
 COND_SYSCALL_COMPAT(io_pgetevents);
+COND_SYSCALL_COMPAT(io_pgetevents_time64);
 COND_SYSCALL(io_uring_setup);
 COND_SYSCALL(io_uring_enter);
 COND_SYSCALL(io_uring_register);

This was already broken on parisc the same way, but the
mistake in sys_ni.c turned into a link failure for every
compat architecture after my patch.

  Arnd

Re: [PATCH 09/15] sh: rework sync_file_range ABI

2024-06-24 Thread Arnd Bergmann

On Mon, Jun 24, 2024, at 08:14, John Paul Adrian Glaubitz wrote:
> On Fri, 2024-06-21 at 11:41 +0200, Arnd Bergmann wrote:
>> On Fri, Jun 21, 2024, at 10:44, John Paul Adrian Glaubitz wrote:
>> > Did you also check what order libc uses? I would expect libc on SuperH 
>> > misordering the
>> > arguments as well unless I am missing something. Or do we know that the 
>> > code is actually
>> > currently broken?
>> 
>> Yes, I checked glibc, musl and uclibc-ng for all the cases in
>> which the ABI made no sense, as well as to check that my analysis
>> of the kernel sources matches the expectations of the libc.
>
> OK, awesome.
>
> Will you send a v2 so I can ack the updated version of the patch?
>
> I'm also fine with the patch going through your tree, as I would
> like to start with the changes for v6.11 this week.

I should be able to get a v2 out today and apply that to my
asm-generic tree to have in linux-next before I send the
pull request.

   Arnd

Re: [PATCH 14/15] asm-generic: unistd: fix time32 compat syscall handling

2024-06-24 Thread Arnd Bergmann

On Thu, Jun 20, 2024, at 18:23, Arnd Bergmann wrote:
> From: Arnd Bergmann 
>
> arch/riscv/ appears to have accidentally enabled the compat time32
> syscalls in 64-bit kernels even though the native 32-bit ABI does
> not expose those.
>
> Address this by adding another level of indirection, checking for both
> the target ABI (32 or 64) and the __ARCH_WANT_TIME32_SYSCALLS macro.
>
> The macro arguments are meant to follow the syscall.tbl format, the idea
> here is that by the end of the series, all other syscalls are changed
> to the same format to make it possible to move all architectures over
> to generating the system call table consistently.
> Only this patch needs to be backported though.
>
> Cc: sta...@vger.kernel.org # v5.19+
> Fixes: 7eb6369d7acf ("RISC-V: Add support for rv32 userspace via COMPAT")
> Signed-off-by: Arnd Bergmann 

I had pulled this in from my longer series, but as the kernel
build bot reported, this produced build time regressions, so
I'll drop it from the v6.10 fixes and will integrated it back
as part of the cleanup series.

 Arnd

Re: [PATCH 07/15] parisc: use generic sys_fanotify_mark implementation

2024-06-21 Thread Arnd Bergmann

On Fri, Jun 21, 2024, at 11:03, John Paul Adrian Glaubitz wrote:
> On Fri, 2024-06-21 at 10:56 +0200, Arnd Bergmann wrote:
>> Feel free to pick up the sh patch directly, I'll just merge whatever
>> is left in the end. I mainly want to ensure we can get all the bugfixes
>> done for v6.10 so I can build my longer cleanup series on top of it
>> for 6.11.
>
> This series is still for 6.10?

Yes, these are all the bugfixes that I think we want to backport
to stable kernels, so it makes sense to merge them as quickly as
possible. The actual stuff I'm working on will come as soon as
I have it in a state for public review and won't need to be
backported.

 Arnd

Re: [PATCH 09/15] sh: rework sync_file_range ABI

2024-06-21 Thread Arnd Bergmann

On Fri, Jun 21, 2024, at 10:44, John Paul Adrian Glaubitz wrote:
> On Thu, 2024-06-20 at 18:23 +0200, Arnd Bergmann wrote:
>> From: Arnd Bergmann 
>> 
>> The unusual function calling conventions on superh ended up causing
>   ^^
>It's spelled SuperH

Fixed now.

>> diff --git a/arch/sh/kernel/sys_sh32.c b/arch/sh/kernel/sys_sh32.c
>> index 9dca568509a5..d5a4f7c697d8 100644
>> --- a/arch/sh/kernel/sys_sh32.c
>> +++ b/arch/sh/kernel/sys_sh32.c
>> @@ -59,3 +59,14 @@ asmlinkage int sys_fadvise64_64_wrapper(int fd, u32 
>> offset0, u32 offset1,
>>   (u64)len0 << 32 | len1, advice);
>>  #endif
>>  }
>> +
>> +/*
>> + * swap the arguments the way that libc wants it instead of
>
> I think "swap the arguments to the order that libc wants them" would
> be easier to understand here.

Done

>> diff --git a/arch/sh/kernel/syscalls/syscall.tbl 
>> b/arch/sh/kernel/syscalls/syscall.tbl
>> index bbf83a2db986..c55fd7696d40 100644
>> --- a/arch/sh/kernel/syscalls/syscall.tbl
>> +++ b/arch/sh/kernel/syscalls/syscall.tbl
>> @@ -321,7 +321,7 @@
>>  311 common  set_robust_list sys_set_robust_list
>>  312 common  get_robust_list sys_get_robust_list
>>  313 common  splice  sys_splice
>> -314 common  sync_file_range sys_sync_file_range
>> +314 common  sync_file_range sys_sh_sync_file_range6
>  ^^ 
> Why the suffix 6 here?

In a later part of my cleanup, I'm consolidating all the
copies of this function (arm64, mips, parisc, powerpc,
s390, sh, sparc, x86) and picked the name
sys_sync_file_range6() for common implementation.

I end up with four entry points here, so the naming is a bit
confusing:

- sys_sync_file_range() is only used on 64-bit architectures,
  on x32 and on mips-n32. This uses four arguments, including
  two 64-bit wide ones.

- sys_sync_file_range2() continues to be used on arm, powerpc,
  xtensa and now on sh, hexagon and csky. I change the
  implementation to take six 32-bit arguments, but the ABI
  remains the same as before, with the flags before offset.

- sys_sync_file_range6() is used for most other 32-bit ABIs:
  arc, m68k, microblaze, nios2, openrisc, parisc, s390, sh, sparc
  and x86. This also has six 32-bit arguments but in the
  default order (fd, offset, nbytes, flags).

- sys_sync_file_range7() is exclusive to mips-o32, this one
  has an unused argument and is otherwise the same as
  sys_sync_file_range6().

My plan is to then have some infrastructure to ensure
userspace tools (libc, strace, qemu, rust, ...) use the
same calling conventions as the kernel. I'm doing the
same thing for all other syscalls that have architecture
specific calling conventions, so far I'm using

fadvise64_64_7
fanotify_mark6
truncate3
truncate4
ftruncate3
ftruncate4
fallocate6
pread5
pread6
pwrite5
pwrite6
preadv5
preadv6
pwritev5
pwritev6
sync_file_range6
fadvise64_64_2
fadvise64_64_6
fadvise64_5
fadvise64_6
readahead4
readahead5

The last number here is usually the number of 32-bit
arguments, except for fadvise64_64_2 that uses the
same argument reordering trick as sync_file_range2.

I'm not too happy with the naming but couldn't come up with
anything clearer either, so let me know if you have any
ideas there.

>>  315 common  tee sys_tee
>>  316 common  vmsplicesys_vmsplice
>>  317 common  move_pages  sys_move_pages
>> @@ -395,6 +395,7 @@
>>  385 common  pkey_alloc  sys_pkey_alloc
>>  386 common  pkey_free   sys_pkey_free
>>  387 common  rseqsys_rseq
>> +388 common  sync_file_range2sys_sync_file_range2
>>  # room for arch specific syscalls
>>  393 common  semget  sys_semget
>>  394 common  semctl  sys_semctl
>
> I wonder how you discovered this bug. Did you look up the calling 
> convention on SuperH
> and compare the argument order for the sys_sync_file_range system call 
> documented there
> with the order in the kernel?

I had to categorize all architectures based on their calling
conventions to see if 64-bit arguments need aligned pairs or
not, so I wrote a set of simple C files that I compiled for
all architectures to see in which cases they insert unused
arguments or swap the order of the upper and lower halves.

SuperH, parisc and s390 are each slightly different from all the
others here, so I ended up reading the ELF psABI docs and/or
the

Re: [PATCH 07/15] parisc: use generic sys_fanotify_mark implementation

2024-06-21 Thread Arnd Bergmann

On Fri, Jun 21, 2024, at 10:52, John Paul Adrian Glaubitz wrote:
> Hi Helge and Arnd,
>
> On Thu, 2024-06-20 at 23:21 +0200, Helge Deller wrote:
>> The patch looks good at first sight.
>> I'll pick it up in my parisc git tree and will do some testing the
>> next few days and then push forward for 6.11 when it opens
>
> Isn't this supposed to go in as one series or can arch maintainers actually
> pick the patches for their architecture and merge them individually?
>
> If yes, I would prefer to do that for the SuperH patch as well as I usually
> prefer merging SuperH patches in my own tree.

The patches are all independent of one another, except for a couple
of context changes where multiple patches touch the same lines.

Feel free to pick up the sh patch directly, I'll just merge whatever
is left in the end. I mainly want to ensure we can get all the bugfixes
done for v6.10 so I can build my longer cleanup series on top of it
for 6.11.

   Arnd

Re: [PATCH 07/15] parisc: use generic sys_fanotify_mark implementation

2024-06-20 Thread Arnd Bergmann

On Fri, Jun 21, 2024, at 07:26, LEROY Christophe wrote:
> Le 20/06/2024 à 23:21, Helge Deller a écrit :
>> [Vous ne recevez pas souvent de courriers de del...@gmx.de. Découvrez
>> pourquoi ceci est important à
>> https://aka.ms/LearnAboutSenderIdentification ]
>>
>> On 6/20/24 18:23, Arnd Bergmann wrote:
>>> From: Arnd Bergmann 
>>>
>>> The sys_fanotify_mark() syscall on parisc uses the reverse word order
>>> for the two halves of the 64-bit argument compared to all syscalls on
>>> all 32-bit architectures. As far as I can tell, the problem is that
>>> the function arguments on parisc are sorted backwards (26, 25, 24, 23,
>>> ...) compared to everyone else,
>>
>> r26 is arg0, r25 is arg1, and so on.
>> I'm not sure I would call this "sorted backwards".
>> I think the reason is simply that hppa is the only 32-bit big-endian
>> arch left...
>
> powerpc/32 is big-endian: r3 is arg0, r4 is arg1, ... r10 is arg7.

Right, I'm pretty sure the ordering is the same on arm, mips,
s390, m68k, openrisc, sh and sparc when running 32-bit big-endian
code.

It's more likely to be related to the upward growing stack.
I checked the gcc sources and found that out of the 50 supported
architectures, ARGS_GROW_DOWNWARD is set on everything except
for gcn, stormy16 and  32-bit parisc. The other two are
little-endian though. STACK_GROWS_DOWNWARD in turn is set on
everything other than parisc (both 32-bit and 64-bit).

  Arnd

[PATCH 15/15] linux/syscalls.h: add missing __user annotations

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

A couple of declarations in linux/syscalls.h are missing __user
annotations on their pointers, which can lead to warnings from
sparse because these don't match the implementation that have
the correct address space annotations.

Signed-off-by: Arnd Bergmann 
---
 include/linux/syscalls.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index ba9337709878..63424af87bba 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -322,13 +322,13 @@ asmlinkage long sys_io_pgetevents(aio_context_t ctx_id,
long nr,
struct io_event __user *events,
struct __kernel_timespec __user *timeout,
-   const struct __aio_sigset *sig);
+   const struct __aio_sigset __user *sig);
 asmlinkage long sys_io_pgetevents_time32(aio_context_t ctx_id,
long min_nr,
long nr,
struct io_event __user *events,
struct old_timespec32 __user *timeout,
-   const struct __aio_sigset *sig);
+   const struct __aio_sigset __user *sig);
 asmlinkage long sys_io_uring_setup(u32 entries,
struct io_uring_params __user *p);
 asmlinkage long sys_io_uring_enter(unsigned int fd, u32 to_submit,
@@ -441,7 +441,7 @@ asmlinkage long sys_fchown(unsigned int fd, uid_t user, 
gid_t group);
 asmlinkage long sys_openat(int dfd, const char __user *filename, int flags,
   umode_t mode);
 asmlinkage long sys_openat2(int dfd, const char __user *filename,
-   struct open_how *how, size_t size);
+   struct open_how __user *how, size_t size);
 asmlinkage long sys_close(unsigned int fd);
 asmlinkage long sys_close_range(unsigned int fd, unsigned int max_fd,
unsigned int flags);
@@ -555,7 +555,7 @@ asmlinkage long sys_get_robust_list(int pid,
 asmlinkage long sys_set_robust_list(struct robust_list_head __user *head,
size_t len);
 
-asmlinkage long sys_futex_waitv(struct futex_waitv *waiters,
+asmlinkage long sys_futex_waitv(struct futex_waitv __user *waiters,
unsigned int nr_futexes, unsigned int flags,
struct __kernel_timespec __user *timeout, 
clockid_t clockid);
 
@@ -907,7 +907,7 @@ asmlinkage long sys_seccomp(unsigned int op, unsigned int 
flags,
 asmlinkage long sys_getrandom(char __user *buf, size_t count,
  unsigned int flags);
 asmlinkage long sys_memfd_create(const char __user *uname_ptr, unsigned int 
flags);
-asmlinkage long sys_bpf(int cmd, union bpf_attr *attr, unsigned int size);
+asmlinkage long sys_bpf(int cmd, union bpf_attr __user *attr, unsigned int 
size);
 asmlinkage long sys_execveat(int dfd, const char __user *filename,
const char __user *const __user *argv,
const char __user *const __user *envp, int flags);
@@ -960,11 +960,11 @@ asmlinkage long sys_cachestat(unsigned int fd,
struct cachestat_range __user *cstat_range,
struct cachestat __user *cstat, unsigned int flags);
 asmlinkage long sys_map_shadow_stack(unsigned long addr, unsigned long size, 
unsigned int flags);
-asmlinkage long sys_lsm_get_self_attr(unsigned int attr, struct lsm_ctx *ctx,
- u32 *size, u32 flags);
-asmlinkage long sys_lsm_set_self_attr(unsigned int attr, struct lsm_ctx *ctx,
+asmlinkage long sys_lsm_get_self_attr(unsigned int attr, struct lsm_ctx __user 
*ctx,
+ u32 __user *size, u32 flags);
+asmlinkage long sys_lsm_set_self_attr(unsigned int attr, struct lsm_ctx __user 
*ctx,
  u32 size, u32 flags);
-asmlinkage long sys_lsm_list_modules(u64 *ids, u32 *size, u32 flags);
+asmlinkage long sys_lsm_list_modules(u64 __user *ids, u32 __user *size, u32 
flags);
 
 /*
  * Architecture-specific system calls
-- 
2.39.2

[PATCH 14/15] asm-generic: unistd: fix time32 compat syscall handling

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

arch/riscv/ appears to have accidentally enabled the compat time32
syscalls in 64-bit kernels even though the native 32-bit ABI does
not expose those.

Address this by adding another level of indirection, checking for both
the target ABI (32 or 64) and the __ARCH_WANT_TIME32_SYSCALLS macro.

The macro arguments are meant to follow the syscall.tbl format, the idea
here is that by the end of the series, all other syscalls are changed
to the same format to make it possible to move all architectures over
to generating the system call table consistently.
Only this patch needs to be backported though.

Cc: sta...@vger.kernel.org # v5.19+
Fixes: 7eb6369d7acf ("RISC-V: Add support for rv32 userspace via COMPAT")
Signed-off-by: Arnd Bergmann 
---
 include/uapi/asm-generic/unistd.h | 146 +++---
 1 file changed, 94 insertions(+), 52 deletions(-)

diff --git a/include/uapi/asm-generic/unistd.h 
b/include/uapi/asm-generic/unistd.h
index 3fdaa573d661..e47c966557d0 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -16,10 +16,32 @@
 #define __SYSCALL(x, y)
 #endif
 
+#ifndef __SC
+#define __SC(_cond, _nr, _sys) __SYSCALL_ ## _cond (_nr, _sys)
+#endif
+
+#ifndef __SCC
+#ifdef __SYSCALL_COMPAT
+#define __SCC(_cond, _nr, _sys, _comp) __SC(_cond, _nr, _comp)
+#else
+#define __SCC(_cond, _nr, _sys, _comp) __SC(_cond, _nr, _sys)
+#endif
+#endif
+
 #if __BITS_PER_LONG == 32 || defined(__SYSCALL_COMPAT)
 #define __SC_3264(_nr, _32, _64) __SYSCALL(_nr, _32)
+#define __SYSCALL_32(_nr, _sys)__SYSCALL(__NR_ ## _nr, _sys)
+#define __SYSCALL_64(_nr, _sys)
 #else
 #define __SC_3264(_nr, _32, _64) __SYSCALL(_nr, _64)
+#define __SYSCALL_32(_nr, _sys)
+#define __SYSCALL_64(_nr, _sys)__SYSCALL(__NR_ ## _nr, _sys)
+#endif
+
+#if defined(__ARCH_WANT_TIME32_SYSCALLS)
+#define __SYSCALL_time32(_nr, _sys)__SYSCALL_32(__NR_ ## _nr, _sys)
+#else
+#define __SYSCALL_time32(_nr, _sys)
 #endif
 
 #ifdef __SYSCALL_COMPAT
@@ -41,7 +63,8 @@ __SYSCALL(__NR_io_cancel, sys_io_cancel)
 
 #if defined(__ARCH_WANT_TIME32_SYSCALLS) || __BITS_PER_LONG != 32
 #define __NR_io_getevents 4
-__SC_3264(__NR_io_getevents, sys_io_getevents_time32, sys_io_getevents)
+__SC(time32, io_getevents, sys_io_getevents_time32)
+__SC(64, io_getevents, sys_io_getevents)
 #endif
 
 #define __NR_setxattr 5
@@ -190,9 +213,11 @@ __SYSCALL(__NR3264_sendfile, sys_sendfile64)
 
 #if defined(__ARCH_WANT_TIME32_SYSCALLS) || __BITS_PER_LONG != 32
 #define __NR_pselect6 72
-__SC_COMP_3264(__NR_pselect6, sys_pselect6_time32, sys_pselect6, 
compat_sys_pselect6_time32)
+__SCC(time32, pselect6, sys_pselect6_time32, compat_sys_pselect6_time32)
+__SC(64, pselect6, sys_pselect6)
 #define __NR_ppoll 73
-__SC_COMP_3264(__NR_ppoll, sys_ppoll_time32, sys_ppoll, 
compat_sys_ppoll_time32)
+__SCC(time32, ppoll, sys_ppoll_time32, compat_sys_ppoll_time32)
+__SC(64, ppoll, sys_ppoll)
 #endif
 
 #define __NR_signalfd4 74
@@ -235,16 +260,17 @@ __SYSCALL(__NR_timerfd_create, sys_timerfd_create)
 
 #if defined(__ARCH_WANT_TIME32_SYSCALLS) || __BITS_PER_LONG != 32
 #define __NR_timerfd_settime 86
-__SC_3264(__NR_timerfd_settime, sys_timerfd_settime32, \
- sys_timerfd_settime)
+__SC(time32, timerfd_settime, sys_timerfd_settime32)
+__SC(64, timerfd_settime, sys_timerfd_settime)
 #define __NR_timerfd_gettime 87
-__SC_3264(__NR_timerfd_gettime, sys_timerfd_gettime32, \
- sys_timerfd_gettime)
+__SC(time32, timerfd_gettime, sys_timerfd_gettime32)
+__SC(64, timerfd_gettime, sys_timerfd_gettime)
 #endif
 
 #if defined(__ARCH_WANT_TIME32_SYSCALLS) || __BITS_PER_LONG != 32
 #define __NR_utimensat 88
-__SC_3264(__NR_utimensat, sys_utimensat_time32, sys_utimensat)
+__SC(time32, utimensat, sys_utimensat_time32)
+__SC(64, utimensat, sys_utimensat)
 #endif
 
 #define __NR_acct 89
@@ -268,7 +294,8 @@ __SYSCALL(__NR_unshare, sys_unshare)
 
 #if defined(__ARCH_WANT_TIME32_SYSCALLS) || __BITS_PER_LONG != 32
 #define __NR_futex 98
-__SC_3264(__NR_futex, sys_futex_time32, sys_futex)
+__SC(time32, futex, sys_futex_time32)
+__SC(64, futex, sys_futex)
 #endif
 
 #define __NR_set_robust_list 99
@@ -280,7 +307,8 @@ __SC_COMP(__NR_get_robust_list, sys_get_robust_list, \
 
 #if defined(__ARCH_WANT_TIME32_SYSCALLS) || __BITS_PER_LONG != 32
 #define __NR_nanosleep 101
-__SC_3264(__NR_nanosleep, sys_nanosleep_time32, sys_nanosleep)
+__SC(time32, nanosleep, sys_nanosleep_time32)
+__SC(64, nanosleep, sys_nanosleep)
 #endif
 
 #define __NR_getitimer 102
@@ -298,7 +326,8 @@ __SC_COMP(__NR_timer_create, sys_timer_create, 
compat_sys_timer_create)
 
 #if defined(__ARCH_WANT_TIME32_SYSCALLS) || __BITS_PER_LONG != 32
 #define __NR_timer_gettime 108
-__SC_3264(__NR_timer_gettime, sys_timer_gettime32, sys_timer_gettime)
+__SC(time32, timer_gettime, sys_timer_gettime32)
+__SC(64, timer_gettime, sys_timer_gettime)
 #endif
 
 #define __NR_timer_getoverrun 109
@@ -306,7 +335,8 @@ __SYSCALL(__NR_timer_

[PATCH 13/15] syscalls: mmap(): use unsigned offset type consistently

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

Most architectures that implement the old-style mmap() with byte offset
use 'unsigned long' as the type for that offset, but microblaze and
riscv have the off_t type that is shared with userspace, matching the
prototype in include/asm-generic/syscalls.h.

Make this consistent by using an unsigned argument everywhere. This
changes the behavior slightly, as the argument is shifted to a page
number, and an user input with the top bit set would result in a
negative page offset rather than a large one as we use elsewhere.

For riscv, the 32-bit sys_mmap2() definition actually used a custom
type that is different from the global declaration, but this was
missed due to an incorrect type check.

Signed-off-by: Arnd Bergmann 
---
 arch/csky/kernel/syscall.c  | 2 +-
 arch/loongarch/kernel/syscall.c | 2 +-
 arch/microblaze/kernel/sys_microblaze.c | 2 +-
 arch/riscv/kernel/sys_riscv.c   | 4 ++--
 include/asm-generic/syscalls.h  | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/csky/kernel/syscall.c b/arch/csky/kernel/syscall.c
index 3d30e58a45d2..4540a271ee39 100644
--- a/arch/csky/kernel/syscall.c
+++ b/arch/csky/kernel/syscall.c
@@ -20,7 +20,7 @@ SYSCALL_DEFINE6(mmap2,
unsigned long, prot,
unsigned long, flags,
unsigned long, fd,
-   off_t, offset)
+   unsigned long, offset)
 {
if (unlikely(offset & (~PAGE_MASK >> 12)))
return -EINVAL;
diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c
index b4c5acd7aa3b..8801611143ab 100644
--- a/arch/loongarch/kernel/syscall.c
+++ b/arch/loongarch/kernel/syscall.c
@@ -22,7 +22,7 @@
 #define __SYSCALL(nr, call)[nr] = (call),
 
 SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len, unsigned long,
-   prot, unsigned long, flags, unsigned long, fd, off_t, offset)
+   prot, unsigned long, flags, unsigned long, fd, unsigned long, 
offset)
 {
if (offset & ~PAGE_MASK)
return -EINVAL;
diff --git a/arch/microblaze/kernel/sys_microblaze.c 
b/arch/microblaze/kernel/sys_microblaze.c
index ed9f34da1a2a..0850b099f300 100644
--- a/arch/microblaze/kernel/sys_microblaze.c
+++ b/arch/microblaze/kernel/sys_microblaze.c
@@ -35,7 +35,7 @@
 
 SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
unsigned long, prot, unsigned long, flags, unsigned long, fd,
-   off_t, pgoff)
+   unsigned long, pgoff)
 {
if (pgoff & ~PAGE_MASK)
return -EINVAL;
diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
index 64155323cc92..d77afe05578f 100644
--- a/arch/riscv/kernel/sys_riscv.c
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -23,7 +23,7 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long 
len,
 #ifdef CONFIG_64BIT
 SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
unsigned long, prot, unsigned long, flags,
-   unsigned long, fd, off_t, offset)
+   unsigned long, fd, unsigned long, offset)
 {
return riscv_sys_mmap(addr, len, prot, flags, fd, offset, 0);
 }
@@ -32,7 +32,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
 #if defined(CONFIG_32BIT) || defined(CONFIG_COMPAT)
 SYSCALL_DEFINE6(mmap2, unsigned long, addr, unsigned long, len,
unsigned long, prot, unsigned long, flags,
-   unsigned long, fd, off_t, offset)
+   unsigned long, fd, unsigned long, offset)
 {
/*
 * Note that the shift for mmap2 is constant (12),
diff --git a/include/asm-generic/syscalls.h b/include/asm-generic/syscalls.h
index 933ca6581aba..fabcefe8a80a 100644
--- a/include/asm-generic/syscalls.h
+++ b/include/asm-generic/syscalls.h
@@ -19,7 +19,7 @@ asmlinkage long sys_mmap2(unsigned long addr, unsigned long 
len,
 #ifndef sys_mmap
 asmlinkage long sys_mmap(unsigned long addr, unsigned long len,
unsigned long prot, unsigned long flags,
-   unsigned long fd, off_t pgoff);
+   unsigned long fd, unsigned long off);
 #endif
 
 #ifndef sys_rt_sigreturn
-- 
2.39.2

[PATCH 12/15] s390: remove native mmap2() syscall

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

The mmap2() syscall has never been used on 64-bit s390x and should
have been removed as part of 5a79859ae0f3 ("s390: remove 31 bit
support").

Remove it now.

Signed-off-by: Arnd Bergmann 
---
 arch/s390/kernel/syscall.c | 27 ---
 1 file changed, 27 deletions(-)

diff --git a/arch/s390/kernel/syscall.c b/arch/s390/kernel/syscall.c
index dc2355c623d6..50cbcbbaa03d 100644
--- a/arch/s390/kernel/syscall.c
+++ b/arch/s390/kernel/syscall.c
@@ -38,33 +38,6 @@
 
 #include "entry.h"
 
-/*
- * Perform the mmap() system call. Linux for S/390 isn't able to handle more
- * than 5 system call parameters, so this system call uses a memory block
- * for parameter passing.
- */
-
-struct s390_mmap_arg_struct {
-   unsigned long addr;
-   unsigned long len;
-   unsigned long prot;
-   unsigned long flags;
-   unsigned long fd;
-   unsigned long offset;
-};
-
-SYSCALL_DEFINE1(mmap2, struct s390_mmap_arg_struct __user *, arg)
-{
-   struct s390_mmap_arg_struct a;
-   int error = -EFAULT;
-
-   if (copy_from_user(&a, arg, sizeof(a)))
-   goto out;
-   error = ksys_mmap_pgoff(a.addr, a.len, a.prot, a.flags, a.fd, a.offset);
-out:
-   return error;
-}
-
 #ifdef CONFIG_SYSVIPC
 /*
  * sys_ipc() is the de-multiplexer for the SysV IPC calls.
-- 
2.39.2

[PATCH 11/15] hexagon: fix fadvise64_64 calling conventions

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

fadvise64_64() has two 64-bit arguments at the wrong alignment
for hexagon, which turns them into a 7-argument syscall that is
not supported by Linux.

The downstream musl port for hexagon actually asks for a 6-argument
version the same way we do it on arm, csky, powerpc, so make the
kernel do it the same way to avoid having to change both.

Link: https://github.com/quic/musl/blob/hexagon/arch/hexagon/syscall_arch.h#L78
Cc: sta...@vger.kernel.org
Signed-off-by: Arnd Bergmann 
---
 arch/hexagon/include/asm/syscalls.h | 6 ++
 arch/hexagon/kernel/syscalltab.c| 7 +++
 2 files changed, 13 insertions(+)
 create mode 100644 arch/hexagon/include/asm/syscalls.h

diff --git a/arch/hexagon/include/asm/syscalls.h 
b/arch/hexagon/include/asm/syscalls.h
new file mode 100644
index ..40f2d08bec92
--- /dev/null
+++ b/arch/hexagon/include/asm/syscalls.h
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include 
+
+asmlinkage long sys_hexagon_fadvise64_64(int fd, int advice,
+ u32 a2, u32 a3, u32 a4, u32 a5);
diff --git a/arch/hexagon/kernel/syscalltab.c b/arch/hexagon/kernel/syscalltab.c
index 0fadd582cfc7..5d98bdc494ec 100644
--- a/arch/hexagon/kernel/syscalltab.c
+++ b/arch/hexagon/kernel/syscalltab.c
@@ -14,6 +14,13 @@
 #undef __SYSCALL
 #define __SYSCALL(nr, call) [nr] = (call),
 
+SYSCALL_DEFINE6(hexagon_fadvise64_64, int, fd, int, advice,
+   SC_ARG64(offset), SC_ARG64(len))
+{
+   return ksys_fadvise64_64(fd, SC_VAL64(loff_t, offset), SC_VAL64(loff_t, 
len), advice);
+}
+#define sys_fadvise64_64 sys_hexagon_fadvise64_64
+
 void *sys_call_table[__NR_syscalls] = {
 #include 
 };
-- 
2.39.2

[PATCH 10/15] csky, hexagon: fix broken sys_sync_file_range

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

Both of these architectures require u64 function arguments to be
passed in even/odd pairs of registers or stack slots, which in case of
sync_file_range would result in a seven-argument system call that is
not currently possible. The system call is therefore incompatible with
all existing binaries.

While it would be possible to implement support for seven arguments
like on mips, it seems better to use a six-argument version, either
with the normal argument order but misaligned as on most architectures
or with the reordered sync_file_range2() calling conventions as on
arm and powerpc.

Cc: sta...@vger.kernel.org
Signed-off-by: Arnd Bergmann 
---
 arch/csky/include/uapi/asm/unistd.h| 1 +
 arch/hexagon/include/uapi/asm/unistd.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/csky/include/uapi/asm/unistd.h 
b/arch/csky/include/uapi/asm/unistd.h
index 7ff6a2466af1..e0594b6370a6 100644
--- a/arch/csky/include/uapi/asm/unistd.h
+++ b/arch/csky/include/uapi/asm/unistd.h
@@ -6,6 +6,7 @@
 #define __ARCH_WANT_SYS_CLONE3
 #define __ARCH_WANT_SET_GET_RLIMIT
 #define __ARCH_WANT_TIME32_SYSCALLS
+#define __ARCH_WANT_SYNC_FILE_RANGE2
 #include 
 
 #define __NR_set_thread_area   (__NR_arch_specific_syscall + 0)
diff --git a/arch/hexagon/include/uapi/asm/unistd.h 
b/arch/hexagon/include/uapi/asm/unistd.h
index 432c4db1b623..21ae22306b5d 100644
--- a/arch/hexagon/include/uapi/asm/unistd.h
+++ b/arch/hexagon/include/uapi/asm/unistd.h
@@ -36,5 +36,6 @@
 #define __ARCH_WANT_SYS_VFORK
 #define __ARCH_WANT_SYS_FORK
 #define __ARCH_WANT_TIME32_SYSCALLS
+#define __ARCH_WANT_SYNC_FILE_RANGE2
 
 #include 
-- 
2.39.2

[PATCH 09/15] sh: rework sync_file_range ABI

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

The unusual function calling conventions on superh ended up causing
sync_file_range to have the wrong argument order, with the 'flags'
argument getting sorted before 'nbytes' by the compiler.

In userspace, I found that musl, glibc, uclibc and strace all expect the
normal calling conventions with 'nbytes' last, so changing the kernel
to match them should make all of those work.

In order to be able to also fix libc implementations to work with existing
kernels, they need to be able to tell which ABI is used. An easy way
to do this is to add yet another system call using the sync_file_range2
ABI that works the same on all architectures.

Old user binaries can now work on new kernels, and new binaries can
try the new sync_file_range2() to work with new kernels or fall back
to the old sync_file_range() version if that doesn't exist.

Cc: sta...@vger.kernel.org
Fixes: 75c92acdd5b1 ("sh: Wire up new syscalls.")
Signed-off-by: Arnd Bergmann 
---
 arch/sh/kernel/sys_sh32.c   | 11 +++
 arch/sh/kernel/syscalls/syscall.tbl |  3 ++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/sh/kernel/sys_sh32.c b/arch/sh/kernel/sys_sh32.c
index 9dca568509a5..d5a4f7c697d8 100644
--- a/arch/sh/kernel/sys_sh32.c
+++ b/arch/sh/kernel/sys_sh32.c
@@ -59,3 +59,14 @@ asmlinkage int sys_fadvise64_64_wrapper(int fd, u32 offset0, 
u32 offset1,
 (u64)len0 << 32 | len1, advice);
 #endif
 }
+
+/*
+ * swap the arguments the way that libc wants it instead of
+ * moving flags ahead of the 64-bit nbytes argument
+ */
+SYSCALL_DEFINE6(sh_sync_file_range6, int, fd, SC_ARG64(offset),
+SC_ARG64(nbytes), unsigned int, flags)
+{
+return ksys_sync_file_range(fd, SC_VAL64(loff_t, offset),
+SC_VAL64(loff_t, nbytes), flags);
+}
diff --git a/arch/sh/kernel/syscalls/syscall.tbl 
b/arch/sh/kernel/syscalls/syscall.tbl
index bbf83a2db986..c55fd7696d40 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -321,7 +321,7 @@
 311common  set_robust_list sys_set_robust_list
 312common  get_robust_list sys_get_robust_list
 313common  splice  sys_splice
-314common  sync_file_range sys_sync_file_range
+314common  sync_file_range sys_sh_sync_file_range6
 315common  tee sys_tee
 316common  vmsplicesys_vmsplice
 317common  move_pages  sys_move_pages
@@ -395,6 +395,7 @@
 385common  pkey_alloc  sys_pkey_alloc
 386common  pkey_free   sys_pkey_free
 387common  rseqsys_rseq
+388common  sync_file_range2sys_sync_file_range2
 # room for arch specific syscalls
 393common  semget  sys_semget
 394common  semctl  sys_semctl
-- 
2.39.2

[PATCH 08/15] powerpc: restore some missing spu syscalls

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

A couple of system calls were inadventently removed from the table during
a bugfix for 32-bit powerpc entry. Restore the original behavior.

Fixes: e23750623835 ("powerpc/32: fix syscall wrappers with 64-bit arguments of 
unaligned register-pairs")
Signed-off-by: Arnd Bergmann 
---
 arch/powerpc/kernel/syscalls/syscall.tbl | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index c6b0546b284d..ebae8415dfbb 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -230,8 +230,10 @@
 178nospu   rt_sigsuspend   sys_rt_sigsuspend   
compat_sys_rt_sigsuspend
 17932  pread64 sys_ppc_pread64 
compat_sys_ppc_pread64
 17964  pread64 sys_pread64
+179spu pread64 sys_pread64
 18032  pwrite64sys_ppc_pwrite64
compat_sys_ppc_pwrite64
 18064  pwrite64sys_pwrite64
+180spu pwrite64sys_pwrite64
 181common  chown   sys_chown
 182common  getcwd  sys_getcwd
 183common  capget  sys_capget
@@ -246,6 +248,7 @@
 190common  ugetrlimit  sys_getrlimit   
compat_sys_getrlimit
 19132  readahead   sys_ppc_readahead   
compat_sys_ppc_readahead
 19164  readahead   sys_readahead
+191spu readahead   sys_readahead
 19232  mmap2   sys_mmap2   
compat_sys_mmap2
 19332  truncate64  sys_ppc_truncate64  
compat_sys_ppc_truncate64
 19432  ftruncate64 sys_ppc_ftruncate64 
compat_sys_ppc_ftruncate64
@@ -293,6 +296,7 @@
 232nospu   set_tid_address sys_set_tid_address
 23332  fadvise64   sys_ppc32_fadvise64 
compat_sys_ppc32_fadvise64
 23364  fadvise64   sys_fadvise64
+233spu fadvise64   sys_fadvise64
 234nospu   exit_group  sys_exit_group
 235nospu   lookup_dcookie  sys_ni_syscall
 236common  epoll_createsys_epoll_create
-- 
2.39.2

[PATCH 07/15] parisc: use generic sys_fanotify_mark implementation

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

The sys_fanotify_mark() syscall on parisc uses the reverse word order
for the two halves of the 64-bit argument compared to all syscalls on
all 32-bit architectures. As far as I can tell, the problem is that
the function arguments on parisc are sorted backwards (26, 25, 24, 23,
...) compared to everyone else, so the calling conventions of using an
even/odd register pair in native word order result in the lower word
coming first in function arguments, matching the expected behavior
on little-endian architectures. The system call conventions however
ended up matching what the other 32-bit architectures do.

A glibc cleanup in 2020 changed the userspace behavior in a way that
handles all architectures consistently, but this inadvertently broke
parisc32 by changing to the same method as everyone else.

The change made it into glibc-2.35 and subsequently into debian 12
(bookworm), which is the latest stable release. This means we
need to choose between reverting the glibc change or changing the
kernel to match it again, but either hange will leave some systems
broken.

Pick the option that is more likely to help current and future
users and change the kernel to match current glibc. This also
means the behavior is now consistent across architectures, but
it breaks running new kernels with old glibc builds before 2.35.

Link: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d150181d73d9
Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/arch/parisc/kernel/sys_parisc.c?h=57b1dfbd5b4a39d
Cc: Adhemerval Zanella 
Signed-off-by: Arnd Bergmann 
---
I found this through code inspection, please double-check to make
sure I got the bug and the fix right.

The alternative is to fix this by reverting glibc back to the
unusual behavior.
---
 arch/parisc/Kconfig | 1 +
 arch/parisc/kernel/sys_parisc32.c   | 9 -
 arch/parisc/kernel/syscalls/syscall.tbl | 2 +-
 3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index daafeb20f993..dc9b902de8ea 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -16,6 +16,7 @@ config PARISC
select ARCH_HAS_UBSAN
select ARCH_HAS_PTE_SPECIAL
select ARCH_NO_SG_CHAIN
+   select ARCH_SPLIT_ARG64 if !64BIT
select ARCH_SUPPORTS_HUGETLBFS if PA20
select ARCH_SUPPORTS_MEMORY_FAILURE
select ARCH_STACKWALK
diff --git a/arch/parisc/kernel/sys_parisc32.c 
b/arch/parisc/kernel/sys_parisc32.c
index 2a12a547b447..826c8e51b585 100644
--- a/arch/parisc/kernel/sys_parisc32.c
+++ b/arch/parisc/kernel/sys_parisc32.c
@@ -23,12 +23,3 @@ asmlinkage long sys32_unimplemented(int r26, int r25, int 
r24, int r23,
current->comm, current->pid, r20);
 return -ENOSYS;
 }
-
-asmlinkage long sys32_fanotify_mark(compat_int_t fanotify_fd, compat_uint_t 
flags,
-   compat_uint_t mask0, compat_uint_t mask1, compat_int_t dfd,
-   const char  __user * pathname)
-{
-   return sys_fanotify_mark(fanotify_fd, flags,
-   ((__u64)mask1 << 32) | mask0,
-dfd, pathname);
-}
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
b/arch/parisc/kernel/syscalls/syscall.tbl
index 39e67fab7515..66dc406b12e4 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -364,7 +364,7 @@
 320common  accept4 sys_accept4
 321common  prlimit64   sys_prlimit64
 322common  fanotify_init   sys_fanotify_init
-323common  fanotify_mark   sys_fanotify_mark   
sys32_fanotify_mark
+323common  fanotify_mark   sys_fanotify_mark   
compat_sys_fanotify_mark
 32432  clock_adjtime   sys_clock_adjtime32
 32464  clock_adjtime   sys_clock_adjtime
 325common  name_to_handle_at   sys_name_to_handle_at
-- 
2.39.2

[PATCH 06/15] parisc: use correct compat recv/recvfrom syscalls

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

Johannes missed parisc back when he introduced the compat version
of these syscalls, so receiving cmsg messages that require a compat
conversion is still broken.

Use the correct calls like the other architectures do.

Fixes: 1dacc76d0014 ("net/compat/wext: send different messages to compat tasks")
Signed-off-by: Arnd Bergmann 
---
 arch/parisc/kernel/syscalls/syscall.tbl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
b/arch/parisc/kernel/syscalls/syscall.tbl
index b13c21373974..39e67fab7515 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -108,7 +108,7 @@
 95 common  fchown  sys_fchown
 96 common  getpriority sys_getpriority
 97 common  setpriority sys_setpriority
-98 common  recvsys_recv
+98 common  recvsys_recv
compat_sys_recv
 99 common  statfs  sys_statfs  
compat_sys_statfs
 100common  fstatfs sys_fstatfs 
compat_sys_fstatfs
 101common  stat64  sys_stat64
@@ -135,7 +135,7 @@
 120common  clone   sys_clone_wrapper
 121common  setdomainname   sys_setdomainname
 122common  sendfilesys_sendfile
compat_sys_sendfile
-123common  recvfromsys_recvfrom
+123common  recvfromsys_recvfrom
compat_sys_recvfrom
 12432  adjtimexsys_adjtimex_time32
 12464  adjtimexsys_adjtimex
 125common  mprotectsys_mprotect
-- 
2.39.2

[PATCH 05/15] sparc: fix compat recv/recvfrom syscalls

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

sparc has the wrong compat version of recv() and recvfrom() for both the
direct syscalls and socketcall().

The direct syscalls just need to use the compat version. For socketcall,
the same thing could be done, but it seems better to completely remove
the custom assembler code for it and just use the same implementation that
everyone else has.

Fixes: 1dacc76d0014 ("net/compat/wext: send different messages to compat tasks")
Signed-off-by: Arnd Bergmann 
---
 arch/sparc/kernel/sys32.S  | 221 -
 arch/sparc/kernel/syscalls/syscall.tbl |   4 +-
 2 files changed, 2 insertions(+), 223 deletions(-)

diff --git a/arch/sparc/kernel/sys32.S b/arch/sparc/kernel/sys32.S
index a45f0f31fe51..a3d308f2043e 100644
--- a/arch/sparc/kernel/sys32.S
+++ b/arch/sparc/kernel/sys32.S
@@ -18,224 +18,3 @@ sys32_mmap2:
sethi   %hi(sys_mmap), %g1
jmpl%g1 + %lo(sys_mmap), %g0
 sllx   %o5, 12, %o5
-
-   .align  32
-   .globl  sys32_socketcall
-sys32_socketcall:  /* %o0=call, %o1=args */
-   cmp %o0, 1
-   bl,pn   %xcc, do_einval
-cmp%o0, 18
-   bg,pn   %xcc, do_einval
-sub%o0, 1, %o0
-   sllx%o0, 5, %o0
-   sethi   %hi(__socketcall_table_begin), %g2
-   or  %g2, %lo(__socketcall_table_begin), %g2
-   jmpl%g2 + %o0, %g0
-nop
-do_einval:
-   retl
-mov-EINVAL, %o0
-
-   .align  32
-__socketcall_table_begin:
-
-   /* Each entry is exactly 32 bytes. */
-do_sys_socket: /* sys_socket(int, int, int) */
-1: ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_socket), %g1
-2: ldswa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_socket), %g0
-3:  ldswa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_bind: /* sys_bind(int fd, struct sockaddr *, int) */
-4: ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_bind), %g1
-5: ldswa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_bind), %g0
-6:  lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_connect: /* sys_connect(int, struct sockaddr *, int) */
-7: ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_connect), %g1
-8: ldswa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_connect), %g0
-9:  lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_listen: /* sys_listen(int, int) */
-10:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_listen), %g1
-   jmpl%g1 + %lo(sys_listen), %g0
-11: ldswa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-   nop
-do_sys_accept: /* sys_accept(int, struct sockaddr *, int *) */
-12:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_accept), %g1
-13:lduwa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_accept), %g0
-14: lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_getsockname: /* sys_getsockname(int, struct sockaddr *, int *) */
-15:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_getsockname), %g1
-16:lduwa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_getsockname), %g0
-17: lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_getpeername: /* sys_getpeername(int, struct sockaddr *, int *) */
-18:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_getpeername), %g1
-19:lduwa   [%o1 + 0x8] %asi, %o2
-   jmpl%g1 + %lo(sys_getpeername), %g0
-20: lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-   nop
-do_sys_socketpair: /* sys_socketpair(int, int, int, int *) */
-21:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_socketpair), %g1
-22:ldswa   [%o1 + 0x8] %asi, %o2
-23:lduwa   [%o1 + 0xc] %asi, %o3
-   jmpl%g1 + %lo(sys_socketpair), %g0
-24: ldswa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-do_sys_send: /* sys_send(int, void *, size_t, unsigned int) */
-25:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_send), %g1
-26:lduwa   [%o1 + 0x8] %asi, %o2
-27:lduwa   [%o1 + 0xc] %asi, %o3
-   jmpl%g1 + %lo(sys_send), %g0
-28: lduwa  [%o1 + 0x4] %asi, %o1
-   nop
-   nop
-do_sys_recv: /* sys_recv(int, void *, size_t, unsigned int) */
-29:ldswa   [%o1 + 0x0] %asi, %o0
-   sethi   %hi(sys_recv), %g1
-30:lduwa   [%o1 + 0x8] %asi, %o2
-31:lduwa   [%o1 + 0xc] %asi, %o3
-   jmpl%g1 + %l

[PATCH 04/15] sparc: fix old compat_sys_select()

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

sparc has two identical select syscalls at numbers 93 and 230, respectively.
During the conversion to the modern syscall.tbl format, the older one of the
two broke in compat mode, and now refers to the native 64-bit syscall.

Restore the correct behavior. This has very little effect, as glibc has
been using the newer number anyway.

Fixes: 6ff645dd683a ("sparc: add system call table generation support")
Signed-off-by: Arnd Bergmann 
---
 arch/sparc/kernel/syscalls/syscall.tbl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sparc/kernel/syscalls/syscall.tbl 
b/arch/sparc/kernel/syscalls/syscall.tbl
index b354139b40be..5e55f73f9880 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -117,7 +117,7 @@
 90 common  dup2sys_dup2
 91 32  setfsuid32  sys_setfsuid
 92 common  fcntl   sys_fcntl   
compat_sys_fcntl
-93 common  select  sys_select
+93 common  select  sys_select  
compat_sys_select
 94 32  setfsgid32  sys_setfsgid
 95 common  fsync   sys_fsync
 96 common  setpriority sys_setpriority
-- 
2.39.2

[PATCH 03/15] mips: fix compat_sys_lseek syscall

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

This is almost compatible, but passing a negative offset should result
in a EINVAL error, but on mips o32 compat mode would seek to a large
32-bit byte offset.

Use compat_sys_lseek() to correctly sign-extend the argument.

Signed-off-by: Arnd Bergmann 
---
 arch/mips/kernel/syscalls/syscall_o32.tbl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl 
b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 85751c9b9cdb..2439a2491cff 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -27,7 +27,7 @@
 17 o32 break   sys_ni_syscall
 # 18 was sys_stat
 18 o32 unused18sys_ni_syscall
-19 o32 lseek   sys_lseek
+19 o32 lseek   sys_lseek   
compat_sys_lseek
 20 o32 getpid  sys_getpid
 21 o32 mount   sys_mount
 22 o32 umount  sys_oldumount
-- 
2.39.2

[PATCH 02/15] syscalls: fix compat_sys_io_pgetevents_time64 usage

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

Using sys_io_pgetevents() as the entry point for compat mode tasks
works almost correctly, but misses the sign extension for the min_nr
and nr arguments.

This was addressed on parisc by switching to
compat_sys_io_pgetevents_time64() in commit 6431e92fc827 ("parisc:
io_pgetevents_time64() needs compat syscall in 32-bit compat mode"),
as well as by using more sophisticated system call wrappers on x86 and
s390. However, arm64, mips, powerpc, sparc and riscv still have the
same bug.

Changes all of them over to use compat_sys_io_pgetevents_time64()
like parisc already does. This was clearly the intention when the
function was originally added, but it got hooked up incorrectly in
the tables.

Cc: sta...@vger.kernel.org
Fixes: 48166e6ea47d ("y2038: add 64-bit time_t syscalls to all 32-bit 
architectures")
Signed-off-by: Arnd Bergmann 
---
 arch/arm64/include/asm/unistd32.h | 2 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl | 2 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl | 2 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  | 2 +-
 arch/s390/kernel/syscalls/syscall.tbl | 2 +-
 arch/sparc/kernel/syscalls/syscall.tbl| 2 +-
 arch/x86/entry/syscalls/syscall_32.tbl| 2 +-
 include/uapi/asm-generic/unistd.h | 2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 266b96acc014..1386e8e751f2 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -840,7 +840,7 @@ __SYSCALL(__NR_pselect6_time64, compat_sys_pselect6_time64)
 #define __NR_ppoll_time64 414
 __SYSCALL(__NR_ppoll_time64, compat_sys_ppoll_time64)
 #define __NR_io_pgetevents_time64 416
-__SYSCALL(__NR_io_pgetevents_time64, sys_io_pgetevents)
+__SYSCALL(__NR_io_pgetevents_time64, compat_sys_io_pgetevents_time64)
 #define __NR_recvmmsg_time64 417
 __SYSCALL(__NR_recvmmsg_time64, compat_sys_recvmmsg_time64)
 #define __NR_mq_timedsend_time64 418
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl 
b/arch/mips/kernel/syscalls/syscall_n32.tbl
index cc869f5d5693..953f5b7dc723 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -354,7 +354,7 @@
 412n32 utimensat_time64sys_utimensat
 413n32 pselect6_time64 compat_sys_pselect6_time64
 414n32 ppoll_time64compat_sys_ppoll_time64
-416n32 io_pgetevents_time64sys_io_pgetevents
+416n32 io_pgetevents_time64compat_sys_io_pgetevents_time64
 417n32 recvmmsg_time64 compat_sys_recvmmsg_time64
 418n32 mq_timedsend_time64 sys_mq_timedsend
 419n32 mq_timedreceive_time64  sys_mq_timedreceive
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl 
b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 008ebe60263e..85751c9b9cdb 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -403,7 +403,7 @@
 412o32 utimensat_time64sys_utimensat   
sys_utimensat
 413o32 pselect6_time64 sys_pselect6
compat_sys_pselect6_time64
 414o32 ppoll_time64sys_ppoll   
compat_sys_ppoll_time64
-416o32 io_pgetevents_time64sys_io_pgetevents   
sys_io_pgetevents
+416o32 io_pgetevents_time64sys_io_pgetevents   
compat_sys_io_pgetevents_time64
 417o32 recvmmsg_time64 sys_recvmmsg
compat_sys_recvmmsg_time64
 418o32 mq_timedsend_time64 sys_mq_timedsend
sys_mq_timedsend
 419o32 mq_timedreceive_time64  sys_mq_timedreceive 
sys_mq_timedreceive
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index 3656f1ca7a21..c6b0546b284d 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -502,7 +502,7 @@
 41232  utimensat_time64sys_utimensat   
sys_utimensat
 41332  pselect6_time64 sys_pselect6
compat_sys_pselect6_time64
 41432  ppoll_time64sys_ppoll   
compat_sys_ppoll_time64
-41632  io_pgetevents_time64sys_io_pgetevents   
sys_io_pgetevents
+41632  io_pgetevents_time64sys_io_pgetevents   
compat_sys_io_pgetevents_time64
 41732  recvmmsg_time64 sys_recvmmsg
compat_sys_recvmmsg_time64
 41832  mq_timedsend_time64 sys_mq_timedsend
sys_mq_timedsend
 41932  mq_timedreceive_time64  sys_mq_timedreceive 
sys_mq_timedreceive
di

[PATCH 01/15] ftruncate: pass a signed offset

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

The old ftruncate() syscall, using the 32-bit off_t misses a sign
extension when called in compat mode on 64-bit architectures.  As a
result, passing a negative length accidentally succeeds in truncating
to file size between 2GiB and 4GiB.

Changing the type of the compat syscall to the signed compat_off_t
changes the behavior so it instead returns -EINVAL.

The native entry point, the truncate() syscall and the corresponding
loff_t based variants are all correct already and do not suffer
from this mistake.

Fixes: 3f6d078d4acc ("fix compat truncate/ftruncate")
Cc: sta...@vger.kernel.org
Signed-off-by: Arnd Bergmann 
---
 fs/open.c| 4 ++--
 include/linux/compat.h   | 2 +-
 include/linux/syscalls.h | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 89cafb572061..50e45bc7c4d8 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -202,13 +202,13 @@ long do_sys_ftruncate(unsigned int fd, loff_t length, int 
small)
return error;
 }
 
-SYSCALL_DEFINE2(ftruncate, unsigned int, fd, unsigned long, length)
+SYSCALL_DEFINE2(ftruncate, unsigned int, fd, off_t, length)
 {
return do_sys_ftruncate(fd, length, 1);
 }
 
 #ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE2(ftruncate, unsigned int, fd, compat_ulong_t, length)
+COMPAT_SYSCALL_DEFINE2(ftruncate, unsigned int, fd, compat_off_t, length)
 {
return do_sys_ftruncate(fd, length, 1);
 }
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 233f61ec8afc..56cebaff0c91 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -608,7 +608,7 @@ asmlinkage long compat_sys_fstatfs(unsigned int fd,
 asmlinkage long compat_sys_fstatfs64(unsigned int fd, compat_size_t sz,
 struct compat_statfs64 __user *buf);
 asmlinkage long compat_sys_truncate(const char __user *, compat_off_t);
-asmlinkage long compat_sys_ftruncate(unsigned int, compat_ulong_t);
+asmlinkage long compat_sys_ftruncate(unsigned int, compat_off_t);
 /* No generic prototype for truncate64, ftruncate64, fallocate */
 asmlinkage long compat_sys_openat(int dfd, const char __user *filename,
  int flags, umode_t mode);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 9104952d323d..ba9337709878 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -418,7 +418,7 @@ asmlinkage long sys_listmount(const struct mnt_id_req 
__user *req,
  u64 __user *mnt_ids, size_t nr_mnt_ids,
  unsigned int flags);
 asmlinkage long sys_truncate(const char __user *path, long length);
-asmlinkage long sys_ftruncate(unsigned int fd, unsigned long length);
+asmlinkage long sys_ftruncate(unsigned int fd, off_t length);
 #if BITS_PER_LONG == 32
 asmlinkage long sys_truncate64(const char __user *path, loff_t length);
 asmlinkage long sys_ftruncate64(unsigned int fd, loff_t length);
-- 
2.39.2

[PATCH 00/15] linux system call fixes

2024-06-20 Thread Arnd Bergmann

From: Arnd Bergmann 

I'm working on cleanup series for Linux system call handling, trying to
unify some of the architecture specific code there among other things.

In the process, I came across a number of bugs that are ABI relevant,
so I'm trying to merge these first. I found all of these by inspection,
not by running the code, so any extra review would help. I assume some
of the issues were already caught by existing LTP tests, while for others
we could add a test. Again, I did not check what is already there.

The sync_file_range and fadvise64_64 changes on sh, csky and hexagon
are likely to also require changes in the libc implementation.

Once the patches are reviewed, I plan to merge my changes as bugfixes
through the asm-generic tree, but architecture maintainers can also
pick them up directly to speed up the bugfix.

 Arnd

Cc: linux-a...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: Thomas Bogendoerfer 
Cc: linux-m...@vger.kernel.org
Cc: Helge Deller 
Cc: linux-par...@vger.kernel.org
Cc: "David S. Miller" 
Cc: Andreas Larsson 
Cc: sparcli...@vger.kernel.org
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: Naveen N. Rao 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Brian Cain 
Cc: linux-hexa...@vger.kernel.org
Cc: Guo Ren 
Cc: linux-c...@vger.kernel.org
Cc: Heiko Carstens 
Cc: linux-s...@vger.kernel.org
Cc: Rich Felker 
Cc: John Paul Adrian Glaubitz 
Cc: linux...@vger.kernel.org
Cc: "H. Peter Anvin" 
Cc: Alexander Viro 
Cc: Christian Brauner 
Cc: linux-fsde...@vger.kernel.org
Cc: libc-al...@sourceware.org
Cc: m...@lists.openwall.com
Cc: l...@lists.linux.it

Arnd Bergmann (15):
  ftruncate: pass a signed offset
  syscalls: fix compat_sys_io_pgetevents_time64 usage
  mips: fix compat_sys_lseek syscall
  sparc: fix old compat_sys_select()
  sparc: fix compat recv/recvfrom syscalls
  parisc: use correct compat recv/recvfrom syscalls
  parisc: use generic sys_fanotify_mark implementation
  powerpc: restore some missing spu syscalls
  sh: rework sync_file_range ABI
  csky, hexagon: fix broken sys_sync_file_range
  hexagon: fix fadvise64_64 calling conventions
  s390: remove native mmap2() syscall
  syscalls: mmap(): use unsigned offset type consistently
  asm-generic: unistd: fix time32 compat syscall handling
  linux/syscalls.h: add missing __user annotations

 arch/arm64/include/asm/unistd32.h |   2 +-
 arch/csky/include/uapi/asm/unistd.h   |   1 +
 arch/csky/kernel/syscall.c|   2 +-
 arch/hexagon/include/asm/syscalls.h   |   6 +
 arch/hexagon/include/uapi/asm/unistd.h|   1 +
 arch/hexagon/kernel/syscalltab.c  |   7 +
 arch/loongarch/kernel/syscall.c   |   2 +-
 arch/microblaze/kernel/sys_microblaze.c   |   2 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |   2 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |   4 +-
 arch/parisc/Kconfig   |   1 +
 arch/parisc/kernel/sys_parisc32.c |   9 -
 arch/parisc/kernel/syscalls/syscall.tbl   |   6 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |   6 +-
 arch/riscv/kernel/sys_riscv.c |   4 +-
 arch/s390/kernel/syscall.c|  27 ---
 arch/s390/kernel/syscalls/syscall.tbl |   2 +-
 arch/sh/kernel/sys_sh32.c |  11 ++
 arch/sh/kernel/syscalls/syscall.tbl   |   3 +-
 arch/sparc/kernel/sys32.S | 221 --
 arch/sparc/kernel/syscalls/syscall.tbl|   8 +-
 arch/x86/entry/syscalls/syscall_32.tbl|   2 +-
 fs/open.c |   4 +-
 include/asm-generic/syscalls.h|   2 +-
 include/linux/compat.h|   2 +-
 include/linux/syscalls.h  |  20 +-
 include/uapi/asm-generic/unistd.h | 146 +-
 27 files changed, 160 insertions(+), 343 deletions(-)
 create mode 100644 arch/hexagon/include/asm/syscalls.h

-- 
2.39.2

Re: [PATCH] powerpc: vdso: fix building with wrong-endian toolchain

2024-06-07 Thread Arnd Bergmann

On Fri, Jun 7, 2024, at 14:42, Michael Ellerman wrote:
> Arnd Bergmann  writes:
>>
>> Signed-off-by: Arnd Bergmann 
>> ---
>> I'm fairly sure this worked in the past, but I did not try to bisect the
>> issue.
>
> It still works for me.
>
> I use the korg toolchains every day, and kisskb uses them too.
>
> What commit / defconfig are you seeing the errors with?
>
> Is it just the 12.3.0 toolchain or all of them? I just tested 12.3.0
> here and it built OK.
>
> I guess you're building on x86 or arm64? I build on ppc64le, I wonder if
> that makes a difference.
>
> The patch is probably OK regardless, but I'd rather understand what the
> actual problem is.

I tested again and found that the problem is actually part of my
local build setup, which overrides the 'CPP' variable in the
top-level makefile that I use for building multiple kernels
concurrently.

This ends up clashing with this other line that only
powerpc sets:

arch/powerpc/Makefile:CPP   = $(CC) -E $(KBUILD_CFLAGS)

It's rare that someone overrides CPP, so quite possibly I'm
the only one that has seen this so far, but it also seems like
it should be possible to do that.

This patch seems to work as well for me, and is a little
more logical, but it's also more invasive and has a
higher regression risk:

8<-
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 65261cbe5bfd..9ad4ca318e34 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -62,14 +62,14 @@ KBUILD_LDFLAGS_MODULE += arch/powerpc/lib/crtsavres.o
 endif
 
 ifdef CONFIG_CPU_LITTLE_ENDIAN
-KBUILD_CFLAGS  += -mlittle-endian
+KBUILD_CPPFLAGS+= -mlittle-endian
 KBUILD_LDFLAGS += -EL
 LDEMULATION:= lppc
 GNUTARGET  := powerpcle
 MULTIPLEWORD   := -mno-multiple
 KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-save-toc-indirect)
 else
-KBUILD_CFLAGS += $(call cc-option,-mbig-endian)
+KBUILD_CPPFLAGS += $(call cc-option,-mbig-endian)
 KBUILD_LDFLAGS += -EB
 LDEMULATION:= ppc
 GNUTARGET  := powerpc
@@ -95,7 +95,7 @@ aflags-$(CONFIG_CPU_BIG_ENDIAN)   += $(call 
cc-option,-mbig-endian)
 aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mlittle-endian
 
 ifeq ($(HAS_BIARCH),y)
-KBUILD_CFLAGS  += -m$(BITS)
+KBUILD_CPPFLAGS+= -m$(BITS)
 KBUILD_AFLAGS  += -m$(BITS)
 KBUILD_LDFLAGS += -m elf$(BITS)$(LDEMULATION)
 endif
@@ -176,7 +176,6 @@ KBUILD_CPPFLAGS += -I $(srctree)/arch/powerpc $(asinstr)
 KBUILD_AFLAGS  += $(AFLAGS-y)
 KBUILD_CFLAGS  += $(call cc-option,-msoft-float)
 KBUILD_CFLAGS  += $(CFLAGS-y)
-CPP= $(CC) -E $(KBUILD_CFLAGS)
 
 CHECKFLAGS += -m$(BITS) -D__powerpc__ -D__powerpc$(BITS)__
 ifdef CONFIG_CPU_BIG_ENDIAN
diff --git a/arch/powerpc/kernel/vdso/Makefile 
b/arch/powerpc/kernel/vdso/Makefile
index 1b93655c2857..3516e71926e5 100644
--- a/arch/powerpc/kernel/vdso/Makefile
+++ b/arch/powerpc/kernel/vdso/Makefile
@@ -59,7 +59,7 @@ ldflags-$(CONFIG_LD_IS_LLD) += $(call 
cc-option,--ld-path=$(LD),-fuse-ld=lld)
 ldflags-$(CONFIG_LD_ORPHAN_WARN) += 
-Wl,--orphan-handling=$(CONFIG_LD_ORPHAN_WARN_LEVEL)
 
 # Filter flags that clang will warn are unused for linking
-ldflags-y += $(filter-out $(CC_AUTO_VAR_INIT_ZERO_ENABLER) $(CC_FLAGS_FTRACE) 
-Wa$(comma)%, $(KBUILD_CFLAGS))
+ldflags-y += $(filter-out $(CC_AUTO_VAR_INIT_ZERO_ENABLER) $(CC_FLAGS_FTRACE) 
-Wa$(comma)%, $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS))
 
 CC32FLAGS := -m32
 LD32FLAGS := -Wl,-soname=linux-vdso32.so.1
->8

 Arnd

[PATCH] powerpc: vdso: fix building with wrong-endian toolchain

2024-06-06 Thread Arnd Bergmann

From: Arnd Bergmann 

Building powerpc64le kernels with the kernel.org crosstool toolchains
no longer works as the linker attempts to build a big-endian vdso:

powerpc-linux/lib/gcc/powerpc-linux/12.3.0/../../../../powerpc-linux/bin/ld: 
arch/powerpc/kernel/vdso/sigtramp32-32.o: compiled for a little endian system 
and target is big endian
powerpc-linux/lib/gcc/powerpc-linux/12.3.0/../../../../powerpc-linux/bin/ld: 
failed to merge target specific data of file 
arch/powerpc/kernel/vdso/sigtramp32-32.o

Apparently creating the vdso.lds files from the lds.S files fails to
pass the -mlittle-endian argument here, so the output format gets set
wrong. Changing the conditional to check for CONFIG_CPU_LITTLE_ENDIAN
instead still works, as the kernel configuration definitions are visible.

Signed-off-by: Arnd Bergmann 
---
I'm fairly sure this worked in the past, but I did not try to bisect the
issue.
---
 arch/powerpc/kernel/vdso/vdso32.lds.S | 2 +-
 arch/powerpc/kernel/vdso/vdso64.lds.S | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/vdso/vdso32.lds.S 
b/arch/powerpc/kernel/vdso/vdso32.lds.S
index 426e1ccc6971..5845ea2d1cba 100644
--- a/arch/powerpc/kernel/vdso/vdso32.lds.S
+++ b/arch/powerpc/kernel/vdso/vdso32.lds.S
@@ -7,7 +7,7 @@
 #include 
 #include 
 
-#ifdef __LITTLE_ENDIAN__
+#ifdef CONFIG_CPU_LITTLE_ENDIAN
 OUTPUT_FORMAT("elf32-powerpcle", "elf32-powerpcle", "elf32-powerpcle")
 #else
 OUTPUT_FORMAT("elf32-powerpc", "elf32-powerpc", "elf32-powerpc")
diff --git a/arch/powerpc/kernel/vdso/vdso64.lds.S 
b/arch/powerpc/kernel/vdso/vdso64.lds.S
index bda6c8cdd459..82c418b18cce 100644
--- a/arch/powerpc/kernel/vdso/vdso64.lds.S
+++ b/arch/powerpc/kernel/vdso/vdso64.lds.S
@@ -7,7 +7,7 @@
 #include 
 #include 
 
-#ifdef __LITTLE_ENDIAN__
+#ifdef CONFIG_CPU_LITTLE_ENDIAN
 OUTPUT_FORMAT("elf64-powerpcle", "elf64-powerpcle", "elf64-powerpcle")
 #else
 OUTPUT_FORMAT("elf64-powerpc", "elf64-powerpc", "elf64-powerpc")
-- 
2.39.2

Re: [PATCH v3 0/3] arch: Remove fbdev dependency from video helpers

2024-05-03 Thread Arnd Bergmann

On Fri, Apr 5, 2024, at 11:04, Thomas Zimmermann wrote:
> Hi,
>
> if there are no further comments, can this series be merged through 
> asm-generic?

Sorry for the delay, I've merged these for asm-generic now.

  Arnd

Re: [PATCH v3 2/2] fs/xattr: add *at family syscalls

2024-04-26 Thread Arnd Bergmann

On Fri, Apr 26, 2024, at 18:20, Christian Göttsche wrote:
> From: Christian Göttsche 
>
> Add the four syscalls setxattrat(), getxattrat(), listxattrat() and
> removexattrat().  Those can be used to operate on extended attributes,
> especially security related ones, either relative to a pinned directory
> or on a file descriptor without read access, avoiding a
> /proc//fd/ detour, requiring a mounted procfs.
>
> One use case will be setfiles(8) setting SELinux file contexts
> ("security.selinux") without race conditions and without a file
> descriptor opened with read access requiring SELinux read permission.
>
> Use the do_{name}at() pattern from fs/open.c.
>
> Pass the value of the extended attribute, its length, and for
> setxattrat(2) the command (XATTR_CREATE or XATTR_REPLACE) via an added
> struct xattr_args to not exceed six syscall arguments and not
> merging the AT_* and XATTR_* flags.
>
> Signed-off-by: Christian Göttsche 
> CC: x...@kernel.org
> CC: linux-al...@vger.kernel.org
> CC: linux-ker...@vger.kernel.org
> CC: linux-arm-ker...@lists.infradead.org
> CC: linux-i...@vger.kernel.org
> CC: linux-m...@lists.linux-m68k.org
> CC: linux-m...@vger.kernel.org
> CC: linux-par...@vger.kernel.org
> CC: linuxppc-dev@lists.ozlabs.org
> CC: linux-s...@vger.kernel.org
> CC: linux...@vger.kernel.org
> CC: sparcli...@vger.kernel.org
> CC: linux-fsde...@vger.kernel.org
> CC: au...@vger.kernel.org
> CC: linux-a...@vger.kernel.org
> CC: linux-...@vger.kernel.org
> CC: linux-security-mod...@vger.kernel.org
> CC: seli...@vger.kernel.org

I checked that the syscalls are all well-formed regarding
argument types, number of arguments and (absence of)
compat handling, and that they are wired up correctly
across architectures

I did not look at the actual implementation in detail.

Reviewed-by: Arnd Bergmann

Re: [PATCH] powerpc: drop port I/O helpers for CONFIG_HAS_IOPORT=n

2024-04-18 Thread Arnd Bergmann

On Fri, Apr 19, 2024, at 07:12, Michael Ellerman wrote:
> Michael Ellerman  writes:
>> "Arnd Bergmann"  writes:
>>>
>>> I had included this at first, but then I still ran into
>>> the same warnings because it ends up pulling in the
>>> generic outsb() etc from include/asm-generic/io.h
>>> that relies on setting a non-NULL PCI_IOBASE.
>>
>> Yes you're right. The above fixes the gcc build, but not clang.
>>
>> So I think I'll just cherry pick f0a816fb12da ("/dev/port: don't compile
>> file operations without CONFIG_DEVPORT") into my next and then apply
>> this. But will see if there's any other build failures over night.
>
> That didn't work. Still lots of drivers in my tree (based on rc2) which
> use inb/outb etc, and barf on the empty #define inb.

Right, the patches from Niklas only went into linux-next so far,
and a few are missing (including the 8250 one I think), so -rc2
at the moment regresses, but that doesn't have the warning either.

The idea of my patch was to both fix the current linux-next
build regression and have something that works in the long
run, I didn't expect it to work by itself. Sorry that wasn't
clear from my description.

> So I think this patch needs to wait until all the CONFIG_HAS_IOPORT
> checks have been merged for various drivers.
>
> For now the below fixes the clang warning. AFAICS it's safe because any
> code using inb() etc. with CONFIG_PCI=n is currently just doing a plain
> load from virtual address ~zero which should fault anyway.

If the port number is high enough, the current code might end
up referencing a user space address, depending on mmap_min_addr,
which defaults to 4096.

Using POISON_POINTER_DELTA is clearly an improvement over that.

> diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
> index 08c550ed49be..1cd6eb6c8101 100644
> --- a/arch/powerpc/include/asm/io.h
> +++ b/arch/powerpc/include/asm/io.h
> @@ -37,7 +37,7 @@ extern struct pci_dev *isa_bridge_pcidev;
>   * define properly based on the platform
>   */
>  #ifndef CONFIG_PCI
> -#define _IO_BASE   0
> +#define _IO_BASE   POISON_POINTER_DELTA
>  #define _ISA_MEM_BASE  0
>  #define PCI_DRAM_OFFSET 0
>  #elif defined(CONFIG_PPC32)

You may need to double-check, but I think for ppc64 we
can just unconditionally set _IO_BASE to ISA_IO_BASE
regardless of CONFIG_PCI. 3d5134ee8341 ("[POWERPC]
Rewrite IO allocation & mapping on powerpc64") ensured
that the I/O space is only ever mapped to this virtual
address, and the same method is used with the
asm-generic/io.h implementation on arm/arm64/loongarch/
m68k/riscv/xtensa. Using this would be both safer and
more efficient than the current version.
It should also not cause any regressions ;-)

Unfortunately, ppc32 never got that cleanup, so
POISON_POINTER_DELTA is probably still best until Niklas's
series is merged. You could set _ISA_MEM_BASE to the same 
here for good measure.

[another side note: the non-zero PCI_DRAM_OFFSET looks
unnecessary as well now, apparently this was meant for
ibm cpc710 and ppc440 platforms that are no longer
supported.]

 Arnd

Re: [PATCH] powerpc: drop port I/O helpers for CONFIG_HAS_IOPORT=n

2024-04-18 Thread Arnd Bergmann

On Thu, Apr 18, 2024, at 08:26, Michael Ellerman wrote:
> Arnd Bergmann  writes:

> @@ -692,6 +692,7 @@ static inline void name at  
> \
>  #define writesw writesw
>  #define writesl writesl
>
> +#ifdef CONFIG_HAS_IOPORT
>  #define inb inb
>  #define inw inw
>  #define inl inl
> @@ -704,6 +705,8 @@ static inline void name at  
> \
>  #define outsb outsb
>  #define outsw outsw
>  #define outsl outsl
> +#endif // CONFIG_HAS_IOPORT
> +
>  #ifdef __powerpc64__
>  #define readq  readq
>  #define writeq writeq

I had included this at first, but then I still ran into
the same warnings because it ends up pulling in the
generic outsb() etc from include/asm-generic/io.h
that relies on setting a non-NULL PCI_IOBASE.

  Arnd

[PATCH] powerpc: drop port I/O helpers for CONFIG_HAS_IOPORT=n

2024-04-16 Thread Arnd Bergmann

From: Arnd Bergmann 

Calling inb()/outb() on powerpc when CONFIG_PCI is disabled causes
a NULL pointer dereference, which is bad for a number of reasons.

After my patch to turn on -Werror in linux-next, this caused a
compiler-time warning with clang:

In file included from arch/powerpc/include/asm/io.h:672:
arch/powerpc/include/asm/io-defs.h:43:1: error: performing pointer
arithmetic on a null pointer has undefined behavior
[-Werror,-Wnull-pointer-arithmetic]
   43 | DEF_PCI_AC_NORET(insb, (unsigned long p, void *b, unsigned long c),
  | ^~~
   44 |  (p, b, c), pio, p)
  |  ~~

In this configuration, CONFIG_HAS_IOPORT is already disabled, and all
drivers that use inb()/outb() should now depend on that (some patches are
still in the process of getting marged).

Hide all references to inb()/outb() in the powerpc code and the definitions
when HAS_IOPORT is disabled to remove the possible NULL pointer access.
The same should happin in asm-generic in the near future, but for now
the empty inb() macros are still defined to ensure the generic version
does not get pulled in.

Signed-off-by: Arnd Bergmann 
Reported-by: Naresh Kamboju 
--

Cc: linux-ker...@vger.kernel.org>
Cc: linuxppc-dev 
Cc: Aneesh Kumar K.V 
Cc: Anders Roxell 
Cc: Kees Cook 
Cc: Niklas Schnelle 
Cc: clang-built-linux 
Cc: Nick Desaulniers 
Cc: Nathan Chancellor 
Cc: Jeff Xu 
Cc: Naveen N. Rao 
Cc: Dan Carpenter 
---
 arch/powerpc/include/asm/dma.h | 12 
 arch/powerpc/include/asm/io-defs.h |  4 
 arch/powerpc/include/asm/io.h  | 19 +++
 arch/powerpc/kernel/iomap.c|  4 
 arch/powerpc/kernel/traps.c|  2 +-
 5 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/dma.h b/arch/powerpc/include/asm/dma.h
index d97c66d9ae34..004a868f82c9 100644
--- a/arch/powerpc/include/asm/dma.h
+++ b/arch/powerpc/include/asm/dma.h
@@ -3,6 +3,12 @@
 #define _ASM_POWERPC_DMA_H
 #ifdef __KERNEL__
 
+/* The maximum address that we can perform a DMA transfer to on this platform 
*/
+/* Doesn't really apply... */
+#define MAX_DMA_ADDRESS(~0UL)
+
+#ifdef CONFIG_HAS_IOPORT
+
 /*
  * Defines for using and allocating dma channels.
  * Written by Hennus Bergman, 1992.
@@ -26,10 +32,6 @@
 #define MAX_DMA_CHANNELS   8
 #endif
 
-/* The maximum address that we can perform a DMA transfer to on this platform 
*/
-/* Doesn't really apply... */
-#define MAX_DMA_ADDRESS(~0UL)
-
 #ifdef HAVE_REALLY_SLOW_DMA_CONTROLLER
 #define dma_outb   outb_p
 #else
@@ -340,5 +342,7 @@ extern int request_dma(unsigned int dmanr, const char 
*device_id);
 /* release it again */
 extern void free_dma(unsigned int dmanr);
 
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_DMA_H */
diff --git a/arch/powerpc/include/asm/io-defs.h 
b/arch/powerpc/include/asm/io-defs.h
index faf8617cc574..8d2209af7759 100644
--- a/arch/powerpc/include/asm/io-defs.h
+++ b/arch/powerpc/include/asm/io-defs.h
@@ -20,12 +20,14 @@ DEF_PCI_AC_NORET(writeq, (u64 val, PCI_IO_ADDR addr), (val, 
addr), mem, addr)
 DEF_PCI_AC_NORET(writeq_be, (u64 val, PCI_IO_ADDR addr), (val, addr), mem, 
addr)
 #endif /* __powerpc64__ */
 
+#ifdef CONFIG_HAS_IOPORT
 DEF_PCI_AC_RET(inb, u8, (unsigned long port), (port), pio, port)
 DEF_PCI_AC_RET(inw, u16, (unsigned long port), (port), pio, port)
 DEF_PCI_AC_RET(inl, u32, (unsigned long port), (port), pio, port)
 DEF_PCI_AC_NORET(outb, (u8 val, unsigned long port), (val, port), pio, port)
 DEF_PCI_AC_NORET(outw, (u16 val, unsigned long port), (val, port), pio, port)
 DEF_PCI_AC_NORET(outl, (u32 val, unsigned long port), (val, port), pio, port)
+#endif
 
 DEF_PCI_AC_NORET(readsb, (const PCI_IO_ADDR a, void *b, unsigned long c),
 (a, b, c), mem, a)
@@ -40,6 +42,7 @@ DEF_PCI_AC_NORET(writesw, (PCI_IO_ADDR a, const void *b, 
unsigned long c),
 DEF_PCI_AC_NORET(writesl, (PCI_IO_ADDR a, const void *b, unsigned long c),
 (a, b, c), mem, a)
 
+#ifdef CONFIG_HAS_IOPORT
 DEF_PCI_AC_NORET(insb, (unsigned long p, void *b, unsigned long c),
 (p, b, c), pio, p)
 DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c),
@@ -52,6 +55,7 @@ DEF_PCI_AC_NORET(outsw, (unsigned long p, const void *b, 
unsigned long c),
 (p, b, c), pio, p)
 DEF_PCI_AC_NORET(outsl, (unsigned long p, const void *b, unsigned long c),
 (p, b, c), pio, p)
+#endif
 
 DEF_PCI_AC_NORET(memset_io, (PCI_IO_ADDR a, int c, unsigned long n),
 (a, c, n), mem, a)
diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index 08c550ed49be..86c212fcbc0c 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -37,7 +37,6 @@ extern struct pci_dev *isa_bridge_pcidev;
  * define properly based on the platform
  */
 #ifndef CONFIG_PCI
-#de

Re: powerpc: io-defs.h:43:1: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic]

2024-04-16 Thread Arnd Bergmann

On Tue, Apr 16, 2024, at 14:42, Dan Carpenter wrote:
> On Tue, Apr 16, 2024 at 01:55:57PM +0200, Arnd Bergmann wrote:
>> On Tue, Apr 16, 2024, at 13:01, Arnd Bergmann wrote:
>> > On Tue, Apr 16, 2024, at 11:32, Naresh Kamboju wrote:
>> >> The Powerpc clang builds failed due to following warnings / errors on the
>> >> Linux next-20240416 tag.
>> >>
>> >> Powerpc:
>> >>  - tqm8xx_defconfig + clang-17 - Failed
>> >>  - allnoconfig + clang-17 - Failed
>> >>  - tinyconfig + clang-17 - Failed
>> >>
>> >> Reported-by: Linux Kernel Functional Testing 
>> >
>> > I'm not sure why this showed up now, but there is a series from
>> > in progress that will avoid this in the future, as the same
>> > issue is present on a couple of other architectures.
>> >
>> 
>> I see now, it was introduced by my patch to turn on -Wextra
>> by default. I had tested that patch on all architectures
>> with allmodconfig and defconfig, but I did not test any
>> powerpc configs with PCI disabled.
>
> I think this warning is clang specific as well...  (Maybe clang was
> included in all architectures but I'm not sure).

Yes, I did test with both gcc and clang where supported.

 Arnd

Re: powerpc: io-defs.h:43:1: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic]

2024-04-16 Thread Arnd Bergmann

On Tue, Apr 16, 2024, at 13:01, Arnd Bergmann wrote:
> On Tue, Apr 16, 2024, at 11:32, Naresh Kamboju wrote:
>> The Powerpc clang builds failed due to following warnings / errors on the
>> Linux next-20240416 tag.
>>
>> Powerpc:
>>  - tqm8xx_defconfig + clang-17 - Failed
>>  - allnoconfig + clang-17 - Failed
>>  - tinyconfig + clang-17 - Failed
>>
>> Reported-by: Linux Kernel Functional Testing 
>
> I'm not sure why this showed up now, but there is a series from
> in progress that will avoid this in the future, as the same
> issue is present on a couple of other architectures.
>

I see now, it was introduced by my patch to turn on -Wextra
by default. I had tested that patch on all architectures
with allmodconfig and defconfig, but I did not test any
powerpc configs with PCI disabled.

 Arnd

Re: powerpc: io-defs.h:43:1: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic]

2024-04-16 Thread Arnd Bergmann

On Tue, Apr 16, 2024, at 11:32, Naresh Kamboju wrote:
> The Powerpc clang builds failed due to following warnings / errors on the
> Linux next-20240416 tag.
>
> Powerpc:
>  - tqm8xx_defconfig + clang-17 - Failed
>  - allnoconfig + clang-17 - Failed
>  - tinyconfig + clang-17 - Failed
>
> Reported-by: Linux Kernel Functional Testing 

I'm not sure why this showed up now, but there is a series from
in progress that will avoid this in the future, as the same
issue is present on a couple of other architectures.

The broken definitions are in the !CONFIG_PCI path of

#ifndef CONFIG_PCI
#define _IO_BASE0
#define _ISA_MEM_BASE   0
#define PCI_DRAM_OFFSET 0
#elif defined(CONFIG_PPC32)
#define _IO_BASEisa_io_base
#define _ISA_MEM_BASE   isa_mem_base
#define PCI_DRAM_OFFSET pci_dram_offset
#else
#define _IO_BASEpci_io_base
#define _ISA_MEM_BASE   isa_mem_base
#define PCI_DRAM_OFFSET 0
#endif

Once the series is merged, the !PCI case can disable
CONFIG_HAS_IOPORT and move all references to it into #ifdef
sections, something like the (incomplete) patch below.

It looks like regardless of this, powerpc can also just set
_IO_BASE to ISA_IO_BASE unconditionally, but I could be missing
something there.

 Arnd

---
diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index 08c550ed49be..29e002b9316c 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -36,11 +36,8 @@ extern struct pci_dev *isa_bridge_pcidev;
  * bases. Most of this file only uses _IO_BASE though which we
  * define properly based on the platform
  */
-#ifndef CONFIG_PCI
-#define _IO_BASE   0
-#define _ISA_MEM_BASE  0
-#define PCI_DRAM_OFFSET 0
-#elif defined(CONFIG_PPC32)
+#ifdef CONFIG_HAS_IOPORT
+#ifdef CONFIG_PPC32
 #define _IO_BASE   isa_io_base
 #define _ISA_MEM_BASE  isa_mem_base
 #define PCI_DRAM_OFFSETpci_dram_offset
@@ -486,8 +483,7 @@ static inline u64 __raw_rm_readq(volatile void __iomem 
*paddr)
  * to port it over
  */
 
-#ifdef CONFIG_PPC32
-
+#if defined(CONFIG_PPC32) && defined(CONFIG_HAS_IOPORT)
 #define __do_in_asm(name, op)  \
 static inline unsigned int name(unsigned int port) \
 {  \
@@ -534,7 +530,7 @@ __do_out_asm(_rec_outb, "stbx")
 __do_out_asm(_rec_outw, "sthbrx")
 __do_out_asm(_rec_outl, "stwbrx")
 
-#endif /* CONFIG_PPC32 */
+#endif /* CONFIG_PPC32 && CONFIG_HAS_IOPORT */
 
 /* The "__do_*" operations below provide the actual "base" implementation
  * for each of the defined accessors. Some of them use the out_* functions
@@ -577,6 +573,7 @@ __do_out_asm(_rec_outl, "stwbrx")
 #define __do_readq_be(addr)in_be64(PCI_FIX_ADDR(addr))
 #endif /* !defined(CONFIG_EEH) */
 
+#ifdef CONFIG_HAS_IOPORT
 #ifdef CONFIG_PPC32
 #define __do_outb(val, port)   _rec_outb(val, port)
 #define __do_outw(val, port)   _rec_outw(val, port)
@@ -592,6 +589,7 @@ __do_out_asm(_rec_outl, "stwbrx")
 #define __do_inw(port) readw((PCI_IO_ADDR)_IO_BASE + port);
 #define __do_inl(port) readl((PCI_IO_ADDR)_IO_BASE + port);
 #endif /* !CONFIG_PPC32 */
+#endif
 
 #ifdef CONFIG_EEH
 #define __do_readsb(a, b, n)   eeh_readsb(PCI_FIX_ADDR(a), (b), (n))
@@ -606,12 +604,14 @@ __do_out_asm(_rec_outl, "stwbrx")
 #define __do_writesw(a, b, n)  _outsw(PCI_FIX_ADDR(a),(b),(n))
 #define __do_writesl(a, b, n)  _outsl(PCI_FIX_ADDR(a),(b),(n))
 
+#ifdef CONFIG_HAS_IOPORT
 #define __do_insb(p, b, n) readsb((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
 #define __do_insw(p, b, n) readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
 #define __do_insl(p, b, n) readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
 #define __do_outsb(p, b, n)writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
 #define __do_outsw(p, b, n)writesw((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
 #define __do_outsl(p, b, n)writesl((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
+#endif
 
 #define __do_memset_io(addr, c, n) \
_memset_io(PCI_FIX_ADDR(addr), c, n)
@@ -689,6 +689,8 @@ static inline void name at  
\
 #define writesb writesb
 #define writesw writesw
 #define writesl writesl
+
+#ifdef CONFIG_HAS_IOPORT
 #define inb inb
 #define inw inw
 #define inl inl
@@ -701,6 +703,8 @@ static inline void name at  
\
 #define outsb outsb
 #define outsw outsw
 #define outsl outsl
+#endif
+
 #ifdef __powerpc64__
 #define readq  readq
 #define writeq writeq

Re: [PATCH] bug: Fix no-return-statement warning with !CONFIG_BUG

2024-04-15 Thread Arnd Bergmann

On Mon, Apr 15, 2024, at 19:07, Christophe Leroy wrote:
> Le 15/04/2024 à 17:35, Arnd Bergmann a écrit :
>> 
>> I haven't seen a good solution here. Ideally we'd just define
>> the functions unconditionally and have IS_ENABLED() take care
>> of letting the compiler drop them silently, but that doesn't
>> build because of missing struct members.
>> 
>> I won't object to either an 'extern' declaration or the
>> 'BUILD_BUG_ON()' if you and others prefer that, both are better
>> than BUG() here. I still think my suggestion would be a little
>> simpler.
>
> The advantage of the BUILD_BUG() against the extern is that the error 
> gets detected at buildtime. With the extern it gets detected only at 
> link-time.
>
> But agree with you, the missing struct members defeats the advantages of 
> IS_ENABLED().
>
> At the end, how many instances of struct timekeeper do we have in the 
> system ? With a quick look I see only two instances: tkcore.timekeeper 
> and shadow_timekeeper. If I'm correct, wouldn't it just be simpler to 
> have the three debug struct members defined at all time ?

Sure, this version looks fine to me, and passes a simple build
test without CONFIG_DEBUG_TIMEKEEPING.

Arnd

diff --git a/include/linux/timekeeper_internal.h 
b/include/linux/timekeeper_internal.h
index 84ff2844df2a..485677a98b0b 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -124,7 +124,6 @@ struct timekeeper {
u32 ntp_err_mult;
/* Flag used to avoid updating NTP twice with same second */
u32 skip_second_overflow;
-#ifdef CONFIG_DEBUG_TIMEKEEPING
longlast_warning;
/*
 * These simple flag variables are managed
@@ -135,7 +134,6 @@ struct timekeeper {
 */
int underflow_seen;
int overflow_seen;
-#endif
 };
 
 #ifdef CONFIG_GENERIC_TIME_VSYSCALL
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 4e18db1819f8..17f7aed807e1 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -195,7 +195,6 @@ static inline u64 tk_clock_read(const struct tk_read_base 
*tkr)
return clock->read(clock);
 }
 
-#ifdef CONFIG_DEBUG_TIMEKEEPING
 #define WARNING_FREQ (HZ*300) /* 5 minute rate-limiting */
 
 static void timekeeping_check_update(struct timekeeper *tk, u64 offset)
@@ -276,15 +275,6 @@ static inline u64 timekeeping_debug_get_ns(const struct 
tk_read_base *tkr)
/* timekeeping_cycles_to_ns() handles both under and overflow */
return timekeeping_cycles_to_ns(tkr, now);
 }
-#else
-static inline void timekeeping_check_update(struct timekeeper *tk, u64 offset)
-{
-}
-static inline u64 timekeeping_debug_get_ns(const struct tk_read_base *tkr)
-{
-   BUG();
-}
-#endif
 
 /**
  * tk_setup_internals - Set up internals to use clocksource clock.
@@ -2173,7 +2163,8 @@ static bool timekeeping_advance(enum timekeeping_adv_mode 
mode)
goto out;
 
/* Do some additional sanity checking */
-   timekeeping_check_update(tk, offset);
+   if (IS_ENABLED(CONFIG_DEBUG_TIMEKEEPING))
+   timekeeping_check_update(tk, offset);
 
/*
 * With NO_HZ we may have to accumulate many cycle_intervals

Re: [PATCH] bug: Fix no-return-statement warning with !CONFIG_BUG

2024-04-15 Thread Arnd Bergmann

On Mon, Apr 15, 2024, at 04:19, Michael Ellerman wrote:
> "Arnd Bergmann"  writes:
>> On Thu, Apr 11, 2024, at 11:27, Adrian Hunter wrote:
>>> On 11/04/24 11:22, Christophe Leroy wrote:
>>>
>>> That is fragile because it depends on defined(__OPTIMIZE__),
>>> so it should still be:
>>
>> If there is a function that is defined but that must never be
>> called, I think we are doing something wrong.
>
> It's a pretty inevitable result of using IS_ENABLED(), which the docs
> encourage people to use.

Using IS_ENABLED() is usually a good idea, as it helps avoid
adding extra #ifdef checks and just drops static functions as
dead code, or lets you call extern functions that are conditionally
defined in a different file.

The thing is that here it does not do either of those and
adds more complexity than it avoids.

> In this case it could easily be turned into a build error by just making
> it an extern rather than a static inline.
>
> But I think Christophe's solution is actually better, because it's more
> explicit, ie. this function should not be called and if it is that's a
> build time error.

I haven't seen a good solution here. Ideally we'd just define
the functions unconditionally and have IS_ENABLED() take care
of letting the compiler drop them silently, but that doesn't
build because of missing struct members.

I won't object to either an 'extern' declaration or the
'BUILD_BUG_ON()' if you and others prefer that, both are better
than BUG() here. I still think my suggestion would be a little
simpler.

 Arnd

Re: [RESEND PATCH net v4 1/2] soc: fsl: qbman: Always disable interrupts when taking cgr_lock

2024-04-11 Thread Arnd Bergmann

On Wed, Apr 10, 2024, at 06:54, Christophe Leroy wrote:
> Le 19/02/2024 à 16:30, Vladimir Oltean a écrit :
>> On Thu, Feb 15, 2024 at 11:23:26AM -0500, Sean Anderson wrote:
>>> smp_call_function_single disables IRQs when executing the callback. To
>>> prevent deadlocks, we must disable IRQs when taking cgr_lock elsewhere.
>>> This is already done by qman_update_cgr and qman_delete_cgr; fix the
>>> other lockers.
>>>
>>> Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()")
>>> CC: sta...@vger.kernel.org
>>> Signed-off-by: Sean Anderson 
>>> Reviewed-by: Camelia Groza 
>>> Tested-by: Vladimir Oltean 
>>> ---
>>> I got no response the first time I sent this, so I am resending to net.
>>> This issue was introduced in a series which went through net, so I hope
>>> it makes sense to take it via net.
>>>
>>> [1] 
>>> https://lore.kernel.org/linux-arm-kernel/20240108161904.2865093-1-sean.ander...@seco.com/
>>>
>>> (no changes since v3)
>>>
>>> Changes in v3:
>>> - Change blamed commit to something more appropriate
>>>
>>> Changes in v2:
>>> - Fix one additional call to spin_unlock
>> 
>> Leo Li (Li Yang) is no longer with NXP. Until we figure out within NXP
>> how to continue with the maintainership of drivers/soc/fsl/, yes, please
>> continue to submit this series to 'net'. I would also like to point
>> out to Arnd that this is the case.
>> 
>> Arnd, a large portion of drivers/soc/fsl/ is networking-related
>> (dpio, qbman). Would it make sense to transfer the maintainership
>> of these under the respective networking drivers, to simplify the
>> procedures?

If there are parts that are only used by networking, I'm definitely
fine with moving those out of drivers/soc into the respective users,
but as far as I can tell, all the code there is shared by multiple
subsystems (crypto, dma, usb, ...), so that would likely require
at least a reorganization.

> I see FREESCALE QUICC ENGINE LIBRARY (drivers/soc/fsl/qe/) is maintained 
> by Qiang Zhao  but I can't find any mail from him in 
> the past 4 years in linuxppc-dev list, and everytime I wanted to submit 
> something I only got responses from Leo Ly.
>
> The last commit he reviewed is 661ea25e5319 ("soc: fsl: qe: Replace 
> one-element array and use struct_size() helper"), it was in May 2020.
>
> Is he still working at NXP and actively maintaining that library ? 
> Keeping this part maintained is vital for me as this SOC is embedded in 
> the two powerpc platform I maintain (8xx and 83xx).
>
> If Qiang Zhao is not able to activaly maintain that SOC anymore, I 
> volonteer to maintain it.

Thanks, much appreciated. The QE driver is also used on
arm64/ls1043a, but I have not seen any email or pull requests
from Qiang Zhao for that driver either.

The previous setup was that Li Yang picked up patches for
anything under drivers/soc/fsl/ and forwarded them to
s...@kernel.org for me to pick up.

I would very much like to get back to the state of having
one or two maintainers for all of drivers/soc/fsl/ and
not have to worry about individual drivers under it when
they are all maintained by different people.

Shawn Guo is already maintaining the arm64 side of
Layerscape in addition to the i.MX code. Herve Codina in
turn has taken responsibility for qe/qmc.c and qe/tsa.c.

Maybe you can pick one more more maintainers for
drivers/soc/fsl/ between the three of you to collect
patches into a git branch and send pull requests to
s...@kernel.org?

  Arnd

Re: [PATCH] bug: Fix no-return-statement warning with !CONFIG_BUG

2024-04-11 Thread Arnd Bergmann

On Thu, Apr 11, 2024, at 11:27, Adrian Hunter wrote:
> On 11/04/24 11:22, Christophe Leroy wrote:
>> Le 11/04/2024 à 10:12, Christophe Leroy a écrit :
>>>
>>> Looking at the report, I think the correct fix should be to use 
>>> BUILD_BUG() instead of BUG()
>> 
>> I confirm the error goes away with the following change to next-20240411 
>> on powerpc tinyconfig with gcc 13.2
>> 
>> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>> index 4e18db1819f8..3d5ac0cdd721 100644
>> --- a/kernel/time/timekeeping.c
>> +++ b/kernel/time/timekeeping.c
>> @@ -282,7 +282,7 @@ static inline void timekeeping_check_update(struct 
>> timekeeper *tk, u64 offset)
>>   }
>>   static inline u64 timekeeping_debug_get_ns(const struct tk_read_base *tkr)
>>   {
>> -BUG();
>> +BUILD_BUG();
>>   }
>>   #endif
>> 
>
> That is fragile because it depends on defined(__OPTIMIZE__),
> so it should still be:

If there is a function that is defined but that must never be
called, I think we are doing something wrong. Before
e8e9d21a5df6 ("timekeeping: Refactor timekeeping helpers"),
the #ifdef made some sense, but now the #else is not really
that useful.

Ideally we would make timekeeping_debug_get_delta() and
timekeeping_check_update() just return in case of
!IS_ENABLED(CONFIG_DEBUG_TIMEKEEPING), but unfortunately
the code uses some struct members that are undefined then.

The patch below moves the #ifdef check into these functions,
which is not great, but it avoids defining useless
functions. Maybe there is a better way here. How about
just removing the BUG()?

 Arnd

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 4e18db1819f8..16c6dba64dd6 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -195,12 +195,11 @@ static inline u64 tk_clock_read(const struct tk_read_base 
*tkr)
return clock->read(clock);
 }
 
-#ifdef CONFIG_DEBUG_TIMEKEEPING
 #define WARNING_FREQ (HZ*300) /* 5 minute rate-limiting */
 
 static void timekeeping_check_update(struct timekeeper *tk, u64 offset)
 {
-
+#ifdef CONFIG_DEBUG_TIMEKEEPING
u64 max_cycles = tk->tkr_mono.clock->max_cycles;
const char *name = tk->tkr_mono.clock->name;
 
@@ -235,12 +234,19 @@ static void timekeeping_check_update(struct timekeeper 
*tk, u64 offset)
}
tk->overflow_seen = 0;
}
+#endif
 }
 
 static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 
cycles);
 
-static inline u64 timekeeping_debug_get_ns(const struct tk_read_base *tkr)
+static u64 __timekeeping_get_ns(const struct tk_read_base *tkr)
+{
+   return timekeeping_cycles_to_ns(tkr, tk_clock_read(tkr));
+}
+
+static inline u64 timekeeping_get_ns(const struct tk_read_base *tkr)
 {
+#ifdef CONFIG_DEBUG_TIMEKEEPING
struct timekeeper *tk = &tk_core.timekeeper;
u64 now, last, mask, max, delta;
unsigned int seq;
@@ -275,16 +281,10 @@ static inline u64 timekeeping_debug_get_ns(const struct 
tk_read_base *tkr)
 
/* timekeeping_cycles_to_ns() handles both under and overflow */
return timekeeping_cycles_to_ns(tkr, now);
-}
 #else
-static inline void timekeeping_check_update(struct timekeeper *tk, u64 offset)
-{
-}
-static inline u64 timekeeping_debug_get_ns(const struct tk_read_base *tkr)
-{
-   BUG();
-}
+   return __timekeeping_get_ns(tkr);
 #endif
+}
 
 /**
  * tk_setup_internals - Set up internals to use clocksource clock.
@@ -390,19 +390,6 @@ static inline u64 timekeeping_cycles_to_ns(const struct 
tk_read_base *tkr, u64 c
return ((delta * tkr->mult) + tkr->xtime_nsec) >> tkr->shift;
 }
 
-static __always_inline u64 __timekeeping_get_ns(const struct tk_read_base *tkr)
-{
-   return timekeeping_cycles_to_ns(tkr, tk_clock_read(tkr));
-}
-
-static inline u64 timekeeping_get_ns(const struct tk_read_base *tkr)
-{
-   if (IS_ENABLED(CONFIG_DEBUG_TIMEKEEPING))
-   return timekeeping_debug_get_ns(tkr);
-
-   return __timekeeping_get_ns(tkr);
-}
-
 /**
  * update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
  * @tkr: Timekeeping readout base from which we take the update

Re: [PATCH] bug: Fix no-return-statement warning with !CONFIG_BUG

2024-04-11 Thread Arnd Bergmann

On Thu, Apr 11, 2024, at 09:16, Adrian Hunter wrote:
> On 11/04/24 10:04, Arnd Bergmann wrote:
>> On Wed, Apr 10, 2024, at 17:32, Adrian Hunter wrote:
>>> BUG() does not return, and arch implementations of BUG() use unreachable()
>>> or other non-returning code. However with !CONFIG_BUG, the default
>>> implementation is often used instead, and that does not do that. x86 always
>>> uses its own implementation, but powerpc with !CONFIG_BUG gives a build
>>> error:
>>>
>>>   kernel/time/timekeeping.c: In function ‘timekeeping_debug_get_ns’:
>>>   kernel/time/timekeeping.c:286:1: error: no return statement in function
>>>   returning non-void [-Werror=return-type]
>>>
>>> Add unreachable() to default !CONFIG_BUG BUG() implementation.
>> 
>> I'm a bit worried about this patch, since we have had problems
>> with unreachable() inside of BUG() in the past, and as far as I
>> can remember, the current version was the only one that
>> actually did the right thing on all compilers.
>> 
>> One problem with an unreachable() annotation here is that if
>> a compiler misanalyses the endless loop, it can decide to
>> throw out the entire code path leading up to it and just
>> run into undefined behavior instead of printing a BUG()
>> message.
>> 
>> Do you know which compiler version show the warning above?
>
> Original report has a list
>

It looks like it's all versions of gcc, though no versions
of clang show the warnings. I did a few more tests and could
not find any differences on actual code generation, but
I'd still feel more comfortable changing the caller than
the BUG() macro. It's trivial to add a 'return 0' there.

Another interesting observation is that clang-11 and earlier
versions end up skipping the endless loop, both with and
without the __builtin_unreachable, see
https://godbolt.org/z/aqa9zqz8x

clang-12 and above do work like gcc, so I guess that is good.

 Arnd

Re: [PATCH v4 13/15] drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT

2024-04-11 Thread Arnd Bergmann

On Thu, Apr 11, 2024, at 09:15, Ard Biesheuvel wrote:
> On Thu, 11 Apr 2024 at 03:11, Samuel Holland  
> wrote:
>> On 2024-04-10 8:02 PM, Thiago Jung Bauermann wrote:
>> > Samuel Holland  writes:
>>
>> >> The short-term fix would be to drop the `select 
>> >> ARCH_HAS_KERNEL_FPU_SUPPORT` for
>> >> 32-bit arm until we can provide these runtime library functions.
>> >
>> > Does this mean that patch 2 in this series:
>> >
>> > [PATCH v4 02/15] ARM: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
>> >
>> > will be dropped?
>>
>> No, because later patches in the series (3, 6) depend on the definition of
>> CC_FLAGS_FPU from that patch. I will need to send a fixup patch unless I can
>> find a GPL-2 compatible implementation of the runtime library functions.
>>
>
> Is there really a point to doing that? Do 32-bit ARM systems even have
> enough address space to the map the BARs of the AMD GPUs that need
> this support?
>
> Given that this was not enabled before, I don't think the upshot of
> this series should be that we enable support for something on 32-bit
> ARM that may cause headaches down the road without any benefit.
>
> So I'd prefer a fixup patch that opts ARM out of this over adding
> support code for 64-bit conversions.

I have not found any dts file for a 32-bit platform with support
for a 64-bit prefetchable BAR, and there are very few that even
have a pcie slot (as opposed on on-board devices) you could
plug a card into.

That said, I also don't think we should encourage the use of
floating-point code in random device drivers. There is really
no excuse for the amdgpu driver to use floating point math
here, and we should get AMD to fix their driver instead.

 Arnd

Re: [PATCH] bug: Fix no-return-statement warning with !CONFIG_BUG

2024-04-11 Thread Arnd Bergmann

On Wed, Apr 10, 2024, at 17:32, Adrian Hunter wrote:
> BUG() does not return, and arch implementations of BUG() use unreachable()
> or other non-returning code. However with !CONFIG_BUG, the default
> implementation is often used instead, and that does not do that. x86 always
> uses its own implementation, but powerpc with !CONFIG_BUG gives a build
> error:
>
>   kernel/time/timekeeping.c: In function ‘timekeeping_debug_get_ns’:
>   kernel/time/timekeeping.c:286:1: error: no return statement in function
>   returning non-void [-Werror=return-type]
>
> Add unreachable() to default !CONFIG_BUG BUG() implementation.

I'm a bit worried about this patch, since we have had problems
with unreachable() inside of BUG() in the past, and as far as I
can remember, the current version was the only one that
actually did the right thing on all compilers.

One problem with an unreachable() annotation here is that if
a compiler misanalyses the endless loop, it can decide to
throw out the entire code path leading up to it and just
run into undefined behavior instead of printing a BUG()
message.

Do you know which compiler version show the warning above?

 Arnd

[PATCH 00/34] address all -Wunused-const warnings

2024-04-03 Thread Arnd Bergmann

From: Arnd Bergmann 

Compilers traditionally warn for unused 'static' variables, but not
if they are constant. The reason here is a custom for C++ programmers
to define named constants as 'static const' variables in header files
instead of using macros or enums.

In W=1 builds, we get warnings only static const variables in C
files, but not in headers, which is a good compromise, but this still
produces warning output in at least 30 files. These warnings are
almost all harmless, but also trivial to fix, and there is no
good reason to warn only about the non-const variables being unused.

I've gone through all the files that I found using randconfig and
allmodconfig builds and created patches to avoid these warnings,
with the goal of retaining a clean build once the option is enabled
by default.

Unfortunately, there is one fairly large patch ("drivers: remove
incorrect of_match_ptr/ACPI_PTR annotations") that touches
34 individual drivers that all need the same one-line change.
If necessary, I can split it up by driver or by subsystem,
but at least for reviewing I would keep it as one piece for
the moment.

Please merge the individual patches through subsystem trees.
I expect that some of these will have to go through multiple
revisions before they are picked up, so anything that gets
applied early saves me from resending.

Arnd

Arnd Bergmann (31):
  powerpc/fsl-soc: hide unused const variable
  ubsan: fix unused variable warning in test module
  platform: goldfish: remove ACPI_PTR() annotations
  i2c: pxa: hide unused icr_bits[] variable
  3c515: remove unused 'mtu' variable
  tracing: hide unused ftrace_event_id_fops
  Input: synaptics: hide unused smbus_pnp_ids[] array
  power: rt9455: hide unused rt9455_boost_voltage_values
  efi: sysfb: don't build when EFI is disabled
  clk: ti: dpll: fix incorrect #ifdef checks
  apm-emulation: hide an unused variable
  sisfb: hide unused variables
  dma/congiguous: avoid warning about unused size_bytes
  leds: apu: remove duplicate DMI lookup data
  iio: ad5755: hook up of_device_id lookup to platform driver
  greybus: arche-ctrl: move device table to its right location
  lib: checksum: hide unused expected_csum_ipv6_magic[]
  sunrpc: suppress warnings for unused procfs functions
  comedi: ni_atmio: avoid warning for unused device_ids[] table
  iwlegacy: don't warn for unused variables with DEBUG_FS=n
  drm/komeda: don't warn for unused debugfs files
  firmware: qcom_scm: mark qcom_scm_qseecom_allowlist as __maybe_unused
  crypto: ccp - drop platform ifdef checks
  usb: gadget: omap_udc: remove unused variable
  isdn: kcapi: don't build unused procfs code
  cpufreq: intel_pstate: hide unused intel_pstate_cpu_oob_ids[]
  net: xgbe: remove extraneous #ifdef checks
  Input: imagis - remove incorrect ifdef checks
  sata: mv: drop unnecessary #ifdef checks
  ASoC: remove incorrect of_match_ptr/ACPI_PTR annotations
  spi: remove incorrect of_match_ptr annotations
  drivers: remove incorrect of_match_ptr/ACPI_PTR annotations
  kbuild: always enable -Wunused-const-variable

Krzysztof Kozlowski (1):
  Input: stmpe-ts - mark OF related data as maybe unused

 arch/powerpc/sysdev/fsl_msi.c |  2 +
 drivers/ata/sata_mv.c | 64 +--
 drivers/char/apm-emulation.c  |  5 +-
 drivers/char/ipmi/ipmb_dev_int.c  |  2 +-
 drivers/char/tpm/tpm_ftpm_tee.c   |  2 +-
 drivers/clk/ti/dpll.c | 10 ++-
 drivers/comedi/drivers/ni_atmio.c |  2 +-
 drivers/cpufreq/intel_pstate.c|  2 +
 drivers/crypto/ccp/sp-platform.c  | 14 +---
 drivers/dma/img-mdc-dma.c |  2 +-
 drivers/firmware/efi/Makefile |  3 +-
 drivers/firmware/efi/sysfb_efi.c  |  2 -
 drivers/firmware/qcom/qcom_scm.c  |  2 +-
 drivers/fpga/versal-fpga.c|  2 +-
 .../gpu/drm/arm/display/komeda/komeda_dev.c   |  8 ---
 drivers/hid/hid-google-hammer.c   |  6 +-
 drivers/i2c/busses/i2c-pxa.c  |  2 +-
 drivers/i2c/muxes/i2c-mux-ltc4306.c   |  2 +-
 drivers/i2c/muxes/i2c-mux-reg.c   |  2 +-
 drivers/iio/dac/ad5755.c  |  1 +
 drivers/input/mouse/synaptics.c   |  2 +
 drivers/input/touchscreen/imagis.c|  4 +-
 drivers/input/touchscreen/stmpe-ts.c  |  2 +-
 drivers/input/touchscreen/wdt87xx_i2c.c   |  2 +-
 drivers/isdn/capi/Makefile|  3 +-
 drivers/isdn/capi/kcapi.c |  7 +-
 drivers/leds/leds-apu.c   |  3 +-
 drivers/mux/adg792a.c |  2 +-
 drivers/net/ethernet/3com/3c515.c |  3 -
 drivers/net/ethernet/amd/xgbe/xgbe-platform.c |  8 ---
 drivers/net/ethernet/apm/xgene-v2/main.c  |  2 +-
 drivers/net/ethernet/hisilicon/hns_mdio.c |  2 +-

[PATCH 01/34] powerpc/fsl-soc: hide unused const variable

2024-04-03 Thread Arnd Bergmann

From: Arnd Bergmann 

vmpic_msi_feature is only used conditionally, which triggers a rare
-Werror=unused-const-variable= warning with gcc:

arch/powerpc/sysdev/fsl_msi.c:567:37: error: 'vmpic_msi_feature' defined but 
not used [-Werror=unused-const-variable=]
  567 | static const struct fsl_msi_feature vmpic_msi_feature =

Hide this one in the same #ifdef as the reference so we can turn on
the warning by default.

Fixes: 305bcf26128e ("powerpc/fsl-soc: use CONFIG_EPAPR_PARAVIRT for hcalls")
Signed-off-by: Arnd Bergmann 
---
 arch/powerpc/sysdev/fsl_msi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 8e6c84df4ca1..e205135ae1fe 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -564,10 +564,12 @@ static const struct fsl_msi_feature ipic_msi_feature = {
.msiir_offset = 0x38,
 };
 
+#ifdef CONFIG_EPAPR_PARAVIRT
 static const struct fsl_msi_feature vmpic_msi_feature = {
.fsl_pic_ip = FSL_PIC_IP_VMPIC,
.msiir_offset = 0,
 };
+#endif
 
 static const struct of_device_id fsl_of_msi_ids[] = {
{
-- 
2.39.2

Re: [PATCH 2/9] dma: Convert from tasklet to BH workqueue

2024-03-28 Thread Arnd Bergmann

On Thu, Mar 28, 2024, at 20:39, Allen wrote:
>> >
>> > Since almost every driver associates the tasklet with the
>> > dma_chan, we could go one step further and add the
>> > work_queue structure directly into struct dma_chan,
>> > with the wrapper operating on the dma_chan rather than
>> > the work_queue.
>>
>> I think that is very great idea. having this wrapped in dma_chan would
>> be very good way as well
>>
>> Am not sure if Allen is up for it :-)
>
>  Thanks Arnd, I know we did speak about this at LPC. I did start
> working on using completion. I dropped it as I thought it would
> be easier to move to workqueues.

It's definitely easier to do the workqueue conversion as a first
step, and I agree adding support for the completion right away is
probably too much. Moving the work_struct into the dma_chan
is probably not too hard though, if you leave your current
approach for the cases where the tasklet is part of the
dma_dev rather than the dma_chan.

  Arnd

Re: [PATCH 2/9] dma: Convert from tasklet to BH workqueue

2024-03-28 Thread Arnd Bergmann

On Thu, Mar 28, 2024, at 06:55, Vinod Koul wrote:
> On 27-03-24, 16:03, Allen Pais wrote:
>> The only generic interface to execute asynchronously in the BH context is
>> tasklet; however, it's marked deprecated and has some design flaws. To
>> replace tasklets, BH workqueue support was recently added. A BH workqueue
>> behaves similarly to regular workqueues except that the queued work items
>> are executed in the BH context.
>
> Thanks for conversion, am happy with BH alternative as it helps in
> dmaengine where we need shortest possible time between tasklet and
> interrupt handling to maximize dma performance

I still feel that we want something different for dmaengine,
at least in the long run. As we have discussed in the past,
the tasklet context in these drivers is what the callbacks
from the dma client device is run in, and a lot of these probably
want something other than tasklet context, e.g. just call
complete() on a client-provided completion structure.

Instead of open-coding the use of the system_bh_wq in each
dmaengine, how about we start with a custom WQ_BH
specifically for the dmaengine subsystem and wrap them
inside of another interface.

Since almost every driver associates the tasklet with the
dma_chan, we could go one step further and add the
work_queue structure directly into struct dma_chan,
with the wrapper operating on the dma_chan rather than
the work_queue.

  Arnd

Re: [PATCH v2 3/3] arch: Rename fbdev header and source files

2024-03-28 Thread Arnd Bergmann

On Thu, Mar 28, 2024, at 13:46, Helge Deller wrote:
> On 3/27/24 21:41, Thomas Zimmermann wrote:

>> +++ b/arch/arc/include/asm/video.h
>> @@ -0,0 +1,8 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef _ASM_VIDEO_H_
>> +#define _ASM_VIDEO_H_
>> +
>> +#include 
>> +
>> +#endif /* _ASM_VIDEO_H_ */
>
> I wonder, since that file simply #includes the generic version,
> wasn't there a possibility that kbuild could symlink
> the generic version for us?
> Does it need to be mandatory in include/asm-generic/Kbuild ?
> Same applies to a few other files below.

It should be enough to just remove the files entirely,
as kbuild will generate the same wrappers for mandatory files.

 Arnd

Re: [PATCH RFC 0/3] mm/gup: consistently call it GUP-fast

2024-03-27 Thread Arnd Bergmann

On Thu, Mar 28, 2024, at 06:51, Vineet Gupta wrote:
> On 3/27/24 09:22, Arnd Bergmann wrote:
>> On Wed, Mar 27, 2024, at 16:39, David Hildenbrand wrote:
>>> On 27.03.24 16:21, Peter Xu wrote:
>>>> On Wed, Mar 27, 2024 at 02:05:35PM +0100, David Hildenbrand wrote:
>>>>
>>>> I'm not sure what config you tried there; as I am doing some build tests
>>>> recently, I found turning off CONFIG_SAMPLES + CONFIG_GCC_PLUGINS could
>>>> avoid a lot of issues, I think it's due to libc missing.  But maybe not the
>>>> case there.
>>> CCin Arnd; I use some of his compiler chains, others from Fedora directly. 
>>> For
>>> example for alpha and arc, the Fedora gcc is "13.2.1".
>>> But there is other stuff like (arc):
>>>
>>> ./arch/arc/include/asm/mmu-arcv2.h: In function 'mmu_setup_asid':
>>> ./arch/arc/include/asm/mmu-arcv2.h:82:9: error: implicit declaration of 
>>> function 'write_aux_reg' [-Werro
>>> r=implicit-function-declaration]
>>> 82 | write_aux_reg(ARC_REG_PID, asid | MMU_ENABLE);
>>>| ^
>> Seems to be missing an #include of soc/arc/aux.h, but I can't
>> tell when this first broke without bisecting.
>
> Weird I don't see this one but I only have gcc 12 handy ATM.
>
>     gcc version 12.2.1 20230306 (ARC HS GNU/Linux glibc toolchain -
> build 1360)
>
> I even tried W=1 (which according to scripts/Makefile.extrawarn) should
> include -Werror=implicit-function-declaration but don't see this still.
>
> Tomorrow I'll try building a gcc 13.2.1 for ARC.

David reported them with the toolchains I built at
https://mirrors.edge.kernel.org/pub/tools/crosstool/
I'm fairly sure the problem is specific to the .config
and tree, not the toolchain though.

>>> or (alpha)
>>>
>>> WARNING: modpost: "saved_config" [vmlinux] is COMMON symbol
>>> ERROR: modpost: "memcpy" [fs/reiserfs/reiserfs.ko] undefined!
>>> ERROR: modpost: "memcpy" [fs/nfs/nfs.ko] undefined!
>>> ERROR: modpost: "memcpy" [fs/nfs/nfsv3.ko] undefined!
>>> ERROR: modpost: "memcpy" [fs/nfsd/nfsd.ko] undefined!
>>> ERROR: modpost: "memcpy" [fs/lockd/lockd.ko] undefined!
>>> ERROR: modpost: "memcpy" [crypto/crypto.ko] undefined!
>>> ERROR: modpost: "memcpy" [crypto/crypto_algapi.ko] undefined!
>>> ERROR: modpost: "memcpy" [crypto/aead.ko] undefined!
>>> ERROR: modpost: "memcpy" [crypto/crypto_skcipher.ko] undefined!
>>> ERROR: modpost: "memcpy" [crypto/seqiv.ko] undefined!
>
> Are these from ARC build or otherwise ?

This was arch/alpha.

  Arnd

Re: [PATCH RFC 0/3] mm/gup: consistently call it GUP-fast

2024-03-27 Thread Arnd Bergmann

On Wed, Mar 27, 2024, at 16:39, David Hildenbrand wrote:
> On 27.03.24 16:21, Peter Xu wrote:
>> On Wed, Mar 27, 2024 at 02:05:35PM +0100, David Hildenbrand wrote:
>> 
>> I'm not sure what config you tried there; as I am doing some build tests
>> recently, I found turning off CONFIG_SAMPLES + CONFIG_GCC_PLUGINS could
>> avoid a lot of issues, I think it's due to libc missing.  But maybe not the
>> case there.
>
> CCin Arnd; I use some of his compiler chains, others from Fedora directly. For
> example for alpha and arc, the Fedora gcc is "13.2.1".

>
> But there is other stuff like (arc):
>
> ./arch/arc/include/asm/mmu-arcv2.h: In function 'mmu_setup_asid':
> ./arch/arc/include/asm/mmu-arcv2.h:82:9: error: implicit declaration of 
> function 'write_aux_reg' [-Werro
> r=implicit-function-declaration]
> 82 | write_aux_reg(ARC_REG_PID, asid | MMU_ENABLE);
>| ^

Seems to be missing an #include of soc/arc/aux.h, but I can't
tell when this first broke without bisecting.

> or (alpha)
>
> WARNING: modpost: "saved_config" [vmlinux] is COMMON symbol
> ERROR: modpost: "memcpy" [fs/reiserfs/reiserfs.ko] undefined!
> ERROR: modpost: "memcpy" [fs/nfs/nfs.ko] undefined!
> ERROR: modpost: "memcpy" [fs/nfs/nfsv3.ko] undefined!
> ERROR: modpost: "memcpy" [fs/nfsd/nfsd.ko] undefined!
> ERROR: modpost: "memcpy" [fs/lockd/lockd.ko] undefined!
> ERROR: modpost: "memcpy" [crypto/crypto.ko] undefined!
> ERROR: modpost: "memcpy" [crypto/crypto_algapi.ko] undefined!
> ERROR: modpost: "memcpy" [crypto/aead.ko] undefined!
> ERROR: modpost: "memcpy" [crypto/crypto_skcipher.ko] undefined!
> ERROR: modpost: "memcpy" [crypto/seqiv.ko] undefined!

Al did a series to fix various build problems on alpha, see
https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git/log/?h=work.alpha
Not sure if he still has to send them to Matt, or if Matt
just needs to apply them.

I also have some alpha patches that I should send upstream.

 Arnd

[PATCH 8/9] ALSA: aoa: avoid false-positive format truncation warning

2024-03-26 Thread Arnd Bergmann

From: Arnd Bergmann 

clang warns about what it interprets as a truncated snprintf:

sound/aoa/soundbus/i2sbus/core.c:171:6: error: 'snprintf' will always be 
truncated; specified size is 6, but format string expands to at least 7 
[-Werror,-Wformat-truncation-non-kprintf]

The actual problem here is that it does not understand the special
%pOFn format string and assumes that it is a pointer followed by
the string "OFn", which would indeed not fit.

Slightly increasing the size of the buffer to its natural alignment
avoids the warning, as it is now long enough for the correct and
the incorrect interprations.

Fixes: b917d58dcfaa ("ALSA: aoa: Convert to using %pOFn instead of 
device_node.name")
Signed-off-by: Arnd Bergmann 
---
 sound/aoa/soundbus/i2sbus/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/aoa/soundbus/i2sbus/core.c b/sound/aoa/soundbus/i2sbus/core.c
index b8ff5cccd0c8..5431d2c49421 100644
--- a/sound/aoa/soundbus/i2sbus/core.c
+++ b/sound/aoa/soundbus/i2sbus/core.c
@@ -158,7 +158,7 @@ static int i2sbus_add_dev(struct macio_dev *macio,
struct device_node *child, *sound = NULL;
struct resource *r;
int i, layout = 0, rlen, ok = force;
-   char node_name[6];
+   char node_name[8];
static const char *rnames[] = { "i2sbus: %pOFn (control)",
"i2sbus: %pOFn (tx)",
"i2sbus: %pOFn (rx)" };
-- 
2.39.2

[PATCH 0/9] enabled -Wformat-truncation for clang

2024-03-26 Thread Arnd Bergmann

From: Arnd Bergmann 

With randconfig build testing, I found only eight files that produce
warnings with clang when -Wformat-truncation is enabled. This means
we can just turn it on by default rather than only enabling it for
"make W=1".

Unfortunately, gcc produces a lot more warnings when the option
is enabled, so it's not yet possible to turn it on both both
compilers.

I hope that the patches can get picked up by platform maintainers
directly, so the final patch can go in later on.

 Arnd

Arnd Bergmann (9):
  fbdev: shmobile: fix snprintf truncation
  enetc: avoid truncating error message
  qed: avoid truncating work queue length
  mlx5: avoid truncating error message
  surface3_power: avoid format string truncation warning
  Input: IMS: fix printf string overflow
  scsi: mylex: fix sysfs buffer lengths
  ALSA: aoa: avoid false-positive format truncation warning
  kbuild: enable -Wformat-truncation on clang

 drivers/input/misc/ims-pcu.c  |  4 ++--
 drivers/net/ethernet/freescale/enetc/enetc.c  |  2 +-
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |  2 +-
 drivers/net/ethernet/qlogic/qed/qed_main.c|  9 ---
 drivers/platform/surface/surface3_power.c |  2 +-
 drivers/scsi/myrb.c   | 20 
 drivers/scsi/myrs.c   | 24 +--
 drivers/video/fbdev/sh_mobile_lcdcfb.c|  2 +-
 scripts/Makefile.extrawarn|  2 ++
 sound/aoa/soundbus/i2sbus/core.c  |  2 +-
 10 files changed, 35 insertions(+), 34 deletions(-)

-- 
2.39.2

Cc: Dmitry Torokhov 
Cc: Claudiu Manoil 
Cc: Vladimir Oltean 
Cc: Jakub Kicinski 
Cc: Saeed Mahameed 
Cc: Leon Romanovsky 
Cc: Ariel Elior 
Cc: Manish Chopra 
Cc: Hans de Goede 
Cc: "Ilpo Järvinen" 
Cc: Maximilian Luz 
Cc: Hannes Reinecke 
Cc: "Martin K. Petersen" 
Cc: Helge Deller 
Cc: Masahiro Yamada 
Cc: Nathan Chancellor 
Cc: Nicolas Schier 
Cc: Johannes Berg 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: Nick Desaulniers 
Cc: Bill Wendling 
Cc: Justin Stitt 
Cc: linux-in...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: net...@vger.kernel.org
Cc: linux-r...@vger.kernel.org
Cc: platform-driver-...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: linux-fb...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Cc: linux-kbu...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: alsa-de...@alsa-project.org
Cc: linux-so...@vger.kernel.org
Cc: l...@lists.linux.dev

Re: [PATCH V2 00/19] timekeeping: Handle potential multiplication overflow

2024-03-25 Thread Arnd Bergmann

On Mon, Mar 25, 2024, at 07:40, Adrian Hunter wrote:
>
> Extend the facility also to VDSO, dependent on new config option
> GENERIC_VDSO_OVERFLOW_PROTECT which is selected by x86 only, so other
> architectures are not affected. The result is a calculation that has
> similar performance as before. Most machines showed performance benefit,
> except Skylake-based hardware such as Intel Kaby Lake which was seen <1%
> worse.

I've read through the series, and this pretty much all makes sense,
nice work!

There are a few patches that just rearrange the local variable
declarations to save a few lines, and I don't see those as an
improvement, but they also don't hurt aside from distracting
slightly from the real changes.

 Arnd

Re: [PATCH] powerpc: ps3: mark ps3_notification_device static for stack usage

2024-03-22 Thread Arnd Bergmann

On Fri, Mar 22, 2024, at 09:34, Geoff Levand wrote:
> On 3/21/24 17:32, Geert Uytterhoeven wrote:
>> --- a/arch/powerpc/platforms/ps3/device-init.c
>>> +++ b/arch/powerpc/platforms/ps3/device-init.c
>>> @@ -770,7 +770,7 @@ static struct task_struct *probe_task;
>>>
>>>  static int ps3_probe_thread(void *data)
>>>  {
>>> -   struct ps3_notification_device dev;
>>> +   static struct ps3_notification_device dev;
>>> int res;
>>> unsigned int irq;
>>> u64 lpar;
>> 
>> Making it static increases kernel size for everyone.  So I'd rather
>> allocate it dynamically. The thread already allocates a buffer, which
>> can be replaced at no cost by allocating a structure containing both
>> the ps3_notification_device and the buffer.

I didn't think it mattered much, given that you would rarely
have a kernel with PS3 support along with other platforms.

I suppose it does increase the size for a PS3-only kernel
as well, while your version makes it smaller.

> Here's what I came up with.  It builds for me without warnings.
> I haven't tested it yet.  A review would be appreciated.

It seems a little complicated but looks all correct to
me and reduces both stack and .data size, so

Acked-by: Arnd Bergmann

[PATCH] powerpc: ps3: mark ps3_notification_device static for stack usage

2024-03-20 Thread Arnd Bergmann

From: Arnd Bergmann 

The device is way too large to be on the stack, causing a warning for
an allmodconfig build with clang:

arch/powerpc/platforms/ps3/device-init.c:771:12: error: stack frame size (2064) 
exceeds limit (2048) in 'ps3_probe_thread' [-Werror,-Wframe-larger-than]
  771 | static int ps3_probe_thread(void *data)

There is only a single thread that ever accesses this, so it's fine to
make it static, which avoids the warning.

Fixes: b4cb2941f855 ("[POWERPC] PS3: Use the HVs storage device notification 
mechanism properly")
Signed-off-by: Arnd Bergmann 
---
 arch/powerpc/platforms/ps3/device-init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/ps3/device-init.c 
b/arch/powerpc/platforms/ps3/device-init.c
index 878bc160246e..ce99f06698a9 100644
--- a/arch/powerpc/platforms/ps3/device-init.c
+++ b/arch/powerpc/platforms/ps3/device-init.c
@@ -770,7 +770,7 @@ static struct task_struct *probe_task;
 
 static int ps3_probe_thread(void *data)
 {
-   struct ps3_notification_device dev;
+   static struct ps3_notification_device dev;
int res;
unsigned int irq;
u64 lpar;
-- 
2.39.2

[PATCH] vdso: use CONFIG_PAGE_SHIFT in vdso/datapage.h

2024-03-20 Thread Arnd Bergmann

From: Arnd Bergmann 

Both the vdso rework and the CONFIG_PAGE_SHIFT changes were merged during
the v6.9 merge window, so it is now possible to use CONFIG_PAGE_SHIFT
instead of including asm/page.h in the vdso.

This avoids the workaround for arm64 and addresses a build warning
for powerpc64:

In file included from :4:
In file included from /home/arnd/arm-soc/arm-soc/lib/vdso/gettimeofday.c:5:
In file included from ../include/vdso/datapage.h:25:
arch/powerpc/include/asm/page.h:230:9: error: result of comparison of constant 
13835058055282163712 with expression of type 'unsigned long' is always true 
[-Werror,-Wtautological-constant-out-of-range-compare]
  230 | return __pa(kaddr) >> PAGE_SHIFT;
  |^~~
arch/powerpc/include/asm/page.h:217:37: note: expanded from macro '__pa'
  217 | VIRTUAL_WARN_ON((unsigned long)(x) < PAGE_OFFSET);  
\
  | ~~~^~
arch/powerpc/include/asm/page.h:202:73: note: expanded from macro 
'VIRTUAL_WARN_ON'
  202 | #define VIRTUAL_WARN_ON(x)  
WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && (x))
  | 
~^~~
arch/powerpc/include/asm/bug.h:88:25: note: expanded from macro 'WARN_ON'
   88 | int __ret_warn_on = !!(x);  \
  |^

Cc: Michael Ellerman 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Andy Lutomirski 
Cc: Thomas Gleixner 
Cc: Vincenzo Frascino 
Cc: Anna-Maria Behnsen 
See-also: 8b3843ae3634 ("vdso/datapage: Quick fix - use asm/page-def.h for 
ARM64")
Signed-off-by: Arnd Bergmann 
---
 arch/powerpc/include/asm/vdso/gettimeofday.h | 3 +--
 include/vdso/datapage.h  | 8 +---
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h 
b/arch/powerpc/include/asm/vdso/gettimeofday.h
index f0a4cf01e85c..78302f6c2580 100644
--- a/arch/powerpc/include/asm/vdso/gettimeofday.h
+++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
@@ -4,7 +4,6 @@
 
 #ifndef __ASSEMBLY__
 
-#include 
 #include 
 #include 
 #include 
@@ -95,7 +94,7 @@ const struct vdso_data *__arch_get_vdso_data(void);
 static __always_inline
 const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data *vd)
 {
-   return (void *)vd + PAGE_SIZE;
+   return (void *)vd + (1U << CONFIG_PAGE_SHIFT);
 }
 #endif
 
diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h
index 5d5c0b8efff2..c71ddb6d4691 100644
--- a/include/vdso/datapage.h
+++ b/include/vdso/datapage.h
@@ -19,12 +19,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_ARM64
-#include 
-#else
-#include 
-#endif
-
 #ifdef CONFIG_ARCH_HAS_VDSO_DATA
 #include 
 #else
@@ -132,7 +126,7 @@ extern struct vdso_data _timens_data[CS_BASES] 
__attribute__((visibility("hidden
  */
 union vdso_data_store {
struct vdso_datadata[CS_BASES];
-   u8  page[PAGE_SIZE];
+   u8  page[1U << CONFIG_PAGE_SHIFT];
 };
 
 /*
-- 
2.39.2

[PATCH v2 3/3] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-03-06 Thread Arnd Bergmann

From: Arnd Bergmann 

Most architectures only support a single hardcoded page size. In order
to ensure that each one of these sets the corresponding Kconfig symbols,
change over the PAGE_SHIFT definition to the common one and allow
only the hardware page size to be selected.

Acked-by: Guo Ren 
Acked-by: Heiko Carstens 
Acked-by: Stafford Horne 
Acked-by: Johannes Berg 
Signed-off-by: Arnd Bergmann 
---
No changes from v1

 arch/alpha/Kconfig | 1 +
 arch/alpha/include/asm/page.h  | 2 +-
 arch/arm/Kconfig   | 1 +
 arch/arm/include/asm/page.h| 2 +-
 arch/csky/Kconfig  | 1 +
 arch/csky/include/asm/page.h   | 2 +-
 arch/m68k/Kconfig  | 3 +++
 arch/m68k/Kconfig.cpu  | 2 ++
 arch/m68k/include/asm/page.h   | 6 +-
 arch/microblaze/Kconfig| 1 +
 arch/microblaze/include/asm/page.h | 2 +-
 arch/nios2/Kconfig | 1 +
 arch/nios2/include/asm/page.h  | 2 +-
 arch/openrisc/Kconfig  | 1 +
 arch/openrisc/include/asm/page.h   | 2 +-
 arch/riscv/Kconfig | 1 +
 arch/riscv/include/asm/page.h  | 2 +-
 arch/s390/Kconfig  | 1 +
 arch/s390/include/asm/page.h   | 2 +-
 arch/sparc/Kconfig | 2 ++
 arch/sparc/include/asm/page_32.h   | 2 +-
 arch/sparc/include/asm/page_64.h   | 3 +--
 arch/um/Kconfig| 1 +
 arch/um/include/asm/page.h | 2 +-
 arch/x86/Kconfig   | 1 +
 arch/x86/include/asm/page_types.h  | 2 +-
 arch/xtensa/Kconfig| 1 +
 arch/xtensa/include/asm/page.h | 2 +-
 28 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index d6968d090d49..4f490250d323 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -14,6 +14,7 @@ config ALPHA
select PCI_DOMAINS if PCI
select PCI_SYSCALL if PCI
select HAVE_ASM_MODVERSIONS
+   select HAVE_PAGE_SIZE_8KB
select HAVE_PCSPKR_PLATFORM
select HAVE_PERF_EVENTS
select NEED_DMA_MAP_STATE
diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h
index 4db1ebc0ed99..70419e6be1a3 100644
--- a/arch/alpha/include/asm/page.h
+++ b/arch/alpha/include/asm/page.h
@@ -6,7 +6,7 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT 13
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0af6709570d1..9d52ba3a8ad1 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -116,6 +116,7 @@ config ARM
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI
select HAVE_OPTPROBES if !THUMB2_KERNEL
+   select HAVE_PAGE_SIZE_4KB
select HAVE_PCI if MMU
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 119aa85d1feb..62af9f7f9e96 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -8,7 +8,7 @@
 #define _ASMARM_PAGE_H
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT 12
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~((1 << PAGE_SHIFT) - 1))
 
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index cf2a6fd7dff8..9c2723ab1c94 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -89,6 +89,7 @@ config CSKY
select HAVE_KPROBES if !CPU_CK610
select HAVE_KPROBES_ON_FTRACE if !CPU_CK610
select HAVE_KRETPROBES if !CPU_CK610
+   select HAVE_PAGE_SIZE_4KB
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
diff --git a/arch/csky/include/asm/page.h b/arch/csky/include/asm/page.h
index 866855e1ab43..0ca6c408c07f 100644
--- a/arch/csky/include/asm/page.h
+++ b/arch/csky/include/asm/page.h
@@ -10,7 +10,7 @@
 /*
  * PAGE_SHIFT determines the page size: 4KB
  */
-#define PAGE_SHIFT 12
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1, UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE - 1))
 #define THREAD_SIZE(PAGE_SIZE * 2)
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 4b3e93cac723..7b709453d5e7 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -84,12 +84,15 @@ config MMU
 
 config MMU_MOTOROLA
bool
+   select HAVE_PAGE_SIZE_4KB
 
 config MMU_COLDFIRE
+   select HAVE_PAGE_SIZE_8KB
bool
 
 config MMU_SUN3
bool
+   select HAVE_PAGE_SIZE_8KB
depends on MMU && !MMU_MOTOROLA && !MMU_COLDFIRE
 
 config ARCH_SUPPORTS_KEXEC
diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index 9dcf245c9cbf..c777a129768a 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -30,6 +30,7 @@ config COLDFIRE
se

[PATCH v2 2/3] arch: simplify architecture specific page size configuration

2024-03-06 Thread Arnd Bergmann

From: Arnd Bergmann 

arc, arm64, parisc and powerpc all have their own Kconfig symbols
in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
so the common symbols are the ones that are actually used, while
leaving the arhcitecture specific ones as the user visible
place for configuring it, to avoid breaking user configs.

Reviewed-by: Christophe Leroy  (powerpc32)
Acked-by: Catalin Marinas 
Acked-by: Helge Deller  # parisc
Signed-off-by: Arnd Bergmann 
---
No changes from v1

 arch/arc/Kconfig  |  3 +++
 arch/arc/include/uapi/asm/page.h  |  6 ++
 arch/arm64/Kconfig| 29 +
 arch/arm64/include/asm/page-def.h |  2 +-
 arch/parisc/Kconfig   |  3 +++
 arch/parisc/include/asm/page.h| 10 +-
 arch/powerpc/Kconfig  | 31 ++-
 arch/powerpc/include/asm/page.h   |  2 +-
 scripts/gdb/linux/constants.py.in |  2 +-
 scripts/gdb/linux/mm.py   |  2 +-
 10 files changed, 32 insertions(+), 58 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 1b0483c51cc1..4092bec198be 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -284,14 +284,17 @@ choice
 
 config ARC_PAGE_SIZE_8K
bool "8KB"
+   select HAVE_PAGE_SIZE_8KB
help
  Choose between 8k vs 16k
 
 config ARC_PAGE_SIZE_16K
+   select HAVE_PAGE_SIZE_16KB
bool "16KB"
 
 config ARC_PAGE_SIZE_4K
bool "4KB"
+   select HAVE_PAGE_SIZE_4KB
depends on ARC_MMU_V3 || ARC_MMU_V4
 
 endchoice
diff --git a/arch/arc/include/uapi/asm/page.h b/arch/arc/include/uapi/asm/page.h
index 2a4ad619abfb..7fd9e741b527 100644
--- a/arch/arc/include/uapi/asm/page.h
+++ b/arch/arc/include/uapi/asm/page.h
@@ -13,10 +13,8 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#if defined(CONFIG_ARC_PAGE_SIZE_16K)
-#define PAGE_SHIFT 14
-#elif defined(CONFIG_ARC_PAGE_SIZE_4K)
-#define PAGE_SHIFT 12
+#ifdef __KERNEL__
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #else
 /*
  * Default 8k
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index aa7c1d435139..29290b8cb36d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -277,27 +277,21 @@ config 64BIT
 config MMU
def_bool y
 
-config ARM64_PAGE_SHIFT
-   int
-   default 16 if ARM64_64K_PAGES
-   default 14 if ARM64_16K_PAGES
-   default 12
-
 config ARM64_CONT_PTE_SHIFT
int
-   default 5 if ARM64_64K_PAGES
-   default 7 if ARM64_16K_PAGES
+   default 5 if PAGE_SIZE_64KB
+   default 7 if PAGE_SIZE_16KB
default 4
 
 config ARM64_CONT_PMD_SHIFT
int
-   default 5 if ARM64_64K_PAGES
-   default 5 if ARM64_16K_PAGES
+   default 5 if PAGE_SIZE_64KB
+   default 5 if PAGE_SIZE_16KB
default 4
 
 config ARCH_MMAP_RND_BITS_MIN
-   default 14 if ARM64_64K_PAGES
-   default 16 if ARM64_16K_PAGES
+   default 14 if PAGE_SIZE_64KB
+   default 16 if PAGE_SIZE_16KB
default 18
 
 # max bits determined by the following formula:
@@ -1259,11 +1253,13 @@ choice
 
 config ARM64_4K_PAGES
bool "4KB"
+   select HAVE_PAGE_SIZE_4KB
help
  This feature enables 4KB pages support.
 
 config ARM64_16K_PAGES
bool "16KB"
+   select HAVE_PAGE_SIZE_16KB
help
  The system will use 16KB pages support. AArch32 emulation
  requires applications compiled with 16K (or a multiple of 16K)
@@ -1271,6 +1267,7 @@ config ARM64_16K_PAGES
 
 config ARM64_64K_PAGES
bool "64KB"
+   select HAVE_PAGE_SIZE_64KB
help
  This feature enables 64KB pages support (4KB by default)
  allowing only two levels of page tables and faster TLB
@@ -1291,19 +1288,19 @@ choice
 
 config ARM64_VA_BITS_36
bool "36-bit" if EXPERT
-   depends on ARM64_16K_PAGES
+   depends on PAGE_SIZE_16KB
 
 config ARM64_VA_BITS_39
bool "39-bit"
-   depends on ARM64_4K_PAGES
+   depends on PAGE_SIZE_4KB
 
 config ARM64_VA_BITS_42
bool "42-bit"
-   depends on ARM64_64K_PAGES
+   depends on PAGE_SIZE_64KB
 
 config ARM64_VA_BITS_47
bool "47-bit"
-   depends on ARM64_16K_PAGES
+   depends on PAGE_SIZE_16KB
 
 config ARM64_VA_BITS_48
bool "48-bit"
diff --git a/arch/arm64/include/asm/page-def.h 
b/arch/arm64/include/asm/page-def.h
index 2403f7b4cdbf..792e9fe881dc 100644
--- a/arch/arm64/include/asm/page-def.h
+++ b/arch/arm64/include/asm/page-def.h
@@ -11,7 +11,7 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT CONFIG_ARM64_PAGE_SHIFT
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1, UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 5c845e8d59d9..b180e684fa0d

[PATCH v2 1/3] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-03-06 Thread Arnd Bergmann

From: Arnd Bergmann 

These four architectures define the same Kconfig symbols for configuring
the page size. Move the logic into a common place where it can be shared
with all other architectures.

Signed-off-by: Arnd Bergmann 
---
Changes from v1:
 - improve Kconfig help texts
 - fix Hexagon Kconfig

 arch/Kconfig  | 92 ++-
 arch/hexagon/Kconfig  | 24 ++--
 arch/hexagon/include/asm/page.h   |  6 +-
 arch/loongarch/Kconfig| 21 ++-
 arch/loongarch/include/asm/page.h | 10 +---
 arch/mips/Kconfig | 58 ++-
 arch/mips/include/asm/page.h  | 16 +-
 arch/sh/include/asm/page.h| 13 +
 arch/sh/mm/Kconfig| 42 --
 9 files changed, 121 insertions(+), 161 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index a5af0edd3eb8..c63034e092d0 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1078,17 +1078,105 @@ config HAVE_ARCH_COMPAT_MMAP_BASES
  and vice-versa 32-bit applications to call 64-bit mmap().
  Required for applications doing different bitness syscalls.
 
+config HAVE_PAGE_SIZE_4KB
+   bool
+
+config HAVE_PAGE_SIZE_8KB
+   bool
+
+config HAVE_PAGE_SIZE_16KB
+   bool
+
+config HAVE_PAGE_SIZE_32KB
+   bool
+
+config HAVE_PAGE_SIZE_64KB
+   bool
+
+config HAVE_PAGE_SIZE_256KB
+   bool
+
+choice
+   prompt "MMU page size"
+
+config PAGE_SIZE_4KB
+   bool "4KiB pages"
+   depends on HAVE_PAGE_SIZE_4KB
+   help
+ This option select the standard 4KiB Linux page size and the only
+ available option on many architectures. Using 4KiB page size will
+ minimize memory consumption and is therefore recommended for low
+ memory systems.
+ Some software that is written for x86 systems makes incorrect
+ assumptions about the page size and only runs on 4KiB pages.
+
+config PAGE_SIZE_8KB
+   bool "8KiB pages"
+   depends on HAVE_PAGE_SIZE_8KB
+   help
+ This option is the only supported page size on a few older
+ processors, and can be slightly faster than 4KiB pages.
+
+config PAGE_SIZE_16KB
+   bool "16KiB pages"
+   depends on HAVE_PAGE_SIZE_16KB
+   help
+ This option is usually a good compromise between memory
+ consumption and performance for typical desktop and server
+ workloads, often saving a level of page table lookups compared
+ to 4KB pages as well as reducing TLB pressure and overhead of
+ per-page operations in the kernel at the expense of a larger
+ page cache.
+
+config PAGE_SIZE_32KB
+   bool "32KiB pages"
+   depends on HAVE_PAGE_SIZE_32KB
+ Using 32KiB page size will result in slightly higher performance
+ kernel at the price of higher memory consumption compared to
+ 16KiB pages.  This option is available only on cnMIPS cores.
+ Note that you will need a suitable Linux distribution to
+ support this.
+
+config PAGE_SIZE_64KB
+   bool "64KiB pages"
+   depends on HAVE_PAGE_SIZE_64KB
+ Using 64KiB page size will result in slightly higher performance
+ kernel at the price of much higher memory consumption compared to
+ 4KiB or 16KiB pages.
+ This is not suitable for general-purpose workloads but the
+ better performance may be worth the cost for certain types of
+ supercomputing or database applications that work mostly with
+ large in-memory data rather than small files.
+
+config PAGE_SIZE_256KB
+   bool "256KiB pages"
+   depends on HAVE_PAGE_SIZE_256KB
+   help
+ 256KiB pages have little practical value due to their extreme
+ memory usage.  The kernel will only be able to run applications
+ that have been compiled with '-zmax-page-size' set to 256KiB
+ (the default is 64KiB or 4KiB on most architectures).
+
+endchoice
+
 config PAGE_SIZE_LESS_THAN_64KB
def_bool y
-   depends on !ARM64_64K_PAGES
depends on !PAGE_SIZE_64KB
-   depends on !PARISC_PAGE_SIZE_64KB
depends on PAGE_SIZE_LESS_THAN_256KB
 
 config PAGE_SIZE_LESS_THAN_256KB
def_bool y
depends on !PAGE_SIZE_256KB
 
+config PAGE_SHIFT
+   int
+   default 12 if PAGE_SIZE_4KB
+   default 13 if PAGE_SIZE_8KB
+   default 14 if PAGE_SIZE_16KB
+   default 15 if PAGE_SIZE_32KB
+   default 16 if PAGE_SIZE_64KB
+   default 18 if PAGE_SIZE_256KB
+
 # This allows to use a set of generic functions to determine mmap base
 # address by giving priority to top-down scheme only if the process
 # is not in legacy mode (compat task, unlimited stack size or
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index a880ee067d2e..1414052e7d6b 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -8,6 +8

[v2 PATCH 0/3] arch: mm, vdso: consolidate PAGE_SIZE definition

2024-03-06 Thread Arnd Bergmann

From: Arnd Bergmann 

Naresh noticed that the newly added usage of the PAGE_SIZE macro in
include/vdso/datapage.h introduced a build regression. I had an older
patch that I revived to have this defined through Kconfig rather than
through including asm/page.h, which is not allowed in vdso code.

The vdso patch series now has a temporary workaround, but I still want to
get this into v6.9 so we can place the hack with CONFIG_PAGE_SIZE
in the vdso.

I've applied this to the asm-generic tree already, please let me know if
there are still remaining issues. It's really close to the merge window
already, so I'd probably give this a few more days before I send a pull
request, or defer it to v6.10 if anything goes wrong.

Sorry for the delay, I was still waiting to resolve the m68k question,
but there were no further replies in the end, so I kept my original
version.

Changes from v1:

 - improve Kconfig help texts
 - remove an extraneous line in hexagon

  Arnd

Link: 
https://lore.kernel.org/lkml/ca+g9fytrxxm_ko9fnpz3xarxhv7ud_yqp-teupqrnrhu+_0...@mail.gmail.com/
Link: https://lore.kernel.org/all/65dc6c14.170a0220.f4a3f.9...@mx.google.com/
Link: https://lore.kernel.org/lkml/20240226161414.2316610-1-a...@kernel.org/

Arnd Bergmann (3):
  arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions
  arch: simplify architecture specific page size configuration
  arch: define CONFIG_PAGE_SIZE_*KB on all architectures

 arch/Kconfig   | 92 +-
 arch/alpha/Kconfig |  1 +
 arch/alpha/include/asm/page.h  |  2 +-
 arch/arc/Kconfig   |  3 +
 arch/arc/include/uapi/asm/page.h   |  6 +-
 arch/arm/Kconfig   |  1 +
 arch/arm/include/asm/page.h|  2 +-
 arch/arm64/Kconfig | 29 +-
 arch/arm64/include/asm/page-def.h  |  2 +-
 arch/csky/Kconfig  |  1 +
 arch/csky/include/asm/page.h   |  2 +-
 arch/hexagon/Kconfig   | 24 ++--
 arch/hexagon/include/asm/page.h|  6 +-
 arch/loongarch/Kconfig | 21 ++-
 arch/loongarch/include/asm/page.h  | 10 +---
 arch/m68k/Kconfig  |  3 +
 arch/m68k/Kconfig.cpu  |  2 +
 arch/m68k/include/asm/page.h   |  6 +-
 arch/microblaze/Kconfig|  1 +
 arch/microblaze/include/asm/page.h |  2 +-
 arch/mips/Kconfig  | 58 ++-
 arch/mips/include/asm/page.h   | 16 +-
 arch/nios2/Kconfig |  1 +
 arch/nios2/include/asm/page.h  |  2 +-
 arch/openrisc/Kconfig  |  1 +
 arch/openrisc/include/asm/page.h   |  2 +-
 arch/parisc/Kconfig|  3 +
 arch/parisc/include/asm/page.h | 10 +---
 arch/powerpc/Kconfig   | 31 ++
 arch/powerpc/include/asm/page.h|  2 +-
 arch/riscv/Kconfig |  1 +
 arch/riscv/include/asm/page.h  |  2 +-
 arch/s390/Kconfig  |  1 +
 arch/s390/include/asm/page.h   |  2 +-
 arch/sh/include/asm/page.h | 13 +
 arch/sh/mm/Kconfig | 42 --
 arch/sparc/Kconfig |  2 +
 arch/sparc/include/asm/page_32.h   |  2 +-
 arch/sparc/include/asm/page_64.h   |  3 +-
 arch/um/Kconfig|  1 +
 arch/um/include/asm/page.h |  2 +-
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/page_types.h  |  2 +-
 arch/xtensa/Kconfig|  1 +
 arch/xtensa/include/asm/page.h |  2 +-
 scripts/gdb/linux/constants.py.in  |  2 +-
 scripts/gdb/linux/mm.py|  2 +-
 47 files changed, 185 insertions(+), 238 deletions(-)

-- 
2.39.2

To: Thomas Gleixner 
To: Vincenzo Frascino 
To: Kees Cook 
To: Anna-Maria Behnsen 
Cc: Matt Turner 
Cc: Vineet Gupta 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Guo Ren 
Cc: Brian Cain 
Cc: Huacai Chen 
Cc: Geert Uytterhoeven 
Cc: Michal Simek 
Cc: Thomas Bogendoerfer 
Cc: Helge Deller 
Cc: Michael Ellerman 
Cc: Christophe Leroy 
Cc: Palmer Dabbelt 
Cc: John Paul Adrian Glaubitz 
Cc: Andreas Larsson 
Cc: Richard Weinberger 
Cc: x...@kernel.org
Cc: Max Filippov 
Cc: Andy Lutomirski 
Cc: Vincenzo Frascino 
Cc: Jan Kiszka 
Cc: Kieran Bingham 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: linux-ker...@vger.kernel.org
Cc: linux-al...@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: linux-hexa...@vger.kernel.org
Cc: loonga...@lists.linux.dev
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-m...@vger.kernel.org
Cc: linux-openr...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@lists.infradead.org

Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Arnd Bergmann

On Tue, Feb 27, 2024, at 16:44, Christophe Leroy wrote:
> Le 27/02/2024 à 16:40, Arnd Bergmann a écrit :
>> On Mon, Feb 26, 2024, at 17:55, Samuel Holland wrote:
>
>
> For 256K pages, powerpc has the following help. I think you should have 
> it too:
>
> The kernel will only be able to run applications that have been
> compiled with '-zmax-page-size' set to 256K (the default is 64K) using
> binutils later than 2.17.50.0.3, or by patching the ELF_MAXPAGESIZE
> definition from 0x1 to 0x4 in older versions.

I don't think we need to mention pre-2.18 binutils any more, but the
rest seems useful, changed the text now to

config PAGE_SIZE_256KB
bool "256KiB pages"
depends on HAVE_PAGE_SIZE_256KB
help
  256KiB pages have little practical value due to their extreme
  memory usage.  The kernel will only be able to run applications
  that have been compiled with '-zmax-page-size' set to 256KiB
  (the default is 64KiB or 4KiB on most architectures).

  Arnd

Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Arnd Bergmann

On Tue, Feb 27, 2024, at 09:45, Geert Uytterhoeven wrote:
>
>> +config PAGE_SIZE_4KB
>> +   bool "4KB pages"
>
> Now you got rid of the 4000-byte ("4kB") pages and friends, please
> do not replace these by Kelvin-bytes, and use the official binary
> prefixes => "4 KiB".
>

Done, thanks.

Arnd

Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Arnd Bergmann

On Mon, Feb 26, 2024, at 20:02, Christophe Leroy wrote:
> Le 26/02/2024 à 17:14, Arnd Bergmann a écrit :
>> From: Arnd Bergmann 
>
> That's a nice re-factor.
>
> The only drawback I see is that we are loosing several interesting 
> arch-specific comments/help text. Don't know if there could be an easy 
> way to keep them.

This is what I have now, trying to write it as generic as
possible while still giving useful advice:

config PAGE_SIZE_4KB
bool "4KiB pages"
depends on HAVE_PAGE_SIZE_4KB
help
  This option select the standard 4KiB Linux page size and the only
  available option on many architectures. Using 4KiB page size will
  minimize memory consumption and is therefore recommended for low
  memory systems.
  Some software that is written for x86 systems makes incorrect
  assumptions about the page size and only runs on 4KiB pages.

config PAGE_SIZE_8KB
bool "8KiB pages"
depends on HAVE_PAGE_SIZE_8KB
help
  This option is the only supported page size on a few older
  processors, and can be slightly faster than 4KiB pages.

config PAGE_SIZE_16KB
bool "16KiB pages"
depends on HAVE_PAGE_SIZE_16KB
help
  This option is usually a good compromise between memory
  consumption and performance for typical desktop and server
  workloads, often saving a level of page table lookups compared
  to 4KB pages as well as reducing TLB pressure and overhead of
  per-page operations in the kernel at the expense of a larger
  page cache.

config PAGE_SIZE_32KB
bool "32KiB pages"
depends on HAVE_PAGE_SIZE_32KB
  Using 32KiB page size will result in slightly higher performance
  kernel at the price of higher memory consumption compared to
  16KiB pages.  This option is available only on cnMIPS cores.
  Note that you will need a suitable Linux distribution to
  support this.

config PAGE_SIZE_64KB
bool "64KiB pages"
depends on HAVE_PAGE_SIZE_64KB
  Using 64KiB page size will result in slightly higher performance
  kernel at the price of much higher memory consumption compared to
  4KiB or 16KiB pages.
  This is not suitable for general-purpose workloads but the
  better performance may be worth the cost for certain types of
  supercomputing or database applications that work mostly with
  large in-memory data rather than small files.

config PAGE_SIZE_256KB
bool "256KiB pages"
depends on HAVE_PAGE_SIZE_256KB
help
  256KB pages have little practical value due to their extreme
  memory usage.

Let me know if you think some of this should be adapted further.

>>   
>> +#define PAGE_SHIFT CONFIG_PAGE_SHIFT
>>   #define PAGE_SIZE  (1UL << PAGE_SHIFT)
>>   #define PAGE_MASK  (~((1 << PAGE_SHIFT) - 1))
>>   
>
> Could we move PAGE_SIZE and PAGE_MASK in a generic/core header instead 
> of having it duplicated for each arch ?

Yes, but I'm leaving this for a follow-up series, since I had
to stop somewhere and there is always room for cleanup up headers
further ;-)

  Arnd

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1066 matches

Mail list logo