Re: [PATCH net-next 00/10] net: mdio: Continue separating C22 and C45

2023-01-13 Thread patchwork-bot+netdevbpf
Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski :

On Thu, 12 Jan 2023 16:15:07 +0100 you wrote:
> I've picked this older series from Andrew up and rebased it onto
> the latest net-next.
> 
> This is the second patch set in the series which separates the C22
> and C45 MDIO bus transactions at the API level to the MDIO bus drivers.
> 
> Signed-off-by: Michael Walle 
> 
> [...]

Here is the summary with links:
  - [net-next,01/10] net: mdio: cavium: Separate C22 and C45 transactions
https://git.kernel.org/netdev/net-next/c/93641ecbaa1f
  - [net-next,02/10] net: mdio: i2c: Separate C22 and C45 transactions
https://git.kernel.org/netdev/net-next/c/87e3bee0f247
  - [net-next,03/10] net: mdio: mux-bcm-iproc: Separate C22 and C45 transactions
https://git.kernel.org/netdev/net-next/c/d544a25930a7
  - [net-next,04/10] net: mdio: aspeed: Separate C22 and C45 transactions
https://git.kernel.org/netdev/net-next/c/c3c497eb8b24
  - [net-next,05/10] net: mdio: ipq4019: Separate C22 and C45 transactions
https://git.kernel.org/netdev/net-next/c/c58e39942adf
  - [net-next,06/10] net: ethernet: mtk_eth_soc: Separate C22 and C45 
transactions
https://git.kernel.org/netdev/net-next/c/900888374e73
  - [net-next,07/10] net: lan743x: Separate C22 and C45 transactions
https://git.kernel.org/netdev/net-next/c/3d90c03cb416
  - [net-next,08/10] net: stmmac: Separate C22 and C45 transactions for xgmac2
https://git.kernel.org/netdev/net-next/c/5b0a447efff5
  - [net-next,09/10] net: stmmac: Separate C22 and C45 transactions for xgmac
https://git.kernel.org/netdev/net-next/c/3c7826d0b106
  - [net-next,10/10] enetc: Separate C22 and C45 transactions
https://git.kernel.org/netdev/net-next/c/80e87442e69b

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH net v5] net: wan: Add checks for NULL for utdm in undo_uhdlc_init and unmap_si_regs

2023-01-13 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski :

On Thu, 12 Jan 2023 10:47:03 +0300 you wrote:
> If uhdlc_priv_tsa != 1 then utdm is not initialized.
> And if ret != NULL then goto undo_uhdlc_init, where
> utdm is dereferenced. Same if dev == NULL.
> 
> Found by Astra Linux on behalf of Linux Verification Center
> (linuxtesting.org) with SVACE.
> 
> [...]

Here is the summary with links:
  - [net,v5] net: wan: Add checks for NULL for utdm in undo_uhdlc_init and 
unmap_si_regs
https://git.kernel.org/netdev/net/c/488e0bf7f34a

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH] lockref: stop doing cpu_relax in the cmpxchg loop

2023-01-13 Thread Linus Torvalds
On Fri, Jan 13, 2023 at 3:47 PM Luck, Tony  wrote:
>
> The computer necrophiliacs at Debian and Gentoo seem determined
> to keep ia64 alive.
>
> So perhaps this should s/cpu_relax/soemt_relax/ where soemt_relax
> is a no-op everywhere except ia64, which can define it as cpu_relax.

Heh. I already took your earlier "$ git rm -r arch/ia64" comment as an
ack for not really caring about ia64.

I suspect nobody will notice, and if ia64 is the only reason to do
this, I really don't think it would be worth it.

  Linus


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Ard Biesheuvel
On Fri, 13 Jan 2023 at 22:06, John Paul Adrian Glaubitz
 wrote:
>
> Hello Ard!
>
> > Can I take that as an ack on [0]? The EFI subsystem has evolved
> > substantially over the years, and there is really no way to do any
> > IA64 testing beyond build testing, so from that perspective, dropping
> > it entirely would be welcomed.
>
> ia64 is regularly tested in Debian and Gentoo [1][2].
>
> Debian's ia64 porterbox yttrium runs a recent kernel without issues:
>
> root@yttrium:~# uname -a
> Linux yttrium 5.19.0-2-mckinley #1 SMP Debian 5.19.11-1 (2022-09-24) ia64 
> GNU/Linux
> root@yttrium:~#
>
> root@yttrium:~# journalctl -b|head -n10
> Nov 14 14:46:10 yttrium kernel: Linux version 5.19.0-2-mckinley 
> (debian-ker...@lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld 
> (GNU Binutils for Debian) 2.39) #1 SMP Debian 5.19.11-1 (2022-09-24)
> Nov 14 14:46:10 yttrium kernel: efi: EFI v2.10 by HP
> Nov 14 14:46:10 yttrium kernel: efi: SALsystab=0xdfdd63a18 ESI=0xdfdd63f18 
> ACPI 2.0=0x3d3c4014 HCDP=0xd8798 SMBIOS=0x3d368000
> Nov 14 14:46:10 yttrium kernel: PCDP: v3 at 0xd8798
> Nov 14 14:46:10 yttrium kernel: earlycon: uart8250 at I/O port 0x4000 
> (options '115200n8')
> Nov 14 14:46:10 yttrium kernel: printk: bootconsole [uart8250] enabled
> Nov 14 14:46:10 yttrium kernel: ACPI: Early table checksum verification 
> disabled
> Nov 14 14:46:10 yttrium kernel: ACPI: RSDP 0x3D3C4014 24 (v02 HP  
>   )
> Nov 14 14:46:10 yttrium kernel: ACPI: XSDT 0x3D3C4580 000124 (v01 HP  
>RX2800-2 0001  0113)
> Nov 14 14:46:10 yttrium kernel: ACPI: FACP 0x3D3BE000 F4 (v03 HP  
>RX2800-2 0001 HP   0001)
> root@yttrium:~#
>
> Same applies to the buildds:
>
> root@lifshitz:~# uname -a
> Linux lifshitz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 
> GNU/Linux
> root@lifshitz:~#
>
> root@lenz:~# uname -a
> Linux lenz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 GNU/Linux
> root@lenz:~#
>
> EFI works fine as well using the latest version of GRUB2.
>
> Thanks,
> Adrian
>
> > [1] https://cdimage.debian.org/cdimage/ports/snapshots/
> > [2] https://mirror.yandex.ru/gentoo-distfiles//releases/ia64/autobuilds/

Thanks for reporting back. I (mis)read the debian ports page [3],
which mentions Debian 7 as the highest Debian version that supports
IA64, and so I assumed that support had been dropped from Debian.

However, if only a handful of people want to keep this port alive for
reasons of nostalgia, it is obviously obsolete, and we should ask
ourselves whether it is reasonable to expect Linux contributors to
keep spending time on this.

Does the Debian ia64 port have any users? Or is the system that builds
the packages the only one that consumes them?


[3] https://www.debian.org/ports/ia64/


Re: [PATCH] kallsyms: Fix scheduling with interrupts disabled in self-test

2023-01-13 Thread Luis Chamberlain
On Fri, Jan 13, 2023 at 09:44:35AM +0100, Petr Mladek wrote:
> On Thu 2023-01-12 10:24:43, Luis Chamberlain wrote:
> > On Thu, Jan 12, 2023 at 08:54:26PM +1000, Nicholas Piggin wrote:
> > > kallsyms_on_each* may schedule so must not be called with interrupts
> > > disabled. The iteration function could disable interrupts, but this
> > > also changes lookup_symbol() to match the change to the other timing
> > > code.
> > > 
> > > Reported-by: Erhard F. 
> > > Link: 
> > > https://lore.kernel.org/all/bug-216902-206...@https.bugzilla.kernel.org%2F/
> > > Reported-by: kernel test robot 
> > > Link: 
> > > https://lore.kernel.org/oe-lkp/202212251728.8d0872ff-oliver.s...@intel.com
> > > Fixes: 30f3bb09778d ("kallsyms: Add self-test facility")
> > > Signed-off-by: Nicholas Piggin 
> > > ---
> > 
> > Thanks Nicholas!
> > 
> > Petr had just suggested removing this aspect of the selftests, the 
> > performance
> > test as its specific to the config, it doesn't run many times to get an
> > average and odd things on a system can create different metrics. Zhen Lei
> > had given up on fixing it and has a patch to instead remove this part of
> > the selftest.
> > 
> > I still find value in keeping it, but Petr, would like your opinion on
> > this fix, if we were to keep it.
> 
> I am fine with this fix.

Merged the fix. I'll push to Linus for 6.2-rc4

  Luis


[PATCH] lockref: stop doing cpu_relax in the cmpxchg loop

2023-01-13 Thread Mateusz Guzik
On the x86-64 architecture even a failing cmpxchg grants exclusive
access to the cacheline, making it preferable to retry the failed op
immediately instead of stalling with the pause instruction.

To illustrate the impact, below are benchmark results obtained by
running various will-it-scale tests on top of the 6.2-rc3 kernel
and Cascade Lake (2 sockets * 24 cores * 2 threads) CPU.

All results in ops/s. Note there is some variance in re-runs, but
the code is consistently faster when contention is present.

open3 ("Same file open/close"):
procstock   no-pause
1   805603  814942  (+%1)
2   1054980 1054781 (-0%)
8   1544802 1822858 (+18%)
24  1191064 2199665 (+84%)
48  851582  1469860 (+72%)
96  609481  1427170 (+134%)

fstat2 ("Same file fstat"):
procstock   no-pause
1   3013872 3047636 (+1%)
2   4284687 4400421 (+2%)
8   3257721 5530156 (+69%)
24  2239819 5466127 (+144%)
48  1701072 5256609 (+209%)
96  1269157 6649326 (+423%)

Additionally, a kernel with a private patch to help access() scalability:
access2 ("Same file access"):
procstock   patched patched+nopause
24  2378041 2005501 5370335 (-15% / +125%)

That is, fixing the problems in access itself *reduces* scalability
after the cacheline ping-pong only happens in lockref with the pause
instruction.

Note that fstat and access benchmarks are not currently integrated into
will-it-scale, but interested parties can find them in pull requests to
said project.

Code at hand has a rather tortured history. First modification showed up
in d472d9d98b463dd7 ("lockref: Relax in cmpxchg loop"), written with
Itanium in mind. Later it got patched up to use an arch-dependent macro
to stop doing it on s390 where it caused a significant regression. Said
macro had undergone revisions and was ultimately eliminated later, going
back to cpu_relax.

While I intended to only remove cpu_relax for x86-64, I got the
following comment from Linus:
> I would actually prefer just removing it entirely and see if somebody
> else hollers. You have the numbers to prove it hurts on real hardware,
> and I don't think we have any numbers to the contrary.

> So I think it's better to trust the numbers and remove it as a
> failure, than say "let's just remove it on x86-64 and leave everybody
>else with the potentially broken code"

Additionally, Will Deacon (maintainer of the arm64 port, one of the
architectures previously benchmarked):
> So, from the arm64 side of the fence, I'm perfectly happy just removing
> the cpu_relax() calls from lockref.

As such, come back full circle in history and whack it altogether.

Signed-off-by: Mateusz Guzik 
---
 lib/lockref.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/lib/lockref.c b/lib/lockref.c
index 45e93ece8ba0..2afe4c5d8919 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -23,7 +23,6 @@
}   
\
if (!--retry)   
\
break;  
\
-   cpu_relax();
\
}   
\
 } while (0)
 
-- 
2.39.0



Re: [PATCH v3 00/51] cpuidle,rcu: Clean up the mess

2023-01-13 Thread Paul E. McKenney
On Thu, Jan 12, 2023 at 08:43:14PM +0100, Peter Zijlstra wrote:
> Hi All!
> 
> The (hopefully) final respin of cpuidle vs rcu cleanup patches. Barring any
> objections I'll be queueing these patches in tip/sched/core in the next few
> days.
> 
> v2: https://lkml.kernel.org/r/20220919095939.761690...@infradead.org
> 
> These here patches clean up the mess that is cpuidle vs rcuidle.
> 
> At the end of the ride there's only on RCU_NONIDLE user left:
> 
>   arch/arm64/kernel/suspend.c:RCU_NONIDLE(__cpu_suspend_exit());
> 
> And I know Mark has been prodding that with something sharp.
> 
> The last version was tested by a number of people and I'm hoping to not have
> broken anything in the meantime ;-)
> 
> 
> Changes since v2:

150 rcutorture hours on each of the default scenarios passed.  This
is qemu/KVM on x86:

Tested-by: Paul E. McKenney 

>  - rebased to v6.2-rc3; as available at:
>  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/idle
> 
>  - folded: 
> https://lkml.kernel.org/r/y3ubwyny15etu...@hirez.programming.kicks-ass.net
>which makes the ARM cpuidle index 0 consistently not use
>CPUIDLE_FLAG_RCU_IDLE, as requested by Ulf.
> 
>  - added a few more __always_inline to empty stub functions as found by the
>robot.
> 
>  - Used _RET_IP_ instead of _THIS_IP_ in a few placed because of:
>https://github.com/ClangBuiltLinux/linux/issues/263
> 
>  - Added new patches to address various robot reports:
> 
>  #35:  trace,hardirq: No moar _rcuidle() tracing
>  #47:  cpuidle: Ensure ct_cpuidle_enter() is always called from 
> noinstr/__cpuidle
>  #48:  cpuidle,arch: Mark all ct_cpuidle_enter() callers __cpuidle
>  #49:  cpuidle,arch: Mark all regular cpuidle_state::enter methods 
> __cpuidle
>  #50:  cpuidle: Comments about noinstr/__cpuidle
>  #51:  context_tracking: Fix noinstr vs KASAN
> 
> 
> ---
>  arch/alpha/kernel/process.c   |  1 -
>  arch/alpha/kernel/vmlinux.lds.S   |  1 -
>  arch/arc/kernel/process.c |  3 ++
>  arch/arc/kernel/vmlinux.lds.S |  1 -
>  arch/arm/include/asm/vmlinux.lds.h|  1 -
>  arch/arm/kernel/cpuidle.c |  4 +-
>  arch/arm/kernel/process.c |  1 -
>  arch/arm/kernel/smp.c |  6 +--
>  arch/arm/mach-davinci/cpuidle.c   |  4 +-
>  arch/arm/mach-gemini/board-dt.c   |  3 +-
>  arch/arm/mach-imx/cpuidle-imx5.c  |  4 +-
>  arch/arm/mach-imx/cpuidle-imx6q.c |  8 ++--
>  arch/arm/mach-imx/cpuidle-imx6sl.c|  4 +-
>  arch/arm/mach-imx/cpuidle-imx6sx.c|  9 ++--
>  arch/arm/mach-imx/cpuidle-imx7ulp.c   |  4 +-
>  arch/arm/mach-omap2/common.h  |  6 ++-
>  arch/arm/mach-omap2/cpuidle34xx.c | 16 ++-
>  arch/arm/mach-omap2/cpuidle44xx.c | 29 +++--
>  arch/arm/mach-omap2/omap-mpuss-lowpower.c | 12 +-
>  arch/arm/mach-omap2/pm.h  |  2 +-
>  arch/arm/mach-omap2/pm24xx.c  | 51 +-
>  arch/arm/mach-omap2/pm34xx.c  | 14 +--
>  arch/arm/mach-omap2/pm44xx.c  |  2 +-
>  arch/arm/mach-omap2/powerdomain.c | 10 ++---
>  arch/arm/mach-s3c/cpuidle-s3c64xx.c   |  5 +--
>  arch/arm64/kernel/cpuidle.c   |  2 +-
>  arch/arm64/kernel/idle.c  |  1 -
>  arch/arm64/kernel/smp.c   |  4 +-
>  arch/arm64/kernel/vmlinux.lds.S   |  1 -
>  arch/csky/kernel/process.c|  1 -
>  arch/csky/kernel/smp.c|  2 +-
>  arch/csky/kernel/vmlinux.lds.S|  1 -
>  arch/hexagon/kernel/process.c |  1 -
>  arch/hexagon/kernel/vmlinux.lds.S |  1 -
>  arch/ia64/kernel/process.c|  1 +
>  arch/ia64/kernel/vmlinux.lds.S|  1 -
>  arch/loongarch/kernel/idle.c  |  1 +
>  arch/loongarch/kernel/vmlinux.lds.S   |  1 -
>  arch/m68k/kernel/vmlinux-nommu.lds|  1 -
>  arch/m68k/kernel/vmlinux-std.lds  |  1 -
>  arch/m68k/kernel/vmlinux-sun3.lds |  1 -
>  arch/microblaze/kernel/process.c  |  1 -
>  arch/microblaze/kernel/vmlinux.lds.S  |  1 -
>  arch/mips/kernel/idle.c   | 14 +++
>  arch/mips/kernel/vmlinux.lds.S|  1 -
>  arch/nios2/kernel/process.c   |  1 -
>  arch/nios2/kernel/vmlinux.lds.S   |  1 -
>  arch/openrisc/kernel/process.c|  1 +
>  arch/openrisc/kernel/vmlinux.lds.S|  1 -
>  arch/parisc/kernel/process.c  |  2 -
>  arch/parisc/kernel/vmlinux.lds.S  |  1 -
>  arch/powerpc/kernel/idle.c|  5 +--
>  arch/powerpc/kernel/vmlinux.lds.S |  1 -
>  arch/riscv/kernel/process.c   |  1 -
>  arch/riscv/kernel/vmlinux-xip.lds.S   |  1 -
>  arch/riscv/kernel/vmlinux.lds.S   |  1 -
>  arch/s390/kernel/idle.c   |  1 -
>  arch/s390/kernel/vmlinux.lds.S

[PATCH mm-unstable v1 00/26] mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs

2023-01-13 Thread David Hildenbrand
This is the follow-up on [1]:
[PATCH v2 0/8] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of
anonymous pages

After we implemented __HAVE_ARCH_PTE_SWP_EXCLUSIVE on most prominent
enterprise architectures, implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all
remaining architectures that support swap PTEs.

This makes sure that exclusive anonymous pages will stay exclusive, even
after they were swapped out -- for example, making GUP R/W FOLL_GET of
anonymous pages reliable. Details can be found in [1].

This primarily fixes remaining known O_DIRECT memory corruptions that can
happen on concurrent swapout, whereby we can lose DMA reads to a page
(modifying the user page by writing to it).

To verify, there are two test cases (requiring swap space, obviously):
(1) The O_DIRECT+swapout test case [2] from Andrea. This test case tries
triggering a race condition.
(2) My vmsplice() test case [3] that tries to detect if the exclusive
marker was lost during swapout, not relying on a race condition.


For example, on 32bit x86 (with and without PAE), my test case fails
without these patches:
$ ./test_swp_exclusive
FAIL: page was replaced during COW
But succeeds with these patches:
$ ./test_swp_exclusive
PASS: page was not replaced during COW


Why implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE for all architectures, even
the ones where swap support might be in a questionable state? This is the
first step towards removing "readable_exclusive" migration entries, and
instead using pte_swp_exclusive() also with (readable) migration entries
instead (as suggested by Peter). The only missing piece for that is
supporting pmd_swp_exclusive() on relevant architectures with THP
migration support.

As all relevant architectures now implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE,,
we can drop __HAVE_ARCH_PTE_SWP_EXCLUSIVE in the last patch.

I tried cross-compiling all relevant setups and tested on x86 and sparc64
so far.

CCing arch maintainers only on this cover letter and on the respective
patch(es).

[1] https://lkml.kernel.org/r/20220329164329.208407-1-da...@redhat.com
[2] 
https://gitlab.com/aarcange/kernel-testcases-for-v5.11/-/blob/main/page_count_do_wp_page-swap.c
[3] 
https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/test_swp_exclusive.c


RFC -> v1:
* Some smaller comment+patch description changes
* "powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s"
 -> Fixup swap PTE description


David Hildenbrand (26):
  mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks
  alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  m68k/mm: remove dummy __swp definitions for nommu
  m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  nios2/mm: refactor swap PTE layout
  nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s
  powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit
  sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit
  um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit
  xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE

 arch/alpha/include/asm/pgtable.h  | 40 -
 arch/arc/include/asm/pgtable-bits-arcv2.h | 26 +-
 arch/arm/include/asm/pgtable-2level.h |  3 +
 arch/arm/include/asm/pgtable-3level.h |  3 +
 arch/arm/include/asm/pgtable.h| 34 +--
 arch/arm64/include/asm/pgtable.h  |  1 -
 arch/csky/abiv1/inc/abi/pgtable-bits.h| 13 ++-
 arch/csky/abiv2/inc/abi/pgtable-bits.h| 19 ++--
 arch/csky/include/asm/pgtable.h   | 17 
 arch/hexagon/include/asm/pgtable.h| 36 ++--
 arch/ia64/include/asm/pgtable.h   | 31 ++-
 arch/loongarch/include/asm/pgtable-bits.h |  4 +
 arch/loongarch/include/asm/pgtable.h  | 38 +++-
 arch/m68k/include/asm/mcf_pgtable.h   | 35 +++-
 arch/m68k/include/asm/motorola_pgtable.h  | 37 +++-
 arch/m68k/include/asm/pgtable_no.h|  6 --
 arch/m68k/include/asm/sun3_pgtable.h  | 38 +++-
 arch/microblaze/include/asm/pgtable.h | 44 +++---
 arch/mips/include/asm/pgtable-32.h| 88 ---
 

Re: lockref scalability on x86-64 vs cpu_relax

2023-01-13 Thread Mateusz Guzik
On 1/13/23, Linus Torvalds  wrote:
> Side note on your access() changes - if it turns out that you can
> remove all the cred games, we should possibly then revert my old
> commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> temporary subjective credentials") which avoided the biggest issue
> with the unnecessary cred switching.
>
> I *think* access() is the only user of that special 'non_rcu' thing,
> but it is possible that the whole 'non_rcu' thing ends up mattering
> for cases where the cred actually does change because euid != uid (ie
> suid programs), so this would need a bit more effort to do performance
> testing on.
>

I don't think the games are avoidable. For one I found non-root
processes with non-empty cap_effective even on my laptop, albeit I did
not check how often something like this is doing access().

Discussion for another time.

> On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik  wrote:
>> All that said, I think the thing to do here is to replace cpu_relax
>> with a dedicated arch-dependent macro, akin to the following:
>
> I would actually prefer just removing it entirely and see if somebody
> else hollers. You have the numbers to prove it hurts on real hardware,
> and I don't think we have any numbers to the contrary.
>
> So I think it's better to trust the numbers and remove it as a
> failure, than say "let's just remove it on x86-64 and leave everybody
> else with the potentially broken code"
>
[snip]
> Then other architectures can try to run their numbers, and only *if*
> it then turns out that they have a reason to do something else should
> we make this conditional and different on different architectures.
>
> Let's try to keep the code as common as possibly until we have hard
> evidence for special cases, in other words.
>

I did not want to make such a change without redoing the ThunderX2
benchmark, or at least something else arm64-y. I may be able to bench it
tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
no arm64 people show up with their results.

Even then IMHO the safest route is to patch it out on x86-64 and give
other people time to bench their archs as they get around to it, and
ultimately whack the thing if it turns out nobody benefits from it.
I would say beats backpedaling on the removal, but I'm not going to
fight for it.

That said, does waiting for arm64 numbers and/or producing them for the
removal commit message sound like a plan? If so, I'll post soon(tm).

-- 
Mateusz Guzik 


Re: [PATCH] modpost: support arbitrary symbol length in modversion

2023-01-13 Thread Lucas De Marchi

On Wed, Jan 11, 2023 at 04:11:51PM +, Gary Guo wrote:

Currently modversion uses a fixed size array of size (64 - sizeof(long))
to store symbol names, thus placing a hard limit on length of symbols.
Rust symbols (which encodes crate and module names) can be quite a bit
longer. The length limit in kallsyms is increased to 512 for this reason.

It's a waste of space to simply expand the fixed array size to 512 in
modversion info entries. I therefore make it variably sized, with offset
to the next entry indicated by the initial "next" field.

In addition to supporting longer-than-56/60 byte symbols, this patch also
reduce the size for short symbols by getting rid of excessive 0 paddings.
There are still some zero paddings to ensure "next" and "crc" fields are
properly aligned.

This patch does have a tiny drawback that it makes ".mod.c" files generated
a bit less easy to read, as code like

"\x08\x00\x00\x00\x78\x56\x34\x12"
"symbol\0\0"

is generated as opposed to

{ 0x12345678, "symbol" },

because the structure is now variable-length. But hopefully nobody reads
the generated file :)

Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512")
Link: https://github.com/Rust-for-Linux/linux/pull/379

Signed-off-by: Gary Guo 
---
arch/powerpc/kernel/module_64.c |  3 ++-
include/linux/module.h  |  6 --
kernel/module/version.c | 21 +
scripts/export_report.pl|  9 +
scripts/mod/modpost.c   | 33 +++--
5 files changed, 43 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index ff045644f13f..eac23c11d579 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -236,10 +236,11 @@ static void dedotify_versions(struct modversion_info 
*vers,
{
struct modversion_info *end;

-   for (end = (void *)vers + size; vers < end; vers++)
+   for (end = (void *)vers + size; vers < end; vers = (void *)vers + 
vers->next) {
if (vers->name[0] == '.') {
memmove(vers->name, vers->name+1, strlen(vers->name));
}
+   }
}

/*
diff --git a/include/linux/module.h b/include/linux/module.h
index 8c5909c0076c..37cb25af9099 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -34,8 +34,10 @@
#define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN

struct modversion_info {
-   unsigned long crc;
-   char name[MODULE_NAME_LEN];
+   /* Offset of the next modversion entry in relation to this one. */
+   u32 next;
+   u32 crc;
+   char name[0];


although not really exported as uapi, this will break userspace as this is
used in the  elf file generated for the modules. I think
this change must be made in a backward compatible way and kmod updated
to deal with the variable name length:

kmod $ git grep "\[64"
libkmod/libkmod-elf.c:  char name[64 - sizeof(uint32_t)];
libkmod/libkmod-elf.c:  char name[64 - sizeof(uint64_t)];

in kmod we have both 32 and 64 because a 64-bit kmod can read both 32
and 64 bit module, and vice versa.

Lucas De Marchi


};

struct module;
diff --git a/kernel/module/version.c b/kernel/module/version.c
index 53f43ac5a73e..af7478dcc158 100644
--- a/kernel/module/version.c
+++ b/kernel/module/version.c
@@ -17,32 +17,29 @@ int check_version(const struct load_info *info,
{
Elf_Shdr *sechdrs = info->sechdrs;
unsigned int versindex = info->index.vers;
-   unsigned int i, num_versions;
-   struct modversion_info *versions;
+   struct modversion_info *versions, *end;
+   u32 crcval;

/* Exporting module didn't supply crcs?  OK, we're already tainted. */
if (!crc)
return 1;
+   crcval = *crc;

/* No versions at all?  modprobe --force does this. */
if (versindex == 0)
return try_to_force_load(mod, symname) == 0;

versions = (void *)sechdrs[versindex].sh_addr;
-   num_versions = sechdrs[versindex].sh_size
-   / sizeof(struct modversion_info);
+   end = (void *)versions + sechdrs[versindex].sh_size;

-   for (i = 0; i < num_versions; i++) {
-   u32 crcval;
-
-   if (strcmp(versions[i].name, symname) != 0)
+   for (; versions < end; versions = (void *)versions + versions->next) {
+   if (strcmp(versions->name, symname) != 0)
continue;

-   crcval = *crc;
-   if (versions[i].crc == crcval)
+   if (versions->crc == crcval)
return 1;
-   pr_debug("Found checksum %X vs module %lX\n",
-crcval, versions[i].crc);
+   pr_debug("Found checksum %X vs module %X\n",
+crcval, versions->crc);
goto bad_version;
}

diff --git a/scripts/export_report.pl b/scripts/export_report.pl
index 

RE: [PATCH] lockref: stop doing cpu_relax in the cmpxchg loop

2023-01-13 Thread Luck, Tony
> diff --git a/lib/lockref.c b/lib/lockref.c
> index 45e93ece8ba0..2afe4c5d8919 100644
> --- a/lib/lockref.c
> +++ b/lib/lockref.c
> @@ -23,7 +23,6 @@
>   }   
> \
>   if (!--retry)   
> \
>   break;  
> \
> - cpu_relax();
> \
>   }   
> \
>  } while (0)

The computer necrophiliacs at Debian and Gentoo seem determined
to keep ia64 alive.

So perhaps this should s/cpu_relax/soemt_relax/ where soemt_relax
is a no-op everywhere except ia64, which can define it as cpu_relax.

The ia64 case is quite painful if one thread on a core is spinning in that
loop, while the other thread on the same core is the one that needs to
run to update that value so that the cmpxchg can succeed.

-Tony


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread John Paul Adrian Glaubitz

Hello Ard!


Can I take that as an ack on [0]? The EFI subsystem has evolved
substantially over the years, and there is really no way to do any
IA64 testing beyond build testing, so from that perspective, dropping
it entirely would be welcomed.


ia64 is regularly tested in Debian and Gentoo [1][2].

Debian's ia64 porterbox yttrium runs a recent kernel without issues:

root@yttrium:~# uname -a
Linux yttrium 5.19.0-2-mckinley #1 SMP Debian 5.19.11-1 (2022-09-24) ia64 
GNU/Linux
root@yttrium:~#

root@yttrium:~# journalctl -b|head -n10
Nov 14 14:46:10 yttrium kernel: Linux version 5.19.0-2-mckinley 
(debian-ker...@lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld (GNU 
Binutils for Debian) 2.39) #1 SMP Debian 5.19.11-1 (2022-09-24)
Nov 14 14:46:10 yttrium kernel: efi: EFI v2.10 by HP
Nov 14 14:46:10 yttrium kernel: efi: SALsystab=0xdfdd63a18 ESI=0xdfdd63f18 ACPI 
2.0=0x3d3c4014 HCDP=0xd8798 SMBIOS=0x3d368000
Nov 14 14:46:10 yttrium kernel: PCDP: v3 at 0xd8798
Nov 14 14:46:10 yttrium kernel: earlycon: uart8250 at I/O port 0x4000 (options 
'115200n8')
Nov 14 14:46:10 yttrium kernel: printk: bootconsole [uart8250] enabled
Nov 14 14:46:10 yttrium kernel: ACPI: Early table checksum verification disabled
Nov 14 14:46:10 yttrium kernel: ACPI: RSDP 0x3D3C4014 24 (v02 HP
)
Nov 14 14:46:10 yttrium kernel: ACPI: XSDT 0x3D3C4580 000124 (v01 HP
 RX2800-2 0001  0113)
Nov 14 14:46:10 yttrium kernel: ACPI: FACP 0x3D3BE000 F4 (v03 HP
 RX2800-2 0001 HP   0001)
root@yttrium:~#

Same applies to the buildds:

root@lifshitz:~# uname -a
Linux lifshitz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 
GNU/Linux
root@lifshitz:~#

root@lenz:~# uname -a
Linux lenz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 GNU/Linux
root@lenz:~#

EFI works fine as well using the latest version of GRUB2.

Thanks,
Adrian


[1] https://cdimage.debian.org/cdimage/ports/snapshots/
[2] https://mirror.yandex.ru/gentoo-distfiles//releases/ia64/autobuilds/


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Jessica Clarke
On 13 Jan 2023, at 21:03, Luck, Tony  wrote:
> 
>> For what it's worth, Debian and Gentoo both have ia64 ports with active
>> users (6.1 looks like it currently fails to build in Debian due to a
>> minor packaging issue, but various versions of 6.0 were built and
>> published, and one of those is running on the one ia64 Debian builder I
>> personally have access to).
> 
> Jess,
> 
> So dropping ia64 from the upstream kernel won't just save time of kernel
> developers. It will also save time for the folks keeping Debian and Gentoo
> ports up and running.
> 
> Are there people actually running production systems on ia64 that also
> update to v6.x kernels?
> 
> If so, why? Just scrap the machine and replace with almost anything else.
> You'll cover the cost of the new machine in short order with the savings on
> your power bill.

Hobbyists, same as alpha, hppa, m68k, sh and any other such
architectures that have no real use in this day and age.

Jess



RE: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Luck, Tony
> For what it's worth, Debian and Gentoo both have ia64 ports with active
> users (6.1 looks like it currently fails to build in Debian due to a
> minor packaging issue, but various versions of 6.0 were built and
> published, and one of those is running on the one ia64 Debian builder I
> personally have access to).

Jess,

So dropping ia64 from the upstream kernel won't just save time of kernel
developers. It will also save time for the folks keeping Debian and Gentoo
ports up and running.

Are there people actually running production systems on ia64 that also
update to v6.x kernels?

If so, why? Just scrap the machine and replace with almost anything else.
You'll cover the cost of the new machine in short order with the savings on
your power bill.

-Tony


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Jessica Clarke
On Fri, Jan 13, 2023 at 08:55:41AM +0100, Ard Biesheuvel wrote:
> On Fri, 13 Jan 2023 at 01:31, Luck, Tony  wrote:
> >
> > > Yeah, if it was ia64-only, it's a non-issue these days. It's dead and
> > > in pure maintenance mode from a kernel perspective (if even that).
> >
> > There's not much "simultaneous" in the SMT on ia64. One thread in a
> > spin loop will hog the core until the h/w switches to the other thread some
> > number of cycles (hundreds, thousands? I really can remember). So I
> > was pretty generous with dropping cpu_relax() into any kind of spin loop.
> >
> > Is it time yet for:
> >
> > $ git rm -r arch/ia64
> >
> 
> Hi Tony,
> 
> Can I take that as an ack on [0]? The EFI subsystem has evolved
> substantially over the years, and there is really no way to do any
> IA64 testing beyond build testing, so from that perspective, dropping
> it entirely would be welcomed.

For what it's worth, Debian and Gentoo both have ia64 ports with active
users (6.1 looks like it currently fails to build in Debian due to a
minor packaging issue, but various versions of 6.0 were built and
published, and one of those is running on the one ia64 Debian builder I
personally have access to).

Jess


Re: [PATCH] modpost: support arbitrary symbol length in modversion

2023-01-13 Thread Gary Guo
On Thu, 12 Jan 2023 14:40:59 -0700
Lucas De Marchi  wrote:

> On Wed, Jan 11, 2023 at 04:11:51PM +, Gary Guo wrote:
> >
> > struct modversion_info {
> >-unsigned long crc;
> >-char name[MODULE_NAME_LEN];
> >+/* Offset of the next modversion entry in relation to this one. */
> >+u32 next;
> >+u32 crc;
> >+char name[0];  
> 
> although not really exported as uapi, this will break userspace as this is
> used in the  elf file generated for the modules. I think
> this change must be made in a backward compatible way and kmod updated
> to deal with the variable name length:
> 
> kmod $ git grep "\[64"
> libkmod/libkmod-elf.c:  char name[64 - sizeof(uint32_t)];
> libkmod/libkmod-elf.c:  char name[64 - sizeof(uint64_t)];
> 
> in kmod we have both 32 and 64 because a 64-bit kmod can read both 32
> and 64 bit module, and vice versa.
> 

Hi Lucas,

Thanks for the information.

The change can't be "truly" backward compatible, in a sense that
regardless of the new format we choose, kmod would not be able to decode
symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves
is going to be incomplete, isn't it?

What kind of backward compatibility should be expected? It could be:
* short symbols can still be found by old versions of kmod, but not
  long symbols;
* or, no symbols are found by old versions of kmod, but it does not
  fail;
* or, old versions of kmod would fail gracefully for not able to
  recognise the format of __versions section, but it didn't do anything
  crazy (e.g. decode it as old format).

Also, do you think the current modversion format should stick forever
or would we be able to migrate away from it eventually and fail old
versions of modprobe given enough time?

Best,
Gary


Re: [PATCH mm-unstable v1 04/26] arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread Russell King (Oracle)
On Fri, Jan 13, 2023 at 06:10:04PM +0100, David Hildenbrand wrote:
> Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the
> offset. This reduces the maximum swap space per file to 64 GiB (was 128
> GiB).
> 
> While at it drop the PTE_TYPE_FAULT from __swp_entry_to_pte() which is
> defined to be 0 and is rather confusing because we should be dealing
> with "Linux PTEs" not "hardware PTEs". Also, properly mask the type in
> __swp_entry().
> 
> Cc: Russell King 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Russell King (Oracle) 

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


[PATCH mm-unstable v1 24/26] x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE just like we already do on
x86-64. After deciphering the PTE layout it becomes clear that there are
still unused bits for 2-level and 3-level page tables that we should be
able to use. Reusing a bit avoids stealing one bit from the swap offset.

While at it, mask the type in __swp_entry(); use some helper definitions
to make the macros easier to grasp.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: "H. Peter Anvin" 
Signed-off-by: David Hildenbrand 
---
 arch/x86/include/asm/pgtable-2level.h | 26 +-
 arch/x86/include/asm/pgtable-3level.h | 26 +++---
 arch/x86/include/asm/pgtable.h|  2 --
 3 files changed, 44 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/pgtable-2level.h 
b/arch/x86/include/asm/pgtable-2level.h
index 60d0f9015317..e9482a11ac52 100644
--- a/arch/x86/include/asm/pgtable-2level.h
+++ b/arch/x86/include/asm/pgtable-2level.h
@@ -80,21 +80,37 @@ static inline unsigned long pte_bitop(unsigned long value, 
unsigned int rightshi
return ((value >> rightshift) & mask) << leftshift;
 }
 
-/* Encode and de-code a swap entry */
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <- offset --> 0 E <- type --> 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ */
 #define SWP_TYPE_BITS 5
+#define _SWP_TYPE_MASK ((1U << SWP_TYPE_BITS) - 1)
+#define _SWP_TYPE_SHIFT (_PAGE_BIT_PRESENT + 1)
 #define SWP_OFFSET_SHIFT (_PAGE_BIT_PROTNONE + 1)
 
-#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS)
+#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5)
 
-#define __swp_type(x)  (((x).val >> (_PAGE_BIT_PRESENT + 1)) \
-& ((1U << SWP_TYPE_BITS) - 1))
+#define __swp_type(x)  (((x).val >> _SWP_TYPE_SHIFT) \
+& _SWP_TYPE_MASK)
 #define __swp_offset(x)((x).val >> SWP_OFFSET_SHIFT)
 #define __swp_entry(type, offset)  ((swp_entry_t) { \
-((type) << (_PAGE_BIT_PRESENT + 1)) \
+(((type) & _SWP_TYPE_MASK) << 
_SWP_TYPE_SHIFT) \
 | ((offset) << SWP_OFFSET_SHIFT) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_low 
})
 #define __swp_entry_to_pte(x)  ((pte_t) { .pte = (x).val })
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_PSE
+
 /* No inverted PFNs on 2 level page tables */
 
 static inline u64 protnone_mask(u64 val)
diff --git a/arch/x86/include/asm/pgtable-3level.h 
b/arch/x86/include/asm/pgtable-3level.h
index 967b135fa2c0..9e7c0b719c3c 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -145,8 +145,24 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 }
 #endif
 
-/* Encode and de-code a swap entry */
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   < type -> <-- offset --
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   > 0 E 0 0 0 0 0 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ */
 #define SWP_TYPE_BITS  5
+#define _SWP_TYPE_MASK ((1U << SWP_TYPE_BITS) - 1)
 
 #define SWP_OFFSET_FIRST_BIT   (_PAGE_BIT_PROTNONE + 1)
 
@@ -154,9 +170,10 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 #define SWP_OFFSET_SHIFT   (SWP_OFFSET_FIRST_BIT + SWP_TYPE_BITS)
 
 #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS)
-#define __swp_type(x)  (((x).val) & ((1UL << SWP_TYPE_BITS) - 
1))
+#define __swp_type(x)  (((x).val) & _SWP_TYPE_MASK)
 #define __swp_offset(x)((x).val >> SWP_TYPE_BITS)
-#define __swp_entry(type, offset)  ((swp_entry_t){(type) | (offset) << 
SWP_TYPE_BITS})
+#define __swp_entry(type, offset)  ((swp_entry_t){((type) & 
_SWP_TYPE_MASK) \
+   | (offset) << SWP_TYPE_BITS})
 
 /*
  * Normally, __swp_entry() converts from arch-independent swp_entry_t to
@@ -184,6 +201,9 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 #define 

[PATCH mm-unstable v1 26/26] mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
__HAVE_ARCH_PTE_SWP_EXCLUSIVE is now supported by all architectures that
support swp PTEs, so let's drop it.

Signed-off-by: David Hildenbrand 
---
 arch/alpha/include/asm/pgtable.h |  1 -
 arch/arc/include/asm/pgtable-bits-arcv2.h|  1 -
 arch/arm/include/asm/pgtable.h   |  1 -
 arch/arm64/include/asm/pgtable.h |  1 -
 arch/csky/include/asm/pgtable.h  |  1 -
 arch/hexagon/include/asm/pgtable.h   |  1 -
 arch/ia64/include/asm/pgtable.h  |  1 -
 arch/loongarch/include/asm/pgtable.h |  1 -
 arch/m68k/include/asm/mcf_pgtable.h  |  1 -
 arch/m68k/include/asm/motorola_pgtable.h |  1 -
 arch/m68k/include/asm/sun3_pgtable.h |  1 -
 arch/microblaze/include/asm/pgtable.h|  1 -
 arch/mips/include/asm/pgtable.h  |  1 -
 arch/nios2/include/asm/pgtable.h |  1 -
 arch/openrisc/include/asm/pgtable.h  |  1 -
 arch/parisc/include/asm/pgtable.h|  1 -
 arch/powerpc/include/asm/book3s/32/pgtable.h |  1 -
 arch/powerpc/include/asm/book3s/64/pgtable.h |  1 -
 arch/powerpc/include/asm/nohash/pgtable.h|  1 -
 arch/riscv/include/asm/pgtable.h |  1 -
 arch/s390/include/asm/pgtable.h  |  1 -
 arch/sh/include/asm/pgtable_32.h |  1 -
 arch/sparc/include/asm/pgtable_32.h  |  1 -
 arch/sparc/include/asm/pgtable_64.h  |  1 -
 arch/um/include/asm/pgtable.h|  1 -
 arch/x86/include/asm/pgtable.h   |  1 -
 arch/xtensa/include/asm/pgtable.h|  1 -
 include/linux/pgtable.h  | 29 
 mm/debug_vm_pgtable.c|  2 --
 mm/memory.c  |  4 ---
 mm/rmap.c| 11 
 31 files changed, 73 deletions(-)

diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h
index 970abf511b13..ba43cb841d19 100644
--- a/arch/alpha/include/asm/pgtable.h
+++ b/arch/alpha/include/asm/pgtable.h
@@ -328,7 +328,6 @@ extern inline pte_t mk_swap_pte(unsigned long type, 
unsigned long offset)
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h 
b/arch/arc/include/asm/pgtable-bits-arcv2.h
index 611f412713b9..6e9f8ca6d6a1 100644
--- a/arch/arc/include/asm/pgtable-bits-arcv2.h
+++ b/arch/arc/include/asm/pgtable-bits-arcv2.h
@@ -132,7 +132,6 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned 
long address,
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 886c275995a2..2e626e6da9a3 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -298,7 +298,6 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(swp)__pte((swp).val)
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_isset(pte, L_PTE_SWP_EXCLUSIVE);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b4bbeed80fb6..e0c19f5e3413 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -417,7 +417,6 @@ static inline pgprot_t mk_pmd_sect_prot(pgprot_t prot)
return __pgprot((pgprot_val(prot) & ~PMD_TABLE_BIT) | PMD_TYPE_SECT);
 }
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline pte_t pte_swp_mkexclusive(pte_t pte)
 {
return set_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE));
diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h
index 574c97b9ecca..d4042495febc 100644
--- a/arch/csky/include/asm/pgtable.h
+++ b/arch/csky/include/asm/pgtable.h
@@ -200,7 +200,6 @@ static inline pte_t pte_mkyoung(pte_t pte)
return pte;
 }
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
diff --git a/arch/hexagon/include/asm/pgtable.h 
b/arch/hexagon/include/asm/pgtable.h
index 7eb008e477c8..59393613d086 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -397,7 +397,6 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
(((type & 0x1f) << 1) | \
 ((offset & 0x38) << 10) | ((offset & 0x7) << 7)) })
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t 

[PATCH mm-unstable v1 25/26] xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 1. This
bit should be safe to use for our usecase.

Most importantly, we can still distinguish swap PTEs from PAGE_NONE PTEs
(see pte_present()) and don't use one of the two reserved attribute
masks (1101 and ). Attribute mask 1100 and 1110 now identify swap PTEs.

While at it, remove SWP_TYPE_BITS (not really helpful as it's not used in
the actual swap macros) and mask the type in __swp_entry().

Cc: Chris Zankel 
Cc: Max Filippov 
Signed-off-by: David Hildenbrand 
---
 arch/xtensa/include/asm/pgtable.h | 32 ++-
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/xtensa/include/asm/pgtable.h 
b/arch/xtensa/include/asm/pgtable.h
index 5b5484d707b2..1025e2dc292b 100644
--- a/arch/xtensa/include/asm/pgtable.h
+++ b/arch/xtensa/include/asm/pgtable.h
@@ -96,7 +96,7 @@
  * +- - - - - - - - - - - - - - - - - - - - -+
  *   (PAGE_NONE)|PPN| 0 | 00 | ADW | 01 | 11 | 11 |
  * +-+
- *   swap  | index |   type   | 01 | 11 | 00 |
+ *   swap  | index |   type   | 01 | 11 | e0 |
  * +-+
  *
  * For T1050 hardware and earlier the layout differs for present and 
(PAGE_NONE)
@@ -112,6 +112,7 @@
  *   RI ring (0=privileged, 1=user, 2 and 3 are unused)
  *   CAcache attribute: 00 bypass, 01 writeback, 10 
writethrough
  * (11 is invalid and used to mark pages that are not present)
+ *   e exclusive marker in swap PTEs
  *   w page is writable (hw)
  *   x page is executable (hw)
  *   index  swap offset / PAGE_SIZE (bit 11-31: 21 bits -> 8 GB)
@@ -158,6 +159,9 @@
 #define _PAGE_DIRTY(1<<7)  /* software: page dirty */
 #define _PAGE_ACCESSED (1<<8)  /* software: page accessed (read) */
 
+/* We borrow bit 1 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<1)
+
 #ifdef CONFIG_MMU
 
 #define _PAGE_CHG_MASK(PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY)
@@ -343,19 +347,37 @@ ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
addr, pte_t *ptep)
 }
 
 /*
- * Encode and decode a swap and file entry.
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  */
-#define SWP_TYPE_BITS  5
-#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS)
+#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5)
 
 #define __swp_type(entry)  (((entry).val >> 6) & 0x1f)
 #define __swp_offset(entry)((entry).val >> 11)
 #define __swp_entry(type,offs) \
-   ((swp_entry_t){((type) << 6) | ((offs) << 11) | \
+   ((swp_entry_t){(((type) & 0x1f) << 6) | ((offs) << 11) | \
 _PAGE_CA_INVALID | _PAGE_USER})
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 #endif /*  !defined (__ASSEMBLY__) */
 
 
-- 
2.39.0



[PATCH mm-unstable v1 23/26] um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 10, which is
yet unused for swap PTEs.

The pte_mkuptodate() is a bit weird in __pte_to_swp_entry() for a swap PTE
... but it only messes with bit 1 and 2 and there is a comment in
set_pte(), so leave these bits alone.

While at it, mask the type in __swp_entry().

Cc: Richard Weinberger 
Cc: Anton Ivanov 
Cc: Johannes Berg 
Signed-off-by: David Hildenbrand 
---
 arch/um/include/asm/pgtable.h | 37 +--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/um/include/asm/pgtable.h b/arch/um/include/asm/pgtable.h
index 4e3052f2671a..cedc5fd451ce 100644
--- a/arch/um/include/asm/pgtable.h
+++ b/arch/um/include/asm/pgtable.h
@@ -21,6 +21,9 @@
 #define _PAGE_PROTNONE 0x010   /* if the user mapped it with PROT_NONE;
   pte_present gives true */
 
+/* We borrow bit 10 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x400
+
 #ifdef CONFIG_3_LEVEL_PGTABLES
 #include 
 #else
@@ -288,16 +291,46 @@ extern pte_t *virt_to_pte(struct mm_struct *mm, unsigned 
long addr);
 
 #define update_mmu_cache(vma,address,ptep) do {} while (0)
 
-/* Encode and de-code a swap entry */
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- offset > E < type -> 0 0 0 1 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_NEWPAGE (bit 1) is always set to 1 in set_pte().
+ */
 #define __swp_type(x)  (((x).val >> 5) & 0x1f)
 #define __swp_offset(x)((x).val >> 11)
 
 #define __swp_entry(type, offset) \
-   ((swp_entry_t) { ((type) << 5) | ((offset) << 11) })
+   ((swp_entry_t) { (((type) & 0x1f) << 5) | ((offset) << 11) })
 #define __pte_to_swp_entry(pte) \
((swp_entry_t) { pte_val(pte_mkuptodate(pte)) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_get_bits(pte, _PAGE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_set_bits(pte, _PAGE_SWP_EXCLUSIVE);
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_clear_bits(pte, _PAGE_SWP_EXCLUSIVE);
+   return pte;
+}
+
 /* Clear a kernel PTE and flush it from the TLB */
 #define kpte_clear_flush(ptep, vaddr)  \
 do {   \
-- 
2.39.0



[PATCH mm-unstable v1 22/26] sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit was effectively unused.

While at it, mask the type in __swp_entry().

Cc: "David S. Miller" 
Signed-off-by: David Hildenbrand 
---
 arch/sparc/include/asm/pgtable_64.h | 38 ++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 3bc9736bddb1..a1658eebd036 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -187,6 +187,9 @@ bool kern_addr_valid(unsigned long addr);
 #define _PAGE_SZHUGE_4U_PAGE_SZ4MB_4U
 #define _PAGE_SZHUGE_4V_PAGE_SZ4MB_4V
 
+/* We borrow bit 20 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_AC(0x0010, UL)
+
 #ifndef __ASSEMBLY__
 
 pte_t mk_pte_io(unsigned long, pgprot_t, int, unsigned long);
@@ -961,18 +964,47 @@ void pgtable_trans_huge_deposit(struct mm_struct *mm, 
pmd_t *pmdp,
 pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
 #endif
 
-/* Encode and de-code a swap entry */
-#define __swp_type(entry)  (((entry).val >> PAGE_SHIFT) & 0xffUL)
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <--- offset ---
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   > E <-- type ---> <--- zeroes >
+ */
+#define __swp_type(entry)  (((entry).val >> PAGE_SHIFT) & 0x7fUL)
 #define __swp_offset(entry)((entry).val >> (PAGE_SHIFT + 8UL))
 #define __swp_entry(type, offset)  \
( (swp_entry_t) \
  { \
-   (((long)(type) << PAGE_SHIFT) | \
+   long)(type) & 0x7fUL) << PAGE_SHIFT) | \
  ((long)(offset) << (PAGE_SHIFT + 8UL))) \
  } )
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE);
+}
+
 int page_in_phys_avail(unsigned long paddr);
 
 /*
-- 
2.39.0



[PATCH mm-unstable v1 21/26] sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by reusing the SRMMU_DIRTY
bit as that seems to be safe to reuse inside a swap PTE. This avoids
having to steal one bit from the swap offset.

While at it, relocate the swap PTE layout documentation and use the same
style now used for most other archs. Note that the old documentation was
wrong: we use 20 bit for the offset and the reserved bits were 8 instead
of 7 bits in the ascii art.

Cc: "David S. Miller" 
Signed-off-by: David Hildenbrand 
---
 arch/sparc/include/asm/pgtable_32.h | 27 ++-
 arch/sparc/include/asm/pgtsrmmu.h   | 14 +++---
 2 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/arch/sparc/include/asm/pgtable_32.h 
b/arch/sparc/include/asm/pgtable_32.h
index 5acc05b572e6..abf7a2601209 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -323,7 +323,16 @@ void srmmu_mapiorange(unsigned int bus, unsigned long xpa,
   unsigned long xva, unsigned int len);
 void srmmu_unmapiorange(unsigned long virt_addr, unsigned int len);
 
-/* Encode and de-code a swap entry */
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <-- offset ---> < type -> E 0 0 0 0 0 0
+ */
 static inline unsigned long __swp_type(swp_entry_t entry)
 {
return (entry.val >> SRMMU_SWP_TYPE_SHIFT) & SRMMU_SWP_TYPE_MASK;
@@ -344,6 +353,22 @@ static inline swp_entry_t __swp_entry(unsigned long type, 
unsigned long offset)
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & SRMMU_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) | SRMMU_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~SRMMU_SWP_EXCLUSIVE);
+}
+
 static inline unsigned long
 __get_phys (unsigned long addr)
 {
diff --git a/arch/sparc/include/asm/pgtsrmmu.h 
b/arch/sparc/include/asm/pgtsrmmu.h
index 6067925972d9..18e68d43f036 100644
--- a/arch/sparc/include/asm/pgtsrmmu.h
+++ b/arch/sparc/include/asm/pgtsrmmu.h
@@ -53,21 +53,13 @@
 
 #define SRMMU_CHG_MASK(0xff00 | SRMMU_REF | SRMMU_DIRTY)
 
-/* SRMMU swap entry encoding
- *
- * We use 5 bits for the type and 19 for the offset.  This gives us
- * 32 swapfiles of 4GB each.  Encoding looks like:
- *
- * ooot
- * fedcba9876543210fedcba9876543210
- *
- * The bottom 7 bits are reserved for protection and status bits, especially
- * PRESENT.
- */
+/* SRMMU swap entry encoding */
 #define SRMMU_SWP_TYPE_MASK0x1f
 #define SRMMU_SWP_TYPE_SHIFT   7
 #define SRMMU_SWP_OFF_MASK 0xf
 #define SRMMU_SWP_OFF_SHIFT(SRMMU_SWP_TYPE_SHIFT + 5)
+/* We borrow bit 6 to store the exclusive marker in swap PTEs. */
+#define SRMMU_SWP_EXCLUSIVESRMMU_DIRTY
 
 /* Some day I will implement true fine grained access bits for
  * user pages because the SRMMU gives us the capabilities to
-- 
2.39.0



[PATCH mm-unstable v1 20/26] sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 6 in the PTE,
reducing the swap type in the !CONFIG_X2TLB case to 5 bits. Generic MM
currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the
stolen bit is effectively unused.

Interrestingly, the swap type in the !CONFIG_X2TLB case could currently
overlap with the _PAGE_PRESENT bit, because there is a sneaky shift by 1 in
__pte_to_swp_entry() and __swp_entry_to_pte(). Bit 0-7 in the architecture
specific swap PTE would get shifted to bit 1-8 in the PTE. As generic MM
uses 5 bits only, this didn't matter so far.

While at it, mask the type in __swp_entry().

Cc: Yoshinori Sato 
Cc: Rich Felker 
Signed-off-by: David Hildenbrand 
---
 arch/sh/include/asm/pgtable_32.h | 54 +---
 1 file changed, 42 insertions(+), 12 deletions(-)

diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h
index d0240decacca..c34aa795a9d2 100644
--- a/arch/sh/include/asm/pgtable_32.h
+++ b/arch/sh/include/asm/pgtable_32.h
@@ -423,40 +423,70 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 #endif
 
 /*
- * Encode and de-code a swap entry
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
  * Constraints:
  * _PAGE_PRESENT at bit 8
  * _PAGE_PROTNONE at bit 9
  *
- * For the normal case, we encode the swap type into bits 0:7 and the
- * swap offset into bits 10:30. For the 64-bit PTE case, we keep the
- * preserved bits in the low 32-bits and use the upper 32 as the swap
- * offset (along with a 5-bit type), following the same approach as x86
- * PAE. This keeps the logic quite simple.
+ * For the normal case, we encode the swap type and offset into the swap PTE
+ * such that bits 8 and 9 stay zero. For the 64-bit PTE case, we use the
+ * upper 32 for the swap offset and swap type, following the same approach as
+ * x86 PAE. This keeps the logic quite simple.
  *
  * As is evident by the Alpha code, if we ever get a 64-bit unsigned
  * long (swp_entry_t) to match up with the 64-bit PTEs, this all becomes
  * much cleaner..
- *
- * NOTE: We should set ZEROs at the position of _PAGE_PRESENT
- *   and _PAGE_PROTNONE bits
  */
+
 #ifdef CONFIG_X2TLB
+/*
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <- offset --> < type ->
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- zeroes > E 0 0 0 0 0 0
+ */
 #define __swp_type(x)  ((x).val & 0x1f)
 #define __swp_offset(x)((x).val >> 5)
-#define __swp_entry(type, offset)  ((swp_entry_t){ (type) | (offset) << 5})
+#define __swp_entry(type, offset)  ((swp_entry_t){ ((type) & 0x1f) | 
(offset) << 5})
 #define __pte_to_swp_entry(pte)((swp_entry_t){ (pte).pte_high 
})
 #define __swp_entry_to_pte(x)  ((pte_t){ 0, (x).val })
 
 #else
-#define __swp_type(x)  ((x).val & 0xff)
+/*
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- offset > 0 0 0 0 E < type -> 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ */
+#define __swp_type(x)  ((x).val & 0x1f)
 #define __swp_offset(x)((x).val >> 10)
-#define __swp_entry(type, offset)  ((swp_entry_t){(type) | (offset) <<10})
+#define __swp_entry(type, offset)  ((swp_entry_t){((type) & 0x1f) | 
(offset) << 10})
 
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) 
>> 1 })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val << 1 })
 #endif
 
+/* In both cases, we borrow bit 6 to store the exclusive marker in swap PTEs. 
*/
+#define _PAGE_SWP_EXCLUSIVE_PAGE_USER
+
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte.pte_low & _PAGE_SWP_EXCLUSIVE;
+}
+
+PTE_BIT_FUNC(low, swp_mkexclusive, |= _PAGE_SWP_EXCLUSIVE);
+PTE_BIT_FUNC(low, swp_clear_exclusive, &= ~_PAGE_SWP_EXCLUSIVE);
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_SH_PGTABLE_32_H */
-- 
2.39.0



[PATCH mm-unstable v1 19/26] riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the offset. This reduces the maximum swap space per file: on 32bit
to 16 GiB (was 32 GiB).

Note that this bit does not conflict with swap PMDs and could also be used
in swap PMD context later.

While at it, mask the type in __swp_entry().

Cc: Paul Walmsley 
Cc: Palmer Dabbelt 
Cc: Albert Ou 
Signed-off-by: David Hildenbrand 
---
 arch/riscv/include/asm/pgtable-bits.h |  3 +++
 arch/riscv/include/asm/pgtable.h  | 29 ++-
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/include/asm/pgtable-bits.h 
b/arch/riscv/include/asm/pgtable-bits.h
index b9e13a8fe2b7..f896708e8331 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -27,6 +27,9 @@
  */
 #define _PAGE_PROT_NONE _PAGE_GLOBAL
 
+/* Used for swap PTEs only. */
+#define _PAGE_SWP_EXCLUSIVE _PAGE_ACCESSED
+
 #define _PAGE_PFN_SHIFT 10
 
 /*
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 4eba9a98d0e3..03a4728db039 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -724,16 +724,18 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 /*
- * Encode and decode a swap entry
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
  * Format of swap PTE:
  * bit0:   _PAGE_PRESENT (zero)
  * bit   1 to 3:   _PAGE_LEAF (zero)
  * bit5:   _PAGE_PROT_NONE (zero)
- * bits  6 to 10:  swap type
- * bits 10 to XLEN-1:  swap offset
+ * bit6:   exclusive marker
+ * bits  7 to 11:  swap type
+ * bits 11 to XLEN-1:  swap offset
  */
-#define __SWP_TYPE_SHIFT   6
+#define __SWP_TYPE_SHIFT   7
 #define __SWP_TYPE_BITS5
 #define __SWP_TYPE_MASK((1UL << __SWP_TYPE_BITS) - 1)
 #define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
@@ -744,11 +746,28 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 #define __swp_type(x)  (((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK)
 #define __swp_offset(x)((x).val >> __SWP_OFFSET_SHIFT)
 #define __swp_entry(type, offset) ((swp_entry_t) \
-   { ((type) << __SWP_TYPE_SHIFT) | ((offset) << __SWP_OFFSET_SHIFT) })
+   { (((type) & __SWP_TYPE_MASK) << __SWP_TYPE_SHIFT) | \
+ ((offset) << __SWP_OFFSET_SHIFT) })
 
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE);
+}
+
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
 #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) })
 #define __swp_entry_to_pmd(swp) __pmd((swp).val)
-- 
2.39.0



[PATCH mm-unstable v1 18/26] powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit and 64bit.

On 64bit, let's use MSB 56 (LSB 7), located right next to the page type.
On 32bit, let's use LSB 2 to avoid stealing one bit from the swap offset.

There seems to be no real reason why these bits cannot be used for swap
PTEs. The important part is that _PAGE_PRESENT and _PAGE_HASHPTE remain
0.

While at it, mask the type in __swp_entry() and remove
_PAGE_BIT_SWAP_TYPE from pte-e500.h: while it was used in 64bit code it was
ignored in 32bit code.

Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Signed-off-by: David Hildenbrand 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h  | 22 +
 arch/powerpc/include/asm/nohash/32/pte-40x.h  |  6 ++---
 arch/powerpc/include/asm/nohash/32/pte-44x.h  | 18 --
 arch/powerpc/include/asm/nohash/32/pte-85xx.h |  4 ++--
 arch/powerpc/include/asm/nohash/64/pgtable.h  | 24 ---
 arch/powerpc/include/asm/nohash/pgtable.h | 16 +
 arch/powerpc/include/asm/nohash/pte-e500.h|  1 -
 7 files changed, 63 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 70edad44dff6..fec56d965f00 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -360,18 +360,30 @@ static inline int pte_young(pte_t pte)
 #endif
 
 #define pmd_page(pmd)  pfn_to_page(pmd_pfn(pmd))
+
 /*
- * Encode and decode a swap entry.
- * Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit.
- *   -- paulus
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs (32bit PTEs):
+ *
+ * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *   <-- offset ---> < type -> E 0 0
+ *
+ * E is the exclusive marker that is not stored in swap entries.
+ *
+ * For 64bit PTEs, the offset is extended by 32bit.
  */
 #define __swp_type(entry)  ((entry).val & 0x1f)
 #define __swp_offset(entry)((entry).val >> 5)
-#define __swp_entry(type, offset)  ((swp_entry_t) { (type) | ((offset) << 
5) })
+#define __swp_entry(type, offset)  ((swp_entry_t) { ((type) & 0x1f) | 
((offset) << 5) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) 
>> 3 })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val << 3 })
 
+/* We borrow LSB 2 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x04
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_POWERPC_NOHASH_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h 
b/arch/powerpc/include/asm/nohash/32/pte-40x.h
index 2d3153cfc0d7..6fe46e754556 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
@@ -27,9 +27,9 @@
  *   of the 16 available.  Bit 24-26 of the TLB are cleared in the TLB
  *   miss handler.  Bit 27 is PAGE_USER, thus selecting the correct
  *   zone.
- * - PRESENT *must* be in the bottom two bits because swap cache
- *   entries use the top 30 bits.  Because 40x doesn't support SMP
- *   anyway, M is irrelevant so we borrow it for PAGE_PRESENT.  Bit 30
+ * - PRESENT *must* be in the bottom two bits because swap PTEs
+ *   use the top 30 bits.  Because 40x doesn't support SMP anyway, M is
+ *   irrelevant so we borrow it for PAGE_PRESENT.  Bit 30
  *   is cleared in the TLB miss handler before the TLB entry is loaded.
  * - All other bits of the PTE are loaded into TLBLO without
  *   modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for
diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h 
b/arch/powerpc/include/asm/nohash/32/pte-44x.h
index 78bc304f750e..b7ed13cee137 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-44x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h
@@ -56,20 +56,10 @@
  * above bits.  Note that the bit values are CPU specific, not architecture
  * specific.
  *
- * The kernel PTE entry holds an arch-dependent swp_entry structure under
- * certain situations. In other words, in such situations some portion of
- * the PTE bits are used as a swp_entry. In the PPC implementation, the
- * 3-24th LSB are shared with swp_entry, however the 0-2nd three LSB still
- * hold protection values. That means the three protection bits are
- * reserved for both PTE and SWAP entry at the most significant three
- * LSBs.
- *
- * There are three protection bits available for SWAP entry:
- * _PAGE_PRESENT
- * _PAGE_HASHPTE (if HW has)
- *
- * So those three bits have to be inside of 0-2nd LSB of PTE.
- *
+ * The kernel PTE entry can be an ordinary PTE mapping a page or a special swap
+ * PTE. In case of a swap PTE, LSB 2-24 are used to store information 

[PATCH mm-unstable v1 17/26] powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s

2023-01-13 Thread David Hildenbrand
We already implemented support for 64bit book3s in commit bff9beaa2e80
("powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3s")

Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also in 32bit by reusing yet
unused LSB 2 / MSB 29. There seems to be no real reason why that bit cannot
be used, and reusing it avoids having to steal one bit from the swap
offset.

While at it, mask the type in __swp_entry().

Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Signed-off-by: David Hildenbrand 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 38 +---
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 75823f39e042..0ecb3a58f23f 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -42,6 +42,9 @@
 #define _PMD_PRESENT_MASK (PAGE_MASK)
 #define _PMD_BAD   (~PAGE_MASK)
 
+/* We borrow the _PAGE_USER bit to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_USER
+
 /* And here we include common definitions */
 
 #define _PAGE_KERNEL_RO0
@@ -363,17 +366,42 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
 #define pmd_page(pmd)  pfn_to_page(pmd_pfn(pmd))
 
 /*
- * Encode and decode a swap entry.
- * Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used).
- *   -- paulus
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs (32bit PTEs):
+ *
+ * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *   <- offset > < type -> E H P
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P) and __PAGE_HASHPTE (H) must be 0.
+ *
+ * For 64bit PTEs, the offset is extended by 32bit.
  */
 #define __swp_type(entry)  ((entry).val & 0x1f)
 #define __swp_offset(entry)((entry).val >> 5)
-#define __swp_entry(type, offset)  ((swp_entry_t) { (type) | ((offset) << 
5) })
+#define __swp_entry(type, offset)  ((swp_entry_t) { ((type) & 0x1f) | 
((offset) << 5) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) 
>> 3 })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val << 3 })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE);
+}
+
 /* Generic accessors to PTE bits */
 static inline int pte_write(pte_t pte) { return !!(pte_val(pte) & 
_PAGE_RW);}
 static inline int pte_read(pte_t pte)  { return 1; }
-- 
2.39.0



[PATCH mm-unstable v1 16/26] parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused
_PAGE_ACCESSED location in the swap PTE. Looking at pte_present()
and pte_none() checks, there seems to be no actual reason why we cannot
use it: we only have to make sure we're not using _PAGE_PRESENT.

Reusing this bit avoids having to steal one bit from the swap offset.

Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Signed-off-by: David Hildenbrand 
---
 arch/parisc/include/asm/pgtable.h | 41 ---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/arch/parisc/include/asm/pgtable.h 
b/arch/parisc/include/asm/pgtable.h
index ea357430aafe..3033bb88df34 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -218,6 +218,9 @@ extern void __update_cache(pte_t pte);
 #define _PAGE_KERNEL_RWX   (_PAGE_KERNEL_EXEC | _PAGE_WRITE)
 #define _PAGE_KERNEL   (_PAGE_KERNEL_RO | _PAGE_WRITE)
 
+/* We borrow bit 23 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_ACCESSED
+
 /* The pgd/pmd contains a ptr (in phys addr space); since all pgds/pmds
  * are page-aligned, we don't care about the PAGE_OFFSET bits, except
  * for a few meta-information bits, so we shift the address to be
@@ -394,17 +397,49 @@ extern void paging_init (void);
 
 #define update_mmu_cache(vms,addr,ptep) __update_cache(*ptep)
 
-/* Encode and de-code a swap entry */
-
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs (32bit):
+ *
+ * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *   < offset -> P E  < type ->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P) must be 0.
+ *
+ *   For the 64bit version, the offset is extended by 32bit.
+ */
 #define __swp_type(x) ((x).val & 0x1f)
 #define __swp_offset(x)   ( (((x).val >> 6) &  0x7) | \
  (((x).val >> 8) & ~0x7) )
-#define __swp_entry(type, offset) ((swp_entry_t) { (type) | \
+#define __swp_entry(type, offset) ((swp_entry_t) { \
+   ((type) & 0x1f) | \
((offset &  0x7) << 6) | \
((offset & ~0x7) << 8) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, 
unsigned long addr, pte_t *ptep)
 {
pte_t pte;
-- 
2.39.0



[PATCH mm-unstable v1 15/26] openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, mask the type in __swp_entry().

Cc: Stefan Kristiansson 
Cc: Stafford Horne 
Signed-off-by: David Hildenbrand 
---
 arch/openrisc/include/asm/pgtable.h | 41 +
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/arch/openrisc/include/asm/pgtable.h 
b/arch/openrisc/include/asm/pgtable.h
index 6477c17b3062..903b32d662ab 100644
--- a/arch/openrisc/include/asm/pgtable.h
+++ b/arch/openrisc/include/asm/pgtable.h
@@ -154,6 +154,9 @@ extern void paging_init(void);
 #define _KERNPG_TABLE \
(_PAGE_BASE | _PAGE_SRE | _PAGE_SWE | _PAGE_ACCESSED | _PAGE_DIRTY)
 
+/* We borrow bit 11 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_U_SHARED
+
 #define PAGE_NONE   __pgprot(_PAGE_ALL)
 #define PAGE_READONLY   __pgprot(_PAGE_ALL | _PAGE_URE | _PAGE_SRE)
 #define PAGE_READONLY_X __pgprot(_PAGE_ALL | _PAGE_URE | _PAGE_SRE | 
_PAGE_EXEC)
@@ -385,16 +388,44 @@ static inline void update_mmu_cache(struct vm_area_struct 
*vma,
 
 /* __PHX__ FIXME, SWAP, this probably doesn't work */
 
-/* Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) */
-/* Since the PAGE_PRESENT bit is bit 4, we can use the bits above */
-
-#define __swp_type(x)  (((x).val >> 5) & 0x7f)
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <-- offset ---> E <- type --> 0 0 0 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   The zero'ed bits include _PAGE_PRESENT.
+ */
+#define __swp_type(x)  (((x).val >> 5) & 0x3f)
 #define __swp_offset(x)((x).val >> 12)
 #define __swp_entry(type, offset) \
-   ((swp_entry_t) { ((type) << 5) | ((offset) << 12) })
+   ((swp_entry_t) { (((type) & 0x3f) << 5) | ((offset) << 12) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 typedef pte_t *pte_addr_t;
 
 #endif /* __ASSEMBLY__ */
-- 
2.39.0



[PATCH mm-unstable v1 14/26] nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused bit
31.

Cc: Thomas Bogendoerfer 
Signed-off-by: David Hildenbrand 
---
 arch/nios2/include/asm/pgtable-bits.h |  3 +++
 arch/nios2/include/asm/pgtable.h  | 22 +-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/nios2/include/asm/pgtable-bits.h 
b/arch/nios2/include/asm/pgtable-bits.h
index bfddff383e89..724f9b08b1d1 100644
--- a/arch/nios2/include/asm/pgtable-bits.h
+++ b/arch/nios2/include/asm/pgtable-bits.h
@@ -31,4 +31,7 @@
 #define _PAGE_ACCESSED (1<<26) /* page referenced */
 #define _PAGE_DIRTY(1<<27) /* dirty page */
 
+/* We borrow bit 31 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<31)
+
 #endif /* _ASM_NIOS2_PGTABLE_BITS_H */
diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index d1e5c9eb4643..05999da01731 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -239,7 +239,9 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  *
  *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
  *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
- *   0 < type -> 0 0 0 0 0 0 <-- offset --->
+ *   E < type -> 0 0 0 0 0 0 <-- offset --->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  *
  * Note that the offset field is always non-zero if the swap type is 0, thus
  * !pte_none() is always true.
@@ -251,6 +253,24 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 #define __swp_entry_to_pte(swp)((pte_t) { (swp).val })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 extern void __init paging_init(void);
 extern void __init mmu_init(void);
 
-- 
2.39.0



[PATCH mm-unstable v1 13/26] nios2/mm: refactor swap PTE layout

2023-01-13 Thread David Hildenbrand
nios2 disables swap for a good reason: it doesn't even provide
sufficient type bits as required by core MM. However, swap entries are
nowadays also used for other purposes (migration entries,
PTE markers, HWPoison, ...), and accidential use could be problematic.

Let's properly use 5 bits for the swap type and document the layout.
Bits 26--31 should get ignored by hardware completely, so they can be
used.

Cc: Dinh Nguyen 
Signed-off-by: David Hildenbrand 
---
 arch/nios2/include/asm/pgtable.h | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index ab793bc517f5..d1e5c9eb4643 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -232,19 +232,21 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
__FILE__, __LINE__, pgd_val(e))
 
 /*
- * Encode and decode a swap entry (must be !pte_none(pte) && !pte_present(pte):
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
- * 31 30 29 28 27 26 25 24 23 22 21 20 19 18 ...  1  0
- *  0  0  0  0 type.  0  0  0  0  0  0 offset.
+ * Format of swap PTEs:
  *
- * This gives us up to 2**2 = 4 swap files and 2**20 * 4K = 4G per swap file.
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   0 < type -> 0 0 0 0 0 0 <-- offset --->
  *
- * Note that the offset field is always non-zero, thus !pte_none(pte) is always
- * true.
+ * Note that the offset field is always non-zero if the swap type is 0, thus
+ * !pte_none() is always true.
  */
-#define __swp_type(swp)(((swp).val >> 26) & 0x3)
+#define __swp_type(swp)(((swp).val >> 26) & 0x1f)
 #define __swp_offset(swp)  ((swp).val & 0xf)
-#define __swp_entry(type, off) ((swp_entry_t) { (((type) & 0x3) << 26) \
+#define __swp_entry(type, off) ((swp_entry_t) { (((type) & 0x1f) << 26) \
 | ((off) & 0xf) })
 #define __swp_entry_to_pte(swp)((pte_t) { (swp).val })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
-- 
2.39.0



[PATCH mm-unstable v1 12/26] mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE.

On 64bit, steal one bit from the type. Generic MM currently only uses 5
bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively
unused.

On 32bit we're able to locate unused bits. As the PTE layout for 32 bit
is very confusing, document it a bit better.

While at it, mask the type in __swp_entry()/mk_swap_pte().

Cc: Thomas Bogendoerfer 
Signed-off-by: David Hildenbrand 
---
 arch/mips/include/asm/pgtable-32.h | 88 ++
 arch/mips/include/asm/pgtable-64.h | 23 ++--
 arch/mips/include/asm/pgtable.h| 36 
 3 files changed, 131 insertions(+), 16 deletions(-)

diff --git a/arch/mips/include/asm/pgtable-32.h 
b/arch/mips/include/asm/pgtable-32.h
index b40a0e69fccc..ba0016709a1a 100644
--- a/arch/mips/include/asm/pgtable-32.h
+++ b/arch/mips/include/asm/pgtable-32.h
@@ -191,49 +191,113 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t 
prot)
 
 #define pte_page(x)pfn_to_page(pte_pfn(x))
 
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ */
 #if defined(CONFIG_CPU_R3K_TLB)
 
-/* Swap entries must have VALID bit cleared. */
+/*
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- offset > < type -> V G E 0 0 0 0 0 0 P
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain
+ *   unused.
+ */
 #define __swp_type(x)  (((x).val >> 10) & 0x1f)
 #define __swp_offset(x)((x).val >> 15)
-#define __swp_entry(type,offset)   ((swp_entry_t) { ((type) << 10) | 
((offset) << 15) })
+#define __swp_entry(type, offset)  ((swp_entry_t) { (((type) & 0x1f) << 
10) | ((offset) << 15) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1 << 7)
+
 #else
 
 #if defined(CONFIG_XPA)
 
-/* Swap entries must have VALID and GLOBAL bits cleared. */
+/*
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   0 0 0 0 0 0 E P <-- zeroes --->
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <- offset --> < type -> V G 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain
+ *   unused.
+ */
 #define __swp_type(x)  (((x).val >> 4) & 0x1f)
 #define __swp_offset(x) ((x).val >> 9)
-#define __swp_entry(type,offset)   ((swp_entry_t)  { ((type) << 4) | 
((offset) << 9) })
+#define __swp_entry(type, offset)  ((swp_entry_t)  { (((type) & 0x1f) << 
4) | ((offset) << 9) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_high 
})
 #define __swp_entry_to_pte(x)  ((pte_t) { 0, (x).val })
 
+/*
+ * We borrow bit 57 (bit 25 in the low PTE) to store the exclusive marker in
+ * swap PTEs.
+ */
+#define _PAGE_SWP_EXCLUSIVE(1 << 25)
+
 #elif defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32)
 
-/* Swap entries must have VALID and GLOBAL bits cleared. */
+/*
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <-- zeroes ---> E P 0 0 0 0 0 0
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- offset > < type -> V G
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain
+ *   unused.
+ */
 #define __swp_type(x)  (((x).val >> 2) & 0x1f)
 #define __swp_offset(x) ((x).val >> 7)
-#define __swp_entry(type, offset)  ((swp_entry_t)  { ((type) << 2) | 
((offset) << 7) })
+#define __swp_entry(type, offset)  ((swp_entry_t)  { (((type) & 0x1f) << 
2) | ((offset) << 7) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_high 
})
 #define __swp_entry_to_pte(x)  ((pte_t) { 0, (x).val })
 
+/*
+ * We borrow bit 39 (bit 7 in the low PTE) to store the exclusive marker in 
swap
+ * PTEs.
+ */
+#define _PAGE_SWP_EXCLUSIVE(1 << 7)
+
 #else
 /*
- * Constraints:
- *  _PAGE_PRESENT at bit 0
- *  _PAGE_MODIFIED at bit 4
- *  

[PATCH mm-unstable v1 11/26] microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

The shift by 2 when converting between PTE and arch-specific swap entry
makes the swap PTE layout a little bit harder to decipher.

While at it, drop the comment from paulus---copy-and-paste leftover
from powerpc where we actually have _PAGE_HASHPTE---and mask the type in
__swp_entry_to_pte() as well.

Cc: Michal Simek 
Signed-off-by: David Hildenbrand 
---
 arch/m68k/include/asm/mcf_pgtable.h   |  4 +--
 arch/microblaze/include/asm/pgtable.h | 45 +--
 2 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/arch/m68k/include/asm/mcf_pgtable.h 
b/arch/m68k/include/asm/mcf_pgtable.h
index 3f8f4d0e66dd..e573d7b649f7 100644
--- a/arch/m68k/include/asm/mcf_pgtable.h
+++ b/arch/m68k/include/asm/mcf_pgtable.h
@@ -46,8 +46,8 @@
 #define _CACHEMASK040  (~0x060)
 #define _PAGE_GLOBAL0400x400   /* 68040 global bit, used for 
kva descs */
 
-/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
-#define _PAGE_SWP_EXCLUSIVE0x080
+/* We borrow bit 24 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVECF_PAGE_NOCACHE
 
 /*
  * Externally used page protection values.
diff --git a/arch/microblaze/include/asm/pgtable.h 
b/arch/microblaze/include/asm/pgtable.h
index 42f5988e998b..7e3de54bf426 100644
--- a/arch/microblaze/include/asm/pgtable.h
+++ b/arch/microblaze/include/asm/pgtable.h
@@ -131,10 +131,10 @@ extern pte_t *va_to_pte(unsigned long address);
  * of the 16 available.  Bit 24-26 of the TLB are cleared in the TLB
  * miss handler.  Bit 27 is PAGE_USER, thus selecting the correct
  * zone.
- * - PRESENT *must* be in the bottom two bits because swap cache
- * entries use the top 30 bits.  Because 4xx doesn't support SMP
- * anyway, M is irrelevant so we borrow it for PAGE_PRESENT.  Bit 30
- * is cleared in the TLB miss handler before the TLB entry is loaded.
+ * - PRESENT *must* be in the bottom two bits because swap PTEs use the top
+ * 30 bits.  Because 4xx doesn't support SMP anyway, M is irrelevant so we
+ * borrow it for PAGE_PRESENT.  Bit 30 is cleared in the TLB miss handler
+ * before the TLB entry is loaded.
  * - All other bits of the PTE are loaded into TLBLO without
  *  * modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for
  * software PTE bits.  We actually use bits 21, 24, 25, and
@@ -155,6 +155,9 @@ extern pte_t *va_to_pte(unsigned long address);
 #define _PAGE_ACCESSED 0x400   /* software: R: page referenced */
 #define _PMD_PRESENT   PAGE_MASK
 
+/* We borrow bit 24 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_DIRTY
+
 /*
  * Some bits are unused...
  */
@@ -393,18 +396,40 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 
 /*
- * Encode and decode a swap entry.
- * Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit, or the _PAGE_HASHPTE bit
- * (if used).  -- paulus
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *   <-- offset ---> E < type -> 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  */
-#define __swp_type(entry)  ((entry).val & 0x3f)
+#define __swp_type(entry)  ((entry).val & 0x1f)
 #define __swp_offset(entry)((entry).val >> 6)
 #define __swp_entry(type, offset) \
-   ((swp_entry_t) { (type) | ((offset) << 6) })
+   ((swp_entry_t) { ((type) & 0x1f) | ((offset) << 6) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 2 })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val << 2 })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 extern unsigned long iopa(unsigned long addr);
 
 /* Values for nocacheflag and cmode */
-- 
2.39.0



[PATCH mm-unstable v1 10/26] m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, make sure for sun3 that the valid bit never gets set by
properly masking it off and mask the type in __swp_entry().

Cc: Geert Uytterhoeven 
Cc: Greg Ungerer 
Signed-off-by: David Hildenbrand 
---
 arch/m68k/include/asm/mcf_pgtable.h  | 36 --
 arch/m68k/include/asm/motorola_pgtable.h | 38 +--
 arch/m68k/include/asm/sun3_pgtable.h | 39 ++--
 3 files changed, 104 insertions(+), 9 deletions(-)

diff --git a/arch/m68k/include/asm/mcf_pgtable.h 
b/arch/m68k/include/asm/mcf_pgtable.h
index b619b22823f8..3f8f4d0e66dd 100644
--- a/arch/m68k/include/asm/mcf_pgtable.h
+++ b/arch/m68k/include/asm/mcf_pgtable.h
@@ -46,6 +46,9 @@
 #define _CACHEMASK040  (~0x060)
 #define _PAGE_GLOBAL0400x400   /* 68040 global bit, used for 
kva descs */
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x080
+
 /*
  * Externally used page protection values.
  */
@@ -254,15 +257,42 @@ static inline pte_t pte_mkcache(pte_t pte)
 extern pgd_t kernel_pg_dir[PTRS_PER_PGD];
 
 /*
- * Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e))
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <-- offset -> 0 0 0 E <-- type --->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  */
-#define __swp_type(x)  ((x).val & 0xFF)
+#define __swp_type(x)  ((x).val & 0x7f)
 #define __swp_offset(x)((x).val >> 11)
-#define __swp_entry(typ, off)  ((swp_entry_t) { (typ) | \
+#define __swp_entry(typ, off)  ((swp_entry_t) { ((typ) & 0x7f) | \
(off << 11) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  (__pte((x).val))
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 #define pmd_pfn(pmd)   (pmd_val(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd)  (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
 
diff --git a/arch/m68k/include/asm/motorola_pgtable.h 
b/arch/m68k/include/asm/motorola_pgtable.h
index 562b54e09850..c1782563e793 100644
--- a/arch/m68k/include/asm/motorola_pgtable.h
+++ b/arch/m68k/include/asm/motorola_pgtable.h
@@ -41,6 +41,9 @@
 
 #define _PAGE_PROTNONE 0x004
 
+/* We borrow bit 11 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x800
+
 #ifndef __ASSEMBLY__
 
 /* This is the cache mode to be used for pages containing page descriptors for
@@ -169,12 +172,41 @@ static inline pte_t pte_mkcache(pte_t pte)
 #define swapper_pg_dir kernel_pg_dir
 extern pgd_t kernel_pg_dir[128];
 
-/* Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) */
-#define __swp_type(x)  (((x).val >> 4) & 0xff)
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <- offset > E <-- type ---> 0 0 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ */
+#define __swp_type(x)  (((x).val >> 4) & 0x7f)
 #define __swp_offset(x)((x).val >> 12)
-#define __swp_entry(type, offset) ((swp_entry_t) { ((type) << 4) | ((offset) 
<< 12) })
+#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & 0x7f) << 4) | 
((offset) << 12) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _MOTOROLA_PGTABLE_H */
diff --git a/arch/m68k/include/asm/sun3_pgtable.h 
b/arch/m68k/include/asm/sun3_pgtable.h
index 

[PATCH mm-unstable v1 09/26] m68k/mm: remove dummy __swp definitions for nommu

2023-01-13 Thread David Hildenbrand
The definitions are not required, let's remove them.

Cc: Geert Uytterhoeven 
Cc: Greg Ungerer 
Signed-off-by: David Hildenbrand 
---
 arch/m68k/include/asm/pgtable_no.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/m68k/include/asm/pgtable_no.h 
b/arch/m68k/include/asm/pgtable_no.h
index fed58da3a6b6..fc044df52b96 100644
--- a/arch/m68k/include/asm/pgtable_no.h
+++ b/arch/m68k/include/asm/pgtable_no.h
@@ -31,12 +31,6 @@
 extern void paging_init(void);
 #define swapper_pg_dir ((pgd_t *) 0)
 
-#define __swp_type(x)  (0)
-#define __swp_offset(x)(0)
-#define __swp_entry(typ,off)   ((swp_entry_t) { ((typ) | ((off) << 7)) })
-#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
-#define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
-
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
-- 
2.39.0



[PATCH mm-unstable v1 08/26] loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, also mask the type in mk_swap_pte().

Note that this bit does not conflict with swap PMDs and could also be used
in swap PMD context later.

Cc: Huacai Chen 
Cc: WANG Xuerui 
Signed-off-by: David Hildenbrand 
---
 arch/loongarch/include/asm/pgtable-bits.h |  4 +++
 arch/loongarch/include/asm/pgtable.h  | 39 ---
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/arch/loongarch/include/asm/pgtable-bits.h 
b/arch/loongarch/include/asm/pgtable-bits.h
index 3d1e0a69975a..8b98d22a145b 100644
--- a/arch/loongarch/include/asm/pgtable-bits.h
+++ b/arch/loongarch/include/asm/pgtable-bits.h
@@ -20,6 +20,7 @@
 #define_PAGE_SPECIAL_SHIFT 11
 #define_PAGE_HGLOBAL_SHIFT 12 /* HGlobal is a PMD bit */
 #define_PAGE_PFN_SHIFT 12
+#define_PAGE_SWP_EXCLUSIVE_SHIFT 23
 #define_PAGE_PFN_END_SHIFT 48
 #define_PAGE_NO_READ_SHIFT 61
 #define_PAGE_NO_EXEC_SHIFT 62
@@ -33,6 +34,9 @@
 #define _PAGE_PROTNONE (_ULCAST_(1) << _PAGE_PROTNONE_SHIFT)
 #define _PAGE_SPECIAL  (_ULCAST_(1) << _PAGE_SPECIAL_SHIFT)
 
+/* We borrow bit 23 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(_ULCAST_(1) << _PAGE_SWP_EXCLUSIVE_SHIFT)
+
 /* Used by TLB hardware (placed in EntryLo*) */
 #define _PAGE_VALID(_ULCAST_(1) << _PAGE_VALID_SHIFT)
 #define _PAGE_DIRTY(_ULCAST_(1) << _PAGE_DIRTY_SHIFT)
diff --git a/arch/loongarch/include/asm/pgtable.h 
b/arch/loongarch/include/asm/pgtable.h
index 7a34e900d8c1..c6b8fe7ac43c 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -249,13 +249,26 @@ extern void pud_init(void *addr);
 extern void pmd_init(void *addr);
 
 /*
- * Non-present pages:  high 40 bits are offset, next 8 bits type,
- * low 16 bits zero.
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <--- offset ---
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   --> E <--- type ---> <-- zeroes -->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   The zero'ed bits include _PAGE_PRESENT and _PAGE_PROTNONE.
  */
 static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset)
-{ pte_t pte; pte_val(pte) = (type << 16) | (offset << 24); return pte; }
+{ pte_t pte; pte_val(pte) = ((type & 0x7f) << 16) | (offset << 24); return 
pte; }
 
-#define __swp_type(x)  (((x).val >> 16) & 0xff)
+#define __swp_type(x)  (((x).val >> 16) & 0x7f)
 #define __swp_offset(x)((x).val >> 24)
 #define __swp_entry(type, offset) ((swp_entry_t) { pte_val(mk_swap_pte((type), 
(offset))) })
 #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
@@ -263,6 +276,24 @@ static inline pte_t mk_swap_pte(unsigned long type, 
unsigned long offset)
 #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) })
 #define __swp_entry_to_pmd(x)  ((pmd_t) { (x).val | _PAGE_HUGE })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 extern void paging_init(void);
 
 #define pte_none(pte)  (!(pte_val(pte) & ~_PAGE_GLOBAL))
-- 
2.39.0



[PATCH mm-unstable v1 07/26] ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, also mask the type in __swp_entry().

Signed-off-by: David Hildenbrand 
---
 arch/ia64/include/asm/pgtable.h | 32 +---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index 01517a5e6778..e4b8ab931399 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -58,6 +58,9 @@
 #define _PAGE_ED   (__IA64_UL(1) << 52)/* exception deferral */
 #define _PAGE_PROTNONE (__IA64_UL(1) << 63)
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1 << 7)
+
 #define _PFN_MASK  _PAGE_PPN_MASK
 /* Mask of bits which may be changed by pte_modify(); the odd bits are there 
for _PAGE_PROTNONE */
 #define _PAGE_CHG_MASK (_PAGE_P | _PAGE_PROTNONE | _PAGE_PL_MASK | 
_PAGE_AR_MASK | _PAGE_ED)
@@ -399,6 +402,9 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 extern void paging_init (void);
 
 /*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
  * Note: The macros below rely on the fact that MAX_SWAPFILES_SHIFT <= number 
of
  *  bits in the swap-type field of the swap pte.  It would be nice to
  *  enforce that, but we can't easily include  here.
@@ -406,16 +412,36 @@ extern void paging_init (void);
  *
  * Format of swap pte:
  * bit   0   : present bit (must be zero)
- * bits  1- 7: swap-type
+ * bits  1- 6: swap type
+ * bit   7   : exclusive marker
  * bits  8-62: swap offset
  * bit  63   : _PAGE_PROTNONE bit
  */
-#define __swp_type(entry)  (((entry).val >> 1) & 0x7f)
+#define __swp_type(entry)  (((entry).val >> 1) & 0x3f)
 #define __swp_offset(entry)(((entry).val << 1) >> 9)
-#define __swp_entry(type,offset)   ((swp_entry_t) { ((type) << 1) | 
((long) (offset) << 8) })
+#define __swp_entry(type, offset)  ((swp_entry_t) { ((type & 0x3f) << 1) | 
\
+((long) (offset) << 8) 
})
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
-- 
2.39.0



[PATCH mm-unstable v1 06/26] hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the
offset. This reduces the maximum swap space per file to 16 GiB (was 32
GiB).

While at it, mask the type in __swp_entry().

Cc: Brian Cain 
Signed-off-by: David Hildenbrand 
---
 arch/hexagon/include/asm/pgtable.h | 37 +-
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/arch/hexagon/include/asm/pgtable.h 
b/arch/hexagon/include/asm/pgtable.h
index f7048c18b6f9..7eb008e477c8 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -61,6 +61,9 @@ extern unsigned long empty_zero_page;
  * So we'll put up with a bit of inefficiency for now...
  */
 
+/* We borrow bit 6 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<6)
+
 /*
  * Top "FOURTH" level (pgd), which for the Hexagon VM is really
  * only the second from the bottom, pgd and pud both being collapsed.
@@ -359,9 +362,12 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 #define ZERO_PAGE(vaddr) (virt_to_page(_zero_page))
 
 /*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
  * Swap/file PTE definitions.  If _PAGE_PRESENT is zero, the rest of the PTE is
  * interpreted as swap information.  The remaining free bits are interpreted as
- * swap type/offset tuple.  Rather than have the TLB fill handler test
+ * listed below.  Rather than have the TLB fill handler test
  * _PAGE_PRESENT, we're going to reserve the permissions bits and set them to
  * all zeros for swap entries, which speeds up the miss handler at the cost of
  * 3 bits of offset.  That trade-off can be revisited if necessary, but Hexagon
@@ -371,9 +377,10 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  * Format of swap PTE:
  * bit 0:  Present (zero)
  * bits1-5:swap type (arch independent layer uses 5 bits max)
- * bits6-9:bits 3:0 of offset
+ * bit 6:  exclusive marker
+ * bits7-9:bits 2:0 of offset
  * bits10-12:  effectively _PAGE_PROTNONE (all zero)
- * bits13-31:  bits 22:4 of swap offset
+ * bits13-31:  bits 21:3 of swap offset
  *
  * The split offset makes some of the following macros a little gnarly,
  * but there's plenty of precedent for this sort of thing.
@@ -383,11 +390,29 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 #define __swp_type(swp_pte)(((swp_pte).val >> 1) & 0x1f)
 
 #define __swp_offset(swp_pte) \
-   swp_pte).val >> 6) & 0xf) | (((swp_pte).val >> 9) & 0x70))
+   swp_pte).val >> 7) & 0x7) | (((swp_pte).val >> 10) & 0x38))
 
 #define __swp_entry(type, offset) \
((swp_entry_t)  { \
-   ((type << 1) | \
-((offset & 0x70) << 9) | ((offset & 0xf) << 6)) })
+   (((type & 0x1f) << 1) | \
+((offset & 0x38) << 10) | ((offset & 0x7) << 7)) })
+
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
 
 #endif
-- 
2.39.0



[PATCH mm-unstable v1 05/26] csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the
offset. This reduces the maximum swap space per file to 16 GiB (was 32
GiB).

We might actually be able to reuse one of the other software bits
(_PAGE_READ / PAGE_WRITE) instead, because we only have to keep
pte_present(), pte_none() and HW happy. For now, let's keep it simple
because there might be something non-obvious.

Cc: Guo Ren 
Signed-off-by: David Hildenbrand 
---
 arch/csky/abiv1/inc/abi/pgtable-bits.h | 13 +
 arch/csky/abiv2/inc/abi/pgtable-bits.h | 19 ---
 arch/csky/include/asm/pgtable.h| 18 ++
 3 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/arch/csky/abiv1/inc/abi/pgtable-bits.h 
b/arch/csky/abiv1/inc/abi/pgtable-bits.h
index 752c8b3f9194..ae7a2f76dd42 100644
--- a/arch/csky/abiv1/inc/abi/pgtable-bits.h
+++ b/arch/csky/abiv1/inc/abi/pgtable-bits.h
@@ -10,6 +10,9 @@
 #define _PAGE_ACCESSED (1<<3)
 #define _PAGE_MODIFIED (1<<4)
 
+/* We borrow bit 9 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<9)
+
 /* implemented in hardware */
 #define _PAGE_GLOBAL   (1<<6)
 #define _PAGE_VALID(1<<7)
@@ -26,7 +29,8 @@
 #define _PAGE_PROT_NONE_PAGE_READ
 
 /*
- * Encode and decode a swap entry
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
  * Format of swap PTE:
  * bit  0:_PAGE_PRESENT (zero)
@@ -35,15 +39,16 @@
  * bit  6:_PAGE_GLOBAL (zero)
  * bit  7:_PAGE_VALID (zero)
  * bit  8:swap type[4]
- * bit 9 - 31:swap offset
+ * bit  9:exclusive marker
+ * bit10 - 31:swap offset
  */
 #define __swp_type(x)  x).val >> 2) & 0xf) | \
(((x).val >> 4) & 0x10))
-#define __swp_offset(x)((x).val >> 9)
+#define __swp_offset(x)((x).val >> 10)
 #define __swp_entry(type, offset)  ((swp_entry_t) { \
((type & 0xf) << 2) | \
((type & 0x10) << 4) | \
-   ((offset) << 9)})
+   ((offset) << 10)})
 
 #define HAVE_ARCH_UNMAPPED_AREA
 
diff --git a/arch/csky/abiv2/inc/abi/pgtable-bits.h 
b/arch/csky/abiv2/inc/abi/pgtable-bits.h
index 7e7f389f546f..526152bd2156 100644
--- a/arch/csky/abiv2/inc/abi/pgtable-bits.h
+++ b/arch/csky/abiv2/inc/abi/pgtable-bits.h
@@ -10,6 +10,9 @@
 #define _PAGE_PRESENT  (1<<10)
 #define _PAGE_MODIFIED (1<<11)
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<7)
+
 /* implemented in hardware */
 #define _PAGE_GLOBAL   (1<<0)
 #define _PAGE_VALID(1<<1)
@@ -26,23 +29,25 @@
 #define _PAGE_PROT_NONE_PAGE_WRITE
 
 /*
- * Encode and decode a swap entry
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
  * Format of swap PTE:
  * bit  0:_PAGE_GLOBAL (zero)
  * bit  1:_PAGE_VALID (zero)
  * bit  2 - 6:swap type
- * bit  7 - 8:swap offset[0 - 1]
+ * bit  7:exclusive marker
+ * bit  8:swap offset[0]
  * bit  9:_PAGE_WRITE (zero)
  * bit 10:_PAGE_PRESENT (zero)
- * bit11 - 31:swap offset[2 - 22]
+ * bit11 - 31:swap offset[1 - 21]
  */
 #define __swp_type(x)  (((x).val >> 2) & 0x1f)
-#define __swp_offset(x)x).val >> 7) & 0x3) | \
-   (((x).val >> 9) & 0x7c))
+#define __swp_offset(x)x).val >> 8) & 0x1) | \
+   (((x).val >> 10) & 0x3e))
 #define __swp_entry(type, offset)  ((swp_entry_t) { \
((type & 0x1f) << 2) | \
-   ((offset & 0x3) << 7) | \
-   ((offset & 0x7c) << 9)})
+   ((offset & 0x1) << 8) | \
+   ((offset & 0x3e) << 10)})
 
 #endif /* __ASM_CSKY_PGTABLE_BITS_H */
diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h
index 77bc6caff2d2..574c97b9ecca 100644
--- a/arch/csky/include/asm/pgtable.h
+++ b/arch/csky/include/asm/pgtable.h
@@ -200,6 +200,24 @@ static inline pte_t pte_mkyoung(pte_t pte)
return pte;
 }
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return 

[PATCH mm-unstable v1 04/26] arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the
offset. This reduces the maximum swap space per file to 64 GiB (was 128
GiB).

While at it drop the PTE_TYPE_FAULT from __swp_entry_to_pte() which is
defined to be 0 and is rather confusing because we should be dealing
with "Linux PTEs" not "hardware PTEs". Also, properly mask the type in
__swp_entry().

Cc: Russell King 
Signed-off-by: David Hildenbrand 
---
 arch/arm/include/asm/pgtable-2level.h |  3 +++
 arch/arm/include/asm/pgtable-3level.h |  3 +++
 arch/arm/include/asm/pgtable.h| 35 +--
 3 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h 
b/arch/arm/include/asm/pgtable-2level.h
index 92abd4cd8ca2..ce543cd9380c 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -126,6 +126,9 @@
 #define L_PTE_SHARED   (_AT(pteval_t, 1) << 10)/* shared(v6), 
coherent(xsc3) */
 #define L_PTE_NONE (_AT(pteval_t, 1) << 11)
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define L_PTE_SWP_EXCLUSIVEL_PTE_RDONLY
+
 /*
  * These are the memory types, defined to be compatible with
  * pre-ARMv6 CPUs cacheable and bufferable bits: n/a,n/a,C,B
diff --git a/arch/arm/include/asm/pgtable-3level.h 
b/arch/arm/include/asm/pgtable-3level.h
index eabe72ff7381..106049791500 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -76,6 +76,9 @@
 #define L_PTE_NONE (_AT(pteval_t, 1) << 57)/* PROT_NONE */
 #define L_PTE_RDONLY   (_AT(pteval_t, 1) << 58)/* READ ONLY */
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define L_PTE_SWP_EXCLUSIVE(_AT(pteval_t, 1) << 7)
+
 #define L_PMD_SECT_VALID   (_AT(pmdval_t, 1) << 0)
 #define L_PMD_SECT_DIRTY   (_AT(pmdval_t, 1) << 55)
 #define L_PMD_SECT_NONE(_AT(pmdval_t, 1) << 57)
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index f049072b2e85..886c275995a2 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -271,27 +271,48 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t 
newprot)
 }
 
 /*
- * Encode and decode a swap entry.  Swap entries are stored in the Linux
- * page tables as follows:
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
  *
  *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
  *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
- *   <--- offset > < type -> 0 0
+ *   <--- offset --> E < type -> 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  *
- * This gives us up to 31 swap files and 128GB per swap file.  Note that
+ * This gives us up to 31 swap files and 64GB per swap file.  Note that
  * the offset field is always non-zero.
  */
 #define __SWP_TYPE_SHIFT   2
 #define __SWP_TYPE_BITS5
 #define __SWP_TYPE_MASK((1 << __SWP_TYPE_BITS) - 1)
-#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
+#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT + 1)
 
 #define __swp_type(x)  (((x).val >> __SWP_TYPE_SHIFT) & 
__SWP_TYPE_MASK)
 #define __swp_offset(x)((x).val >> __SWP_OFFSET_SHIFT)
-#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << __SWP_TYPE_SHIFT) 
| ((offset) << __SWP_OFFSET_SHIFT) })
+#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & __SWP_TYPE_BITS) 
<< __SWP_TYPE_SHIFT) | \
+  ((offset) << 
__SWP_OFFSET_SHIFT) })
 
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
-#define __swp_entry_to_pte(swp)__pte((swp).val | PTE_TYPE_FAULT)
+#define __swp_entry_to_pte(swp)__pte((swp).val)
+
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_isset(pte, L_PTE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return set_pte_bit(pte, __pgprot(L_PTE_SWP_EXCLUSIVE));
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return clear_pte_bit(pte, __pgprot(L_PTE_SWP_EXCLUSIVE));
+}
 
 /*
  * It is an error for the kernel to have more swap files than we can
-- 
2.39.0



[PATCH mm-unstable v1 03/26] arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 5, which is yet
unused. The only important parts seems to be to not use _PAGE_PRESENT
(bit 9).

Cc: Vineet Gupta 
Signed-off-by: David Hildenbrand 
---
 arch/arc/include/asm/pgtable-bits-arcv2.h | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h 
b/arch/arc/include/asm/pgtable-bits-arcv2.h
index 515e82db519f..611f412713b9 100644
--- a/arch/arc/include/asm/pgtable-bits-arcv2.h
+++ b/arch/arc/include/asm/pgtable-bits-arcv2.h
@@ -26,6 +26,9 @@
 #define _PAGE_GLOBAL   (1 << 8)  /* ASID agnostic (H) */
 #define _PAGE_PRESENT  (1 << 9)  /* PTE/TLB Valid (H) */
 
+/* We borrow bit 5 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_DIRTY
+
 #ifdef CONFIG_ARC_MMU_V4
 #define _PAGE_HW_SZ(1 << 10)  /* Normal/super (H) */
 #else
@@ -106,9 +109,18 @@ static inline void set_pte_at(struct mm_struct *mm, 
unsigned long addr,
 void update_mmu_cache(struct vm_area_struct *vma, unsigned long address,
  pte_t *ptep);
 
-/* Encode swap {type,off} tuple into PTE
- * We reserve 13 bits for 5-bit @type, keeping bits 12-5 zero, ensuring that
- * PAGE_PRESENT is zero in a PTE holding swap "identifier"
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <-- offset -> <--- zero --> E < type ->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   The zero'ed bits include _PAGE_PRESENT.
  */
 #define __swp_entry(type, off) ((swp_entry_t) \
{ ((type) & 0x1f) | ((off) << 13) })
@@ -120,6 +132,15 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned 
long address,
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+PTE_BIT_FUNC(swp_mkexclusive, |= (_PAGE_SWP_EXCLUSIVE));
+PTE_BIT_FUNC(swp_clear_exclusive, &= ~(_PAGE_SWP_EXCLUSIVE));
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #include 
 #endif
-- 
2.39.0



[PATCH mm-unstable v1 02/26] alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, mask the type in mk_swap_pte() as well.

Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Signed-off-by: David Hildenbrand 
---
 arch/alpha/include/asm/pgtable.h | 41 
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h
index 9e45f6735d5d..970abf511b13 100644
--- a/arch/alpha/include/asm/pgtable.h
+++ b/arch/alpha/include/asm/pgtable.h
@@ -74,6 +74,9 @@ struct vm_area_struct;
 #define _PAGE_DIRTY0x2
 #define _PAGE_ACCESSED 0x4
 
+/* We borrow bit 39 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x80UL
+
 /*
  * NOTE! The "accessed" bit isn't necessarily exact:  it can be kept exactly
  * by software (use the KRE/URE/KWE/UWE bits appropriately), but I'll fake it.
@@ -301,18 +304,48 @@ extern inline void update_mmu_cache(struct vm_area_struct 
* vma,
 }
 
 /*
- * Non-present pages:  high 24 bits are offset, next 8 bits type,
- * low 32 bits zero.
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <--- offset --> E <--- type -->
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- zeroes -->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  */
 extern inline pte_t mk_swap_pte(unsigned long type, unsigned long offset)
-{ pte_t pte; pte_val(pte) = (type << 32) | (offset << 40); return pte; }
+{ pte_t pte; pte_val(pte) = ((type & 0x7f) << 32) | (offset << 40); return 
pte; }
 
-#define __swp_type(x)  (((x).val >> 32) & 0xff)
+#define __swp_type(x)  (((x).val >> 32) & 0x7f)
 #define __swp_offset(x)((x).val >> 40)
 #define __swp_entry(type, off) ((swp_entry_t) { pte_val(mk_swap_pte((type), 
(off))) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 #define pte_ERROR(e) \
printk("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
 #define pmd_ERROR(e) \
-- 
2.39.0



[PATCH mm-unstable v1 01/26] mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks

2023-01-13 Thread David Hildenbrand
We want to implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures.
Let's extend our sanity checks, especially testing that our PTE bit
does not affect:
* is_swap_pte() -> pte_present() and pte_none()
* the swap entry + type
* pte_swp_soft_dirty()

Especially, the pfn_pte() is dodgy when the swap PTE layout differs
heavily from ordinary PTEs. Let's properly construct a swap PTE from
swap type+offset.

Signed-off-by: David Hildenbrand 
---
 mm/debug_vm_pgtable.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index bb3328f46126..a0730beffd78 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -811,13 +811,34 @@ static void __init pmd_swap_soft_dirty_tests(struct 
pgtable_debug_args *args) {
 static void __init pte_swap_exclusive_tests(struct pgtable_debug_args *args)
 {
 #ifdef __HAVE_ARCH_PTE_SWP_EXCLUSIVE
-   pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot);
+   unsigned long max_swapfile_size = generic_max_swapfile_size();
+   swp_entry_t entry, entry2;
+   pte_t pte;
 
pr_debug("Validating PTE swap exclusive\n");
+
+   /* Create a swp entry with all possible bits set */
+   entry = swp_entry((1 << MAX_SWAPFILES_SHIFT) - 1,
+ max_swapfile_size - 1);
+
+   pte = swp_entry_to_pte(entry);
+   WARN_ON(pte_swp_exclusive(pte));
+   WARN_ON(!is_swap_pte(pte));
+   entry2 = pte_to_swp_entry(pte);
+   WARN_ON(memcmp(, , sizeof(entry)));
+
pte = pte_swp_mkexclusive(pte);
WARN_ON(!pte_swp_exclusive(pte));
+   WARN_ON(!is_swap_pte(pte));
+   WARN_ON(pte_swp_soft_dirty(pte));
+   entry2 = pte_to_swp_entry(pte);
+   WARN_ON(memcmp(, , sizeof(entry)));
+
pte = pte_swp_clear_exclusive(pte);
WARN_ON(pte_swp_exclusive(pte));
+   WARN_ON(!is_swap_pte(pte));
+   entry2 = pte_to_swp_entry(pte);
+   WARN_ON(memcmp(, , sizeof(entry)));
 #endif /* __HAVE_ARCH_PTE_SWP_EXCLUSIVE */
 }
 
-- 
2.39.0



RE: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Luck, Tony
>> Is it time yet for:
>>
>> $ git rm -r arch/ia64
>>

> Can I take that as an ack on [0]? The EFI subsystem has evolved
> substantially over the years, and there is really no way to do any
> IA64 testing beyond build testing, so from that perspective, dropping
> it entirely would be welcomed.
>
> Thanks,
> Ard.
>
>
>
> [0] 
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=remove-ia64

Yes. EFI isn't the only issue. A bunch of folks[1] have spent time fixing ia64
for (in most cases) some tree-wide patches that they needed. Their time might
have been more productively spent fixing things that actually matter in modern 
times.

Acked-by: Tony Luck 

-Tony

[1] git log --no-merges --since=2year -- arch/ia64 | grep Author: | sort | uniq 
-c | sort -rn
 19 Author: Masahiro Yamada 
 11 Author: Sergei Trofimovich 
  9 Author: Eric W. Biederman 
  8 Author: Arnd Bergmann 
  6 Author: Randy Dunlap 
  6 Author: Kefeng Wang 
  6 Author: Anshuman Khandual 
  5 Author: Masami Hiramatsu 
  5 Author: Al Viro 
  4 Author: Peter Zijlstra 
  4 Author: Mike Rapoport 
  4 Author: David Hildenbrand 
  4 Author: Christophe Leroy 
  3 Author: Yury Norov 
  3 Author: Michal Hocko 
  3 Author: Geert Uytterhoeven 
  3 Author: Bhaskar Chowdhury 
  3 Author: Baoquan He 
  3 Author: Ard Biesheuvel 
  3 Author: Aneesh Kumar K.V 
  2 Author: Yang Guang 
  2 Author: Will Deacon 
  2 Author: Viresh Kumar 
  2 Author: Valentin Schneider 
  2 Author: Stafford Horne 
  2 Author: Sebastian Andrzej Siewior 
  2 Author: Richard Guy Briggs 
  2 Author: Peter Xu 
  2 Author: Peter Collingbourne 
  2 Author: Mark Rutland 
  2 Author: Lukas Bulwahn 
  2 Author: Julia Lawall 
  2 Author: Jens Axboe 
  2 Author: Jason Wang 
  2 Author: Jan Kara 
  2 Author: Christoph Hellwig 
  2 Author: Bjorn Helgaas 
  2 Author: Alexander Lobakin 
  1 Author: Zi Yan 
  1 Author: Zhang Yunkai 
  1 Author: ye xingchen 
  1 Author: xu xin 
  1 Author: Wolfram Sang 
  1 Author: Weizhao Ouyang 
  1 Author: Suren Baghdasaryan 
  1 Author: Souptick Joarder (HPE) 
  1 Author: Sergey Shtylyov 
  1 Author: Sergei Trofimovich 
  1 Author: Sascha Hauer 
  1 Author: Samuel Holland 
  1 Author: Qi Zheng 
  1 Author: Peng Liu 
  1 Author: Naveen N. Rao 
  1 Author: Muchun Song 
  1 Author: Mikulas Patocka 
  1 Author: Mike Kravetz 
  1 Author: Mickaël Salaün 
  1 Author: Matthew Wilcox (Oracle) 
  1 Author: Martin Oliveira 
  1 Author: Luc Van Oostenryck 
  1 Author: Kees Cook 
  1 Author: Jason A. Donenfeld 
  1 Author: Ilpo Järvinen 
  1 Author: Haowen Bai 
  1 Author: Gustavo A. R. Silva 
  1 Author: Greg Kroah-Hartman 
  1 Author: Gaosheng Cui 
  1 Author: Dmitry Osipenko 
  1 Author: Dawei Li 
  1 Author: Chuck Lever 
  1 Author: Christian Brauner 
  1 Author: Chris Down 
  1 Author: Chen Li 
  1 Author: Catalin Marinas 
  1 Author: Benjamin Stürz 
  1 Author: Ben Dooks 
  1 Author: Baolin Wang 
  1 Author: Andy Shevchenko 
  1 Author: André Almeida 


Re: [PATCH v3 07/13] tty: Convert ->dtr_rts() to take bool argument

2023-01-13 Thread Ulf Hansson
On Wed, 11 Jan 2023 at 15:24, Ilpo Järvinen
 wrote:
>
> Convert the raise/on parameter in ->dtr_rts() to bool through the
> callchain. The parameter is used like bool. In USB serial, there
> remains a few implicit bool -> larger type conversions because some
> devices use u8 in their control messages.
>
> In moxa_tiocmget(), dtr variable was reused for line status which
> requires int so use a separate variable for status.
>
> Reviewed-by: Jiri Slaby 
> Signed-off-by: Ilpo Järvinen 

Acked-by: Ulf Hansson  # For MMC

Kind regards
Uffe


> ---
>  drivers/char/pcmcia/synclink_cs.c|  4 +--
>  drivers/mmc/core/sdio_uart.c |  4 +--
>  drivers/staging/greybus/uart.c   |  2 +-
>  drivers/tty/amiserial.c  |  2 +-
>  drivers/tty/hvc/hvc_console.c|  4 +--
>  drivers/tty/hvc/hvc_console.h|  2 +-
>  drivers/tty/hvc/hvc_iucv.c   |  4 +--
>  drivers/tty/moxa.c   | 54 ++--
>  drivers/tty/mxser.c  |  2 +-
>  drivers/tty/n_gsm.c  |  2 +-
>  drivers/tty/serial/serial_core.c |  8 ++---
>  drivers/tty/synclink_gt.c|  2 +-
>  drivers/tty/tty_port.c   |  4 +--
>  drivers/usb/class/cdc-acm.c  |  2 +-
>  drivers/usb/serial/ch341.c   |  2 +-
>  drivers/usb/serial/cp210x.c  |  4 +--
>  drivers/usb/serial/cypress_m8.c  |  6 ++--
>  drivers/usb/serial/digi_acceleport.c |  6 ++--
>  drivers/usb/serial/f81232.c  |  2 +-
>  drivers/usb/serial/f81534.c  |  2 +-
>  drivers/usb/serial/ftdi_sio.c|  2 +-
>  drivers/usb/serial/ipw.c |  2 +-
>  drivers/usb/serial/keyspan.c |  2 +-
>  drivers/usb/serial/keyspan_pda.c |  2 +-
>  drivers/usb/serial/mct_u232.c|  4 +--
>  drivers/usb/serial/mxuport.c |  2 +-
>  drivers/usb/serial/pl2303.c  |  2 +-
>  drivers/usb/serial/quatech2.c|  2 +-
>  drivers/usb/serial/sierra.c  |  2 +-
>  drivers/usb/serial/spcp8x5.c |  2 +-
>  drivers/usb/serial/ssu100.c  |  2 +-
>  drivers/usb/serial/upd78f0730.c  |  6 ++--
>  drivers/usb/serial/usb-serial.c  |  2 +-
>  drivers/usb/serial/usb-wwan.h|  2 +-
>  drivers/usb/serial/usb_wwan.c|  2 +-
>  drivers/usb/serial/xr_serial.c   |  6 ++--
>  include/linux/tty_port.h |  4 +--
>  include/linux/usb/serial.h   |  2 +-
>  38 files changed, 84 insertions(+), 82 deletions(-)
>
> diff --git a/drivers/char/pcmcia/synclink_cs.c 
> b/drivers/char/pcmcia/synclink_cs.c
> index 4391138e1b8a..46a0b586d234 100644
> --- a/drivers/char/pcmcia/synclink_cs.c
> +++ b/drivers/char/pcmcia/synclink_cs.c
> @@ -378,7 +378,7 @@ static void async_mode(MGSLPC_INFO *info);
>  static void tx_timeout(struct timer_list *t);
>
>  static bool carrier_raised(struct tty_port *port);
> -static void dtr_rts(struct tty_port *port, int onoff);
> +static void dtr_rts(struct tty_port *port, bool onoff);
>
>  #if SYNCLINK_GENERIC_HDLC
>  #define dev_to_port(D) (dev_to_hdlc(D)->priv)
> @@ -2442,7 +2442,7 @@ static bool carrier_raised(struct tty_port *port)
> return info->serial_signals & SerialSignal_DCD;
>  }
>
> -static void dtr_rts(struct tty_port *port, int onoff)
> +static void dtr_rts(struct tty_port *port, bool onoff)
>  {
> MGSLPC_INFO *info = container_of(port, MGSLPC_INFO, port);
> unsigned long flags;
> diff --git a/drivers/mmc/core/sdio_uart.c b/drivers/mmc/core/sdio_uart.c
> index 47f58258d8ff..c6b4b2b2a4b2 100644
> --- a/drivers/mmc/core/sdio_uart.c
> +++ b/drivers/mmc/core/sdio_uart.c
> @@ -548,14 +548,14 @@ static bool uart_carrier_raised(struct tty_port *tport)
>   * adjusted during an open, close and hangup.
>   */
>
> -static void uart_dtr_rts(struct tty_port *tport, int onoff)
> +static void uart_dtr_rts(struct tty_port *tport, bool onoff)
>  {
> struct sdio_uart_port *port =
> container_of(tport, struct sdio_uart_port, port);
> int ret = sdio_uart_claim_func(port);
> if (ret)
> return;
> -   if (onoff == 0)
> +   if (!onoff)
> sdio_uart_clear_mctrl(port, TIOCM_DTR | TIOCM_RTS);
> else
> sdio_uart_set_mctrl(port, TIOCM_DTR | TIOCM_RTS);
> diff --git a/drivers/staging/greybus/uart.c b/drivers/staging/greybus/uart.c
> index 90ff07f2cbf7..92d49740d5a4 100644
> --- a/drivers/staging/greybus/uart.c
> +++ b/drivers/staging/greybus/uart.c
> @@ -701,7 +701,7 @@ static int gb_tty_ioctl(struct tty_struct *tty, unsigned 
> int cmd,
> return -ENOIOCTLCMD;
>  }
>
> -static void gb_tty_dtr_rts(struct tty_port *port, int on)
> +static void gb_tty_dtr_rts(struct tty_port *port, bool on)
>  {
> struct gb_tty *gb_tty;
> u8 newctrl;
> diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c
> index 01c4fd3ce7c8..29d4c554f6b8 100644
> --- a/drivers/tty/amiserial.c
> +++ b/drivers/tty/amiserial.c
> @@ -1459,7 +1459,7 @@ 

Re: [PATCH v3 11/13] tty/serial: Call ->dtr_rts() parameter active consistently

2023-01-13 Thread Ulf Hansson
On Wed, 11 Jan 2023 at 15:24, Ilpo Järvinen
 wrote:
>
> Convert various parameter names for ->dtr_rts() and related functions
> from onoff, on, and raise to active.
>
> Reviewed-by: Jiri Slaby 
> Signed-off-by: Ilpo Järvinen 

Acked-by: Ulf Hansson  # For MMC

Kind regards
Uffe

> ---
>  drivers/char/pcmcia/synclink_cs.c | 6 +++---
>  drivers/mmc/core/sdio_uart.c  | 6 +++---
>  drivers/staging/greybus/uart.c| 4 ++--
>  drivers/tty/amiserial.c   | 4 ++--
>  drivers/tty/hvc/hvc_console.h | 2 +-
>  drivers/tty/hvc/hvc_iucv.c| 6 +++---
>  drivers/tty/mxser.c   | 4 ++--
>  drivers/tty/n_gsm.c   | 4 ++--
>  drivers/tty/serial/serial_core.c  | 8 
>  drivers/tty/synclink_gt.c | 4 ++--
>  include/linux/tty_port.h  | 4 ++--
>  include/linux/usb/serial.h| 2 +-
>  12 files changed, 27 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/char/pcmcia/synclink_cs.c 
> b/drivers/char/pcmcia/synclink_cs.c
> index 46a0b586d234..1577eba6fe0e 100644
> --- a/drivers/char/pcmcia/synclink_cs.c
> +++ b/drivers/char/pcmcia/synclink_cs.c
> @@ -378,7 +378,7 @@ static void async_mode(MGSLPC_INFO *info);
>  static void tx_timeout(struct timer_list *t);
>
>  static bool carrier_raised(struct tty_port *port);
> -static void dtr_rts(struct tty_port *port, bool onoff);
> +static void dtr_rts(struct tty_port *port, bool active);
>
>  #if SYNCLINK_GENERIC_HDLC
>  #define dev_to_port(D) (dev_to_hdlc(D)->priv)
> @@ -2442,13 +2442,13 @@ static bool carrier_raised(struct tty_port *port)
> return info->serial_signals & SerialSignal_DCD;
>  }
>
> -static void dtr_rts(struct tty_port *port, bool onoff)
> +static void dtr_rts(struct tty_port *port, bool active)
>  {
> MGSLPC_INFO *info = container_of(port, MGSLPC_INFO, port);
> unsigned long flags;
>
> spin_lock_irqsave(>lock, flags);
> -   if (onoff)
> +   if (active)
> info->serial_signals |= SerialSignal_RTS | SerialSignal_DTR;
> else
> info->serial_signals &= ~(SerialSignal_RTS | 
> SerialSignal_DTR);
> diff --git a/drivers/mmc/core/sdio_uart.c b/drivers/mmc/core/sdio_uart.c
> index c6b4b2b2a4b2..50536fe59f1a 100644
> --- a/drivers/mmc/core/sdio_uart.c
> +++ b/drivers/mmc/core/sdio_uart.c
> @@ -542,20 +542,20 @@ static bool uart_carrier_raised(struct tty_port *tport)
>  /**
>   * uart_dtr_rts-port helper to set uart signals
>   * @tport: tty port to be updated
> - * @onoff: set to turn on DTR/RTS
> + * @active: set to turn on DTR/RTS
>   *
>   * Called by the tty port helpers when the modem signals need to be
>   * adjusted during an open, close and hangup.
>   */
>
> -static void uart_dtr_rts(struct tty_port *tport, bool onoff)
> +static void uart_dtr_rts(struct tty_port *tport, bool active)
>  {
> struct sdio_uart_port *port =
> container_of(tport, struct sdio_uart_port, port);
> int ret = sdio_uart_claim_func(port);
> if (ret)
> return;
> -   if (!onoff)
> +   if (!active)
> sdio_uart_clear_mctrl(port, TIOCM_DTR | TIOCM_RTS);
> else
> sdio_uart_set_mctrl(port, TIOCM_DTR | TIOCM_RTS);
> diff --git a/drivers/staging/greybus/uart.c b/drivers/staging/greybus/uart.c
> index 92d49740d5a4..20a34599859f 100644
> --- a/drivers/staging/greybus/uart.c
> +++ b/drivers/staging/greybus/uart.c
> @@ -701,7 +701,7 @@ static int gb_tty_ioctl(struct tty_struct *tty, unsigned 
> int cmd,
> return -ENOIOCTLCMD;
>  }
>
> -static void gb_tty_dtr_rts(struct tty_port *port, bool on)
> +static void gb_tty_dtr_rts(struct tty_port *port, bool active)
>  {
> struct gb_tty *gb_tty;
> u8 newctrl;
> @@ -709,7 +709,7 @@ static void gb_tty_dtr_rts(struct tty_port *port, bool on)
> gb_tty = container_of(port, struct gb_tty, port);
> newctrl = gb_tty->ctrlout;
>
> -   if (on)
> +   if (active)
> newctrl |= (GB_UART_CTRL_DTR | GB_UART_CTRL_RTS);
> else
> newctrl &= ~(GB_UART_CTRL_DTR | GB_UART_CTRL_RTS);
> diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c
> index 29d4c554f6b8..d7515d61659e 100644
> --- a/drivers/tty/amiserial.c
> +++ b/drivers/tty/amiserial.c
> @@ -1459,13 +1459,13 @@ static bool amiga_carrier_raised(struct tty_port 
> *port)
> return !(ciab.pra & SER_DCD);
>  }
>
> -static void amiga_dtr_rts(struct tty_port *port, bool raise)
> +static void amiga_dtr_rts(struct tty_port *port, bool active)
>  {
> struct serial_state *info = container_of(port, struct serial_state,
> tport);
> unsigned long flags;
>
> -   if (raise)
> +   if (active)
> info->MCR |= SER_DTR|SER_RTS;
> else
> info->MCR &= ~(SER_DTR|SER_RTS);
> diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
> index 

[PATCH v3 10/10] MAINTAINERS: add the Freescale QMC audio entry

2023-01-13 Thread Herve Codina
After contributing the component, add myself as the maintainer
for the Freescale QMC audio ASoC component.

Signed-off-by: Herve Codina 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9a574892b9b1..9dcfadec5aa3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8440,6 +8440,14 @@ F:   sound/soc/fsl/fsl*
 F: sound/soc/fsl/imx*
 F: sound/soc/fsl/mpc8610_hpcd.c
 
+FREESCALE SOC SOUND QMC DRIVER
+M: Herve Codina 
+L: alsa-de...@alsa-project.org (moderated for non-subscribers)
+L: linuxppc-dev@lists.ozlabs.org
+S: Maintained
+F: Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
+F: sound/soc/fsl/fsl_qmc_audio.c
+
 FREESCALE USB PERIPHERAL DRIVERS
 M: Li Yang 
 L: linux-...@vger.kernel.org
-- 
2.38.1



[PATCH v3 09/10] ASoC: fsl: Add support for QMC audio

2023-01-13 Thread Herve Codina
The QMC audio is an ASoC component which provides DAIs
that use the QMC (QUICC Multichannel Controller) to transfer
the audio data.

It provides as many DAIs as the number of QMC channels it
references.

Signed-off-by: Herve Codina 
---
 sound/soc/fsl/Kconfig |   9 +
 sound/soc/fsl/Makefile|   2 +
 sound/soc/fsl/fsl_qmc_audio.c | 732 ++
 3 files changed, 743 insertions(+)
 create mode 100644 sound/soc/fsl/fsl_qmc_audio.c

diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig
index 614eceda6b9e..17db29c25d96 100644
--- a/sound/soc/fsl/Kconfig
+++ b/sound/soc/fsl/Kconfig
@@ -172,6 +172,15 @@ config SND_MPC52xx_DMA
 config SND_SOC_POWERPC_DMA
tristate
 
+config SND_SOC_POWERPC_QMC_AUDIO
+   tristate "QMC ALSA SoC support"
+   depends on CPM_QMC
+   help
+ ALSA SoC Audio support using the Freescale QUICC Multichannel
+ Controller (QMC).
+ Say Y or M if you want to add support for SoC audio using Freescale
+ QMC.
+
 comment "SoC Audio support for Freescale PPC boards:"
 
 config SND_SOC_MPC8610_HPCD
diff --git a/sound/soc/fsl/Makefile b/sound/soc/fsl/Makefile
index b54beb1a66fa..8db7e97d0bd5 100644
--- a/sound/soc/fsl/Makefile
+++ b/sound/soc/fsl/Makefile
@@ -28,6 +28,7 @@ snd-soc-fsl-easrc-objs := fsl_easrc.o
 snd-soc-fsl-xcvr-objs := fsl_xcvr.o
 snd-soc-fsl-aud2htx-objs := fsl_aud2htx.o
 snd-soc-fsl-rpmsg-objs := fsl_rpmsg.o
+snd-soc-fsl-qmc-audio-objs := fsl_qmc_audio.o
 
 obj-$(CONFIG_SND_SOC_FSL_AUDMIX) += snd-soc-fsl-audmix.o
 obj-$(CONFIG_SND_SOC_FSL_ASOC_CARD) += snd-soc-fsl-asoc-card.o
@@ -44,6 +45,7 @@ obj-$(CONFIG_SND_SOC_POWERPC_DMA) += snd-soc-fsl-dma.o
 obj-$(CONFIG_SND_SOC_FSL_XCVR) += snd-soc-fsl-xcvr.o
 obj-$(CONFIG_SND_SOC_FSL_AUD2HTX) += snd-soc-fsl-aud2htx.o
 obj-$(CONFIG_SND_SOC_FSL_RPMSG) += snd-soc-fsl-rpmsg.o
+obj-$(CONFIG_SND_SOC_POWERPC_QMC_AUDIO) += snd-soc-fsl-qmc-audio.o
 
 # MPC5200 Platform Support
 obj-$(CONFIG_SND_MPC52xx_DMA) += mpc5200_dma.o
diff --git a/sound/soc/fsl/fsl_qmc_audio.c b/sound/soc/fsl/fsl_qmc_audio.c
new file mode 100644
index ..c36b493da96a
--- /dev/null
+++ b/sound/soc/fsl/fsl_qmc_audio.c
@@ -0,0 +1,732 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ALSA SoC using the QUICC Multichannel Controller (QMC)
+ *
+ * Copyright 2022 CS GROUP France
+ *
+ * Author: Herve Codina 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct qmc_dai {
+   char *name;
+   int id;
+   struct device *dev;
+   struct qmc_chan *qmc_chan;
+   unsigned int nb_tx_ts;
+   unsigned int nb_rx_ts;
+};
+
+struct qmc_audio {
+   struct device *dev;
+   unsigned int num_dais;
+   struct qmc_dai *dais;
+   struct snd_soc_dai_driver *dai_drivers;
+};
+
+struct qmc_dai_prtd {
+   struct qmc_dai *qmc_dai;
+   dma_addr_t dma_buffer_start;
+   dma_addr_t period_ptr_submitted;
+   dma_addr_t period_ptr_ended;
+   dma_addr_t dma_buffer_end;
+   size_t period_size;
+   struct snd_pcm_substream *substream;
+};
+
+static int qmc_audio_pcm_construct(struct snd_soc_component *component,
+  struct snd_soc_pcm_runtime *rtd)
+{
+   struct snd_card *card = rtd->card->snd_card;
+   int ret;
+
+   ret = dma_coerce_mask_and_coherent(card->dev, DMA_BIT_MASK(32));
+   if (ret)
+   return ret;
+
+   snd_pcm_set_managed_buffer_all(rtd->pcm, SNDRV_DMA_TYPE_DEV, card->dev,
+  64*1024, 64*1024);
+   return 0;
+}
+
+static int qmc_audio_pcm_hw_params(struct snd_soc_component *component,
+  struct snd_pcm_substream *substream,
+  struct snd_pcm_hw_params *params)
+{
+   struct snd_pcm_runtime *runtime = substream->runtime;
+   struct qmc_dai_prtd *prtd = substream->runtime->private_data;
+
+   prtd->dma_buffer_start = runtime->dma_addr;
+   prtd->dma_buffer_end = runtime->dma_addr + params_buffer_bytes(params);
+   prtd->period_size = params_period_bytes(params);
+   prtd->period_ptr_submitted = prtd->dma_buffer_start;
+   prtd->period_ptr_ended = prtd->dma_buffer_start;
+   prtd->substream = substream;
+
+   return 0;
+}
+
+static void qmc_audio_pcm_write_complete(void *context)
+{
+   struct qmc_dai_prtd *prtd = context;
+   int ret;
+
+   prtd->period_ptr_ended += prtd->period_size;
+   if (prtd->period_ptr_ended >= prtd->dma_buffer_end)
+   prtd->period_ptr_ended = prtd->dma_buffer_start;
+
+   prtd->period_ptr_submitted += prtd->period_size;
+   if (prtd->period_ptr_submitted >= prtd->dma_buffer_end)
+   prtd->period_ptr_submitted = prtd->dma_buffer_start;
+
+   ret = qmc_chan_write_submit(prtd->qmc_dai->qmc_chan,
+   prtd->period_ptr_submitted, prtd->period_size,
+   qmc_audio_pcm_write_complete, 

[PATCH v3 08/10] dt-bindings: sound: Add support for QMC audio

2023-01-13 Thread Herve Codina
The QMC (QUICC mutichannel controller) is a controller
present in some PowerQUICC SoC such as MPC885.
The QMC audio is an ASoC component that uses the QMC
controller to transfer the audio data.

Signed-off-by: Herve Codina 
---
 .../bindings/sound/fsl,qmc-audio.yaml | 117 ++
 1 file changed, 117 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml

diff --git a/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml 
b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
new file mode 100644
index ..ff5cd9241941
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
@@ -0,0 +1,117 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/sound/fsl,qmc-audio.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: QMC audio
+
+maintainers:
+  - Herve Codina 
+
+description: |
+  The QMC audio is an ASoC component which uses QMC (QUICC Multichannel
+  Controller) channels to transfer the audio data.
+  It provides as many DAI as the number of QMC channel used.
+
+allOf:
+  - $ref: dai-common.yaml#
+
+properties:
+  compatible:
+const: fsl,qmc-audio
+
+  '#address-cells':
+const: 1
+  '#size-cells':
+const: 0
+  '#sound-dai-cells':
+const: 1
+
+patternProperties:
+  '^dai@([0-9]|[1-5][0-9]|6[0-3])$':
+description:
+  A DAI managed by this controller
+type: object
+
+properties:
+  reg:
+minimum: 0
+maximum: 63
+description:
+  The DAI number
+
+  fsl,qmc-chan:
+$ref: /schemas/types.yaml#/definitions/phandle-array
+items:
+  - items:
+  - description: phandle to QMC node
+  - description: Channel number
+description:
+  Should be a phandle/number pair. The phandle to QMC node and the QMC
+  channel to use for this DAI.
+
+required:
+  - reg
+  - fsl,qmc-chan
+
+required:
+  - compatible
+  - '#address-cells'
+  - '#size-cells'
+  - '#sound-dai-cells'
+
+additionalProperties: false
+
+examples:
+  - |
+audio_controller: audio-controller {
+compatible = "fsl,qmc-audio";
+#address-cells = <1>;
+#size-cells = <0>;
+#sound-dai-cells = <1>;
+dai@16 {
+reg = <16>;
+fsl,qmc-chan = < 16>;
+};
+dai@17 {
+reg = <17>;
+fsl,qmc-chan = < 17>;
+};
+};
+
+sound {
+compatible = "simple-audio-card";
+#address-cells = <1>;
+#size-cells = <0>;
+simple-audio-card,dai-link@0 {
+reg = <0>;
+format = "dsp_b";
+cpu {
+sound-dai = <_controller 16>;
+};
+codec {
+sound-dai = <>;
+dai-tdm-slot-num = <4>;
+dai-tdm-slot-width = <8>;
+/* TS 3, 5, 7, 9 */
+dai-tdm-slot-tx-mask = <0 0 0 1 0 1 0 1 0 1>;
+dai-tdm-slot-rx-mask = <0 0 0 1 0 1 0 1 0 1>;
+};
+};
+simple-audio-card,dai-link@1 {
+reg = <1>;
+format = "dsp_b";
+cpu {
+sound-dai = <_controller 17>;
+};
+codec {
+sound-dai = <>;
+dai-tdm-slot-num = <4>;
+dai-tdm-slot-width = <8>;
+/* TS 2, 4, 6, 8 */
+dai-tdm-slot-tx-mask = <0 0 1 0 1 0 1 0 1>;
+dai-tdm-slot-rx-mask = <0 0 1 0 1 0 1 0 1>;
+};
+};
+};
-- 
2.38.1



[PATCH v3 07/10] MAINTAINERS: add the Freescale QMC controller entry

2023-01-13 Thread Herve Codina
After contributing the driver, add myself as the maintainer
for the Freescale QMC controller.

Signed-off-by: Herve Codina 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6a0605ebf8a0..9a574892b9b1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8372,6 +8372,14 @@ S:   Maintained
 F: drivers/soc/fsl/qe/
 F: include/soc/fsl/qe/
 
+FREESCALE QUICC ENGINE QMC DRIVER
+M: Herve Codina 
+L: linuxppc-dev@lists.ozlabs.org
+S: Maintained
+F: Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml
+F: drivers/soc/fsl/qe/qmc.c
+F: include/soc/fsl/qe/qmc.h
+
 FREESCALE QUICC ENGINE TSA DRIVER
 M: Herve Codina 
 L: linuxppc-dev@lists.ozlabs.org
-- 
2.38.1



[PATCH v3 06/10] soc: fsl: cmp1: Add support for QMC

2023-01-13 Thread Herve Codina
The QMC (QUICC Multichannel Controller) emulates up to 64
channels within one serial controller using the same TDM
physical interface routed from the TSA.

It is available in some PowerQUICC SoC such as the
MPC885 or MPC866.

It is also available on some Quicc Engine SoCs.
This current version support CPM1 SoCs only and some
enhancement are needed to support Quicc Engine SoCs.

Signed-off-by: Herve Codina 
---
 drivers/soc/fsl/qe/Kconfig  |   12 +
 drivers/soc/fsl/qe/Makefile |1 +
 drivers/soc/fsl/qe/qmc.c| 1531 +++
 include/soc/fsl/qe/qmc.h|   71 ++
 4 files changed, 1615 insertions(+)
 create mode 100644 drivers/soc/fsl/qe/qmc.c
 create mode 100644 include/soc/fsl/qe/qmc.h

diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig
index 60ec11c9f4d9..25b218351ae3 100644
--- a/drivers/soc/fsl/qe/Kconfig
+++ b/drivers/soc/fsl/qe/Kconfig
@@ -44,6 +44,18 @@ config CPM_TSA
  This option enables support for this
  controller
 
+config CPM_QMC
+   tristate "CPM QMC support"
+   depends on OF && HAS_IOMEM
+   depends on CPM1 || (PPC && COMPILE_TEST)
+   depends on CPM_TSA
+   help
+ Freescale CPM QUICC Multichannel Controller
+ (QMC)
+
+ This option enables support for this
+ controller
+
 config QE_TDM
bool
default y if FSL_UCC_HDLC
diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile
index 45c961acc81b..ec8506e13113 100644
--- a/drivers/soc/fsl/qe/Makefile
+++ b/drivers/soc/fsl/qe/Makefile
@@ -5,6 +5,7 @@
 obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o
 obj-$(CONFIG_CPM)  += qe_common.o
 obj-$(CONFIG_CPM_TSA)  += tsa.o
+obj-$(CONFIG_CPM_QMC)  += qmc.o
 obj-$(CONFIG_UCC)  += ucc.o
 obj-$(CONFIG_UCC_SLOW) += ucc_slow.o
 obj-$(CONFIG_UCC_FAST) += ucc_fast.o
diff --git a/drivers/soc/fsl/qe/qmc.c b/drivers/soc/fsl/qe/qmc.c
new file mode 100644
index ..85a99538bde7
--- /dev/null
+++ b/drivers/soc/fsl/qe/qmc.c
@@ -0,0 +1,1531 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * QMC driver
+ *
+ * Copyright 2022 CS GROUP France
+ *
+ * Author: Herve Codina 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "tsa.h"
+
+/* SCC general mode register high (32 bits) */
+#define SCC_GSMRL  0x00
+#define SCC_GSMRL_ENR  (1 << 5)
+#define SCC_GSMRL_ENT  (1 << 4)
+#define SCC_GSMRL_MODE_QMC (0x0A << 0)
+
+/* SCC general mode register low (32 bits) */
+#define SCC_GSMRH  0x04
+#define   SCC_GSMRH_CTSS   (1 << 7)
+#define   SCC_GSMRH_CDS(1 << 8)
+#define   SCC_GSMRH_CTSP   (1 << 9)
+#define   SCC_GSMRH_CDP(1 << 10)
+
+/* SCC event register (16 bits) */
+#define SCC_SCCE   0x10
+#define   SCC_SCCE_IQOV(1 << 3)
+#define   SCC_SCCE_GINT(1 << 2)
+#define   SCC_SCCE_GUN (1 << 1)
+#define   SCC_SCCE_GOV (1 << 0)
+
+/* SCC mask register (16 bits) */
+#define SCC_SCCM   0x14
+/* Multichannel base pointer (32 bits) */
+#define QMC_GBL_MCBASE 0x00
+/* Multichannel controller state (16 bits) */
+#define QMC_GBL_QMCSTATE   0x04
+/* Maximum receive buffer length (16 bits) */
+#define QMC_GBL_MRBLR  0x06
+/* Tx time-slot assignment table pointer (16 bits) */
+#define QMC_GBL_TX_S_PTR   0x08
+/* Rx pointer (16 bits) */
+#define QMC_GBL_RXPTR  0x0A
+/* Global receive frame threshold (16 bits) */
+#define QMC_GBL_GRFTHR 0x0C
+/* Global receive frame count (16 bits) */
+#define QMC_GBL_GRFCNT 0x0E
+/* Multichannel interrupt base address (32 bits) */
+#define QMC_GBL_INTBASE0x10
+/* Multichannel interrupt pointer (32 bits) */
+#define QMC_GBL_INTPTR 0x14
+/* Rx time-slot assignment table pointer (16 bits) */
+#define QMC_GBL_RX_S_PTR   0x18
+/* Tx pointer (16 bits) */
+#define QMC_GBL_TXPTR  0x1A
+/* CRC constant (32 bits) */
+#define QMC_GBL_C_MASK32   0x1C
+/* Time slot assignment table Rx (32 x 16 bits) */
+#define QMC_GBL_TSATRX 0x20
+/* Time slot assignment table Tx (32 x 16 bits) */
+#define QMC_GBL_TSATTX 0x60
+/* CRC constant (16 bits) */
+#define QMC_GBL_C_MASK16   0xA0
+
+/* TSA entry (16bit entry in TSATRX and TSATTX) */
+#define QMC_TSA_VALID  (1 << 15)
+#define QMC_TSA_WRAP   (1 << 14)
+#define QMC_TSA_MASK   (0x303F)
+#define QMC_TSA_CHANNEL(x) ((x) << 6)
+
+/* Tx buffer descriptor base address (16 bits, offset from MCBASE) */
+#define QMC_SPE_TBASE  0x00
+
+/* Channel mode register (16 bits) */
+#define QMC_SPE_CHAMR  0x02
+#define   QMC_SPE_CHAMR_MODE_HDLC  (1 << 15)
+#define   QMC_SPE_CHAMR_MODE_TRANSP((0 << 15) | (1 << 13))
+#define   QMC_SPE_CHAMR_ENT(1 << 12)
+#define   QMC_SPE_CHAMR_POL(1 << 8)
+#define   QMC_SPE_CHAMR_HDLC_IDLM  (1 << 13)
+#define   

[PATCH v3 05/10] dt-bindings: soc: fsl: cpm_qe: Add QMC controller

2023-01-13 Thread Herve Codina
Add support for the QMC (QUICC Multichannel Controller)
available in some PowerQUICC SoC such as MPC885 or MPC866.

Signed-off-by: Herve Codina 
---
 .../bindings/soc/fsl/cpm_qe/fsl,qmc.yaml  | 164 ++
 1 file changed, 164 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml

diff --git a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml
new file mode 100644
index ..3ec52f1635c8
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml
@@ -0,0 +1,164 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/fsl/cpm_qe/fsl,qmc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: PowerQUICC CPM QUICC Multichannel Controller (QMC)
+
+maintainers:
+  - Herve Codina 
+
+description: |
+  The QMC (QUICC Multichannel Controller) emulates up to 64 channels within
+  one serial controller using the same TDM physical interface routed from
+  TSA.
+
+properties:
+  compatible:
+items:
+  - enum:
+  - fsl,mpc885-scc-qmc
+  - fsl,mpc866-scc-qmc
+  - const: fsl,cpm1-scc-qmc
+
+  reg:
+items:
+  - description: SCC (Serial communication controller) register base
+  - description: SCC parameter ram base
+  - description: Dual port ram base
+
+  reg-names:
+items:
+  - const: scc_regs
+  - const: scc_pram
+  - const: dpram
+
+  interrupts:
+maxItems: 1
+description: SCC interrupt line in the CPM interrupt controller
+
+  fsl,tsa:
+$ref: /schemas/types.yaml#/definitions/phandle
+description: phandle to the TSA
+
+  fsl,tsa-cell-id:
+$ref: /schemas/types.yaml#/definitions/uint32
+enum: [1, 2, 3]
+description: |
+  TSA cell ID (dt-bindings/soc/fsl,tsa.h defines these values)
+   - 1: SCC2
+   - 2: SCC3
+   - 3: SCC4
+
+  '#address-cells':
+const: 1
+
+  '#size-cells':
+const: 0
+
+  '#chan-cells':
+const: 1
+
+patternProperties:
+  '^channel@([0-9]|[1-5][0-9]|6[0-3])$':
+description:
+  A channel managed by this controller
+type: object
+
+properties:
+  reg:
+minimum: 0
+maximum: 63
+description:
+  The channel number
+
+  fsl,mode:
+$ref: /schemas/types.yaml#/definitions/string
+enum: [transparent, hdlc]
+default: transparent
+description: Operational mode
+
+  fsl,reverse-data:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  The bit order as seen on the channels is reversed,
+  transmitting/receiving the MSB of each octet first.
+  This flag is used only in 'transparent' mode.
+
+  fsl,tx-ts-mask:
+$ref: /schemas/types.yaml#/definitions/uint64
+description:
+  Channel assigned Tx time-slots within the Tx time-slots routed
+  by the TSA to this cell.
+
+  fsl,rx-ts-mask:
+$ref: /schemas/types.yaml#/definitions/uint64
+description:
+  Channel assigned Rx time-slots within the Rx time-slots routed
+  by the TSA to this cell.
+
+required:
+  - reg
+  - fsl,tx-ts-mask
+  - fsl,rx-ts-mask
+
+required:
+  - compatible
+  - reg
+  - reg-names
+  - interrupts
+  - fsl,tsa
+  - fsl,tsa-cell-id
+  - '#address-cells'
+  - '#size-cells'
+  - '#chan-cells'
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+
+qmc@a60 {
+compatible = "fsl,mpc885-scc-qmc", "fsl,cpm1-scc-qmc";
+reg = <0xa60 0x20>,
+  <0x3f00 0xc0>,
+  <0x2000 0x1000>;
+reg-names = "scc_regs", "scc_pram", "dpram";
+interrupts = <27>;
+interrupt-parent = <_PIC>;
+
+#address-cells = <1>;
+#size-cells = <0>;
+#chan-cells = <1>;
+
+fsl,tsa = <>;
+fsl,tsa-cell-id = ;
+
+channel@16 {
+/* Ch16 : First 4 even TS from all routed from TSA */
+reg = <16>;
+fsl,mode = "transparent";
+fsl,reverse-data;
+fsl,tx-ts-mask = <0x 0x00aa>;
+fsl,rx-ts-mask = <0x 0x00aa>;
+};
+
+channel@17 {
+/* Ch17 : First 4 odd TS from all routed from TSA */
+reg = <17>;
+fsl,mode = "transparent";
+fsl,reverse-data;
+fsl,tx-ts-mask = <0x 0x0055>;
+fsl,rx-ts-mask = <0x 0x0055>;
+};
+
+channel@19 {
+/* Ch19 : 8 TS (TS 8..15) from all routed from TSA */
+reg = <19>;
+fsl,mode = "hdlc";
+fsl,tx-ts-mask = <0x 0xff00>;
+fsl,rx-ts-mask = <0x 0xff00>;
+};
+};
-- 
2.38.1



[PATCH v3 04/10] powerpc/8xx: Use a larger CPM1 command check mask

2023-01-13 Thread Herve Codina
The CPM1 command mask is defined for use with the standard
CPM1 command register as described in the user's manual:
  0  |13|47|8   11|12  14| 15|
  RST|- |OPCODE|CH_NUM| -|FLG|

In the QMC extension the CPM1 command register is redefined
(QMC supplement user's manuel) with the following mapping:
  0  |13|47|8   13|14| 15|
  RST|QMC OPCODE|  1110|CHANNEL_NUMBER| -|FLG|

Extend the check command mask in order to support both the
standard CH_NUM field and the QMC extension CHANNEL_NUMBER
field.

Signed-off-by: Herve Codina 
Acked-by: Christophe Leroy 
---
 arch/powerpc/platforms/8xx/cpm1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/8xx/cpm1.c 
b/arch/powerpc/platforms/8xx/cpm1.c
index 8ef1f4392086..6b828b9f90d9 100644
--- a/arch/powerpc/platforms/8xx/cpm1.c
+++ b/arch/powerpc/platforms/8xx/cpm1.c
@@ -100,7 +100,7 @@ int cpm_command(u32 command, u8 opcode)
int i, ret;
unsigned long flags;
 
-   if (command & 0xff0f)
+   if (command & 0xff03)
return -EINVAL;
 
spin_lock_irqsave(_lock, flags);
-- 
2.38.1



[PATCH v3 03/10] MAINTAINERS: add the Freescale TSA controller entry

2023-01-13 Thread Herve Codina
After contributing the driver, add myself as the maintainer
for the Freescale TSA controller.

Signed-off-by: Herve Codina 
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7f86d02cb427..6a0605ebf8a0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8372,6 +8372,15 @@ S:   Maintained
 F: drivers/soc/fsl/qe/
 F: include/soc/fsl/qe/
 
+FREESCALE QUICC ENGINE TSA DRIVER
+M: Herve Codina 
+L: linuxppc-dev@lists.ozlabs.org
+S: Maintained
+F: Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml
+F: drivers/soc/fsl/qe/tsa.c
+F: drivers/soc/fsl/qe/tsa.h
+F: include/dt-bindings/soc/fsl,tsa.h
+
 FREESCALE QUICC ENGINE UCC ETHERNET DRIVER
 M: Li Yang 
 L: net...@vger.kernel.org
-- 
2.38.1



[PATCH v3 01/10] dt-bindings: soc: fsl: cpm_qe: Add TSA controller

2023-01-13 Thread Herve Codina
Add support for the time slot assigner (TSA)
available in some PowerQUICC SoC such as MPC885
or MPC866.

Signed-off-by: Herve Codina 
---
 .../bindings/soc/fsl/cpm_qe/fsl,tsa.yaml  | 260 ++
 include/dt-bindings/soc/fsl,tsa.h |  13 +
 2 files changed, 273 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml
 create mode 100644 include/dt-bindings/soc/fsl,tsa.h

diff --git a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml
new file mode 100644
index ..eb17b6119abd
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml
@@ -0,0 +1,260 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/fsl/cpm_qe/fsl,tsa.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: PowerQUICC CPM Time-slot assigner (TSA) controller
+
+maintainers:
+  - Herve Codina 
+
+description: |
+  The TSA is the time-slot assigner that can be found on some
+  PowerQUICC SoC.
+  Its purpose is to route some TDM time-slots to other internal
+  serial controllers.
+
+properties:
+  compatible:
+items:
+  - enum:
+  - fsl,mpc885-tsa
+  - fsl,mpc866-tsa
+  - const: fsl,cpm1-tsa
+
+  reg:
+items:
+  - description: SI (Serial Interface) register base
+  - description: SI RAM base
+
+  reg-names:
+items:
+  - const: si_regs
+  - const: si_ram
+
+  '#address-cells':
+const: 1
+
+  '#size-cells':
+const: 0
+
+patternProperties:
+  '^tdm@[0-1]$':
+description:
+  The TDM managed by this controller
+type: object
+
+properties:
+  reg:
+minimum: 0
+maximum: 1
+description:
+  The TDM number for this TDM, 0 for TDMa and 1 for TDMb
+
+  fsl,common-rxtx-pins:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  The hardware can use four dedicated pins for Tx clock,
+  Tx sync, Rx clock and Rx sync or use only two pins,
+  Tx/Rx clock and Rx/Rx sync.
+  Without the 'fsl,common-rxtx-pins' property, the four
+  pins are used. With the 'fsl,common-rxtx-pins' property,
+  two pins are used.
+
+  clocks:
+minItems: 2
+maxItems: 4
+
+  clock-names:
+minItems: 2
+maxItems: 4
+
+  fsl,mode:
+$ref: /schemas/types.yaml#/definitions/string
+enum: [normal, echo, internal-loopback, control-loopback]
+default: normal
+description: |
+  Operational mode:
+- normal:
+Normal operation
+- echo:
+Automatic echo. Rx data is resent on Tx
+- internal-loopback:
+The TDM transmitter is connected to the receiver.
+Data appears on Tx pin.
+- control-loopback:
+The TDM transmitter is connected to the receiver.
+The Tx pin is disconnected.
+
+  fsl,rx-frame-sync-delay-bits:
+enum: [0, 1, 2, 3]
+default: 0
+description: |
+  Receive frame sync delay in number of bits.
+  Indicates the delay between the Rx sync and the first bit of the
+  Rx frame. 0 for no bit delay. 1, 2 or 3 for 1, 2 or 3 bits delay.
+
+  fsl,tx-frame-sync-delay-bits:
+enum: [0, 1, 2, 3]
+default: 0
+description: |
+  Transmit frame sync delay in number of bits.
+  Indicates the delay between the Tx sync and the first bit of the
+  Tx frame. 0 for no bit delay. 1, 2 or 3 for 1, 2 or 3 bits delay.
+
+  fsl,clock-falling-edge:
+$ref: /schemas/types.yaml#/definitions/flag
+description: |
+  Data is sent on falling edge of the clock (and received on the
+  rising edge).
+  If 'clock-falling-edge' is not present, data is sent on the
+  rising edge (and received on the falling edge).
+
+  fsl,fsync-rising-edge:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  Frame sync pulses are sampled with the rising edge of the channel
+  clock. If 'fsync-rising-edge' is not present, pulses are sample
+  with e falling edge.
+
+  fsl,double-speed-clock:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  The channel clock is twice the data rate.
+
+  fsl,tx-ts-routes:
+$ref: /schemas/types.yaml#/definitions/uint32-matrix
+description: |
+  A list of tupple that indicates the Tx time-slots routes.
+tx_ts_routes =
+   < 2 0 >, /* The first 2 time slots are not used */
+   < 3 1 >, /* The next 3 ones are route to SCC2 */
+   < 4 0 >, /* The next 4 ones are not used */
+   < 2 2 >; /* The nest 2 ones are route to SCC3 */
+

[PATCH v3 02/10] soc: fsl: cpm1: Add support for TSA

2023-01-13 Thread Herve Codina
The TSA (Time Slot Assigner) purpose is to route some
TDM time-slots to other internal serial controllers.

It is available in some PowerQUICC SoC such as the
MPC885 or MPC866.

It is also available on some Quicc Engine SoCs.
This current version support CPM1 SoCs only and some
enhancement are needed to support Quicc Engine SoCs.

Signed-off-by: Herve Codina 
---
 drivers/soc/fsl/qe/Kconfig  |  11 +
 drivers/soc/fsl/qe/Makefile |   1 +
 drivers/soc/fsl/qe/tsa.c| 810 
 drivers/soc/fsl/qe/tsa.h|  43 ++
 4 files changed, 865 insertions(+)
 create mode 100644 drivers/soc/fsl/qe/tsa.c
 create mode 100644 drivers/soc/fsl/qe/tsa.h

diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig
index 357c5800b112..60ec11c9f4d9 100644
--- a/drivers/soc/fsl/qe/Kconfig
+++ b/drivers/soc/fsl/qe/Kconfig
@@ -33,6 +33,17 @@ config UCC
bool
default y if UCC_FAST || UCC_SLOW
 
+config CPM_TSA
+   tristate "CPM TSA support"
+   depends on OF && HAS_IOMEM
+   depends on CPM1 || (PPC && COMPILE_TEST)
+   help
+ Freescale CPM Time Slot Assigner (TSA)
+ controller.
+
+ This option enables support for this
+ controller
+
 config QE_TDM
bool
default y if FSL_UCC_HDLC
diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile
index 55a555304f3a..45c961acc81b 100644
--- a/drivers/soc/fsl/qe/Makefile
+++ b/drivers/soc/fsl/qe/Makefile
@@ -4,6 +4,7 @@
 #
 obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o
 obj-$(CONFIG_CPM)  += qe_common.o
+obj-$(CONFIG_CPM_TSA)  += tsa.o
 obj-$(CONFIG_UCC)  += ucc.o
 obj-$(CONFIG_UCC_SLOW) += ucc_slow.o
 obj-$(CONFIG_UCC_FAST) += ucc_fast.o
diff --git a/drivers/soc/fsl/qe/tsa.c b/drivers/soc/fsl/qe/tsa.c
new file mode 100644
index ..58b574cf37e2
--- /dev/null
+++ b/drivers/soc/fsl/qe/tsa.c
@@ -0,0 +1,810 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TSA driver
+ *
+ * Copyright 2022 CS GROUP France
+ *
+ * Author: Herve Codina 
+ */
+
+#include "tsa.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* TSA SI RAM routing tables entry */
+#define TSA_SIRAM_ENTRY_LAST   (1 << 16)
+#define TSA_SIRAM_ENTRY_BYTE   (1 << 17)
+#define TSA_SIRAM_ENTRY_CNT(x) (((x) & 0x0f) << 18)
+#define TSA_SIRAM_ENTRY_CSEL_MASK  (0x7 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_NU(0x0 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SCC2  (0x2 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SCC3  (0x3 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SCC4  (0x4 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SMC1  (0x5 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SMC2  (0x6 << 22)
+
+/* SI mode register (32 bits) */
+#define TSA_SIMODE 0x00
+#define   TSA_SIMODE_SMC2  0x8000
+#define   TSA_SIMODE_SMC1  0x8000
+#define   TSA_SIMODE_TDMA(x)   ((x) << 0)
+#define   TSA_SIMODE_TDMB(x)   ((x) << 16)
+#define TSA_SIMODE_TDM_MASK0x0fff
+#define TSA_SIMODE_TDM_SDM_MASK0x0c00
+#define   TSA_SIMODE_TDM_SDM_NORM  0x
+#define   TSA_SIMODE_TDM_SDM_ECHO  0x0400
+#define   TSA_SIMODE_TDM_SDM_INTL_LOOP 0x0800
+#define   TSA_SIMODE_TDM_SDM_LOOP_CTRL 0x0c00
+#define TSA_SIMODE_TDM_RFSD(x) ((x) << 8)
+#define TSA_SIMODE_TDM_DSC 0x0080
+#define TSA_SIMODE_TDM_CRT 0x0040
+#define TSA_SIMODE_TDM_STZ 0x0020
+#define TSA_SIMODE_TDM_CE  0x0010
+#define TSA_SIMODE_TDM_FE  0x0008
+#define TSA_SIMODE_TDM_GM  0x0004
+#define TSA_SIMODE_TDM_TFSD(x) ((x) << 0)
+
+/* SI global mode register (8 bits) */
+#define TSA_SIGMR  0x04
+#define TSA_SIGMR_ENB  (1<<3)
+#define TSA_SIGMR_ENA  (1<<2)
+#define TSA_SIGMR_RDM_MASK 0x03
+#define   TSA_SIGMR_RDM_STATIC_TDMA0x00
+#define   TSA_SIGMR_RDM_DYN_TDMA   0x01
+#define   TSA_SIGMR_RDM_STATIC_TDMAB   0x02
+#define   TSA_SIGMR_RDM_DYN_TDMAB  0x03
+
+/* SI status register (8 bits) */
+#define TSA_SISTR  0x06
+
+/* SI command register (8 bits) */
+#define TSA_SICMR  0x07
+
+/* SI clock route register (32 bits) */
+#define TSA_SICR   0x0C
+#define   TSA_SICR_SCC2(x) ((x) << 8)
+#define   TSA_SICR_SCC3(x) ((x) << 16)
+#define   TSA_SICR_SCC4(x) ((x) << 24)
+#define TSA_SICR_SCC_MASK  0x0ff
+#define TSA_SICR_SCC_GRX   (1 << 7)
+#define TSA_SICR_SCC_SCX_TSA   (1 << 6)
+#define TSA_SICR_SCC_RXCS_MASK (0x7 << 3)
+#define   TSA_SICR_SCC_RXCS_BRG1   (0x0 << 3)
+#define   TSA_SICR_SCC_RXCS_BRG2   (0x1 << 3)
+#define   TSA_SICR_SCC_RXCS_BRG3   (0x2 << 3)
+#define   TSA_SICR_SCC_RXCS_BRG4   (0x3 << 3)
+#define   TSA_SICR_SCC_RXCS_CLK15  (0x4 

[PATCH v3 00/10] Add the PowerQUICC audio support using the QMC

2023-01-13 Thread Herve Codina
Hi,

This series adds support for audio using the QMC controller
available in some Freescale PowerQUICC SoCs.

This series contains three parts in order to show the different
blocks hierarchy and their usage in this support.

The first one is related to TSA (Time Slot Assigner).
The TSA handles the data present at the pin level (TDM with up
to 64 time slots) and dispatchs them to one or more serial
controller (SCC).

The second is related to QMC (QUICC Multichannel Controller).
The QMC handles the data at the serial controller (SCC) level
and splits again the data to creates some virtual channels.

The last one is related to the audio component (QMC audio).
It is the glue between the QMC controller and the ASoC
component. It handles one or more QMC virtual channels and
creates one DAI per QMC virtual channels handled.

Compared to the v2 series, this v3 series mainly:
  - adds modification in the DT bindings,
  - uses generic io{read,write}be{16,32} for registers
accesses instead of the specific PowerPC ones.
  - updates some commit subjects and logs (CPM1 SoCs supports).

Best regards,
Herve Codina

Changes v2 -> v3
  - All bindings
Rename fsl-tsa.h to fsl,tsa.h
Add missing vendor prefix
Various fixes (quotes, node names, upper/lower case)

  - patches 1 and 2 (TSA binding specific)
Remove 'reserved' values in the routing tables
Remove fsl,grant-mode
Add a better description for 'fsl,common-rxtx-pins'
Fix clocks/clocks-name handling against fsl,common-rxtx-pins
Add information related to the delays unit
Removed FSL_CPM_TSA_NBCELL
Fix license in binding header file fsl,tsa.h

  - patches 5 and 6 (QMC binding specific)
Remove fsl,cpm-command property
Add interrupt property constraint

  - patches 8 and 9 (QMC audio binding specific)
Remove 'items' in compatible property definition
Add missing 'dai-common.yaml' reference
Fix the qmc_chan phandle definition

  - patch 2 and 6
Use io{read,write}be{32,16}
Change commit subjects and logs

  - patch 4
Add 'Acked-by: Christophe Leroy '

Changes v1 -> v2:
  - patch 2 and 6
Fix kernel test robot errors

  - other patches
No changes

Herve Codina (10):
  dt-bindings: soc: fsl: cpm_qe: Add TSA controller
  soc: fsl: cpm1: Add support for TSA
  MAINTAINERS: add the Freescale TSA controller entry
  powerpc/8xx: Use a larger CPM1 command check mask
  dt-bindings: soc: fsl: cpm_qe: Add QMC controller
  soc: fsl: cmp1: Add support for QMC
  MAINTAINERS: add the Freescale QMC controller entry
  dt-bindings: sound: Add support for QMC audio
  ASoC: fsl: Add support for QMC audio
  MAINTAINERS: add the Freescale QMC audio entry

 .../bindings/soc/fsl/cpm_qe/fsl,qmc.yaml  |  164 ++
 .../bindings/soc/fsl/cpm_qe/fsl,tsa.yaml  |  260 +++
 .../bindings/sound/fsl,qmc-audio.yaml |  117 ++
 MAINTAINERS   |   25 +
 arch/powerpc/platforms/8xx/cpm1.c |2 +-
 drivers/soc/fsl/qe/Kconfig|   23 +
 drivers/soc/fsl/qe/Makefile   |2 +
 drivers/soc/fsl/qe/qmc.c  | 1531 +
 drivers/soc/fsl/qe/tsa.c  |  810 +
 drivers/soc/fsl/qe/tsa.h  |   43 +
 include/dt-bindings/soc/fsl,tsa.h |   13 +
 include/soc/fsl/qe/qmc.h  |   71 +
 sound/soc/fsl/Kconfig |9 +
 sound/soc/fsl/Makefile|2 +
 sound/soc/fsl/fsl_qmc_audio.c |  732 
 15 files changed, 3803 insertions(+), 1 deletion(-)
 create mode 100644 
Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml
 create mode 100644 
Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml
 create mode 100644 Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
 create mode 100644 drivers/soc/fsl/qe/qmc.c
 create mode 100644 drivers/soc/fsl/qe/tsa.c
 create mode 100644 drivers/soc/fsl/qe/tsa.h
 create mode 100644 include/dt-bindings/soc/fsl,tsa.h
 create mode 100644 include/soc/fsl/qe/qmc.h
 create mode 100644 sound/soc/fsl/fsl_qmc_audio.c

-- 
2.38.1



Re: lockref scalability on x86-64 vs cpu_relax

2023-01-13 Thread Peter Zijlstra
On Thu, Jan 12, 2023 at 06:13:16PM -0600, Linus Torvalds wrote:
> On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik  wrote:
> >
> > To my understanding on said architecture failed cmpxchg still grants you
> > exclusive access to the cacheline, making immediate retry preferable
> > when trying to inc/dec unless a certain value is found.
> 
> I actually suspect that is _always_ the case - this is not like a
> contended spinlock where we want to pause because we're waiting for
> the value to change and become unlocked, this cmpxchg loop is likely
> always better off just retrying with the new value.
> 
> That said, the "likely always better off" is purely about performance.
> 
> So I have this suspicion that the reason Tony added the cpu_relax()
> was simply not about performance, but about other issues, like
> fairness in SMT situations.
> 
> That said, evern from a fairness perspective the cpu_relax() sounds a
> bit odd and unlikely - we're literally yielding when we lost a race,
> so it hurts the _loser_, not the winner, and thus might make fairness
> worse too.

I've been writing cmpxchg loops that have strict termination conditions
without cpu_relax() in them for a while now.

For example, the x86 atomic_fetch_and() implementation looks like so:

static __always_inline int arch_atomic_fetch_and(int i, atomic_t *v)
{
int val = arch_atomic_read(v);

do { } while (!arch_atomic_try_cmpxchg(v, , val & i));

return val;
}

And I did that because of the exact same argument you had above, it
needs to do the op anyway, waiting between failed attempts will only
increase the chance it will fail again.


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-13 Thread Will Deacon
On Fri, Jan 13, 2023 at 02:12:50AM +0100, Mateusz Guzik wrote:
> On 1/13/23, Linus Torvalds  wrote:
> > Side note on your access() changes - if it turns out that you can
> > remove all the cred games, we should possibly then revert my old
> > commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> > temporary subjective credentials") which avoided the biggest issue
> > with the unnecessary cred switching.
> >
> > I *think* access() is the only user of that special 'non_rcu' thing,
> > but it is possible that the whole 'non_rcu' thing ends up mattering
> > for cases where the cred actually does change because euid != uid (ie
> > suid programs), so this would need a bit more effort to do performance
> > testing on.
> >
> 
> I don't think the games are avoidable. For one I found non-root
> processes with non-empty cap_effective even on my laptop, albeit I did
> not check how often something like this is doing access().
> 
> Discussion for another time.
> 
> > On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik  wrote:
> >> All that said, I think the thing to do here is to replace cpu_relax
> >> with a dedicated arch-dependent macro, akin to the following:
> >
> > I would actually prefer just removing it entirely and see if somebody
> > else hollers. You have the numbers to prove it hurts on real hardware,
> > and I don't think we have any numbers to the contrary.
> >
> > So I think it's better to trust the numbers and remove it as a
> > failure, than say "let's just remove it on x86-64 and leave everybody
> > else with the potentially broken code"
> >
> [snip]
> > Then other architectures can try to run their numbers, and only *if*
> > it then turns out that they have a reason to do something else should
> > we make this conditional and different on different architectures.
> >
> > Let's try to keep the code as common as possibly until we have hard
> > evidence for special cases, in other words.
> >
> 
> I did not want to make such a change without redoing the ThunderX2
> benchmark, or at least something else arm64-y. I may be able to bench it
> tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
> no arm64 people show up with their results.
> 
> Even then IMHO the safest route is to patch it out on x86-64 and give
> other people time to bench their archs as they get around to it, and
> ultimately whack the thing if it turns out nobody benefits from it.
> I would say beats backpedaling on the removal, but I'm not going to
> fight for it.
> 
> That said, does waiting for arm64 numbers and/or producing them for the
> removal commit message sound like a plan? If so, I'll post soon(tm).

Honestly, I wouldn't worry about us (arm64) here. I don't think any real
hardware implements the YIELD instruction (i.e. it behaves as a NOP in
practice). The only place I'm aware of where it _does_ something is in
QEMU, which was actually the motivation behind having it in cpu_relax() to
start with (see 1baa82f48030 ("arm64: Implement cpu_relax as yield")).

So, from the arm64 side of the fence, I'm perfectly happy just removing
the cpu_relax() calls from lockref.

Will


Re: [PATCH] kallsyms: Fix scheduling with interrupts disabled in self-test

2023-01-13 Thread Petr Mladek
On Thu 2023-01-12 10:24:43, Luis Chamberlain wrote:
> On Thu, Jan 12, 2023 at 08:54:26PM +1000, Nicholas Piggin wrote:
> > kallsyms_on_each* may schedule so must not be called with interrupts
> > disabled. The iteration function could disable interrupts, but this
> > also changes lookup_symbol() to match the change to the other timing
> > code.
> > 
> > Reported-by: Erhard F. 
> > Link: 
> > https://lore.kernel.org/all/bug-216902-206...@https.bugzilla.kernel.org%2F/
> > Reported-by: kernel test robot 
> > Link: 
> > https://lore.kernel.org/oe-lkp/202212251728.8d0872ff-oliver.s...@intel.com
> > Fixes: 30f3bb09778d ("kallsyms: Add self-test facility")
> > Signed-off-by: Nicholas Piggin 
> > ---
> 
> Thanks Nicholas!
> 
> Petr had just suggested removing this aspect of the selftests, the performance
> test as its specific to the config, it doesn't run many times to get an
> average and odd things on a system can create different metrics. Zhen Lei
> had given up on fixing it and has a patch to instead remove this part of
> the selftest.
> 
> I still find value in keeping it, but Petr, would like your opinion on
> this fix, if we were to keep it.

I am fine with this fix.

It increases a risk of possible inaccuracy of the measured time.
It would count also time spent on unrelated interrupts and eventual
rescheduling.

Anyway, it is safe at least. I was against the previous attempts to
fix this problem because they might have caused problems for
the rest of the system.

Best Regards,
Petr