Re: [PATCH net-next 00/10] net: mdio: Continue separating C22 and C45
Hello: This series was applied to netdev/net-next.git (master) by Jakub Kicinski : On Thu, 12 Jan 2023 16:15:07 +0100 you wrote: > I've picked this older series from Andrew up and rebased it onto > the latest net-next. > > This is the second patch set in the series which separates the C22 > and C45 MDIO bus transactions at the API level to the MDIO bus drivers. > > Signed-off-by: Michael Walle > > [...] Here is the summary with links: - [net-next,01/10] net: mdio: cavium: Separate C22 and C45 transactions https://git.kernel.org/netdev/net-next/c/93641ecbaa1f - [net-next,02/10] net: mdio: i2c: Separate C22 and C45 transactions https://git.kernel.org/netdev/net-next/c/87e3bee0f247 - [net-next,03/10] net: mdio: mux-bcm-iproc: Separate C22 and C45 transactions https://git.kernel.org/netdev/net-next/c/d544a25930a7 - [net-next,04/10] net: mdio: aspeed: Separate C22 and C45 transactions https://git.kernel.org/netdev/net-next/c/c3c497eb8b24 - [net-next,05/10] net: mdio: ipq4019: Separate C22 and C45 transactions https://git.kernel.org/netdev/net-next/c/c58e39942adf - [net-next,06/10] net: ethernet: mtk_eth_soc: Separate C22 and C45 transactions https://git.kernel.org/netdev/net-next/c/900888374e73 - [net-next,07/10] net: lan743x: Separate C22 and C45 transactions https://git.kernel.org/netdev/net-next/c/3d90c03cb416 - [net-next,08/10] net: stmmac: Separate C22 and C45 transactions for xgmac2 https://git.kernel.org/netdev/net-next/c/5b0a447efff5 - [net-next,09/10] net: stmmac: Separate C22 and C45 transactions for xgmac https://git.kernel.org/netdev/net-next/c/3c7826d0b106 - [net-next,10/10] enetc: Separate C22 and C45 transactions https://git.kernel.org/netdev/net-next/c/80e87442e69b You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH net v5] net: wan: Add checks for NULL for utdm in undo_uhdlc_init and unmap_si_regs
Hello: This patch was applied to netdev/net.git (master) by Jakub Kicinski : On Thu, 12 Jan 2023 10:47:03 +0300 you wrote: > If uhdlc_priv_tsa != 1 then utdm is not initialized. > And if ret != NULL then goto undo_uhdlc_init, where > utdm is dereferenced. Same if dev == NULL. > > Found by Astra Linux on behalf of Linux Verification Center > (linuxtesting.org) with SVACE. > > [...] Here is the summary with links: - [net,v5] net: wan: Add checks for NULL for utdm in undo_uhdlc_init and unmap_si_regs https://git.kernel.org/netdev/net/c/488e0bf7f34a You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH] lockref: stop doing cpu_relax in the cmpxchg loop
On Fri, Jan 13, 2023 at 3:47 PM Luck, Tony wrote: > > The computer necrophiliacs at Debian and Gentoo seem determined > to keep ia64 alive. > > So perhaps this should s/cpu_relax/soemt_relax/ where soemt_relax > is a no-op everywhere except ia64, which can define it as cpu_relax. Heh. I already took your earlier "$ git rm -r arch/ia64" comment as an ack for not really caring about ia64. I suspect nobody will notice, and if ia64 is the only reason to do this, I really don't think it would be worth it. Linus
Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)
On Fri, 13 Jan 2023 at 22:06, John Paul Adrian Glaubitz wrote: > > Hello Ard! > > > Can I take that as an ack on [0]? The EFI subsystem has evolved > > substantially over the years, and there is really no way to do any > > IA64 testing beyond build testing, so from that perspective, dropping > > it entirely would be welcomed. > > ia64 is regularly tested in Debian and Gentoo [1][2]. > > Debian's ia64 porterbox yttrium runs a recent kernel without issues: > > root@yttrium:~# uname -a > Linux yttrium 5.19.0-2-mckinley #1 SMP Debian 5.19.11-1 (2022-09-24) ia64 > GNU/Linux > root@yttrium:~# > > root@yttrium:~# journalctl -b|head -n10 > Nov 14 14:46:10 yttrium kernel: Linux version 5.19.0-2-mckinley > (debian-ker...@lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld > (GNU Binutils for Debian) 2.39) #1 SMP Debian 5.19.11-1 (2022-09-24) > Nov 14 14:46:10 yttrium kernel: efi: EFI v2.10 by HP > Nov 14 14:46:10 yttrium kernel: efi: SALsystab=0xdfdd63a18 ESI=0xdfdd63f18 > ACPI 2.0=0x3d3c4014 HCDP=0xd8798 SMBIOS=0x3d368000 > Nov 14 14:46:10 yttrium kernel: PCDP: v3 at 0xd8798 > Nov 14 14:46:10 yttrium kernel: earlycon: uart8250 at I/O port 0x4000 > (options '115200n8') > Nov 14 14:46:10 yttrium kernel: printk: bootconsole [uart8250] enabled > Nov 14 14:46:10 yttrium kernel: ACPI: Early table checksum verification > disabled > Nov 14 14:46:10 yttrium kernel: ACPI: RSDP 0x3D3C4014 24 (v02 HP > ) > Nov 14 14:46:10 yttrium kernel: ACPI: XSDT 0x3D3C4580 000124 (v01 HP >RX2800-2 0001 0113) > Nov 14 14:46:10 yttrium kernel: ACPI: FACP 0x3D3BE000 F4 (v03 HP >RX2800-2 0001 HP 0001) > root@yttrium:~# > > Same applies to the buildds: > > root@lifshitz:~# uname -a > Linux lifshitz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 > GNU/Linux > root@lifshitz:~# > > root@lenz:~# uname -a > Linux lenz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 GNU/Linux > root@lenz:~# > > EFI works fine as well using the latest version of GRUB2. > > Thanks, > Adrian > > > [1] https://cdimage.debian.org/cdimage/ports/snapshots/ > > [2] https://mirror.yandex.ru/gentoo-distfiles//releases/ia64/autobuilds/ Thanks for reporting back. I (mis)read the debian ports page [3], which mentions Debian 7 as the highest Debian version that supports IA64, and so I assumed that support had been dropped from Debian. However, if only a handful of people want to keep this port alive for reasons of nostalgia, it is obviously obsolete, and we should ask ourselves whether it is reasonable to expect Linux contributors to keep spending time on this. Does the Debian ia64 port have any users? Or is the system that builds the packages the only one that consumes them? [3] https://www.debian.org/ports/ia64/
Re: [PATCH] kallsyms: Fix scheduling with interrupts disabled in self-test
On Fri, Jan 13, 2023 at 09:44:35AM +0100, Petr Mladek wrote: > On Thu 2023-01-12 10:24:43, Luis Chamberlain wrote: > > On Thu, Jan 12, 2023 at 08:54:26PM +1000, Nicholas Piggin wrote: > > > kallsyms_on_each* may schedule so must not be called with interrupts > > > disabled. The iteration function could disable interrupts, but this > > > also changes lookup_symbol() to match the change to the other timing > > > code. > > > > > > Reported-by: Erhard F. > > > Link: > > > https://lore.kernel.org/all/bug-216902-206...@https.bugzilla.kernel.org%2F/ > > > Reported-by: kernel test robot > > > Link: > > > https://lore.kernel.org/oe-lkp/202212251728.8d0872ff-oliver.s...@intel.com > > > Fixes: 30f3bb09778d ("kallsyms: Add self-test facility") > > > Signed-off-by: Nicholas Piggin > > > --- > > > > Thanks Nicholas! > > > > Petr had just suggested removing this aspect of the selftests, the > > performance > > test as its specific to the config, it doesn't run many times to get an > > average and odd things on a system can create different metrics. Zhen Lei > > had given up on fixing it and has a patch to instead remove this part of > > the selftest. > > > > I still find value in keeping it, but Petr, would like your opinion on > > this fix, if we were to keep it. > > I am fine with this fix. Merged the fix. I'll push to Linus for 6.2-rc4 Luis
[PATCH] lockref: stop doing cpu_relax in the cmpxchg loop
On the x86-64 architecture even a failing cmpxchg grants exclusive access to the cacheline, making it preferable to retry the failed op immediately instead of stalling with the pause instruction. To illustrate the impact, below are benchmark results obtained by running various will-it-scale tests on top of the 6.2-rc3 kernel and Cascade Lake (2 sockets * 24 cores * 2 threads) CPU. All results in ops/s. Note there is some variance in re-runs, but the code is consistently faster when contention is present. open3 ("Same file open/close"): procstock no-pause 1 805603 814942 (+%1) 2 1054980 1054781 (-0%) 8 1544802 1822858 (+18%) 24 1191064 2199665 (+84%) 48 851582 1469860 (+72%) 96 609481 1427170 (+134%) fstat2 ("Same file fstat"): procstock no-pause 1 3013872 3047636 (+1%) 2 4284687 4400421 (+2%) 8 3257721 5530156 (+69%) 24 2239819 5466127 (+144%) 48 1701072 5256609 (+209%) 96 1269157 6649326 (+423%) Additionally, a kernel with a private patch to help access() scalability: access2 ("Same file access"): procstock patched patched+nopause 24 2378041 2005501 5370335 (-15% / +125%) That is, fixing the problems in access itself *reduces* scalability after the cacheline ping-pong only happens in lockref with the pause instruction. Note that fstat and access benchmarks are not currently integrated into will-it-scale, but interested parties can find them in pull requests to said project. Code at hand has a rather tortured history. First modification showed up in d472d9d98b463dd7 ("lockref: Relax in cmpxchg loop"), written with Itanium in mind. Later it got patched up to use an arch-dependent macro to stop doing it on s390 where it caused a significant regression. Said macro had undergone revisions and was ultimately eliminated later, going back to cpu_relax. While I intended to only remove cpu_relax for x86-64, I got the following comment from Linus: > I would actually prefer just removing it entirely and see if somebody > else hollers. You have the numbers to prove it hurts on real hardware, > and I don't think we have any numbers to the contrary. > So I think it's better to trust the numbers and remove it as a > failure, than say "let's just remove it on x86-64 and leave everybody >else with the potentially broken code" Additionally, Will Deacon (maintainer of the arm64 port, one of the architectures previously benchmarked): > So, from the arm64 side of the fence, I'm perfectly happy just removing > the cpu_relax() calls from lockref. As such, come back full circle in history and whack it altogether. Signed-off-by: Mateusz Guzik --- lib/lockref.c | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/lockref.c b/lib/lockref.c index 45e93ece8ba0..2afe4c5d8919 100644 --- a/lib/lockref.c +++ b/lib/lockref.c @@ -23,7 +23,6 @@ } \ if (!--retry) \ break; \ - cpu_relax(); \ } \ } while (0) -- 2.39.0
Re: [PATCH v3 00/51] cpuidle,rcu: Clean up the mess
On Thu, Jan 12, 2023 at 08:43:14PM +0100, Peter Zijlstra wrote: > Hi All! > > The (hopefully) final respin of cpuidle vs rcu cleanup patches. Barring any > objections I'll be queueing these patches in tip/sched/core in the next few > days. > > v2: https://lkml.kernel.org/r/20220919095939.761690...@infradead.org > > These here patches clean up the mess that is cpuidle vs rcuidle. > > At the end of the ride there's only on RCU_NONIDLE user left: > > arch/arm64/kernel/suspend.c:RCU_NONIDLE(__cpu_suspend_exit()); > > And I know Mark has been prodding that with something sharp. > > The last version was tested by a number of people and I'm hoping to not have > broken anything in the meantime ;-) > > > Changes since v2: 150 rcutorture hours on each of the default scenarios passed. This is qemu/KVM on x86: Tested-by: Paul E. McKenney > - rebased to v6.2-rc3; as available at: > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/idle > > - folded: > https://lkml.kernel.org/r/y3ubwyny15etu...@hirez.programming.kicks-ass.net >which makes the ARM cpuidle index 0 consistently not use >CPUIDLE_FLAG_RCU_IDLE, as requested by Ulf. > > - added a few more __always_inline to empty stub functions as found by the >robot. > > - Used _RET_IP_ instead of _THIS_IP_ in a few placed because of: >https://github.com/ClangBuiltLinux/linux/issues/263 > > - Added new patches to address various robot reports: > > #35: trace,hardirq: No moar _rcuidle() tracing > #47: cpuidle: Ensure ct_cpuidle_enter() is always called from > noinstr/__cpuidle > #48: cpuidle,arch: Mark all ct_cpuidle_enter() callers __cpuidle > #49: cpuidle,arch: Mark all regular cpuidle_state::enter methods > __cpuidle > #50: cpuidle: Comments about noinstr/__cpuidle > #51: context_tracking: Fix noinstr vs KASAN > > > --- > arch/alpha/kernel/process.c | 1 - > arch/alpha/kernel/vmlinux.lds.S | 1 - > arch/arc/kernel/process.c | 3 ++ > arch/arc/kernel/vmlinux.lds.S | 1 - > arch/arm/include/asm/vmlinux.lds.h| 1 - > arch/arm/kernel/cpuidle.c | 4 +- > arch/arm/kernel/process.c | 1 - > arch/arm/kernel/smp.c | 6 +-- > arch/arm/mach-davinci/cpuidle.c | 4 +- > arch/arm/mach-gemini/board-dt.c | 3 +- > arch/arm/mach-imx/cpuidle-imx5.c | 4 +- > arch/arm/mach-imx/cpuidle-imx6q.c | 8 ++-- > arch/arm/mach-imx/cpuidle-imx6sl.c| 4 +- > arch/arm/mach-imx/cpuidle-imx6sx.c| 9 ++-- > arch/arm/mach-imx/cpuidle-imx7ulp.c | 4 +- > arch/arm/mach-omap2/common.h | 6 ++- > arch/arm/mach-omap2/cpuidle34xx.c | 16 ++- > arch/arm/mach-omap2/cpuidle44xx.c | 29 +++-- > arch/arm/mach-omap2/omap-mpuss-lowpower.c | 12 +- > arch/arm/mach-omap2/pm.h | 2 +- > arch/arm/mach-omap2/pm24xx.c | 51 +- > arch/arm/mach-omap2/pm34xx.c | 14 +-- > arch/arm/mach-omap2/pm44xx.c | 2 +- > arch/arm/mach-omap2/powerdomain.c | 10 ++--- > arch/arm/mach-s3c/cpuidle-s3c64xx.c | 5 +-- > arch/arm64/kernel/cpuidle.c | 2 +- > arch/arm64/kernel/idle.c | 1 - > arch/arm64/kernel/smp.c | 4 +- > arch/arm64/kernel/vmlinux.lds.S | 1 - > arch/csky/kernel/process.c| 1 - > arch/csky/kernel/smp.c| 2 +- > arch/csky/kernel/vmlinux.lds.S| 1 - > arch/hexagon/kernel/process.c | 1 - > arch/hexagon/kernel/vmlinux.lds.S | 1 - > arch/ia64/kernel/process.c| 1 + > arch/ia64/kernel/vmlinux.lds.S| 1 - > arch/loongarch/kernel/idle.c | 1 + > arch/loongarch/kernel/vmlinux.lds.S | 1 - > arch/m68k/kernel/vmlinux-nommu.lds| 1 - > arch/m68k/kernel/vmlinux-std.lds | 1 - > arch/m68k/kernel/vmlinux-sun3.lds | 1 - > arch/microblaze/kernel/process.c | 1 - > arch/microblaze/kernel/vmlinux.lds.S | 1 - > arch/mips/kernel/idle.c | 14 +++ > arch/mips/kernel/vmlinux.lds.S| 1 - > arch/nios2/kernel/process.c | 1 - > arch/nios2/kernel/vmlinux.lds.S | 1 - > arch/openrisc/kernel/process.c| 1 + > arch/openrisc/kernel/vmlinux.lds.S| 1 - > arch/parisc/kernel/process.c | 2 - > arch/parisc/kernel/vmlinux.lds.S | 1 - > arch/powerpc/kernel/idle.c| 5 +-- > arch/powerpc/kernel/vmlinux.lds.S | 1 - > arch/riscv/kernel/process.c | 1 - > arch/riscv/kernel/vmlinux-xip.lds.S | 1 - > arch/riscv/kernel/vmlinux.lds.S | 1 - > arch/s390/kernel/idle.c | 1 - > arch/s390/kernel/vmlinux.lds.S|
[PATCH mm-unstable v1 00/26] mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs
This is the follow-up on [1]: [PATCH v2 0/8] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of anonymous pages After we implemented __HAVE_ARCH_PTE_SWP_EXCLUSIVE on most prominent enterprise architectures, implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all remaining architectures that support swap PTEs. This makes sure that exclusive anonymous pages will stay exclusive, even after they were swapped out -- for example, making GUP R/W FOLL_GET of anonymous pages reliable. Details can be found in [1]. This primarily fixes remaining known O_DIRECT memory corruptions that can happen on concurrent swapout, whereby we can lose DMA reads to a page (modifying the user page by writing to it). To verify, there are two test cases (requiring swap space, obviously): (1) The O_DIRECT+swapout test case [2] from Andrea. This test case tries triggering a race condition. (2) My vmsplice() test case [3] that tries to detect if the exclusive marker was lost during swapout, not relying on a race condition. For example, on 32bit x86 (with and without PAE), my test case fails without these patches: $ ./test_swp_exclusive FAIL: page was replaced during COW But succeeds with these patches: $ ./test_swp_exclusive PASS: page was not replaced during COW Why implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE for all architectures, even the ones where swap support might be in a questionable state? This is the first step towards removing "readable_exclusive" migration entries, and instead using pte_swp_exclusive() also with (readable) migration entries instead (as suggested by Peter). The only missing piece for that is supporting pmd_swp_exclusive() on relevant architectures with THP migration support. As all relevant architectures now implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE,, we can drop __HAVE_ARCH_PTE_SWP_EXCLUSIVE in the last patch. I tried cross-compiling all relevant setups and tested on x86 and sparc64 so far. CCing arch maintainers only on this cover letter and on the respective patch(es). [1] https://lkml.kernel.org/r/20220329164329.208407-1-da...@redhat.com [2] https://gitlab.com/aarcange/kernel-testcases-for-v5.11/-/blob/main/page_count_do_wp_page-swap.c [3] https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/test_swp_exclusive.c RFC -> v1: * Some smaller comment+patch description changes * "powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s" -> Fixup swap PTE description David Hildenbrand (26): mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE m68k/mm: remove dummy __swp definitions for nommu m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE nios2/mm: refactor swap PTE layout nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE arch/alpha/include/asm/pgtable.h | 40 - arch/arc/include/asm/pgtable-bits-arcv2.h | 26 +- arch/arm/include/asm/pgtable-2level.h | 3 + arch/arm/include/asm/pgtable-3level.h | 3 + arch/arm/include/asm/pgtable.h| 34 +-- arch/arm64/include/asm/pgtable.h | 1 - arch/csky/abiv1/inc/abi/pgtable-bits.h| 13 ++- arch/csky/abiv2/inc/abi/pgtable-bits.h| 19 ++-- arch/csky/include/asm/pgtable.h | 17 arch/hexagon/include/asm/pgtable.h| 36 ++-- arch/ia64/include/asm/pgtable.h | 31 ++- arch/loongarch/include/asm/pgtable-bits.h | 4 + arch/loongarch/include/asm/pgtable.h | 38 +++- arch/m68k/include/asm/mcf_pgtable.h | 35 +++- arch/m68k/include/asm/motorola_pgtable.h | 37 +++- arch/m68k/include/asm/pgtable_no.h| 6 -- arch/m68k/include/asm/sun3_pgtable.h | 38 +++- arch/microblaze/include/asm/pgtable.h | 44 +++--- arch/mips/include/asm/pgtable-32.h| 88 --- arch/mips/include/a
Re: lockref scalability on x86-64 vs cpu_relax
On 1/13/23, Linus Torvalds wrote: > Side note on your access() changes - if it turns out that you can > remove all the cred games, we should possibly then revert my old > commit d7852fbd0f04 ("access: avoid the RCU grace period for the > temporary subjective credentials") which avoided the biggest issue > with the unnecessary cred switching. > > I *think* access() is the only user of that special 'non_rcu' thing, > but it is possible that the whole 'non_rcu' thing ends up mattering > for cases where the cred actually does change because euid != uid (ie > suid programs), so this would need a bit more effort to do performance > testing on. > I don't think the games are avoidable. For one I found non-root processes with non-empty cap_effective even on my laptop, albeit I did not check how often something like this is doing access(). Discussion for another time. > On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik wrote: >> All that said, I think the thing to do here is to replace cpu_relax >> with a dedicated arch-dependent macro, akin to the following: > > I would actually prefer just removing it entirely and see if somebody > else hollers. You have the numbers to prove it hurts on real hardware, > and I don't think we have any numbers to the contrary. > > So I think it's better to trust the numbers and remove it as a > failure, than say "let's just remove it on x86-64 and leave everybody > else with the potentially broken code" > [snip] > Then other architectures can try to run their numbers, and only *if* > it then turns out that they have a reason to do something else should > we make this conditional and different on different architectures. > > Let's try to keep the code as common as possibly until we have hard > evidence for special cases, in other words. > I did not want to make such a change without redoing the ThunderX2 benchmark, or at least something else arm64-y. I may be able to bench it tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming no arm64 people show up with their results. Even then IMHO the safest route is to patch it out on x86-64 and give other people time to bench their archs as they get around to it, and ultimately whack the thing if it turns out nobody benefits from it. I would say beats backpedaling on the removal, but I'm not going to fight for it. That said, does waiting for arm64 numbers and/or producing them for the removal commit message sound like a plan? If so, I'll post soon(tm). -- Mateusz Guzik
Re: [PATCH] modpost: support arbitrary symbol length in modversion
On Wed, Jan 11, 2023 at 04:11:51PM +, Gary Guo wrote: Currently modversion uses a fixed size array of size (64 - sizeof(long)) to store symbol names, thus placing a hard limit on length of symbols. Rust symbols (which encodes crate and module names) can be quite a bit longer. The length limit in kallsyms is increased to 512 for this reason. It's a waste of space to simply expand the fixed array size to 512 in modversion info entries. I therefore make it variably sized, with offset to the next entry indicated by the initial "next" field. In addition to supporting longer-than-56/60 byte symbols, this patch also reduce the size for short symbols by getting rid of excessive 0 paddings. There are still some zero paddings to ensure "next" and "crc" fields are properly aligned. This patch does have a tiny drawback that it makes ".mod.c" files generated a bit less easy to read, as code like "\x08\x00\x00\x00\x78\x56\x34\x12" "symbol\0\0" is generated as opposed to { 0x12345678, "symbol" }, because the structure is now variable-length. But hopefully nobody reads the generated file :) Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") Link: https://github.com/Rust-for-Linux/linux/pull/379 Signed-off-by: Gary Guo --- arch/powerpc/kernel/module_64.c | 3 ++- include/linux/module.h | 6 -- kernel/module/version.c | 21 + scripts/export_report.pl| 9 + scripts/mod/modpost.c | 33 +++-- 5 files changed, 43 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index ff045644f13f..eac23c11d579 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c @@ -236,10 +236,11 @@ static void dedotify_versions(struct modversion_info *vers, { struct modversion_info *end; - for (end = (void *)vers + size; vers < end; vers++) + for (end = (void *)vers + size; vers < end; vers = (void *)vers + vers->next) { if (vers->name[0] == '.') { memmove(vers->name, vers->name+1, strlen(vers->name)); } + } } /* diff --git a/include/linux/module.h b/include/linux/module.h index 8c5909c0076c..37cb25af9099 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -34,8 +34,10 @@ #define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN struct modversion_info { - unsigned long crc; - char name[MODULE_NAME_LEN]; + /* Offset of the next modversion entry in relation to this one. */ + u32 next; + u32 crc; + char name[0]; although not really exported as uapi, this will break userspace as this is used in the elf file generated for the modules. I think this change must be made in a backward compatible way and kmod updated to deal with the variable name length: kmod $ git grep "\[64" libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 and 64 bit module, and vice versa. Lucas De Marchi }; struct module; diff --git a/kernel/module/version.c b/kernel/module/version.c index 53f43ac5a73e..af7478dcc158 100644 --- a/kernel/module/version.c +++ b/kernel/module/version.c @@ -17,32 +17,29 @@ int check_version(const struct load_info *info, { Elf_Shdr *sechdrs = info->sechdrs; unsigned int versindex = info->index.vers; - unsigned int i, num_versions; - struct modversion_info *versions; + struct modversion_info *versions, *end; + u32 crcval; /* Exporting module didn't supply crcs? OK, we're already tainted. */ if (!crc) return 1; + crcval = *crc; /* No versions at all? modprobe --force does this. */ if (versindex == 0) return try_to_force_load(mod, symname) == 0; versions = (void *)sechdrs[versindex].sh_addr; - num_versions = sechdrs[versindex].sh_size - / sizeof(struct modversion_info); + end = (void *)versions + sechdrs[versindex].sh_size; - for (i = 0; i < num_versions; i++) { - u32 crcval; - - if (strcmp(versions[i].name, symname) != 0) + for (; versions < end; versions = (void *)versions + versions->next) { + if (strcmp(versions->name, symname) != 0) continue; - crcval = *crc; - if (versions[i].crc == crcval) + if (versions->crc == crcval) return 1; - pr_debug("Found checksum %X vs module %lX\n", -crcval, versions[i].crc); + pr_debug("Found checksum %X vs module %X\n", +crcval, versions->crc); goto bad_version; } diff --git a/scripts/export_report.pl b/scripts/export_report.pl index
RE: [PATCH] lockref: stop doing cpu_relax in the cmpxchg loop
> diff --git a/lib/lockref.c b/lib/lockref.c > index 45e93ece8ba0..2afe4c5d8919 100644 > --- a/lib/lockref.c > +++ b/lib/lockref.c > @@ -23,7 +23,6 @@ > } > \ > if (!--retry) > \ > break; > \ > - cpu_relax(); > \ > } > \ > } while (0) The computer necrophiliacs at Debian and Gentoo seem determined to keep ia64 alive. So perhaps this should s/cpu_relax/soemt_relax/ where soemt_relax is a no-op everywhere except ia64, which can define it as cpu_relax. The ia64 case is quite painful if one thread on a core is spinning in that loop, while the other thread on the same core is the one that needs to run to update that value so that the cmpxchg can succeed. -Tony
Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)
Hello Ard! Can I take that as an ack on [0]? The EFI subsystem has evolved substantially over the years, and there is really no way to do any IA64 testing beyond build testing, so from that perspective, dropping it entirely would be welcomed. ia64 is regularly tested in Debian and Gentoo [1][2]. Debian's ia64 porterbox yttrium runs a recent kernel without issues: root@yttrium:~# uname -a Linux yttrium 5.19.0-2-mckinley #1 SMP Debian 5.19.11-1 (2022-09-24) ia64 GNU/Linux root@yttrium:~# root@yttrium:~# journalctl -b|head -n10 Nov 14 14:46:10 yttrium kernel: Linux version 5.19.0-2-mckinley (debian-ker...@lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld (GNU Binutils for Debian) 2.39) #1 SMP Debian 5.19.11-1 (2022-09-24) Nov 14 14:46:10 yttrium kernel: efi: EFI v2.10 by HP Nov 14 14:46:10 yttrium kernel: efi: SALsystab=0xdfdd63a18 ESI=0xdfdd63f18 ACPI 2.0=0x3d3c4014 HCDP=0xd8798 SMBIOS=0x3d368000 Nov 14 14:46:10 yttrium kernel: PCDP: v3 at 0xd8798 Nov 14 14:46:10 yttrium kernel: earlycon: uart8250 at I/O port 0x4000 (options '115200n8') Nov 14 14:46:10 yttrium kernel: printk: bootconsole [uart8250] enabled Nov 14 14:46:10 yttrium kernel: ACPI: Early table checksum verification disabled Nov 14 14:46:10 yttrium kernel: ACPI: RSDP 0x3D3C4014 24 (v02 HP ) Nov 14 14:46:10 yttrium kernel: ACPI: XSDT 0x3D3C4580 000124 (v01 HP RX2800-2 0001 0113) Nov 14 14:46:10 yttrium kernel: ACPI: FACP 0x3D3BE000 F4 (v03 HP RX2800-2 0001 HP 0001) root@yttrium:~# Same applies to the buildds: root@lifshitz:~# uname -a Linux lifshitz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 GNU/Linux root@lifshitz:~# root@lenz:~# uname -a Linux lenz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 GNU/Linux root@lenz:~# EFI works fine as well using the latest version of GRUB2. Thanks, Adrian [1] https://cdimage.debian.org/cdimage/ports/snapshots/ [2] https://mirror.yandex.ru/gentoo-distfiles//releases/ia64/autobuilds/
Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)
On 13 Jan 2023, at 21:03, Luck, Tony wrote: > >> For what it's worth, Debian and Gentoo both have ia64 ports with active >> users (6.1 looks like it currently fails to build in Debian due to a >> minor packaging issue, but various versions of 6.0 were built and >> published, and one of those is running on the one ia64 Debian builder I >> personally have access to). > > Jess, > > So dropping ia64 from the upstream kernel won't just save time of kernel > developers. It will also save time for the folks keeping Debian and Gentoo > ports up and running. > > Are there people actually running production systems on ia64 that also > update to v6.x kernels? > > If so, why? Just scrap the machine and replace with almost anything else. > You'll cover the cost of the new machine in short order with the savings on > your power bill. Hobbyists, same as alpha, hppa, m68k, sh and any other such architectures that have no real use in this day and age. Jess
RE: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)
> For what it's worth, Debian and Gentoo both have ia64 ports with active > users (6.1 looks like it currently fails to build in Debian due to a > minor packaging issue, but various versions of 6.0 were built and > published, and one of those is running on the one ia64 Debian builder I > personally have access to). Jess, So dropping ia64 from the upstream kernel won't just save time of kernel developers. It will also save time for the folks keeping Debian and Gentoo ports up and running. Are there people actually running production systems on ia64 that also update to v6.x kernels? If so, why? Just scrap the machine and replace with almost anything else. You'll cover the cost of the new machine in short order with the savings on your power bill. -Tony
Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)
On Fri, Jan 13, 2023 at 08:55:41AM +0100, Ard Biesheuvel wrote: > On Fri, 13 Jan 2023 at 01:31, Luck, Tony wrote: > > > > > Yeah, if it was ia64-only, it's a non-issue these days. It's dead and > > > in pure maintenance mode from a kernel perspective (if even that). > > > > There's not much "simultaneous" in the SMT on ia64. One thread in a > > spin loop will hog the core until the h/w switches to the other thread some > > number of cycles (hundreds, thousands? I really can remember). So I > > was pretty generous with dropping cpu_relax() into any kind of spin loop. > > > > Is it time yet for: > > > > $ git rm -r arch/ia64 > > > > Hi Tony, > > Can I take that as an ack on [0]? The EFI subsystem has evolved > substantially over the years, and there is really no way to do any > IA64 testing beyond build testing, so from that perspective, dropping > it entirely would be welcomed. For what it's worth, Debian and Gentoo both have ia64 ports with active users (6.1 looks like it currently fails to build in Debian due to a minor packaging issue, but various versions of 6.0 were built and published, and one of those is running on the one ia64 Debian builder I personally have access to). Jess
Re: [PATCH] modpost: support arbitrary symbol length in modversion
On Thu, 12 Jan 2023 14:40:59 -0700 Lucas De Marchi wrote: > On Wed, Jan 11, 2023 at 04:11:51PM +, Gary Guo wrote: > > > > struct modversion_info { > >-unsigned long crc; > >-char name[MODULE_NAME_LEN]; > >+/* Offset of the next modversion entry in relation to this one. */ > >+u32 next; > >+u32 crc; > >+char name[0]; > > although not really exported as uapi, this will break userspace as this is > used in the elf file generated for the modules. I think > this change must be made in a backward compatible way and kmod updated > to deal with the variable name length: > > kmod $ git grep "\[64" > libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; > libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; > > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 > and 64 bit module, and vice versa. > Hi Lucas, Thanks for the information. The change can't be "truly" backward compatible, in a sense that regardless of the new format we choose, kmod would not be able to decode symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves is going to be incomplete, isn't it? What kind of backward compatibility should be expected? It could be: * short symbols can still be found by old versions of kmod, but not long symbols; * or, no symbols are found by old versions of kmod, but it does not fail; * or, old versions of kmod would fail gracefully for not able to recognise the format of __versions section, but it didn't do anything crazy (e.g. decode it as old format). Also, do you think the current modversion format should stick forever or would we be able to migrate away from it eventually and fail old versions of modprobe given enough time? Best, Gary
Re: [PATCH mm-unstable v1 04/26] arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
On Fri, Jan 13, 2023 at 06:10:04PM +0100, David Hildenbrand wrote: > Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the > offset. This reduces the maximum swap space per file to 64 GiB (was 128 > GiB). > > While at it drop the PTE_TYPE_FAULT from __swp_entry_to_pte() which is > defined to be 0 and is rather confusing because we should be dealing > with "Linux PTEs" not "hardware PTEs". Also, properly mask the type in > __swp_entry(). > > Cc: Russell King > Signed-off-by: David Hildenbrand Reviewed-by: Russell King (Oracle) Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
[PATCH mm-unstable v1 24/26] x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE just like we already do on x86-64. After deciphering the PTE layout it becomes clear that there are still unused bits for 2-level and 3-level page tables that we should be able to use. Reusing a bit avoids stealing one bit from the swap offset. While at it, mask the type in __swp_entry(); use some helper definitions to make the macros easier to grasp. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Signed-off-by: David Hildenbrand --- arch/x86/include/asm/pgtable-2level.h | 26 +- arch/x86/include/asm/pgtable-3level.h | 26 +++--- arch/x86/include/asm/pgtable.h| 2 -- 3 files changed, 44 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/pgtable-2level.h b/arch/x86/include/asm/pgtable-2level.h index 60d0f9015317..e9482a11ac52 100644 --- a/arch/x86/include/asm/pgtable-2level.h +++ b/arch/x86/include/asm/pgtable-2level.h @@ -80,21 +80,37 @@ static inline unsigned long pte_bitop(unsigned long value, unsigned int rightshi return ((value >> rightshift) & mask) << leftshift; } -/* Encode and de-code a swap entry */ +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <- offset --> 0 E <- type --> 0 + * + * E is the exclusive marker that is not stored in swap entries. + */ #define SWP_TYPE_BITS 5 +#define _SWP_TYPE_MASK ((1U << SWP_TYPE_BITS) - 1) +#define _SWP_TYPE_SHIFT (_PAGE_BIT_PRESENT + 1) #define SWP_OFFSET_SHIFT (_PAGE_BIT_PROTNONE + 1) -#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) +#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5) -#define __swp_type(x) (((x).val >> (_PAGE_BIT_PRESENT + 1)) \ -& ((1U << SWP_TYPE_BITS) - 1)) +#define __swp_type(x) (((x).val >> _SWP_TYPE_SHIFT) \ +& _SWP_TYPE_MASK) #define __swp_offset(x)((x).val >> SWP_OFFSET_SHIFT) #define __swp_entry(type, offset) ((swp_entry_t) { \ -((type) << (_PAGE_BIT_PRESENT + 1)) \ +(((type) & _SWP_TYPE_MASK) << _SWP_TYPE_SHIFT) \ | ((offset) << SWP_OFFSET_SHIFT) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_low }) #define __swp_entry_to_pte(x) ((pte_t) { .pte = (x).val }) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_PSE + /* No inverted PFNs on 2 level page tables */ static inline u64 protnone_mask(u64 val) diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h index 967b135fa2c0..9e7c0b719c3c 100644 --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -145,8 +145,24 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, } #endif -/* Encode and de-code a swap entry */ +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * < type -> <-- offset -- + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * > 0 E 0 0 0 0 0 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + */ #define SWP_TYPE_BITS 5 +#define _SWP_TYPE_MASK ((1U << SWP_TYPE_BITS) - 1) #define SWP_OFFSET_FIRST_BIT (_PAGE_BIT_PROTNONE + 1) @@ -154,9 +170,10 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #define SWP_OFFSET_SHIFT (SWP_OFFSET_FIRST_BIT + SWP_TYPE_BITS) #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) -#define __swp_type(x) (((x).val) & ((1UL << SWP_TYPE_BITS) - 1)) +#define __swp_type(x) (((x).val) & _SWP_TYPE_MASK) #define __swp_offset(x)((x).val >> SWP_TYPE_BITS) -#define __swp_entry(type, offset) ((swp_entry_t){(type) | (offset) << SWP_TYPE_BITS}) +#define __swp_entry(type, offset) ((swp_entry_t){((type) & _SWP_TYPE_MASK) \ + | (offset) << SWP_TYPE_BITS}) /* * Normally, __swp_entry() converts from arch-independent swp_entry_t to @@ -184,6 +201,9 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #define __
[PATCH mm-unstable v1 26/26] mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE
__HAVE_ARCH_PTE_SWP_EXCLUSIVE is now supported by all architectures that support swp PTEs, so let's drop it. Signed-off-by: David Hildenbrand --- arch/alpha/include/asm/pgtable.h | 1 - arch/arc/include/asm/pgtable-bits-arcv2.h| 1 - arch/arm/include/asm/pgtable.h | 1 - arch/arm64/include/asm/pgtable.h | 1 - arch/csky/include/asm/pgtable.h | 1 - arch/hexagon/include/asm/pgtable.h | 1 - arch/ia64/include/asm/pgtable.h | 1 - arch/loongarch/include/asm/pgtable.h | 1 - arch/m68k/include/asm/mcf_pgtable.h | 1 - arch/m68k/include/asm/motorola_pgtable.h | 1 - arch/m68k/include/asm/sun3_pgtable.h | 1 - arch/microblaze/include/asm/pgtable.h| 1 - arch/mips/include/asm/pgtable.h | 1 - arch/nios2/include/asm/pgtable.h | 1 - arch/openrisc/include/asm/pgtable.h | 1 - arch/parisc/include/asm/pgtable.h| 1 - arch/powerpc/include/asm/book3s/32/pgtable.h | 1 - arch/powerpc/include/asm/book3s/64/pgtable.h | 1 - arch/powerpc/include/asm/nohash/pgtable.h| 1 - arch/riscv/include/asm/pgtable.h | 1 - arch/s390/include/asm/pgtable.h | 1 - arch/sh/include/asm/pgtable_32.h | 1 - arch/sparc/include/asm/pgtable_32.h | 1 - arch/sparc/include/asm/pgtable_64.h | 1 - arch/um/include/asm/pgtable.h| 1 - arch/x86/include/asm/pgtable.h | 1 - arch/xtensa/include/asm/pgtable.h| 1 - include/linux/pgtable.h | 29 mm/debug_vm_pgtable.c| 2 -- mm/memory.c | 4 --- mm/rmap.c| 11 31 files changed, 73 deletions(-) diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h index 970abf511b13..ba43cb841d19 100644 --- a/arch/alpha/include/asm/pgtable.h +++ b/arch/alpha/include/asm/pgtable.h @@ -328,7 +328,6 @@ extern inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h b/arch/arc/include/asm/pgtable-bits-arcv2.h index 611f412713b9..6e9f8ca6d6a1 100644 --- a/arch/arc/include/asm/pgtable-bits-arcv2.h +++ b/arch/arc/include/asm/pgtable-bits-arcv2.h @@ -132,7 +132,6 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index 886c275995a2..2e626e6da9a3 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -298,7 +298,6 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(swp)__pte((swp).val) -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_isset(pte, L_PTE_SWP_EXCLUSIVE); diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index b4bbeed80fb6..e0c19f5e3413 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -417,7 +417,6 @@ static inline pgprot_t mk_pmd_sect_prot(pgprot_t prot) return __pgprot((pgprot_val(prot) & ~PMD_TABLE_BIT) | PMD_TYPE_SECT); } -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline pte_t pte_swp_mkexclusive(pte_t pte) { return set_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE)); diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h index 574c97b9ecca..d4042495febc 100644 --- a/arch/csky/include/asm/pgtable.h +++ b/arch/csky/include/asm/pgtable.h @@ -200,7 +200,6 @@ static inline pte_t pte_mkyoung(pte_t pte) return pte; } -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h index 7eb008e477c8..59393613d086 100644 --- a/arch/hexagon/include/asm/pgtable.h +++ b/arch/hexagon/include/asm/pgtable.h @@ -397,7 +397,6 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) (((type & 0x1f) << 1) | \ ((offset & 0x38) << 10) | ((offset & 0x7) << 7)) }) -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t
[PATCH mm-unstable v1 25/26] xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 1. This bit should be safe to use for our usecase. Most importantly, we can still distinguish swap PTEs from PAGE_NONE PTEs (see pte_present()) and don't use one of the two reserved attribute masks (1101 and ). Attribute mask 1100 and 1110 now identify swap PTEs. While at it, remove SWP_TYPE_BITS (not really helpful as it's not used in the actual swap macros) and mask the type in __swp_entry(). Cc: Chris Zankel Cc: Max Filippov Signed-off-by: David Hildenbrand --- arch/xtensa/include/asm/pgtable.h | 32 ++- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/xtensa/include/asm/pgtable.h b/arch/xtensa/include/asm/pgtable.h index 5b5484d707b2..1025e2dc292b 100644 --- a/arch/xtensa/include/asm/pgtable.h +++ b/arch/xtensa/include/asm/pgtable.h @@ -96,7 +96,7 @@ * +- - - - - - - - - - - - - - - - - - - - -+ * (PAGE_NONE)|PPN| 0 | 00 | ADW | 01 | 11 | 11 | * +-+ - * swap | index | type | 01 | 11 | 00 | + * swap | index | type | 01 | 11 | e0 | * +-+ * * For T1050 hardware and earlier the layout differs for present and (PAGE_NONE) @@ -112,6 +112,7 @@ * RI ring (0=privileged, 1=user, 2 and 3 are unused) * CAcache attribute: 00 bypass, 01 writeback, 10 writethrough * (11 is invalid and used to mark pages that are not present) + * e exclusive marker in swap PTEs * w page is writable (hw) * x page is executable (hw) * index swap offset / PAGE_SIZE (bit 11-31: 21 bits -> 8 GB) @@ -158,6 +159,9 @@ #define _PAGE_DIRTY(1<<7) /* software: page dirty */ #define _PAGE_ACCESSED (1<<8) /* software: page accessed (read) */ +/* We borrow bit 1 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<1) + #ifdef CONFIG_MMU #define _PAGE_CHG_MASK(PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY) @@ -343,19 +347,37 @@ ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) } /* - * Encode and decode a swap and file entry. + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). */ -#define SWP_TYPE_BITS 5 -#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) +#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5) #define __swp_type(entry) (((entry).val >> 6) & 0x1f) #define __swp_offset(entry)((entry).val >> 11) #define __swp_entry(type,offs) \ - ((swp_entry_t){((type) << 6) | ((offs) << 11) | \ + ((swp_entry_t){(((type) & 0x1f) << 6) | ((offs) << 11) | \ _PAGE_CA_INVALID | _PAGE_USER}) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + #endif /* !defined (__ASSEMBLY__) */ -- 2.39.0
[PATCH mm-unstable v1 23/26] um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 10, which is yet unused for swap PTEs. The pte_mkuptodate() is a bit weird in __pte_to_swp_entry() for a swap PTE ... but it only messes with bit 1 and 2 and there is a comment in set_pte(), so leave these bits alone. While at it, mask the type in __swp_entry(). Cc: Richard Weinberger Cc: Anton Ivanov Cc: Johannes Berg Signed-off-by: David Hildenbrand --- arch/um/include/asm/pgtable.h | 37 +-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/arch/um/include/asm/pgtable.h b/arch/um/include/asm/pgtable.h index 4e3052f2671a..cedc5fd451ce 100644 --- a/arch/um/include/asm/pgtable.h +++ b/arch/um/include/asm/pgtable.h @@ -21,6 +21,9 @@ #define _PAGE_PROTNONE 0x010 /* if the user mapped it with PROT_NONE; pte_present gives true */ +/* We borrow bit 10 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x400 + #ifdef CONFIG_3_LEVEL_PGTABLES #include #else @@ -288,16 +291,46 @@ extern pte_t *virt_to_pte(struct mm_struct *mm, unsigned long addr); #define update_mmu_cache(vma,address,ptep) do {} while (0) -/* Encode and de-code a swap entry */ +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- offset > E < type -> 0 0 0 1 0 + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_NEWPAGE (bit 1) is always set to 1 in set_pte(). + */ #define __swp_type(x) (((x).val >> 5) & 0x1f) #define __swp_offset(x)((x).val >> 11) #define __swp_entry(type, offset) \ - ((swp_entry_t) { ((type) << 5) | ((offset) << 11) }) + ((swp_entry_t) { (((type) & 0x1f) << 5) | ((offset) << 11) }) #define __pte_to_swp_entry(pte) \ ((swp_entry_t) { pte_val(pte_mkuptodate(pte)) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_get_bits(pte, _PAGE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_set_bits(pte, _PAGE_SWP_EXCLUSIVE); + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_clear_bits(pte, _PAGE_SWP_EXCLUSIVE); + return pte; +} + /* Clear a kernel PTE and flush it from the TLB */ #define kpte_clear_flush(ptep, vaddr) \ do { \ -- 2.39.0
[PATCH mm-unstable v1 22/26] sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit was effectively unused. While at it, mask the type in __swp_entry(). Cc: "David S. Miller" Signed-off-by: David Hildenbrand --- arch/sparc/include/asm/pgtable_64.h | 38 ++--- 1 file changed, 35 insertions(+), 3 deletions(-) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 3bc9736bddb1..a1658eebd036 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -187,6 +187,9 @@ bool kern_addr_valid(unsigned long addr); #define _PAGE_SZHUGE_4U_PAGE_SZ4MB_4U #define _PAGE_SZHUGE_4V_PAGE_SZ4MB_4V +/* We borrow bit 20 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_AC(0x0010, UL) + #ifndef __ASSEMBLY__ pte_t mk_pte_io(unsigned long, pgprot_t, int, unsigned long); @@ -961,18 +964,47 @@ void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp); #endif -/* Encode and de-code a swap entry */ -#define __swp_type(entry) (((entry).val >> PAGE_SHIFT) & 0xffUL) +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <--- offset --- + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * > E <-- type ---> <--- zeroes > + */ +#define __swp_type(entry) (((entry).val >> PAGE_SHIFT) & 0x7fUL) #define __swp_offset(entry)((entry).val >> (PAGE_SHIFT + 8UL)) #define __swp_entry(type, offset) \ ( (swp_entry_t) \ { \ - (((long)(type) << PAGE_SHIFT) | \ + long)(type) & 0x7fUL) << PAGE_SHIFT) | \ ((long)(offset) << (PAGE_SHIFT + 8UL))) \ } ) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE); +} + int page_in_phys_avail(unsigned long paddr); /* -- 2.39.0
[PATCH mm-unstable v1 21/26] sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by reusing the SRMMU_DIRTY bit as that seems to be safe to reuse inside a swap PTE. This avoids having to steal one bit from the swap offset. While at it, relocate the swap PTE layout documentation and use the same style now used for most other archs. Note that the old documentation was wrong: we use 20 bit for the offset and the reserved bits were 8 instead of 7 bits in the ascii art. Cc: "David S. Miller" Signed-off-by: David Hildenbrand --- arch/sparc/include/asm/pgtable_32.h | 27 ++- arch/sparc/include/asm/pgtsrmmu.h | 14 +++--- 2 files changed, 29 insertions(+), 12 deletions(-) diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h index 5acc05b572e6..abf7a2601209 100644 --- a/arch/sparc/include/asm/pgtable_32.h +++ b/arch/sparc/include/asm/pgtable_32.h @@ -323,7 +323,16 @@ void srmmu_mapiorange(unsigned int bus, unsigned long xpa, unsigned long xva, unsigned int len); void srmmu_unmapiorange(unsigned long virt_addr, unsigned int len); -/* Encode and de-code a swap entry */ +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <-- offset ---> < type -> E 0 0 0 0 0 0 + */ static inline unsigned long __swp_type(swp_entry_t entry) { return (entry.val >> SRMMU_SWP_TYPE_SHIFT) & SRMMU_SWP_TYPE_MASK; @@ -344,6 +353,22 @@ static inline swp_entry_t __swp_entry(unsigned long type, unsigned long offset) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & SRMMU_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte(pte_val(pte) | SRMMU_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte(pte_val(pte) & ~SRMMU_SWP_EXCLUSIVE); +} + static inline unsigned long __get_phys (unsigned long addr) { diff --git a/arch/sparc/include/asm/pgtsrmmu.h b/arch/sparc/include/asm/pgtsrmmu.h index 6067925972d9..18e68d43f036 100644 --- a/arch/sparc/include/asm/pgtsrmmu.h +++ b/arch/sparc/include/asm/pgtsrmmu.h @@ -53,21 +53,13 @@ #define SRMMU_CHG_MASK(0xff00 | SRMMU_REF | SRMMU_DIRTY) -/* SRMMU swap entry encoding - * - * We use 5 bits for the type and 19 for the offset. This gives us - * 32 swapfiles of 4GB each. Encoding looks like: - * - * ooot - * fedcba9876543210fedcba9876543210 - * - * The bottom 7 bits are reserved for protection and status bits, especially - * PRESENT. - */ +/* SRMMU swap entry encoding */ #define SRMMU_SWP_TYPE_MASK0x1f #define SRMMU_SWP_TYPE_SHIFT 7 #define SRMMU_SWP_OFF_MASK 0xf #define SRMMU_SWP_OFF_SHIFT(SRMMU_SWP_TYPE_SHIFT + 5) +/* We borrow bit 6 to store the exclusive marker in swap PTEs. */ +#define SRMMU_SWP_EXCLUSIVESRMMU_DIRTY /* Some day I will implement true fine grained access bits for * user pages because the SRMMU gives us the capabilities to -- 2.39.0
[PATCH mm-unstable v1 20/26] sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 6 in the PTE, reducing the swap type in the !CONFIG_X2TLB case to 5 bits. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. Interrestingly, the swap type in the !CONFIG_X2TLB case could currently overlap with the _PAGE_PRESENT bit, because there is a sneaky shift by 1 in __pte_to_swp_entry() and __swp_entry_to_pte(). Bit 0-7 in the architecture specific swap PTE would get shifted to bit 1-8 in the PTE. As generic MM uses 5 bits only, this didn't matter so far. While at it, mask the type in __swp_entry(). Cc: Yoshinori Sato Cc: Rich Felker Signed-off-by: David Hildenbrand --- arch/sh/include/asm/pgtable_32.h | 54 +--- 1 file changed, 42 insertions(+), 12 deletions(-) diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h index d0240decacca..c34aa795a9d2 100644 --- a/arch/sh/include/asm/pgtable_32.h +++ b/arch/sh/include/asm/pgtable_32.h @@ -423,40 +423,70 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #endif /* - * Encode and de-code a swap entry + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * * Constraints: * _PAGE_PRESENT at bit 8 * _PAGE_PROTNONE at bit 9 * - * For the normal case, we encode the swap type into bits 0:7 and the - * swap offset into bits 10:30. For the 64-bit PTE case, we keep the - * preserved bits in the low 32-bits and use the upper 32 as the swap - * offset (along with a 5-bit type), following the same approach as x86 - * PAE. This keeps the logic quite simple. + * For the normal case, we encode the swap type and offset into the swap PTE + * such that bits 8 and 9 stay zero. For the 64-bit PTE case, we use the + * upper 32 for the swap offset and swap type, following the same approach as + * x86 PAE. This keeps the logic quite simple. * * As is evident by the Alpha code, if we ever get a 64-bit unsigned * long (swp_entry_t) to match up with the 64-bit PTEs, this all becomes * much cleaner.. - * - * NOTE: We should set ZEROs at the position of _PAGE_PRESENT - * and _PAGE_PROTNONE bits */ + #ifdef CONFIG_X2TLB +/* + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <- offset --> < type -> + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- zeroes > E 0 0 0 0 0 0 + */ #define __swp_type(x) ((x).val & 0x1f) #define __swp_offset(x)((x).val >> 5) -#define __swp_entry(type, offset) ((swp_entry_t){ (type) | (offset) << 5}) +#define __swp_entry(type, offset) ((swp_entry_t){ ((type) & 0x1f) | (offset) << 5}) #define __pte_to_swp_entry(pte)((swp_entry_t){ (pte).pte_high }) #define __swp_entry_to_pte(x) ((pte_t){ 0, (x).val }) #else -#define __swp_type(x) ((x).val & 0xff) +/* + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- offset > 0 0 0 0 E < type -> 0 + * + * E is the exclusive marker that is not stored in swap entries. + */ +#define __swp_type(x) ((x).val & 0x1f) #define __swp_offset(x)((x).val >> 10) -#define __swp_entry(type, offset) ((swp_entry_t){(type) | (offset) <<10}) +#define __swp_entry(type, offset) ((swp_entry_t){((type) & 0x1f) | (offset) << 10}) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 1 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 1 }) #endif +/* In both cases, we borrow bit 6 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_USER + +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte.pte_low & _PAGE_SWP_EXCLUSIVE; +} + +PTE_BIT_FUNC(low, swp_mkexclusive, |= _PAGE_SWP_EXCLUSIVE); +PTE_BIT_FUNC(low, swp_clear_exclusive, &= ~_PAGE_SWP_EXCLUSIVE); + #endif /* __ASSEMBLY__ */ #endif /* __ASM_SH_PGTABLE_32_H */ -- 2.39.0
[PATCH mm-unstable v1 19/26] riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file: on 32bit to 16 GiB (was 32 GiB). Note that this bit does not conflict with swap PMDs and could also be used in swap PMD context later. While at it, mask the type in __swp_entry(). Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Signed-off-by: David Hildenbrand --- arch/riscv/include/asm/pgtable-bits.h | 3 +++ arch/riscv/include/asm/pgtable.h | 29 ++- 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index b9e13a8fe2b7..f896708e8331 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -27,6 +27,9 @@ */ #define _PAGE_PROT_NONE _PAGE_GLOBAL +/* Used for swap PTEs only. */ +#define _PAGE_SWP_EXCLUSIVE _PAGE_ACCESSED + #define _PAGE_PFN_SHIFT 10 /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 4eba9a98d0e3..03a4728db039 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -724,16 +724,18 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ /* - * Encode and decode a swap entry + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * * Format of swap PTE: * bit0: _PAGE_PRESENT (zero) * bit 1 to 3: _PAGE_LEAF (zero) * bit5: _PAGE_PROT_NONE (zero) - * bits 6 to 10: swap type - * bits 10 to XLEN-1: swap offset + * bit6: exclusive marker + * bits 7 to 11: swap type + * bits 11 to XLEN-1: swap offset */ -#define __SWP_TYPE_SHIFT 6 +#define __SWP_TYPE_SHIFT 7 #define __SWP_TYPE_BITS5 #define __SWP_TYPE_MASK((1UL << __SWP_TYPE_BITS) - 1) #define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT) @@ -744,11 +746,28 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #define __swp_type(x) (((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK) #define __swp_offset(x)((x).val >> __SWP_OFFSET_SHIFT) #define __swp_entry(type, offset) ((swp_entry_t) \ - { ((type) << __SWP_TYPE_SHIFT) | ((offset) << __SWP_OFFSET_SHIFT) }) + { (((type) & __SWP_TYPE_MASK) << __SWP_TYPE_SHIFT) | \ + ((offset) << __SWP_OFFSET_SHIFT) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE); +} + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) }) #define __swp_entry_to_pmd(swp) __pmd((swp).val) -- 2.39.0
[PATCH mm-unstable v1 18/26] powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit and 64bit. On 64bit, let's use MSB 56 (LSB 7), located right next to the page type. On 32bit, let's use LSB 2 to avoid stealing one bit from the swap offset. There seems to be no real reason why these bits cannot be used for swap PTEs. The important part is that _PAGE_PRESENT and _PAGE_HASHPTE remain 0. While at it, mask the type in __swp_entry() and remove _PAGE_BIT_SWAP_TYPE from pte-e500.h: while it was used in 64bit code it was ignored in 32bit code. Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/nohash/32/pgtable.h | 22 + arch/powerpc/include/asm/nohash/32/pte-40x.h | 6 ++--- arch/powerpc/include/asm/nohash/32/pte-44x.h | 18 -- arch/powerpc/include/asm/nohash/32/pte-85xx.h | 4 ++-- arch/powerpc/include/asm/nohash/64/pgtable.h | 24 --- arch/powerpc/include/asm/nohash/pgtable.h | 16 + arch/powerpc/include/asm/nohash/pte-e500.h| 1 - 7 files changed, 63 insertions(+), 28 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index 70edad44dff6..fec56d965f00 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -360,18 +360,30 @@ static inline int pte_young(pte_t pte) #endif #define pmd_page(pmd) pfn_to_page(pmd_pfn(pmd)) + /* - * Encode and decode a swap entry. - * Note that the bits we use in a PTE for representing a swap entry - * must not include the _PAGE_PRESENT bit. - * -- paulus + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs (32bit PTEs): + * + * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * <-- offset ---> < type -> E 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + * + * For 64bit PTEs, the offset is extended by 32bit. */ #define __swp_type(entry) ((entry).val & 0x1f) #define __swp_offset(entry)((entry).val >> 5) -#define __swp_entry(type, offset) ((swp_entry_t) { (type) | ((offset) << 5) }) +#define __swp_entry(type, offset) ((swp_entry_t) { ((type) & 0x1f) | ((offset) << 5) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 3 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 3 }) +/* We borrow LSB 2 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x04 + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_POWERPC_NOHASH_32_PGTABLE_H */ diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h index 2d3153cfc0d7..6fe46e754556 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-40x.h +++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h @@ -27,9 +27,9 @@ * of the 16 available. Bit 24-26 of the TLB are cleared in the TLB * miss handler. Bit 27 is PAGE_USER, thus selecting the correct * zone. - * - PRESENT *must* be in the bottom two bits because swap cache - * entries use the top 30 bits. Because 40x doesn't support SMP - * anyway, M is irrelevant so we borrow it for PAGE_PRESENT. Bit 30 + * - PRESENT *must* be in the bottom two bits because swap PTEs + * use the top 30 bits. Because 40x doesn't support SMP anyway, M is + * irrelevant so we borrow it for PAGE_PRESENT. Bit 30 * is cleared in the TLB miss handler before the TLB entry is loaded. * - All other bits of the PTE are loaded into TLBLO without * modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h index 78bc304f750e..b7ed13cee137 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-44x.h +++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h @@ -56,20 +56,10 @@ * above bits. Note that the bit values are CPU specific, not architecture * specific. * - * The kernel PTE entry holds an arch-dependent swp_entry structure under - * certain situations. In other words, in such situations some portion of - * the PTE bits are used as a swp_entry. In the PPC implementation, the - * 3-24th LSB are shared with swp_entry, however the 0-2nd three LSB still - * hold protection values. That means the three protection bits are - * reserved for both PTE and SWAP entry at the most significant three - * LSBs. - * - * There are three protection bits available for SWAP entry: - * _PAGE_PRESENT - * _PAGE_HASHPTE (if HW has) - * - * So those three bits have to be inside of 0-2nd LSB of PTE. - * + * The kernel PTE entry can be an ordinary PTE mapping a page or a special swap + * PTE. In case of a swap PTE, LSB 2-24 are used to store information reg
[PATCH mm-unstable v1 17/26] powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s
We already implemented support for 64bit book3s in commit bff9beaa2e80 ("powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3s") Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also in 32bit by reusing yet unused LSB 2 / MSB 29. There seems to be no real reason why that bit cannot be used, and reusing it avoids having to steal one bit from the swap offset. While at it, mask the type in __swp_entry(). Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/book3s/32/pgtable.h | 38 +--- 1 file changed, 33 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 75823f39e042..0ecb3a58f23f 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -42,6 +42,9 @@ #define _PMD_PRESENT_MASK (PAGE_MASK) #define _PMD_BAD (~PAGE_MASK) +/* We borrow the _PAGE_USER bit to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_USER + /* And here we include common definitions */ #define _PAGE_KERNEL_RO0 @@ -363,17 +366,42 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, #define pmd_page(pmd) pfn_to_page(pmd_pfn(pmd)) /* - * Encode and decode a swap entry. - * Note that the bits we use in a PTE for representing a swap entry - * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used). - * -- paulus + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs (32bit PTEs): + * + * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * <- offset > < type -> E H P + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P) and __PAGE_HASHPTE (H) must be 0. + * + * For 64bit PTEs, the offset is extended by 32bit. */ #define __swp_type(entry) ((entry).val & 0x1f) #define __swp_offset(entry)((entry).val >> 5) -#define __swp_entry(type, offset) ((swp_entry_t) { (type) | ((offset) << 5) }) +#define __swp_entry(type, offset) ((swp_entry_t) { ((type) & 0x1f) | ((offset) << 5) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 3 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 3 }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE); +} + /* Generic accessors to PTE bits */ static inline int pte_write(pte_t pte) { return !!(pte_val(pte) & _PAGE_RW);} static inline int pte_read(pte_t pte) { return 1; } -- 2.39.0
[PATCH mm-unstable v1 16/26] parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused _PAGE_ACCESSED location in the swap PTE. Looking at pte_present() and pte_none() checks, there seems to be no actual reason why we cannot use it: we only have to make sure we're not using _PAGE_PRESENT. Reusing this bit avoids having to steal one bit from the swap offset. Cc: "James E.J. Bottomley" Cc: Helge Deller Signed-off-by: David Hildenbrand --- arch/parisc/include/asm/pgtable.h | 41 --- 1 file changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h index ea357430aafe..3033bb88df34 100644 --- a/arch/parisc/include/asm/pgtable.h +++ b/arch/parisc/include/asm/pgtable.h @@ -218,6 +218,9 @@ extern void __update_cache(pte_t pte); #define _PAGE_KERNEL_RWX (_PAGE_KERNEL_EXEC | _PAGE_WRITE) #define _PAGE_KERNEL (_PAGE_KERNEL_RO | _PAGE_WRITE) +/* We borrow bit 23 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_ACCESSED + /* The pgd/pmd contains a ptr (in phys addr space); since all pgds/pmds * are page-aligned, we don't care about the PAGE_OFFSET bits, except * for a few meta-information bits, so we shift the address to be @@ -394,17 +397,49 @@ extern void paging_init (void); #define update_mmu_cache(vms,addr,ptep) __update_cache(*ptep) -/* Encode and de-code a swap entry */ - +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs (32bit): + * + * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * < offset -> P E < type -> + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P) must be 0. + * + * For the 64bit version, the offset is extended by 32bit. + */ #define __swp_type(x) ((x).val & 0x1f) #define __swp_offset(x) ( (((x).val >> 6) & 0x7) | \ (((x).val >> 8) & ~0x7) ) -#define __swp_entry(type, offset) ((swp_entry_t) { (type) | \ +#define __swp_entry(type, offset) ((swp_entry_t) { \ + ((type) & 0x1f) | \ ((offset & 0x7) << 6) | \ ((offset & ~0x7) << 8) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte; -- 2.39.0
[PATCH mm-unstable v1 15/26] openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, mask the type in __swp_entry(). Cc: Stefan Kristiansson Cc: Stafford Horne Signed-off-by: David Hildenbrand --- arch/openrisc/include/asm/pgtable.h | 41 + 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/arch/openrisc/include/asm/pgtable.h b/arch/openrisc/include/asm/pgtable.h index 6477c17b3062..903b32d662ab 100644 --- a/arch/openrisc/include/asm/pgtable.h +++ b/arch/openrisc/include/asm/pgtable.h @@ -154,6 +154,9 @@ extern void paging_init(void); #define _KERNPG_TABLE \ (_PAGE_BASE | _PAGE_SRE | _PAGE_SWE | _PAGE_ACCESSED | _PAGE_DIRTY) +/* We borrow bit 11 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_U_SHARED + #define PAGE_NONE __pgprot(_PAGE_ALL) #define PAGE_READONLY __pgprot(_PAGE_ALL | _PAGE_URE | _PAGE_SRE) #define PAGE_READONLY_X __pgprot(_PAGE_ALL | _PAGE_URE | _PAGE_SRE | _PAGE_EXEC) @@ -385,16 +388,44 @@ static inline void update_mmu_cache(struct vm_area_struct *vma, /* __PHX__ FIXME, SWAP, this probably doesn't work */ -/* Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) */ -/* Since the PAGE_PRESENT bit is bit 4, we can use the bits above */ - -#define __swp_type(x) (((x).val >> 5) & 0x7f) +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <-- offset ---> E <- type --> 0 0 0 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + * The zero'ed bits include _PAGE_PRESENT. + */ +#define __swp_type(x) (((x).val >> 5) & 0x3f) #define __swp_offset(x)((x).val >> 12) #define __swp_entry(type, offset) \ - ((swp_entry_t) { ((type) << 5) | ((offset) << 12) }) + ((swp_entry_t) { (((type) & 0x3f) << 5) | ((offset) << 12) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + typedef pte_t *pte_addr_t; #endif /* __ASSEMBLY__ */ -- 2.39.0
[PATCH mm-unstable v1 14/26] nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused bit 31. Cc: Thomas Bogendoerfer Signed-off-by: David Hildenbrand --- arch/nios2/include/asm/pgtable-bits.h | 3 +++ arch/nios2/include/asm/pgtable.h | 22 +- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/arch/nios2/include/asm/pgtable-bits.h b/arch/nios2/include/asm/pgtable-bits.h index bfddff383e89..724f9b08b1d1 100644 --- a/arch/nios2/include/asm/pgtable-bits.h +++ b/arch/nios2/include/asm/pgtable-bits.h @@ -31,4 +31,7 @@ #define _PAGE_ACCESSED (1<<26) /* page referenced */ #define _PAGE_DIRTY(1<<27) /* dirty page */ +/* We borrow bit 31 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<31) + #endif /* _ASM_NIOS2_PGTABLE_BITS_H */ diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h index d1e5c9eb4643..05999da01731 100644 --- a/arch/nios2/include/asm/pgtable.h +++ b/arch/nios2/include/asm/pgtable.h @@ -239,7 +239,9 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) * * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 - * 0 < type -> 0 0 0 0 0 0 <-- offset ---> + * E < type -> 0 0 0 0 0 0 <-- offset ---> + * + * E is the exclusive marker that is not stored in swap entries. * * Note that the offset field is always non-zero if the swap type is 0, thus * !pte_none() is always true. @@ -251,6 +253,24 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #define __swp_entry_to_pte(swp)((pte_t) { (swp).val }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + extern void __init paging_init(void); extern void __init mmu_init(void); -- 2.39.0
[PATCH mm-unstable v1 13/26] nios2/mm: refactor swap PTE layout
nios2 disables swap for a good reason: it doesn't even provide sufficient type bits as required by core MM. However, swap entries are nowadays also used for other purposes (migration entries, PTE markers, HWPoison, ...), and accidential use could be problematic. Let's properly use 5 bits for the swap type and document the layout. Bits 26--31 should get ignored by hardware completely, so they can be used. Cc: Dinh Nguyen Signed-off-by: David Hildenbrand --- arch/nios2/include/asm/pgtable.h | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h index ab793bc517f5..d1e5c9eb4643 100644 --- a/arch/nios2/include/asm/pgtable.h +++ b/arch/nios2/include/asm/pgtable.h @@ -232,19 +232,21 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) __FILE__, __LINE__, pgd_val(e)) /* - * Encode and decode a swap entry (must be !pte_none(pte) && !pte_present(pte): + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * - * 31 30 29 28 27 26 25 24 23 22 21 20 19 18 ... 1 0 - * 0 0 0 0 type. 0 0 0 0 0 0 offset. + * Format of swap PTEs: * - * This gives us up to 2**2 = 4 swap files and 2**20 * 4K = 4G per swap file. + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * 0 < type -> 0 0 0 0 0 0 <-- offset ---> * - * Note that the offset field is always non-zero, thus !pte_none(pte) is always - * true. + * Note that the offset field is always non-zero if the swap type is 0, thus + * !pte_none() is always true. */ -#define __swp_type(swp)(((swp).val >> 26) & 0x3) +#define __swp_type(swp)(((swp).val >> 26) & 0x1f) #define __swp_offset(swp) ((swp).val & 0xf) -#define __swp_entry(type, off) ((swp_entry_t) { (((type) & 0x3) << 26) \ +#define __swp_entry(type, off) ((swp_entry_t) { (((type) & 0x1f) << 26) \ | ((off) & 0xf) }) #define __swp_entry_to_pte(swp)((pte_t) { (swp).val }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) -- 2.39.0
[PATCH mm-unstable v1 12/26] mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE. On 64bit, steal one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. On 32bit we're able to locate unused bits. As the PTE layout for 32 bit is very confusing, document it a bit better. While at it, mask the type in __swp_entry()/mk_swap_pte(). Cc: Thomas Bogendoerfer Signed-off-by: David Hildenbrand --- arch/mips/include/asm/pgtable-32.h | 88 ++ arch/mips/include/asm/pgtable-64.h | 23 ++-- arch/mips/include/asm/pgtable.h| 36 3 files changed, 131 insertions(+), 16 deletions(-) diff --git a/arch/mips/include/asm/pgtable-32.h b/arch/mips/include/asm/pgtable-32.h index b40a0e69fccc..ba0016709a1a 100644 --- a/arch/mips/include/asm/pgtable-32.h +++ b/arch/mips/include/asm/pgtable-32.h @@ -191,49 +191,113 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t prot) #define pte_page(x)pfn_to_page(pte_pfn(x)) +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + */ #if defined(CONFIG_CPU_R3K_TLB) -/* Swap entries must have VALID bit cleared. */ +/* + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- offset > < type -> V G E 0 0 0 0 0 0 P + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain + * unused. + */ #define __swp_type(x) (((x).val >> 10) & 0x1f) #define __swp_offset(x)((x).val >> 15) -#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << 10) | ((offset) << 15) }) +#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & 0x1f) << 10) | ((offset) << 15) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1 << 7) + #else #if defined(CONFIG_XPA) -/* Swap entries must have VALID and GLOBAL bits cleared. */ +/* + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * 0 0 0 0 0 0 E P <-- zeroes ---> + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <- offset --> < type -> V G 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain + * unused. + */ #define __swp_type(x) (((x).val >> 4) & 0x1f) #define __swp_offset(x) ((x).val >> 9) -#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << 4) | ((offset) << 9) }) +#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & 0x1f) << 4) | ((offset) << 9) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_high }) #define __swp_entry_to_pte(x) ((pte_t) { 0, (x).val }) +/* + * We borrow bit 57 (bit 25 in the low PTE) to store the exclusive marker in + * swap PTEs. + */ +#define _PAGE_SWP_EXCLUSIVE(1 << 25) + #elif defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32) -/* Swap entries must have VALID and GLOBAL bits cleared. */ +/* + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <-- zeroes ---> E P 0 0 0 0 0 0 + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- offset > < type -> V G + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain + * unused. + */ #define __swp_type(x) (((x).val >> 2) & 0x1f) #define __swp_offset(x) ((x).val >> 7) -#define __swp_entry(type, offset) ((swp_entry_t) { ((type) << 2) | ((offset) << 7) }) +#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & 0x1f) << 2) | ((offset) << 7) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_high }) #define __swp_entry_to_pte(x) ((pte_t) { 0, (x).val }) +/* + * We borrow bit 39 (bit 7 in the low PTE) to store the exclusive marker in swap + * PTEs. + */ +#define _PAGE_SWP_EXCLUSIVE(1 << 7) + #else /* - * Constraints: - * _PAGE_PRESENT at bit 0 - * _PAGE_MODIFIED at bit 4 - * _PAGE_GLOB
[PATCH mm-unstable v1 11/26] microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. The shift by 2 when converting between PTE and arch-specific swap entry makes the swap PTE layout a little bit harder to decipher. While at it, drop the comment from paulus---copy-and-paste leftover from powerpc where we actually have _PAGE_HASHPTE---and mask the type in __swp_entry_to_pte() as well. Cc: Michal Simek Signed-off-by: David Hildenbrand --- arch/m68k/include/asm/mcf_pgtable.h | 4 +-- arch/microblaze/include/asm/pgtable.h | 45 +-- 2 files changed, 37 insertions(+), 12 deletions(-) diff --git a/arch/m68k/include/asm/mcf_pgtable.h b/arch/m68k/include/asm/mcf_pgtable.h index 3f8f4d0e66dd..e573d7b649f7 100644 --- a/arch/m68k/include/asm/mcf_pgtable.h +++ b/arch/m68k/include/asm/mcf_pgtable.h @@ -46,8 +46,8 @@ #define _CACHEMASK040 (~0x060) #define _PAGE_GLOBAL0400x400 /* 68040 global bit, used for kva descs */ -/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ -#define _PAGE_SWP_EXCLUSIVE0x080 +/* We borrow bit 24 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVECF_PAGE_NOCACHE /* * Externally used page protection values. diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h index 42f5988e998b..7e3de54bf426 100644 --- a/arch/microblaze/include/asm/pgtable.h +++ b/arch/microblaze/include/asm/pgtable.h @@ -131,10 +131,10 @@ extern pte_t *va_to_pte(unsigned long address); * of the 16 available. Bit 24-26 of the TLB are cleared in the TLB * miss handler. Bit 27 is PAGE_USER, thus selecting the correct * zone. - * - PRESENT *must* be in the bottom two bits because swap cache - * entries use the top 30 bits. Because 4xx doesn't support SMP - * anyway, M is irrelevant so we borrow it for PAGE_PRESENT. Bit 30 - * is cleared in the TLB miss handler before the TLB entry is loaded. + * - PRESENT *must* be in the bottom two bits because swap PTEs use the top + * 30 bits. Because 4xx doesn't support SMP anyway, M is irrelevant so we + * borrow it for PAGE_PRESENT. Bit 30 is cleared in the TLB miss handler + * before the TLB entry is loaded. * - All other bits of the PTE are loaded into TLBLO without * * modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for * software PTE bits. We actually use bits 21, 24, 25, and @@ -155,6 +155,9 @@ extern pte_t *va_to_pte(unsigned long address); #define _PAGE_ACCESSED 0x400 /* software: R: page referenced */ #define _PMD_PRESENT PAGE_MASK +/* We borrow bit 24 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_DIRTY + /* * Some bits are unused... */ @@ -393,18 +396,40 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; /* - * Encode and decode a swap entry. - * Note that the bits we use in a PTE for representing a swap entry - * must not include the _PAGE_PRESENT bit, or the _PAGE_HASHPTE bit - * (if used). -- paulus + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * <-- offset ---> E < type -> 0 0 + * + * E is the exclusive marker that is not stored in swap entries. */ -#define __swp_type(entry) ((entry).val & 0x3f) +#define __swp_type(entry) ((entry).val & 0x1f) #define __swp_offset(entry)((entry).val >> 6) #define __swp_entry(type, offset) \ - ((swp_entry_t) { (type) | ((offset) << 6) }) + ((swp_entry_t) { ((type) & 0x1f) | ((offset) << 6) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 2 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 2 }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + extern unsigned long iopa(unsigned long addr); /* Values for nocacheflag and cmode */ -- 2.39.0
[PATCH mm-unstable v1 10/26] m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, make sure for sun3 that the valid bit never gets set by properly masking it off and mask the type in __swp_entry(). Cc: Geert Uytterhoeven Cc: Greg Ungerer Signed-off-by: David Hildenbrand --- arch/m68k/include/asm/mcf_pgtable.h | 36 -- arch/m68k/include/asm/motorola_pgtable.h | 38 +-- arch/m68k/include/asm/sun3_pgtable.h | 39 ++-- 3 files changed, 104 insertions(+), 9 deletions(-) diff --git a/arch/m68k/include/asm/mcf_pgtable.h b/arch/m68k/include/asm/mcf_pgtable.h index b619b22823f8..3f8f4d0e66dd 100644 --- a/arch/m68k/include/asm/mcf_pgtable.h +++ b/arch/m68k/include/asm/mcf_pgtable.h @@ -46,6 +46,9 @@ #define _CACHEMASK040 (~0x060) #define _PAGE_GLOBAL0400x400 /* 68040 global bit, used for kva descs */ +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x080 + /* * Externally used page protection values. */ @@ -254,15 +257,42 @@ static inline pte_t pte_mkcache(pte_t pte) extern pgd_t kernel_pg_dir[PTRS_PER_PGD]; /* - * Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <-- offset -> 0 0 0 E <-- type ---> + * + * E is the exclusive marker that is not stored in swap entries. */ -#define __swp_type(x) ((x).val & 0xFF) +#define __swp_type(x) ((x).val & 0x7f) #define __swp_offset(x)((x).val >> 11) -#define __swp_entry(typ, off) ((swp_entry_t) { (typ) | \ +#define __swp_entry(typ, off) ((swp_entry_t) { ((typ) & 0x7f) | \ (off << 11) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) (__pte((x).val)) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + #define pmd_pfn(pmd) (pmd_val(pmd) >> PAGE_SHIFT) #define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT)) diff --git a/arch/m68k/include/asm/motorola_pgtable.h b/arch/m68k/include/asm/motorola_pgtable.h index 562b54e09850..c1782563e793 100644 --- a/arch/m68k/include/asm/motorola_pgtable.h +++ b/arch/m68k/include/asm/motorola_pgtable.h @@ -41,6 +41,9 @@ #define _PAGE_PROTNONE 0x004 +/* We borrow bit 11 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x800 + #ifndef __ASSEMBLY__ /* This is the cache mode to be used for pages containing page descriptors for @@ -169,12 +172,41 @@ static inline pte_t pte_mkcache(pte_t pte) #define swapper_pg_dir kernel_pg_dir extern pgd_t kernel_pg_dir[128]; -/* Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) */ -#define __swp_type(x) (((x).val >> 4) & 0xff) +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <- offset > E <-- type ---> 0 0 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + */ +#define __swp_type(x) (((x).val >> 4) & 0x7f) #define __swp_offset(x)((x).val >> 12) -#define __swp_entry(type, offset) ((swp_entry_t) { ((type) << 4) | ((offset) << 12) }) +#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & 0x7f) << 4) | ((offset) << 12) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + #endif /* !__ASSEMBLY__ */ #endif /* _MOTOROLA_PGTABLE_H */ diff --git a/arch/m68k/include/asm/sun3_pgtable.h b/arch/m68k/include/asm/sun3_pgtable.h index 90d57e537eb1..dbfc9703b1
[PATCH mm-unstable v1 09/26] m68k/mm: remove dummy __swp definitions for nommu
The definitions are not required, let's remove them. Cc: Geert Uytterhoeven Cc: Greg Ungerer Signed-off-by: David Hildenbrand --- arch/m68k/include/asm/pgtable_no.h | 6 -- 1 file changed, 6 deletions(-) diff --git a/arch/m68k/include/asm/pgtable_no.h b/arch/m68k/include/asm/pgtable_no.h index fed58da3a6b6..fc044df52b96 100644 --- a/arch/m68k/include/asm/pgtable_no.h +++ b/arch/m68k/include/asm/pgtable_no.h @@ -31,12 +31,6 @@ extern void paging_init(void); #define swapper_pg_dir ((pgd_t *) 0) -#define __swp_type(x) (0) -#define __swp_offset(x)(0) -#define __swp_entry(typ,off) ((swp_entry_t) { ((typ) | ((off) << 7)) }) -#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) -#define __swp_entry_to_pte(x) ((pte_t) { (x).val }) - /* * ZERO_PAGE is a global shared page that is always zero: used * for zero-mapped memory areas etc.. -- 2.39.0
[PATCH mm-unstable v1 08/26] loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, also mask the type in mk_swap_pte(). Note that this bit does not conflict with swap PMDs and could also be used in swap PMD context later. Cc: Huacai Chen Cc: WANG Xuerui Signed-off-by: David Hildenbrand --- arch/loongarch/include/asm/pgtable-bits.h | 4 +++ arch/loongarch/include/asm/pgtable.h | 39 --- 2 files changed, 39 insertions(+), 4 deletions(-) diff --git a/arch/loongarch/include/asm/pgtable-bits.h b/arch/loongarch/include/asm/pgtable-bits.h index 3d1e0a69975a..8b98d22a145b 100644 --- a/arch/loongarch/include/asm/pgtable-bits.h +++ b/arch/loongarch/include/asm/pgtable-bits.h @@ -20,6 +20,7 @@ #define_PAGE_SPECIAL_SHIFT 11 #define_PAGE_HGLOBAL_SHIFT 12 /* HGlobal is a PMD bit */ #define_PAGE_PFN_SHIFT 12 +#define_PAGE_SWP_EXCLUSIVE_SHIFT 23 #define_PAGE_PFN_END_SHIFT 48 #define_PAGE_NO_READ_SHIFT 61 #define_PAGE_NO_EXEC_SHIFT 62 @@ -33,6 +34,9 @@ #define _PAGE_PROTNONE (_ULCAST_(1) << _PAGE_PROTNONE_SHIFT) #define _PAGE_SPECIAL (_ULCAST_(1) << _PAGE_SPECIAL_SHIFT) +/* We borrow bit 23 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(_ULCAST_(1) << _PAGE_SWP_EXCLUSIVE_SHIFT) + /* Used by TLB hardware (placed in EntryLo*) */ #define _PAGE_VALID(_ULCAST_(1) << _PAGE_VALID_SHIFT) #define _PAGE_DIRTY(_ULCAST_(1) << _PAGE_DIRTY_SHIFT) diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h index 7a34e900d8c1..c6b8fe7ac43c 100644 --- a/arch/loongarch/include/asm/pgtable.h +++ b/arch/loongarch/include/asm/pgtable.h @@ -249,13 +249,26 @@ extern void pud_init(void *addr); extern void pmd_init(void *addr); /* - * Non-present pages: high 40 bits are offset, next 8 bits type, - * low 16 bits zero. + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <--- offset --- + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * --> E <--- type ---> <-- zeroes --> + * + * E is the exclusive marker that is not stored in swap entries. + * The zero'ed bits include _PAGE_PRESENT and _PAGE_PROTNONE. */ static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) -{ pte_t pte; pte_val(pte) = (type << 16) | (offset << 24); return pte; } +{ pte_t pte; pte_val(pte) = ((type & 0x7f) << 16) | (offset << 24); return pte; } -#define __swp_type(x) (((x).val >> 16) & 0xff) +#define __swp_type(x) (((x).val >> 16) & 0x7f) #define __swp_offset(x)((x).val >> 24) #define __swp_entry(type, offset) ((swp_entry_t) { pte_val(mk_swap_pte((type), (offset))) }) #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) }) @@ -263,6 +276,24 @@ static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) }) #define __swp_entry_to_pmd(x) ((pmd_t) { (x).val | _PAGE_HUGE }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + extern void paging_init(void); #define pte_none(pte) (!(pte_val(pte) & ~_PAGE_GLOBAL)) -- 2.39.0
[PATCH mm-unstable v1 07/26] ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, also mask the type in __swp_entry(). Signed-off-by: David Hildenbrand --- arch/ia64/include/asm/pgtable.h | 32 +--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h index 01517a5e6778..e4b8ab931399 100644 --- a/arch/ia64/include/asm/pgtable.h +++ b/arch/ia64/include/asm/pgtable.h @@ -58,6 +58,9 @@ #define _PAGE_ED (__IA64_UL(1) << 52)/* exception deferral */ #define _PAGE_PROTNONE (__IA64_UL(1) << 63) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1 << 7) + #define _PFN_MASK _PAGE_PPN_MASK /* Mask of bits which may be changed by pte_modify(); the odd bits are there for _PAGE_PROTNONE */ #define _PAGE_CHG_MASK (_PAGE_P | _PAGE_PROTNONE | _PAGE_PL_MASK | _PAGE_AR_MASK | _PAGE_ED) @@ -399,6 +402,9 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; extern void paging_init (void); /* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * * Note: The macros below rely on the fact that MAX_SWAPFILES_SHIFT <= number of * bits in the swap-type field of the swap pte. It would be nice to * enforce that, but we can't easily include here. @@ -406,16 +412,36 @@ extern void paging_init (void); * * Format of swap pte: * bit 0 : present bit (must be zero) - * bits 1- 7: swap-type + * bits 1- 6: swap type + * bit 7 : exclusive marker * bits 8-62: swap offset * bit 63 : _PAGE_PROTNONE bit */ -#define __swp_type(entry) (((entry).val >> 1) & 0x7f) +#define __swp_type(entry) (((entry).val >> 1) & 0x3f) #define __swp_offset(entry)(((entry).val << 1) >> 9) -#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << 1) | ((long) (offset) << 8) }) +#define __swp_entry(type, offset) ((swp_entry_t) { ((type & 0x3f) << 1) | \ +((long) (offset) << 8) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + /* * ZERO_PAGE is a global shared page that is always zero: used * for zero-mapped memory areas etc.. -- 2.39.0
[PATCH mm-unstable v1 06/26] hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file to 16 GiB (was 32 GiB). While at it, mask the type in __swp_entry(). Cc: Brian Cain Signed-off-by: David Hildenbrand --- arch/hexagon/include/asm/pgtable.h | 37 +- 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h index f7048c18b6f9..7eb008e477c8 100644 --- a/arch/hexagon/include/asm/pgtable.h +++ b/arch/hexagon/include/asm/pgtable.h @@ -61,6 +61,9 @@ extern unsigned long empty_zero_page; * So we'll put up with a bit of inefficiency for now... */ +/* We borrow bit 6 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<6) + /* * Top "FOURTH" level (pgd), which for the Hexagon VM is really * only the second from the bottom, pgd and pud both being collapsed. @@ -359,9 +362,12 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #define ZERO_PAGE(vaddr) (virt_to_page(&empty_zero_page)) /* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * * Swap/file PTE definitions. If _PAGE_PRESENT is zero, the rest of the PTE is * interpreted as swap information. The remaining free bits are interpreted as - * swap type/offset tuple. Rather than have the TLB fill handler test + * listed below. Rather than have the TLB fill handler test * _PAGE_PRESENT, we're going to reserve the permissions bits and set them to * all zeros for swap entries, which speeds up the miss handler at the cost of * 3 bits of offset. That trade-off can be revisited if necessary, but Hexagon @@ -371,9 +377,10 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) * Format of swap PTE: * bit 0: Present (zero) * bits1-5:swap type (arch independent layer uses 5 bits max) - * bits6-9:bits 3:0 of offset + * bit 6: exclusive marker + * bits7-9:bits 2:0 of offset * bits10-12: effectively _PAGE_PROTNONE (all zero) - * bits13-31: bits 22:4 of swap offset + * bits13-31: bits 21:3 of swap offset * * The split offset makes some of the following macros a little gnarly, * but there's plenty of precedent for this sort of thing. @@ -383,11 +390,29 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #define __swp_type(swp_pte)(((swp_pte).val >> 1) & 0x1f) #define __swp_offset(swp_pte) \ - swp_pte).val >> 6) & 0xf) | (((swp_pte).val >> 9) & 0x70)) + swp_pte).val >> 7) & 0x7) | (((swp_pte).val >> 10) & 0x38)) #define __swp_entry(type, offset) \ ((swp_entry_t) { \ - ((type << 1) | \ -((offset & 0x70) << 9) | ((offset & 0xf) << 6)) }) + (((type & 0x1f) << 1) | \ +((offset & 0x38) << 10) | ((offset & 0x7) << 7)) }) + +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} #endif -- 2.39.0
[PATCH mm-unstable v1 05/26] csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file to 16 GiB (was 32 GiB). We might actually be able to reuse one of the other software bits (_PAGE_READ / PAGE_WRITE) instead, because we only have to keep pte_present(), pte_none() and HW happy. For now, let's keep it simple because there might be something non-obvious. Cc: Guo Ren Signed-off-by: David Hildenbrand --- arch/csky/abiv1/inc/abi/pgtable-bits.h | 13 + arch/csky/abiv2/inc/abi/pgtable-bits.h | 19 --- arch/csky/include/asm/pgtable.h| 18 ++ 3 files changed, 39 insertions(+), 11 deletions(-) diff --git a/arch/csky/abiv1/inc/abi/pgtable-bits.h b/arch/csky/abiv1/inc/abi/pgtable-bits.h index 752c8b3f9194..ae7a2f76dd42 100644 --- a/arch/csky/abiv1/inc/abi/pgtable-bits.h +++ b/arch/csky/abiv1/inc/abi/pgtable-bits.h @@ -10,6 +10,9 @@ #define _PAGE_ACCESSED (1<<3) #define _PAGE_MODIFIED (1<<4) +/* We borrow bit 9 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<9) + /* implemented in hardware */ #define _PAGE_GLOBAL (1<<6) #define _PAGE_VALID(1<<7) @@ -26,7 +29,8 @@ #define _PAGE_PROT_NONE_PAGE_READ /* - * Encode and decode a swap entry + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * * Format of swap PTE: * bit 0:_PAGE_PRESENT (zero) @@ -35,15 +39,16 @@ * bit 6:_PAGE_GLOBAL (zero) * bit 7:_PAGE_VALID (zero) * bit 8:swap type[4] - * bit 9 - 31:swap offset + * bit 9:exclusive marker + * bit10 - 31:swap offset */ #define __swp_type(x) x).val >> 2) & 0xf) | \ (((x).val >> 4) & 0x10)) -#define __swp_offset(x)((x).val >> 9) +#define __swp_offset(x)((x).val >> 10) #define __swp_entry(type, offset) ((swp_entry_t) { \ ((type & 0xf) << 2) | \ ((type & 0x10) << 4) | \ - ((offset) << 9)}) + ((offset) << 10)}) #define HAVE_ARCH_UNMAPPED_AREA diff --git a/arch/csky/abiv2/inc/abi/pgtable-bits.h b/arch/csky/abiv2/inc/abi/pgtable-bits.h index 7e7f389f546f..526152bd2156 100644 --- a/arch/csky/abiv2/inc/abi/pgtable-bits.h +++ b/arch/csky/abiv2/inc/abi/pgtable-bits.h @@ -10,6 +10,9 @@ #define _PAGE_PRESENT (1<<10) #define _PAGE_MODIFIED (1<<11) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<7) + /* implemented in hardware */ #define _PAGE_GLOBAL (1<<0) #define _PAGE_VALID(1<<1) @@ -26,23 +29,25 @@ #define _PAGE_PROT_NONE_PAGE_WRITE /* - * Encode and decode a swap entry + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * * Format of swap PTE: * bit 0:_PAGE_GLOBAL (zero) * bit 1:_PAGE_VALID (zero) * bit 2 - 6:swap type - * bit 7 - 8:swap offset[0 - 1] + * bit 7:exclusive marker + * bit 8:swap offset[0] * bit 9:_PAGE_WRITE (zero) * bit 10:_PAGE_PRESENT (zero) - * bit11 - 31:swap offset[2 - 22] + * bit11 - 31:swap offset[1 - 21] */ #define __swp_type(x) (((x).val >> 2) & 0x1f) -#define __swp_offset(x)x).val >> 7) & 0x3) | \ - (((x).val >> 9) & 0x7c)) +#define __swp_offset(x)x).val >> 8) & 0x1) | \ + (((x).val >> 10) & 0x3e)) #define __swp_entry(type, offset) ((swp_entry_t) { \ ((type & 0x1f) << 2) | \ - ((offset & 0x3) << 7) | \ - ((offset & 0x7c) << 9)}) + ((offset & 0x1) << 8) | \ + ((offset & 0x3e) << 10)}) #endif /* __ASM_CSKY_PGTABLE_BITS_H */ diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h index 77bc6caff2d2..574c97b9ecca 100644 --- a/arch/csky/include/asm/pgtable.h +++ b/arch/csky/include/asm/pgtable.h @@ -200,6 +200,24 @@ static inline pte_t pte_mkyoung(pte_t pte) return pte; } +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pt
[PATCH mm-unstable v1 04/26] arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file to 64 GiB (was 128 GiB). While at it drop the PTE_TYPE_FAULT from __swp_entry_to_pte() which is defined to be 0 and is rather confusing because we should be dealing with "Linux PTEs" not "hardware PTEs". Also, properly mask the type in __swp_entry(). Cc: Russell King Signed-off-by: David Hildenbrand --- arch/arm/include/asm/pgtable-2level.h | 3 +++ arch/arm/include/asm/pgtable-3level.h | 3 +++ arch/arm/include/asm/pgtable.h| 35 +-- 3 files changed, 34 insertions(+), 7 deletions(-) diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 92abd4cd8ca2..ce543cd9380c 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -126,6 +126,9 @@ #define L_PTE_SHARED (_AT(pteval_t, 1) << 10)/* shared(v6), coherent(xsc3) */ #define L_PTE_NONE (_AT(pteval_t, 1) << 11) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define L_PTE_SWP_EXCLUSIVEL_PTE_RDONLY + /* * These are the memory types, defined to be compatible with * pre-ARMv6 CPUs cacheable and bufferable bits: n/a,n/a,C,B diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index eabe72ff7381..106049791500 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -76,6 +76,9 @@ #define L_PTE_NONE (_AT(pteval_t, 1) << 57)/* PROT_NONE */ #define L_PTE_RDONLY (_AT(pteval_t, 1) << 58)/* READ ONLY */ +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define L_PTE_SWP_EXCLUSIVE(_AT(pteval_t, 1) << 7) + #define L_PMD_SECT_VALID (_AT(pmdval_t, 1) << 0) #define L_PMD_SECT_DIRTY (_AT(pmdval_t, 1) << 55) #define L_PMD_SECT_NONE(_AT(pmdval_t, 1) << 57) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index f049072b2e85..886c275995a2 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -271,27 +271,48 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) } /* - * Encode and decode a swap entry. Swap entries are stored in the Linux - * page tables as follows: + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: * * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 - * <--- offset > < type -> 0 0 + * <--- offset --> E < type -> 0 0 + * + * E is the exclusive marker that is not stored in swap entries. * - * This gives us up to 31 swap files and 128GB per swap file. Note that + * This gives us up to 31 swap files and 64GB per swap file. Note that * the offset field is always non-zero. */ #define __SWP_TYPE_SHIFT 2 #define __SWP_TYPE_BITS5 #define __SWP_TYPE_MASK((1 << __SWP_TYPE_BITS) - 1) -#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT) +#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT + 1) #define __swp_type(x) (((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK) #define __swp_offset(x)((x).val >> __SWP_OFFSET_SHIFT) -#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << __SWP_TYPE_SHIFT) | ((offset) << __SWP_OFFSET_SHIFT) }) +#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & __SWP_TYPE_BITS) << __SWP_TYPE_SHIFT) | \ + ((offset) << __SWP_OFFSET_SHIFT) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) -#define __swp_entry_to_pte(swp)__pte((swp).val | PTE_TYPE_FAULT) +#define __swp_entry_to_pte(swp)__pte((swp).val) + +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_isset(pte, L_PTE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(L_PTE_SWP_EXCLUSIVE)); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return clear_pte_bit(pte, __pgprot(L_PTE_SWP_EXCLUSIVE)); +} /* * It is an error for the kernel to have more swap files than we can -- 2.39.0
[PATCH mm-unstable v1 03/26] arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 5, which is yet unused. The only important parts seems to be to not use _PAGE_PRESENT (bit 9). Cc: Vineet Gupta Signed-off-by: David Hildenbrand --- arch/arc/include/asm/pgtable-bits-arcv2.h | 27 --- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h b/arch/arc/include/asm/pgtable-bits-arcv2.h index 515e82db519f..611f412713b9 100644 --- a/arch/arc/include/asm/pgtable-bits-arcv2.h +++ b/arch/arc/include/asm/pgtable-bits-arcv2.h @@ -26,6 +26,9 @@ #define _PAGE_GLOBAL (1 << 8) /* ASID agnostic (H) */ #define _PAGE_PRESENT (1 << 9) /* PTE/TLB Valid (H) */ +/* We borrow bit 5 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_DIRTY + #ifdef CONFIG_ARC_MMU_V4 #define _PAGE_HW_SZ(1 << 10) /* Normal/super (H) */ #else @@ -106,9 +109,18 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep); -/* Encode swap {type,off} tuple into PTE - * We reserve 13 bits for 5-bit @type, keeping bits 12-5 zero, ensuring that - * PAGE_PRESENT is zero in a PTE holding swap "identifier" +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <-- offset -> <--- zero --> E < type -> + * + * E is the exclusive marker that is not stored in swap entries. + * The zero'ed bits include _PAGE_PRESENT. */ #define __swp_entry(type, off) ((swp_entry_t) \ { ((type) & 0x1f) | ((off) << 13) }) @@ -120,6 +132,15 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +PTE_BIT_FUNC(swp_mkexclusive, |= (_PAGE_SWP_EXCLUSIVE)); +PTE_BIT_FUNC(swp_clear_exclusive, &= ~(_PAGE_SWP_EXCLUSIVE)); + #ifdef CONFIG_TRANSPARENT_HUGEPAGE #include #endif -- 2.39.0
[PATCH mm-unstable v1 02/26] alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, mask the type in mk_swap_pte() as well. Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Signed-off-by: David Hildenbrand --- arch/alpha/include/asm/pgtable.h | 41 1 file changed, 37 insertions(+), 4 deletions(-) diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h index 9e45f6735d5d..970abf511b13 100644 --- a/arch/alpha/include/asm/pgtable.h +++ b/arch/alpha/include/asm/pgtable.h @@ -74,6 +74,9 @@ struct vm_area_struct; #define _PAGE_DIRTY0x2 #define _PAGE_ACCESSED 0x4 +/* We borrow bit 39 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x80UL + /* * NOTE! The "accessed" bit isn't necessarily exact: it can be kept exactly * by software (use the KRE/URE/KWE/UWE bits appropriately), but I'll fake it. @@ -301,18 +304,48 @@ extern inline void update_mmu_cache(struct vm_area_struct * vma, } /* - * Non-present pages: high 24 bits are offset, next 8 bits type, - * low 32 bits zero. + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <--- offset --> E <--- type --> + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- zeroes --> + * + * E is the exclusive marker that is not stored in swap entries. */ extern inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) -{ pte_t pte; pte_val(pte) = (type << 32) | (offset << 40); return pte; } +{ pte_t pte; pte_val(pte) = ((type & 0x7f) << 32) | (offset << 40); return pte; } -#define __swp_type(x) (((x).val >> 32) & 0xff) +#define __swp_type(x) (((x).val >> 32) & 0x7f) #define __swp_offset(x)((x).val >> 40) #define __swp_entry(type, off) ((swp_entry_t) { pte_val(mk_swap_pte((type), (off))) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + #define pte_ERROR(e) \ printk("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e)) #define pmd_ERROR(e) \ -- 2.39.0
[PATCH mm-unstable v1 01/26] mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks
We want to implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures. Let's extend our sanity checks, especially testing that our PTE bit does not affect: * is_swap_pte() -> pte_present() and pte_none() * the swap entry + type * pte_swp_soft_dirty() Especially, the pfn_pte() is dodgy when the swap PTE layout differs heavily from ordinary PTEs. Let's properly construct a swap PTE from swap type+offset. Signed-off-by: David Hildenbrand --- mm/debug_vm_pgtable.c | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index bb3328f46126..a0730beffd78 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -811,13 +811,34 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) { static void __init pte_swap_exclusive_tests(struct pgtable_debug_args *args) { #ifdef __HAVE_ARCH_PTE_SWP_EXCLUSIVE - pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); + unsigned long max_swapfile_size = generic_max_swapfile_size(); + swp_entry_t entry, entry2; + pte_t pte; pr_debug("Validating PTE swap exclusive\n"); + + /* Create a swp entry with all possible bits set */ + entry = swp_entry((1 << MAX_SWAPFILES_SHIFT) - 1, + max_swapfile_size - 1); + + pte = swp_entry_to_pte(entry); + WARN_ON(pte_swp_exclusive(pte)); + WARN_ON(!is_swap_pte(pte)); + entry2 = pte_to_swp_entry(pte); + WARN_ON(memcmp(&entry, &entry2, sizeof(entry))); + pte = pte_swp_mkexclusive(pte); WARN_ON(!pte_swp_exclusive(pte)); + WARN_ON(!is_swap_pte(pte)); + WARN_ON(pte_swp_soft_dirty(pte)); + entry2 = pte_to_swp_entry(pte); + WARN_ON(memcmp(&entry, &entry2, sizeof(entry))); + pte = pte_swp_clear_exclusive(pte); WARN_ON(pte_swp_exclusive(pte)); + WARN_ON(!is_swap_pte(pte)); + entry2 = pte_to_swp_entry(pte); + WARN_ON(memcmp(&entry, &entry2, sizeof(entry))); #endif /* __HAVE_ARCH_PTE_SWP_EXCLUSIVE */ } -- 2.39.0
RE: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)
>> Is it time yet for: >> >> $ git rm -r arch/ia64 >> > Can I take that as an ack on [0]? The EFI subsystem has evolved > substantially over the years, and there is really no way to do any > IA64 testing beyond build testing, so from that perspective, dropping > it entirely would be welcomed. > > Thanks, > Ard. > > > > [0] > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=remove-ia64 Yes. EFI isn't the only issue. A bunch of folks[1] have spent time fixing ia64 for (in most cases) some tree-wide patches that they needed. Their time might have been more productively spent fixing things that actually matter in modern times. Acked-by: Tony Luck -Tony [1] git log --no-merges --since=2year -- arch/ia64 | grep Author: | sort | uniq -c | sort -rn 19 Author: Masahiro Yamada 11 Author: Sergei Trofimovich 9 Author: Eric W. Biederman 8 Author: Arnd Bergmann 6 Author: Randy Dunlap 6 Author: Kefeng Wang 6 Author: Anshuman Khandual 5 Author: Masami Hiramatsu 5 Author: Al Viro 4 Author: Peter Zijlstra 4 Author: Mike Rapoport 4 Author: David Hildenbrand 4 Author: Christophe Leroy 3 Author: Yury Norov 3 Author: Michal Hocko 3 Author: Geert Uytterhoeven 3 Author: Bhaskar Chowdhury 3 Author: Baoquan He 3 Author: Ard Biesheuvel 3 Author: Aneesh Kumar K.V 2 Author: Yang Guang 2 Author: Will Deacon 2 Author: Viresh Kumar 2 Author: Valentin Schneider 2 Author: Stafford Horne 2 Author: Sebastian Andrzej Siewior 2 Author: Richard Guy Briggs 2 Author: Peter Xu 2 Author: Peter Collingbourne 2 Author: Mark Rutland 2 Author: Lukas Bulwahn 2 Author: Julia Lawall 2 Author: Jens Axboe 2 Author: Jason Wang 2 Author: Jan Kara 2 Author: Christoph Hellwig 2 Author: Bjorn Helgaas 2 Author: Alexander Lobakin 1 Author: Zi Yan 1 Author: Zhang Yunkai 1 Author: ye xingchen 1 Author: xu xin 1 Author: Wolfram Sang 1 Author: Weizhao Ouyang 1 Author: Suren Baghdasaryan 1 Author: Souptick Joarder (HPE) 1 Author: Sergey Shtylyov 1 Author: Sergei Trofimovich 1 Author: Sascha Hauer 1 Author: Samuel Holland 1 Author: Qi Zheng 1 Author: Peng Liu 1 Author: Naveen N. Rao 1 Author: Muchun Song 1 Author: Mikulas Patocka 1 Author: Mike Kravetz 1 Author: Mickaël Salaün 1 Author: Matthew Wilcox (Oracle) 1 Author: Martin Oliveira 1 Author: Luc Van Oostenryck 1 Author: Kees Cook 1 Author: Jason A. Donenfeld 1 Author: Ilpo Järvinen 1 Author: Haowen Bai 1 Author: Gustavo A. R. Silva 1 Author: Greg Kroah-Hartman 1 Author: Gaosheng Cui 1 Author: Dmitry Osipenko 1 Author: Dawei Li 1 Author: Chuck Lever 1 Author: Christian Brauner 1 Author: Chris Down 1 Author: Chen Li 1 Author: Catalin Marinas 1 Author: Benjamin Stürz 1 Author: Ben Dooks 1 Author: Baolin Wang 1 Author: Andy Shevchenko 1 Author: André Almeida
Re: [PATCH v3 07/13] tty: Convert ->dtr_rts() to take bool argument
On Wed, 11 Jan 2023 at 15:24, Ilpo Järvinen wrote: > > Convert the raise/on parameter in ->dtr_rts() to bool through the > callchain. The parameter is used like bool. In USB serial, there > remains a few implicit bool -> larger type conversions because some > devices use u8 in their control messages. > > In moxa_tiocmget(), dtr variable was reused for line status which > requires int so use a separate variable for status. > > Reviewed-by: Jiri Slaby > Signed-off-by: Ilpo Järvinen Acked-by: Ulf Hansson # For MMC Kind regards Uffe > --- > drivers/char/pcmcia/synclink_cs.c| 4 +-- > drivers/mmc/core/sdio_uart.c | 4 +-- > drivers/staging/greybus/uart.c | 2 +- > drivers/tty/amiserial.c | 2 +- > drivers/tty/hvc/hvc_console.c| 4 +-- > drivers/tty/hvc/hvc_console.h| 2 +- > drivers/tty/hvc/hvc_iucv.c | 4 +-- > drivers/tty/moxa.c | 54 ++-- > drivers/tty/mxser.c | 2 +- > drivers/tty/n_gsm.c | 2 +- > drivers/tty/serial/serial_core.c | 8 ++--- > drivers/tty/synclink_gt.c| 2 +- > drivers/tty/tty_port.c | 4 +-- > drivers/usb/class/cdc-acm.c | 2 +- > drivers/usb/serial/ch341.c | 2 +- > drivers/usb/serial/cp210x.c | 4 +-- > drivers/usb/serial/cypress_m8.c | 6 ++-- > drivers/usb/serial/digi_acceleport.c | 6 ++-- > drivers/usb/serial/f81232.c | 2 +- > drivers/usb/serial/f81534.c | 2 +- > drivers/usb/serial/ftdi_sio.c| 2 +- > drivers/usb/serial/ipw.c | 2 +- > drivers/usb/serial/keyspan.c | 2 +- > drivers/usb/serial/keyspan_pda.c | 2 +- > drivers/usb/serial/mct_u232.c| 4 +-- > drivers/usb/serial/mxuport.c | 2 +- > drivers/usb/serial/pl2303.c | 2 +- > drivers/usb/serial/quatech2.c| 2 +- > drivers/usb/serial/sierra.c | 2 +- > drivers/usb/serial/spcp8x5.c | 2 +- > drivers/usb/serial/ssu100.c | 2 +- > drivers/usb/serial/upd78f0730.c | 6 ++-- > drivers/usb/serial/usb-serial.c | 2 +- > drivers/usb/serial/usb-wwan.h| 2 +- > drivers/usb/serial/usb_wwan.c| 2 +- > drivers/usb/serial/xr_serial.c | 6 ++-- > include/linux/tty_port.h | 4 +-- > include/linux/usb/serial.h | 2 +- > 38 files changed, 84 insertions(+), 82 deletions(-) > > diff --git a/drivers/char/pcmcia/synclink_cs.c > b/drivers/char/pcmcia/synclink_cs.c > index 4391138e1b8a..46a0b586d234 100644 > --- a/drivers/char/pcmcia/synclink_cs.c > +++ b/drivers/char/pcmcia/synclink_cs.c > @@ -378,7 +378,7 @@ static void async_mode(MGSLPC_INFO *info); > static void tx_timeout(struct timer_list *t); > > static bool carrier_raised(struct tty_port *port); > -static void dtr_rts(struct tty_port *port, int onoff); > +static void dtr_rts(struct tty_port *port, bool onoff); > > #if SYNCLINK_GENERIC_HDLC > #define dev_to_port(D) (dev_to_hdlc(D)->priv) > @@ -2442,7 +2442,7 @@ static bool carrier_raised(struct tty_port *port) > return info->serial_signals & SerialSignal_DCD; > } > > -static void dtr_rts(struct tty_port *port, int onoff) > +static void dtr_rts(struct tty_port *port, bool onoff) > { > MGSLPC_INFO *info = container_of(port, MGSLPC_INFO, port); > unsigned long flags; > diff --git a/drivers/mmc/core/sdio_uart.c b/drivers/mmc/core/sdio_uart.c > index 47f58258d8ff..c6b4b2b2a4b2 100644 > --- a/drivers/mmc/core/sdio_uart.c > +++ b/drivers/mmc/core/sdio_uart.c > @@ -548,14 +548,14 @@ static bool uart_carrier_raised(struct tty_port *tport) > * adjusted during an open, close and hangup. > */ > > -static void uart_dtr_rts(struct tty_port *tport, int onoff) > +static void uart_dtr_rts(struct tty_port *tport, bool onoff) > { > struct sdio_uart_port *port = > container_of(tport, struct sdio_uart_port, port); > int ret = sdio_uart_claim_func(port); > if (ret) > return; > - if (onoff == 0) > + if (!onoff) > sdio_uart_clear_mctrl(port, TIOCM_DTR | TIOCM_RTS); > else > sdio_uart_set_mctrl(port, TIOCM_DTR | TIOCM_RTS); > diff --git a/drivers/staging/greybus/uart.c b/drivers/staging/greybus/uart.c > index 90ff07f2cbf7..92d49740d5a4 100644 > --- a/drivers/staging/greybus/uart.c > +++ b/drivers/staging/greybus/uart.c > @@ -701,7 +701,7 @@ static int gb_tty_ioctl(struct tty_struct *tty, unsigned > int cmd, > return -ENOIOCTLCMD; > } > > -static void gb_tty_dtr_rts(struct tty_port *port, int on) > +static void gb_tty_dtr_rts(struct tty_port *port, bool on) > { > struct gb_tty *gb_tty; > u8 newctrl; > diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c > index 01c4fd3ce7c8..29d4c554f6b8 100644 > --- a/drivers/tty/amiserial.c > +++ b/drivers/tty/amiserial.c > @@ -1459,7 +1459,7 @@ sta
Re: [PATCH v3 11/13] tty/serial: Call ->dtr_rts() parameter active consistently
On Wed, 11 Jan 2023 at 15:24, Ilpo Järvinen wrote: > > Convert various parameter names for ->dtr_rts() and related functions > from onoff, on, and raise to active. > > Reviewed-by: Jiri Slaby > Signed-off-by: Ilpo Järvinen Acked-by: Ulf Hansson # For MMC Kind regards Uffe > --- > drivers/char/pcmcia/synclink_cs.c | 6 +++--- > drivers/mmc/core/sdio_uart.c | 6 +++--- > drivers/staging/greybus/uart.c| 4 ++-- > drivers/tty/amiserial.c | 4 ++-- > drivers/tty/hvc/hvc_console.h | 2 +- > drivers/tty/hvc/hvc_iucv.c| 6 +++--- > drivers/tty/mxser.c | 4 ++-- > drivers/tty/n_gsm.c | 4 ++-- > drivers/tty/serial/serial_core.c | 8 > drivers/tty/synclink_gt.c | 4 ++-- > include/linux/tty_port.h | 4 ++-- > include/linux/usb/serial.h| 2 +- > 12 files changed, 27 insertions(+), 27 deletions(-) > > diff --git a/drivers/char/pcmcia/synclink_cs.c > b/drivers/char/pcmcia/synclink_cs.c > index 46a0b586d234..1577eba6fe0e 100644 > --- a/drivers/char/pcmcia/synclink_cs.c > +++ b/drivers/char/pcmcia/synclink_cs.c > @@ -378,7 +378,7 @@ static void async_mode(MGSLPC_INFO *info); > static void tx_timeout(struct timer_list *t); > > static bool carrier_raised(struct tty_port *port); > -static void dtr_rts(struct tty_port *port, bool onoff); > +static void dtr_rts(struct tty_port *port, bool active); > > #if SYNCLINK_GENERIC_HDLC > #define dev_to_port(D) (dev_to_hdlc(D)->priv) > @@ -2442,13 +2442,13 @@ static bool carrier_raised(struct tty_port *port) > return info->serial_signals & SerialSignal_DCD; > } > > -static void dtr_rts(struct tty_port *port, bool onoff) > +static void dtr_rts(struct tty_port *port, bool active) > { > MGSLPC_INFO *info = container_of(port, MGSLPC_INFO, port); > unsigned long flags; > > spin_lock_irqsave(&info->lock, flags); > - if (onoff) > + if (active) > info->serial_signals |= SerialSignal_RTS | SerialSignal_DTR; > else > info->serial_signals &= ~(SerialSignal_RTS | > SerialSignal_DTR); > diff --git a/drivers/mmc/core/sdio_uart.c b/drivers/mmc/core/sdio_uart.c > index c6b4b2b2a4b2..50536fe59f1a 100644 > --- a/drivers/mmc/core/sdio_uart.c > +++ b/drivers/mmc/core/sdio_uart.c > @@ -542,20 +542,20 @@ static bool uart_carrier_raised(struct tty_port *tport) > /** > * uart_dtr_rts-port helper to set uart signals > * @tport: tty port to be updated > - * @onoff: set to turn on DTR/RTS > + * @active: set to turn on DTR/RTS > * > * Called by the tty port helpers when the modem signals need to be > * adjusted during an open, close and hangup. > */ > > -static void uart_dtr_rts(struct tty_port *tport, bool onoff) > +static void uart_dtr_rts(struct tty_port *tport, bool active) > { > struct sdio_uart_port *port = > container_of(tport, struct sdio_uart_port, port); > int ret = sdio_uart_claim_func(port); > if (ret) > return; > - if (!onoff) > + if (!active) > sdio_uart_clear_mctrl(port, TIOCM_DTR | TIOCM_RTS); > else > sdio_uart_set_mctrl(port, TIOCM_DTR | TIOCM_RTS); > diff --git a/drivers/staging/greybus/uart.c b/drivers/staging/greybus/uart.c > index 92d49740d5a4..20a34599859f 100644 > --- a/drivers/staging/greybus/uart.c > +++ b/drivers/staging/greybus/uart.c > @@ -701,7 +701,7 @@ static int gb_tty_ioctl(struct tty_struct *tty, unsigned > int cmd, > return -ENOIOCTLCMD; > } > > -static void gb_tty_dtr_rts(struct tty_port *port, bool on) > +static void gb_tty_dtr_rts(struct tty_port *port, bool active) > { > struct gb_tty *gb_tty; > u8 newctrl; > @@ -709,7 +709,7 @@ static void gb_tty_dtr_rts(struct tty_port *port, bool on) > gb_tty = container_of(port, struct gb_tty, port); > newctrl = gb_tty->ctrlout; > > - if (on) > + if (active) > newctrl |= (GB_UART_CTRL_DTR | GB_UART_CTRL_RTS); > else > newctrl &= ~(GB_UART_CTRL_DTR | GB_UART_CTRL_RTS); > diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c > index 29d4c554f6b8..d7515d61659e 100644 > --- a/drivers/tty/amiserial.c > +++ b/drivers/tty/amiserial.c > @@ -1459,13 +1459,13 @@ static bool amiga_carrier_raised(struct tty_port > *port) > return !(ciab.pra & SER_DCD); > } > > -static void amiga_dtr_rts(struct tty_port *port, bool raise) > +static void amiga_dtr_rts(struct tty_port *port, bool active) > { > struct serial_state *info = container_of(port, struct serial_state, > tport); > unsigned long flags; > > - if (raise) > + if (active) > info->MCR |= SER_DTR|SER_RTS; > else > info->MCR &= ~(SER_DTR|SER_RTS); > diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h > index 6d3428
[PATCH v3 10/10] MAINTAINERS: add the Freescale QMC audio entry
After contributing the component, add myself as the maintainer for the Freescale QMC audio ASoC component. Signed-off-by: Herve Codina --- MAINTAINERS | 8 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 9a574892b9b1..9dcfadec5aa3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8440,6 +8440,14 @@ F: sound/soc/fsl/fsl* F: sound/soc/fsl/imx* F: sound/soc/fsl/mpc8610_hpcd.c +FREESCALE SOC SOUND QMC DRIVER +M: Herve Codina +L: alsa-de...@alsa-project.org (moderated for non-subscribers) +L: linuxppc-dev@lists.ozlabs.org +S: Maintained +F: Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml +F: sound/soc/fsl/fsl_qmc_audio.c + FREESCALE USB PERIPHERAL DRIVERS M: Li Yang L: linux-...@vger.kernel.org -- 2.38.1
[PATCH v3 09/10] ASoC: fsl: Add support for QMC audio
The QMC audio is an ASoC component which provides DAIs that use the QMC (QUICC Multichannel Controller) to transfer the audio data. It provides as many DAIs as the number of QMC channels it references. Signed-off-by: Herve Codina --- sound/soc/fsl/Kconfig | 9 + sound/soc/fsl/Makefile| 2 + sound/soc/fsl/fsl_qmc_audio.c | 732 ++ 3 files changed, 743 insertions(+) create mode 100644 sound/soc/fsl/fsl_qmc_audio.c diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig index 614eceda6b9e..17db29c25d96 100644 --- a/sound/soc/fsl/Kconfig +++ b/sound/soc/fsl/Kconfig @@ -172,6 +172,15 @@ config SND_MPC52xx_DMA config SND_SOC_POWERPC_DMA tristate +config SND_SOC_POWERPC_QMC_AUDIO + tristate "QMC ALSA SoC support" + depends on CPM_QMC + help + ALSA SoC Audio support using the Freescale QUICC Multichannel + Controller (QMC). + Say Y or M if you want to add support for SoC audio using Freescale + QMC. + comment "SoC Audio support for Freescale PPC boards:" config SND_SOC_MPC8610_HPCD diff --git a/sound/soc/fsl/Makefile b/sound/soc/fsl/Makefile index b54beb1a66fa..8db7e97d0bd5 100644 --- a/sound/soc/fsl/Makefile +++ b/sound/soc/fsl/Makefile @@ -28,6 +28,7 @@ snd-soc-fsl-easrc-objs := fsl_easrc.o snd-soc-fsl-xcvr-objs := fsl_xcvr.o snd-soc-fsl-aud2htx-objs := fsl_aud2htx.o snd-soc-fsl-rpmsg-objs := fsl_rpmsg.o +snd-soc-fsl-qmc-audio-objs := fsl_qmc_audio.o obj-$(CONFIG_SND_SOC_FSL_AUDMIX) += snd-soc-fsl-audmix.o obj-$(CONFIG_SND_SOC_FSL_ASOC_CARD) += snd-soc-fsl-asoc-card.o @@ -44,6 +45,7 @@ obj-$(CONFIG_SND_SOC_POWERPC_DMA) += snd-soc-fsl-dma.o obj-$(CONFIG_SND_SOC_FSL_XCVR) += snd-soc-fsl-xcvr.o obj-$(CONFIG_SND_SOC_FSL_AUD2HTX) += snd-soc-fsl-aud2htx.o obj-$(CONFIG_SND_SOC_FSL_RPMSG) += snd-soc-fsl-rpmsg.o +obj-$(CONFIG_SND_SOC_POWERPC_QMC_AUDIO) += snd-soc-fsl-qmc-audio.o # MPC5200 Platform Support obj-$(CONFIG_SND_MPC52xx_DMA) += mpc5200_dma.o diff --git a/sound/soc/fsl/fsl_qmc_audio.c b/sound/soc/fsl/fsl_qmc_audio.c new file mode 100644 index ..c36b493da96a --- /dev/null +++ b/sound/soc/fsl/fsl_qmc_audio.c @@ -0,0 +1,732 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ALSA SoC using the QUICC Multichannel Controller (QMC) + * + * Copyright 2022 CS GROUP France + * + * Author: Herve Codina + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct qmc_dai { + char *name; + int id; + struct device *dev; + struct qmc_chan *qmc_chan; + unsigned int nb_tx_ts; + unsigned int nb_rx_ts; +}; + +struct qmc_audio { + struct device *dev; + unsigned int num_dais; + struct qmc_dai *dais; + struct snd_soc_dai_driver *dai_drivers; +}; + +struct qmc_dai_prtd { + struct qmc_dai *qmc_dai; + dma_addr_t dma_buffer_start; + dma_addr_t period_ptr_submitted; + dma_addr_t period_ptr_ended; + dma_addr_t dma_buffer_end; + size_t period_size; + struct snd_pcm_substream *substream; +}; + +static int qmc_audio_pcm_construct(struct snd_soc_component *component, + struct snd_soc_pcm_runtime *rtd) +{ + struct snd_card *card = rtd->card->snd_card; + int ret; + + ret = dma_coerce_mask_and_coherent(card->dev, DMA_BIT_MASK(32)); + if (ret) + return ret; + + snd_pcm_set_managed_buffer_all(rtd->pcm, SNDRV_DMA_TYPE_DEV, card->dev, + 64*1024, 64*1024); + return 0; +} + +static int qmc_audio_pcm_hw_params(struct snd_soc_component *component, + struct snd_pcm_substream *substream, + struct snd_pcm_hw_params *params) +{ + struct snd_pcm_runtime *runtime = substream->runtime; + struct qmc_dai_prtd *prtd = substream->runtime->private_data; + + prtd->dma_buffer_start = runtime->dma_addr; + prtd->dma_buffer_end = runtime->dma_addr + params_buffer_bytes(params); + prtd->period_size = params_period_bytes(params); + prtd->period_ptr_submitted = prtd->dma_buffer_start; + prtd->period_ptr_ended = prtd->dma_buffer_start; + prtd->substream = substream; + + return 0; +} + +static void qmc_audio_pcm_write_complete(void *context) +{ + struct qmc_dai_prtd *prtd = context; + int ret; + + prtd->period_ptr_ended += prtd->period_size; + if (prtd->period_ptr_ended >= prtd->dma_buffer_end) + prtd->period_ptr_ended = prtd->dma_buffer_start; + + prtd->period_ptr_submitted += prtd->period_size; + if (prtd->period_ptr_submitted >= prtd->dma_buffer_end) + prtd->period_ptr_submitted = prtd->dma_buffer_start; + + ret = qmc_chan_write_submit(prtd->qmc_dai->qmc_chan, + prtd->period_ptr_submitted, prtd->period_size, + qmc_audio_pcm_write_complete,
[PATCH v3 08/10] dt-bindings: sound: Add support for QMC audio
The QMC (QUICC mutichannel controller) is a controller present in some PowerQUICC SoC such as MPC885. The QMC audio is an ASoC component that uses the QMC controller to transfer the audio data. Signed-off-by: Herve Codina --- .../bindings/sound/fsl,qmc-audio.yaml | 117 ++ 1 file changed, 117 insertions(+) create mode 100644 Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml diff --git a/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml new file mode 100644 index ..ff5cd9241941 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml @@ -0,0 +1,117 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/sound/fsl,qmc-audio.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: QMC audio + +maintainers: + - Herve Codina + +description: | + The QMC audio is an ASoC component which uses QMC (QUICC Multichannel + Controller) channels to transfer the audio data. + It provides as many DAI as the number of QMC channel used. + +allOf: + - $ref: dai-common.yaml# + +properties: + compatible: +const: fsl,qmc-audio + + '#address-cells': +const: 1 + '#size-cells': +const: 0 + '#sound-dai-cells': +const: 1 + +patternProperties: + '^dai@([0-9]|[1-5][0-9]|6[0-3])$': +description: + A DAI managed by this controller +type: object + +properties: + reg: +minimum: 0 +maximum: 63 +description: + The DAI number + + fsl,qmc-chan: +$ref: /schemas/types.yaml#/definitions/phandle-array +items: + - items: + - description: phandle to QMC node + - description: Channel number +description: + Should be a phandle/number pair. The phandle to QMC node and the QMC + channel to use for this DAI. + +required: + - reg + - fsl,qmc-chan + +required: + - compatible + - '#address-cells' + - '#size-cells' + - '#sound-dai-cells' + +additionalProperties: false + +examples: + - | +audio_controller: audio-controller { +compatible = "fsl,qmc-audio"; +#address-cells = <1>; +#size-cells = <0>; +#sound-dai-cells = <1>; +dai@16 { +reg = <16>; +fsl,qmc-chan = <&qmc 16>; +}; +dai@17 { +reg = <17>; +fsl,qmc-chan = <&qmc 17>; +}; +}; + +sound { +compatible = "simple-audio-card"; +#address-cells = <1>; +#size-cells = <0>; +simple-audio-card,dai-link@0 { +reg = <0>; +format = "dsp_b"; +cpu { +sound-dai = <&audio_controller 16>; +}; +codec { +sound-dai = <&codec1>; +dai-tdm-slot-num = <4>; +dai-tdm-slot-width = <8>; +/* TS 3, 5, 7, 9 */ +dai-tdm-slot-tx-mask = <0 0 0 1 0 1 0 1 0 1>; +dai-tdm-slot-rx-mask = <0 0 0 1 0 1 0 1 0 1>; +}; +}; +simple-audio-card,dai-link@1 { +reg = <1>; +format = "dsp_b"; +cpu { +sound-dai = <&audio_controller 17>; +}; +codec { +sound-dai = <&codec2>; +dai-tdm-slot-num = <4>; +dai-tdm-slot-width = <8>; +/* TS 2, 4, 6, 8 */ +dai-tdm-slot-tx-mask = <0 0 1 0 1 0 1 0 1>; +dai-tdm-slot-rx-mask = <0 0 1 0 1 0 1 0 1>; +}; +}; +}; -- 2.38.1
[PATCH v3 07/10] MAINTAINERS: add the Freescale QMC controller entry
After contributing the driver, add myself as the maintainer for the Freescale QMC controller. Signed-off-by: Herve Codina --- MAINTAINERS | 8 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 6a0605ebf8a0..9a574892b9b1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8372,6 +8372,14 @@ S: Maintained F: drivers/soc/fsl/qe/ F: include/soc/fsl/qe/ +FREESCALE QUICC ENGINE QMC DRIVER +M: Herve Codina +L: linuxppc-dev@lists.ozlabs.org +S: Maintained +F: Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml +F: drivers/soc/fsl/qe/qmc.c +F: include/soc/fsl/qe/qmc.h + FREESCALE QUICC ENGINE TSA DRIVER M: Herve Codina L: linuxppc-dev@lists.ozlabs.org -- 2.38.1
[PATCH v3 06/10] soc: fsl: cmp1: Add support for QMC
The QMC (QUICC Multichannel Controller) emulates up to 64 channels within one serial controller using the same TDM physical interface routed from the TSA. It is available in some PowerQUICC SoC such as the MPC885 or MPC866. It is also available on some Quicc Engine SoCs. This current version support CPM1 SoCs only and some enhancement are needed to support Quicc Engine SoCs. Signed-off-by: Herve Codina --- drivers/soc/fsl/qe/Kconfig | 12 + drivers/soc/fsl/qe/Makefile |1 + drivers/soc/fsl/qe/qmc.c| 1531 +++ include/soc/fsl/qe/qmc.h| 71 ++ 4 files changed, 1615 insertions(+) create mode 100644 drivers/soc/fsl/qe/qmc.c create mode 100644 include/soc/fsl/qe/qmc.h diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig index 60ec11c9f4d9..25b218351ae3 100644 --- a/drivers/soc/fsl/qe/Kconfig +++ b/drivers/soc/fsl/qe/Kconfig @@ -44,6 +44,18 @@ config CPM_TSA This option enables support for this controller +config CPM_QMC + tristate "CPM QMC support" + depends on OF && HAS_IOMEM + depends on CPM1 || (PPC && COMPILE_TEST) + depends on CPM_TSA + help + Freescale CPM QUICC Multichannel Controller + (QMC) + + This option enables support for this + controller + config QE_TDM bool default y if FSL_UCC_HDLC diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile index 45c961acc81b..ec8506e13113 100644 --- a/drivers/soc/fsl/qe/Makefile +++ b/drivers/soc/fsl/qe/Makefile @@ -5,6 +5,7 @@ obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o obj-$(CONFIG_CPM) += qe_common.o obj-$(CONFIG_CPM_TSA) += tsa.o +obj-$(CONFIG_CPM_QMC) += qmc.o obj-$(CONFIG_UCC) += ucc.o obj-$(CONFIG_UCC_SLOW) += ucc_slow.o obj-$(CONFIG_UCC_FAST) += ucc_fast.o diff --git a/drivers/soc/fsl/qe/qmc.c b/drivers/soc/fsl/qe/qmc.c new file mode 100644 index ..85a99538bde7 --- /dev/null +++ b/drivers/soc/fsl/qe/qmc.c @@ -0,0 +1,1531 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * QMC driver + * + * Copyright 2022 CS GROUP France + * + * Author: Herve Codina + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "tsa.h" + +/* SCC general mode register high (32 bits) */ +#define SCC_GSMRL 0x00 +#define SCC_GSMRL_ENR (1 << 5) +#define SCC_GSMRL_ENT (1 << 4) +#define SCC_GSMRL_MODE_QMC (0x0A << 0) + +/* SCC general mode register low (32 bits) */ +#define SCC_GSMRH 0x04 +#define SCC_GSMRH_CTSS (1 << 7) +#define SCC_GSMRH_CDS(1 << 8) +#define SCC_GSMRH_CTSP (1 << 9) +#define SCC_GSMRH_CDP(1 << 10) + +/* SCC event register (16 bits) */ +#define SCC_SCCE 0x10 +#define SCC_SCCE_IQOV(1 << 3) +#define SCC_SCCE_GINT(1 << 2) +#define SCC_SCCE_GUN (1 << 1) +#define SCC_SCCE_GOV (1 << 0) + +/* SCC mask register (16 bits) */ +#define SCC_SCCM 0x14 +/* Multichannel base pointer (32 bits) */ +#define QMC_GBL_MCBASE 0x00 +/* Multichannel controller state (16 bits) */ +#define QMC_GBL_QMCSTATE 0x04 +/* Maximum receive buffer length (16 bits) */ +#define QMC_GBL_MRBLR 0x06 +/* Tx time-slot assignment table pointer (16 bits) */ +#define QMC_GBL_TX_S_PTR 0x08 +/* Rx pointer (16 bits) */ +#define QMC_GBL_RXPTR 0x0A +/* Global receive frame threshold (16 bits) */ +#define QMC_GBL_GRFTHR 0x0C +/* Global receive frame count (16 bits) */ +#define QMC_GBL_GRFCNT 0x0E +/* Multichannel interrupt base address (32 bits) */ +#define QMC_GBL_INTBASE0x10 +/* Multichannel interrupt pointer (32 bits) */ +#define QMC_GBL_INTPTR 0x14 +/* Rx time-slot assignment table pointer (16 bits) */ +#define QMC_GBL_RX_S_PTR 0x18 +/* Tx pointer (16 bits) */ +#define QMC_GBL_TXPTR 0x1A +/* CRC constant (32 bits) */ +#define QMC_GBL_C_MASK32 0x1C +/* Time slot assignment table Rx (32 x 16 bits) */ +#define QMC_GBL_TSATRX 0x20 +/* Time slot assignment table Tx (32 x 16 bits) */ +#define QMC_GBL_TSATTX 0x60 +/* CRC constant (16 bits) */ +#define QMC_GBL_C_MASK16 0xA0 + +/* TSA entry (16bit entry in TSATRX and TSATTX) */ +#define QMC_TSA_VALID (1 << 15) +#define QMC_TSA_WRAP (1 << 14) +#define QMC_TSA_MASK (0x303F) +#define QMC_TSA_CHANNEL(x) ((x) << 6) + +/* Tx buffer descriptor base address (16 bits, offset from MCBASE) */ +#define QMC_SPE_TBASE 0x00 + +/* Channel mode register (16 bits) */ +#define QMC_SPE_CHAMR 0x02 +#define QMC_SPE_CHAMR_MODE_HDLC (1 << 15) +#define QMC_SPE_CHAMR_MODE_TRANSP((0 << 15) | (1 << 13)) +#define QMC_SPE_CHAMR_ENT(1 << 12) +#define QMC_SPE_CHAMR_POL(1 << 8) +#define QMC_SPE_CHAMR_HDLC_IDLM (1 << 13) +#define QMC_SPE
[PATCH v3 05/10] dt-bindings: soc: fsl: cpm_qe: Add QMC controller
Add support for the QMC (QUICC Multichannel Controller) available in some PowerQUICC SoC such as MPC885 or MPC866. Signed-off-by: Herve Codina --- .../bindings/soc/fsl/cpm_qe/fsl,qmc.yaml | 164 ++ 1 file changed, 164 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml diff --git a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml new file mode 100644 index ..3ec52f1635c8 --- /dev/null +++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml @@ -0,0 +1,164 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/soc/fsl/cpm_qe/fsl,qmc.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: PowerQUICC CPM QUICC Multichannel Controller (QMC) + +maintainers: + - Herve Codina + +description: | + The QMC (QUICC Multichannel Controller) emulates up to 64 channels within + one serial controller using the same TDM physical interface routed from + TSA. + +properties: + compatible: +items: + - enum: + - fsl,mpc885-scc-qmc + - fsl,mpc866-scc-qmc + - const: fsl,cpm1-scc-qmc + + reg: +items: + - description: SCC (Serial communication controller) register base + - description: SCC parameter ram base + - description: Dual port ram base + + reg-names: +items: + - const: scc_regs + - const: scc_pram + - const: dpram + + interrupts: +maxItems: 1 +description: SCC interrupt line in the CPM interrupt controller + + fsl,tsa: +$ref: /schemas/types.yaml#/definitions/phandle +description: phandle to the TSA + + fsl,tsa-cell-id: +$ref: /schemas/types.yaml#/definitions/uint32 +enum: [1, 2, 3] +description: | + TSA cell ID (dt-bindings/soc/fsl,tsa.h defines these values) + - 1: SCC2 + - 2: SCC3 + - 3: SCC4 + + '#address-cells': +const: 1 + + '#size-cells': +const: 0 + + '#chan-cells': +const: 1 + +patternProperties: + '^channel@([0-9]|[1-5][0-9]|6[0-3])$': +description: + A channel managed by this controller +type: object + +properties: + reg: +minimum: 0 +maximum: 63 +description: + The channel number + + fsl,mode: +$ref: /schemas/types.yaml#/definitions/string +enum: [transparent, hdlc] +default: transparent +description: Operational mode + + fsl,reverse-data: +$ref: /schemas/types.yaml#/definitions/flag +description: + The bit order as seen on the channels is reversed, + transmitting/receiving the MSB of each octet first. + This flag is used only in 'transparent' mode. + + fsl,tx-ts-mask: +$ref: /schemas/types.yaml#/definitions/uint64 +description: + Channel assigned Tx time-slots within the Tx time-slots routed + by the TSA to this cell. + + fsl,rx-ts-mask: +$ref: /schemas/types.yaml#/definitions/uint64 +description: + Channel assigned Rx time-slots within the Rx time-slots routed + by the TSA to this cell. + +required: + - reg + - fsl,tx-ts-mask + - fsl,rx-ts-mask + +required: + - compatible + - reg + - reg-names + - interrupts + - fsl,tsa + - fsl,tsa-cell-id + - '#address-cells' + - '#size-cells' + - '#chan-cells' + +additionalProperties: false + +examples: + - | +#include + +qmc@a60 { +compatible = "fsl,mpc885-scc-qmc", "fsl,cpm1-scc-qmc"; +reg = <0xa60 0x20>, + <0x3f00 0xc0>, + <0x2000 0x1000>; +reg-names = "scc_regs", "scc_pram", "dpram"; +interrupts = <27>; +interrupt-parent = <&CPM_PIC>; + +#address-cells = <1>; +#size-cells = <0>; +#chan-cells = <1>; + +fsl,tsa = <&tsa>; +fsl,tsa-cell-id = ; + +channel@16 { +/* Ch16 : First 4 even TS from all routed from TSA */ +reg = <16>; +fsl,mode = "transparent"; +fsl,reverse-data; +fsl,tx-ts-mask = <0x 0x00aa>; +fsl,rx-ts-mask = <0x 0x00aa>; +}; + +channel@17 { +/* Ch17 : First 4 odd TS from all routed from TSA */ +reg = <17>; +fsl,mode = "transparent"; +fsl,reverse-data; +fsl,tx-ts-mask = <0x 0x0055>; +fsl,rx-ts-mask = <0x 0x0055>; +}; + +channel@19 { +/* Ch19 : 8 TS (TS 8..15) from all routed from TSA */ +reg = <19>; +fsl,mode = "hdlc"; +fsl,tx-ts-mask = <0x 0xff00>; +fsl,rx-ts-mask = <0x 0xff00>; +}; +}; -- 2.38.1
[PATCH v3 04/10] powerpc/8xx: Use a larger CPM1 command check mask
The CPM1 command mask is defined for use with the standard CPM1 command register as described in the user's manual: 0 |13|47|8 11|12 14| 15| RST|- |OPCODE|CH_NUM| -|FLG| In the QMC extension the CPM1 command register is redefined (QMC supplement user's manuel) with the following mapping: 0 |13|47|8 13|14| 15| RST|QMC OPCODE| 1110|CHANNEL_NUMBER| -|FLG| Extend the check command mask in order to support both the standard CH_NUM field and the QMC extension CHANNEL_NUMBER field. Signed-off-by: Herve Codina Acked-by: Christophe Leroy --- arch/powerpc/platforms/8xx/cpm1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/8xx/cpm1.c b/arch/powerpc/platforms/8xx/cpm1.c index 8ef1f4392086..6b828b9f90d9 100644 --- a/arch/powerpc/platforms/8xx/cpm1.c +++ b/arch/powerpc/platforms/8xx/cpm1.c @@ -100,7 +100,7 @@ int cpm_command(u32 command, u8 opcode) int i, ret; unsigned long flags; - if (command & 0xff0f) + if (command & 0xff03) return -EINVAL; spin_lock_irqsave(&cmd_lock, flags); -- 2.38.1
[PATCH v3 03/10] MAINTAINERS: add the Freescale TSA controller entry
After contributing the driver, add myself as the maintainer for the Freescale TSA controller. Signed-off-by: Herve Codina --- MAINTAINERS | 9 + 1 file changed, 9 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 7f86d02cb427..6a0605ebf8a0 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8372,6 +8372,15 @@ S: Maintained F: drivers/soc/fsl/qe/ F: include/soc/fsl/qe/ +FREESCALE QUICC ENGINE TSA DRIVER +M: Herve Codina +L: linuxppc-dev@lists.ozlabs.org +S: Maintained +F: Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml +F: drivers/soc/fsl/qe/tsa.c +F: drivers/soc/fsl/qe/tsa.h +F: include/dt-bindings/soc/fsl,tsa.h + FREESCALE QUICC ENGINE UCC ETHERNET DRIVER M: Li Yang L: net...@vger.kernel.org -- 2.38.1
[PATCH v3 01/10] dt-bindings: soc: fsl: cpm_qe: Add TSA controller
Add support for the time slot assigner (TSA) available in some PowerQUICC SoC such as MPC885 or MPC866. Signed-off-by: Herve Codina --- .../bindings/soc/fsl/cpm_qe/fsl,tsa.yaml | 260 ++ include/dt-bindings/soc/fsl,tsa.h | 13 + 2 files changed, 273 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml create mode 100644 include/dt-bindings/soc/fsl,tsa.h diff --git a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml new file mode 100644 index ..eb17b6119abd --- /dev/null +++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml @@ -0,0 +1,260 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/soc/fsl/cpm_qe/fsl,tsa.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: PowerQUICC CPM Time-slot assigner (TSA) controller + +maintainers: + - Herve Codina + +description: | + The TSA is the time-slot assigner that can be found on some + PowerQUICC SoC. + Its purpose is to route some TDM time-slots to other internal + serial controllers. + +properties: + compatible: +items: + - enum: + - fsl,mpc885-tsa + - fsl,mpc866-tsa + - const: fsl,cpm1-tsa + + reg: +items: + - description: SI (Serial Interface) register base + - description: SI RAM base + + reg-names: +items: + - const: si_regs + - const: si_ram + + '#address-cells': +const: 1 + + '#size-cells': +const: 0 + +patternProperties: + '^tdm@[0-1]$': +description: + The TDM managed by this controller +type: object + +properties: + reg: +minimum: 0 +maximum: 1 +description: + The TDM number for this TDM, 0 for TDMa and 1 for TDMb + + fsl,common-rxtx-pins: +$ref: /schemas/types.yaml#/definitions/flag +description: + The hardware can use four dedicated pins for Tx clock, + Tx sync, Rx clock and Rx sync or use only two pins, + Tx/Rx clock and Rx/Rx sync. + Without the 'fsl,common-rxtx-pins' property, the four + pins are used. With the 'fsl,common-rxtx-pins' property, + two pins are used. + + clocks: +minItems: 2 +maxItems: 4 + + clock-names: +minItems: 2 +maxItems: 4 + + fsl,mode: +$ref: /schemas/types.yaml#/definitions/string +enum: [normal, echo, internal-loopback, control-loopback] +default: normal +description: | + Operational mode: +- normal: +Normal operation +- echo: +Automatic echo. Rx data is resent on Tx +- internal-loopback: +The TDM transmitter is connected to the receiver. +Data appears on Tx pin. +- control-loopback: +The TDM transmitter is connected to the receiver. +The Tx pin is disconnected. + + fsl,rx-frame-sync-delay-bits: +enum: [0, 1, 2, 3] +default: 0 +description: | + Receive frame sync delay in number of bits. + Indicates the delay between the Rx sync and the first bit of the + Rx frame. 0 for no bit delay. 1, 2 or 3 for 1, 2 or 3 bits delay. + + fsl,tx-frame-sync-delay-bits: +enum: [0, 1, 2, 3] +default: 0 +description: | + Transmit frame sync delay in number of bits. + Indicates the delay between the Tx sync and the first bit of the + Tx frame. 0 for no bit delay. 1, 2 or 3 for 1, 2 or 3 bits delay. + + fsl,clock-falling-edge: +$ref: /schemas/types.yaml#/definitions/flag +description: | + Data is sent on falling edge of the clock (and received on the + rising edge). + If 'clock-falling-edge' is not present, data is sent on the + rising edge (and received on the falling edge). + + fsl,fsync-rising-edge: +$ref: /schemas/types.yaml#/definitions/flag +description: + Frame sync pulses are sampled with the rising edge of the channel + clock. If 'fsync-rising-edge' is not present, pulses are sample + with e falling edge. + + fsl,double-speed-clock: +$ref: /schemas/types.yaml#/definitions/flag +description: + The channel clock is twice the data rate. + + fsl,tx-ts-routes: +$ref: /schemas/types.yaml#/definitions/uint32-matrix +description: | + A list of tupple that indicates the Tx time-slots routes. +tx_ts_routes = + < 2 0 >, /* The first 2 time slots are not used */ + < 3 1 >, /* The next 3 ones are route to SCC2 */ + < 4 0 >, /* The next 4 ones are not used */ + < 2 2 >; /* The nest 2 ones are route to SCC3 */ +
[PATCH v3 02/10] soc: fsl: cpm1: Add support for TSA
The TSA (Time Slot Assigner) purpose is to route some TDM time-slots to other internal serial controllers. It is available in some PowerQUICC SoC such as the MPC885 or MPC866. It is also available on some Quicc Engine SoCs. This current version support CPM1 SoCs only and some enhancement are needed to support Quicc Engine SoCs. Signed-off-by: Herve Codina --- drivers/soc/fsl/qe/Kconfig | 11 + drivers/soc/fsl/qe/Makefile | 1 + drivers/soc/fsl/qe/tsa.c| 810 drivers/soc/fsl/qe/tsa.h| 43 ++ 4 files changed, 865 insertions(+) create mode 100644 drivers/soc/fsl/qe/tsa.c create mode 100644 drivers/soc/fsl/qe/tsa.h diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig index 357c5800b112..60ec11c9f4d9 100644 --- a/drivers/soc/fsl/qe/Kconfig +++ b/drivers/soc/fsl/qe/Kconfig @@ -33,6 +33,17 @@ config UCC bool default y if UCC_FAST || UCC_SLOW +config CPM_TSA + tristate "CPM TSA support" + depends on OF && HAS_IOMEM + depends on CPM1 || (PPC && COMPILE_TEST) + help + Freescale CPM Time Slot Assigner (TSA) + controller. + + This option enables support for this + controller + config QE_TDM bool default y if FSL_UCC_HDLC diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile index 55a555304f3a..45c961acc81b 100644 --- a/drivers/soc/fsl/qe/Makefile +++ b/drivers/soc/fsl/qe/Makefile @@ -4,6 +4,7 @@ # obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o obj-$(CONFIG_CPM) += qe_common.o +obj-$(CONFIG_CPM_TSA) += tsa.o obj-$(CONFIG_UCC) += ucc.o obj-$(CONFIG_UCC_SLOW) += ucc_slow.o obj-$(CONFIG_UCC_FAST) += ucc_fast.o diff --git a/drivers/soc/fsl/qe/tsa.c b/drivers/soc/fsl/qe/tsa.c new file mode 100644 index ..58b574cf37e2 --- /dev/null +++ b/drivers/soc/fsl/qe/tsa.c @@ -0,0 +1,810 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * TSA driver + * + * Copyright 2022 CS GROUP France + * + * Author: Herve Codina + */ + +#include "tsa.h" +#include +#include +#include +#include +#include +#include +#include + +/* TSA SI RAM routing tables entry */ +#define TSA_SIRAM_ENTRY_LAST (1 << 16) +#define TSA_SIRAM_ENTRY_BYTE (1 << 17) +#define TSA_SIRAM_ENTRY_CNT(x) (((x) & 0x0f) << 18) +#define TSA_SIRAM_ENTRY_CSEL_MASK (0x7 << 22) +#define TSA_SIRAM_ENTRY_CSEL_NU(0x0 << 22) +#define TSA_SIRAM_ENTRY_CSEL_SCC2 (0x2 << 22) +#define TSA_SIRAM_ENTRY_CSEL_SCC3 (0x3 << 22) +#define TSA_SIRAM_ENTRY_CSEL_SCC4 (0x4 << 22) +#define TSA_SIRAM_ENTRY_CSEL_SMC1 (0x5 << 22) +#define TSA_SIRAM_ENTRY_CSEL_SMC2 (0x6 << 22) + +/* SI mode register (32 bits) */ +#define TSA_SIMODE 0x00 +#define TSA_SIMODE_SMC2 0x8000 +#define TSA_SIMODE_SMC1 0x8000 +#define TSA_SIMODE_TDMA(x) ((x) << 0) +#define TSA_SIMODE_TDMB(x) ((x) << 16) +#define TSA_SIMODE_TDM_MASK0x0fff +#define TSA_SIMODE_TDM_SDM_MASK0x0c00 +#define TSA_SIMODE_TDM_SDM_NORM 0x +#define TSA_SIMODE_TDM_SDM_ECHO 0x0400 +#define TSA_SIMODE_TDM_SDM_INTL_LOOP 0x0800 +#define TSA_SIMODE_TDM_SDM_LOOP_CTRL 0x0c00 +#define TSA_SIMODE_TDM_RFSD(x) ((x) << 8) +#define TSA_SIMODE_TDM_DSC 0x0080 +#define TSA_SIMODE_TDM_CRT 0x0040 +#define TSA_SIMODE_TDM_STZ 0x0020 +#define TSA_SIMODE_TDM_CE 0x0010 +#define TSA_SIMODE_TDM_FE 0x0008 +#define TSA_SIMODE_TDM_GM 0x0004 +#define TSA_SIMODE_TDM_TFSD(x) ((x) << 0) + +/* SI global mode register (8 bits) */ +#define TSA_SIGMR 0x04 +#define TSA_SIGMR_ENB (1<<3) +#define TSA_SIGMR_ENA (1<<2) +#define TSA_SIGMR_RDM_MASK 0x03 +#define TSA_SIGMR_RDM_STATIC_TDMA0x00 +#define TSA_SIGMR_RDM_DYN_TDMA 0x01 +#define TSA_SIGMR_RDM_STATIC_TDMAB 0x02 +#define TSA_SIGMR_RDM_DYN_TDMAB 0x03 + +/* SI status register (8 bits) */ +#define TSA_SISTR 0x06 + +/* SI command register (8 bits) */ +#define TSA_SICMR 0x07 + +/* SI clock route register (32 bits) */ +#define TSA_SICR 0x0C +#define TSA_SICR_SCC2(x) ((x) << 8) +#define TSA_SICR_SCC3(x) ((x) << 16) +#define TSA_SICR_SCC4(x) ((x) << 24) +#define TSA_SICR_SCC_MASK 0x0ff +#define TSA_SICR_SCC_GRX (1 << 7) +#define TSA_SICR_SCC_SCX_TSA (1 << 6) +#define TSA_SICR_SCC_RXCS_MASK (0x7 << 3) +#define TSA_SICR_SCC_RXCS_BRG1 (0x0 << 3) +#define TSA_SICR_SCC_RXCS_BRG2 (0x1 << 3) +#define TSA_SICR_SCC_RXCS_BRG3 (0x2 << 3) +#define TSA_SICR_SCC_RXCS_BRG4 (0x3 << 3) +#define TSA_SICR_SCC_RXCS_CLK15 (0x4
[PATCH v3 00/10] Add the PowerQUICC audio support using the QMC
Hi, This series adds support for audio using the QMC controller available in some Freescale PowerQUICC SoCs. This series contains three parts in order to show the different blocks hierarchy and their usage in this support. The first one is related to TSA (Time Slot Assigner). The TSA handles the data present at the pin level (TDM with up to 64 time slots) and dispatchs them to one or more serial controller (SCC). The second is related to QMC (QUICC Multichannel Controller). The QMC handles the data at the serial controller (SCC) level and splits again the data to creates some virtual channels. The last one is related to the audio component (QMC audio). It is the glue between the QMC controller and the ASoC component. It handles one or more QMC virtual channels and creates one DAI per QMC virtual channels handled. Compared to the v2 series, this v3 series mainly: - adds modification in the DT bindings, - uses generic io{read,write}be{16,32} for registers accesses instead of the specific PowerPC ones. - updates some commit subjects and logs (CPM1 SoCs supports). Best regards, Herve Codina Changes v2 -> v3 - All bindings Rename fsl-tsa.h to fsl,tsa.h Add missing vendor prefix Various fixes (quotes, node names, upper/lower case) - patches 1 and 2 (TSA binding specific) Remove 'reserved' values in the routing tables Remove fsl,grant-mode Add a better description for 'fsl,common-rxtx-pins' Fix clocks/clocks-name handling against fsl,common-rxtx-pins Add information related to the delays unit Removed FSL_CPM_TSA_NBCELL Fix license in binding header file fsl,tsa.h - patches 5 and 6 (QMC binding specific) Remove fsl,cpm-command property Add interrupt property constraint - patches 8 and 9 (QMC audio binding specific) Remove 'items' in compatible property definition Add missing 'dai-common.yaml' reference Fix the qmc_chan phandle definition - patch 2 and 6 Use io{read,write}be{32,16} Change commit subjects and logs - patch 4 Add 'Acked-by: Christophe Leroy ' Changes v1 -> v2: - patch 2 and 6 Fix kernel test robot errors - other patches No changes Herve Codina (10): dt-bindings: soc: fsl: cpm_qe: Add TSA controller soc: fsl: cpm1: Add support for TSA MAINTAINERS: add the Freescale TSA controller entry powerpc/8xx: Use a larger CPM1 command check mask dt-bindings: soc: fsl: cpm_qe: Add QMC controller soc: fsl: cmp1: Add support for QMC MAINTAINERS: add the Freescale QMC controller entry dt-bindings: sound: Add support for QMC audio ASoC: fsl: Add support for QMC audio MAINTAINERS: add the Freescale QMC audio entry .../bindings/soc/fsl/cpm_qe/fsl,qmc.yaml | 164 ++ .../bindings/soc/fsl/cpm_qe/fsl,tsa.yaml | 260 +++ .../bindings/sound/fsl,qmc-audio.yaml | 117 ++ MAINTAINERS | 25 + arch/powerpc/platforms/8xx/cpm1.c |2 +- drivers/soc/fsl/qe/Kconfig| 23 + drivers/soc/fsl/qe/Makefile |2 + drivers/soc/fsl/qe/qmc.c | 1531 + drivers/soc/fsl/qe/tsa.c | 810 + drivers/soc/fsl/qe/tsa.h | 43 + include/dt-bindings/soc/fsl,tsa.h | 13 + include/soc/fsl/qe/qmc.h | 71 + sound/soc/fsl/Kconfig |9 + sound/soc/fsl/Makefile|2 + sound/soc/fsl/fsl_qmc_audio.c | 732 15 files changed, 3803 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qmc.yaml create mode 100644 Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,tsa.yaml create mode 100644 Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml create mode 100644 drivers/soc/fsl/qe/qmc.c create mode 100644 drivers/soc/fsl/qe/tsa.c create mode 100644 drivers/soc/fsl/qe/tsa.h create mode 100644 include/dt-bindings/soc/fsl,tsa.h create mode 100644 include/soc/fsl/qe/qmc.h create mode 100644 sound/soc/fsl/fsl_qmc_audio.c -- 2.38.1
Re: lockref scalability on x86-64 vs cpu_relax
On Thu, Jan 12, 2023 at 06:13:16PM -0600, Linus Torvalds wrote: > On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik wrote: > > > > To my understanding on said architecture failed cmpxchg still grants you > > exclusive access to the cacheline, making immediate retry preferable > > when trying to inc/dec unless a certain value is found. > > I actually suspect that is _always_ the case - this is not like a > contended spinlock where we want to pause because we're waiting for > the value to change and become unlocked, this cmpxchg loop is likely > always better off just retrying with the new value. > > That said, the "likely always better off" is purely about performance. > > So I have this suspicion that the reason Tony added the cpu_relax() > was simply not about performance, but about other issues, like > fairness in SMT situations. > > That said, evern from a fairness perspective the cpu_relax() sounds a > bit odd and unlikely - we're literally yielding when we lost a race, > so it hurts the _loser_, not the winner, and thus might make fairness > worse too. I've been writing cmpxchg loops that have strict termination conditions without cpu_relax() in them for a while now. For example, the x86 atomic_fetch_and() implementation looks like so: static __always_inline int arch_atomic_fetch_and(int i, atomic_t *v) { int val = arch_atomic_read(v); do { } while (!arch_atomic_try_cmpxchg(v, &val, val & i)); return val; } And I did that because of the exact same argument you had above, it needs to do the op anyway, waiting between failed attempts will only increase the chance it will fail again.
Re: lockref scalability on x86-64 vs cpu_relax
On Fri, Jan 13, 2023 at 02:12:50AM +0100, Mateusz Guzik wrote: > On 1/13/23, Linus Torvalds wrote: > > Side note on your access() changes - if it turns out that you can > > remove all the cred games, we should possibly then revert my old > > commit d7852fbd0f04 ("access: avoid the RCU grace period for the > > temporary subjective credentials") which avoided the biggest issue > > with the unnecessary cred switching. > > > > I *think* access() is the only user of that special 'non_rcu' thing, > > but it is possible that the whole 'non_rcu' thing ends up mattering > > for cases where the cred actually does change because euid != uid (ie > > suid programs), so this would need a bit more effort to do performance > > testing on. > > > > I don't think the games are avoidable. For one I found non-root > processes with non-empty cap_effective even on my laptop, albeit I did > not check how often something like this is doing access(). > > Discussion for another time. > > > On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik wrote: > >> All that said, I think the thing to do here is to replace cpu_relax > >> with a dedicated arch-dependent macro, akin to the following: > > > > I would actually prefer just removing it entirely and see if somebody > > else hollers. You have the numbers to prove it hurts on real hardware, > > and I don't think we have any numbers to the contrary. > > > > So I think it's better to trust the numbers and remove it as a > > failure, than say "let's just remove it on x86-64 and leave everybody > > else with the potentially broken code" > > > [snip] > > Then other architectures can try to run their numbers, and only *if* > > it then turns out that they have a reason to do something else should > > we make this conditional and different on different architectures. > > > > Let's try to keep the code as common as possibly until we have hard > > evidence for special cases, in other words. > > > > I did not want to make such a change without redoing the ThunderX2 > benchmark, or at least something else arm64-y. I may be able to bench it > tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming > no arm64 people show up with their results. > > Even then IMHO the safest route is to patch it out on x86-64 and give > other people time to bench their archs as they get around to it, and > ultimately whack the thing if it turns out nobody benefits from it. > I would say beats backpedaling on the removal, but I'm not going to > fight for it. > > That said, does waiting for arm64 numbers and/or producing them for the > removal commit message sound like a plan? If so, I'll post soon(tm). Honestly, I wouldn't worry about us (arm64) here. I don't think any real hardware implements the YIELD instruction (i.e. it behaves as a NOP in practice). The only place I'm aware of where it _does_ something is in QEMU, which was actually the motivation behind having it in cpu_relax() to start with (see 1baa82f48030 ("arm64: Implement cpu_relax as yield")). So, from the arm64 side of the fence, I'm perfectly happy just removing the cpu_relax() calls from lockref. Will
Re: [PATCH] kallsyms: Fix scheduling with interrupts disabled in self-test
On Thu 2023-01-12 10:24:43, Luis Chamberlain wrote: > On Thu, Jan 12, 2023 at 08:54:26PM +1000, Nicholas Piggin wrote: > > kallsyms_on_each* may schedule so must not be called with interrupts > > disabled. The iteration function could disable interrupts, but this > > also changes lookup_symbol() to match the change to the other timing > > code. > > > > Reported-by: Erhard F. > > Link: > > https://lore.kernel.org/all/bug-216902-206...@https.bugzilla.kernel.org%2F/ > > Reported-by: kernel test robot > > Link: > > https://lore.kernel.org/oe-lkp/202212251728.8d0872ff-oliver.s...@intel.com > > Fixes: 30f3bb09778d ("kallsyms: Add self-test facility") > > Signed-off-by: Nicholas Piggin > > --- > > Thanks Nicholas! > > Petr had just suggested removing this aspect of the selftests, the performance > test as its specific to the config, it doesn't run many times to get an > average and odd things on a system can create different metrics. Zhen Lei > had given up on fixing it and has a patch to instead remove this part of > the selftest. > > I still find value in keeping it, but Petr, would like your opinion on > this fix, if we were to keep it. I am fine with this fix. It increases a risk of possible inaccuracy of the measured time. It would count also time spent on unrelated interrupts and eventual rescheduling. Anyway, it is safe at least. I was against the previous attempts to fix this problem because they might have caused problems for the rest of the system. Best Regards, Petr