Re: [PATCH] powerpc/powernv/pci: remove dead code from !CONFIG_EEH
On Thu, Apr 22, 2021 at 6:13 PM Oliver O'Halloran wrote: > > On Fri, Apr 23, 2021 at 9:09 AM Daniel Axtens wrote: > > > > Hi Nick, > > > > > While looking at -Wundef warnings, the #if CONFIG_EEH stood out as a > > > possible candidate to convert to #ifdef CONFIG_EEH, but it seems that > > > based on Kconfig dependencies it's not possible to build this file > > > without CONFIG_EEH enabled. > > > > This seemed odd to me, but I think you're right: > > > > arch/powerpc/platforms/Kconfig contains: > > > > config EEH > > bool > > depends on (PPC_POWERNV || PPC_PSERIES) && PCI > > default y > > > > It's not configurable from e.g. make menuconfig because there's no prompt. > > You can attempt to explicitly disable it with e.g. `scripts/config -d EEH` > > but then something like `make oldconfig` will silently re-enable it for > > you. > > > > It's been forced on since commit e49f7a9997c6 ("powerpc/pseries: Rivet > > CONFIG_EEH for pSeries platform") in 2012 which fixed it for > > pseries. That moved out from pseries to pseries + powernv later on. > > > > There are other cleanups in the same vein that could be made, from the > > Makefile (which has files only built with CONFIG_EEH) through to other > > source files. It looks like there's one `#ifdef CONFIG_EEH` in > > arch/powerpc/platforms/powernv/pci-ioda.c that could be pulled out, for > > example. > > > > I think it's probably worth trying to rip out all of those in one patch? > > The change in commit e49f7a9997c6 ("powerpc/pseries: Rivet CONFIG_EEH > for pSeries platform") never should have been made. I'll change my patch to keep the conditionals, but use #ifdef instead of #if then? > > There's no inherent reason why EEH needs to be enabled and forcing it > on is (IMO) a large part of why EEH support is the byzantine > clusterfuck that it is. One of the things I was working towards was > allowing pseries and powernv to be built with !CONFIG_EEH since that > would help define a clearer boundary between what is "eeh support" and > what is required to support PCI on the platform. Pseries is > particularly bad for this since PAPR says the RTAS calls needed to do > a PCI bus reset are part of the EEH extension, but there's non-EEH > reasons why you might want to use those RTAS calls. The PHB reset that > we do when entering a kdump kernel is a good example since that uses > the same RTAS calls, but it has nothing to do with the EEH recovery > machinery enabled by CONFIG_EEH. > > I was looking into that largely because people were considering using > OPAL for microwatt platforms. Breaking the assumption that > powernv==EEH support is one of the few bits of work required to enable > that, but even if you don't go down that road I think everyone would > be better off if you kept a degree of separation between the two. -- Thanks, ~Nick Desaulniers
Re: [PATCH 6/9] tty: hvc_console: Fix coding style issues of block comments
Hi Johan, Thanks for reviewing this patch. On 2021/5/17 22:15, Johan Hovold wrote: On Mon, May 17, 2021 at 02:37:10PM +0800, Xiaofei Tan wrote: Fix coding style issues of block comments, reported by checkpatch.pl. Besides, add a period at the end of the sentenses. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 39018e5..a61cdf0 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -177,7 +177,8 @@ static void hvc_console_print(struct console *co, const char *b, r = cons_ops[index]->put_chars(vtermnos[index], c, i); if (r <= 0) { /* throw away characters on error -* but spin in case of -EAGAIN */ +* but spin in case of -EAGAIN. +*/ How is this an improvement? First, the multi-line comment style is /* * ... */ Yes, mostly we use this style. I can follow it if new version is needed. BTW, How about add the '/*' check into checkpatch.pl? Second, that sentence is not capitalised so why do add a period? How about capitalize the sentence, or just remove the period ? Third, why are you sending checkpatch.pl cleanups for files outside of staging? I'm sorry, Is this a rule, or kind of tradition? I've never heard of this before. Unless doing some real changes to the files in question as well this is mostly just churn and noise that makes it harder to backport fixes and do code forensics for no real gain. I'm not sure. But if cleanup patches have made it hard to backport fixes and do code forensics, then the code quality may not be good enough. Greg may disagree, but I don't think we should be encouraging this kind of patches. Johan .
Re: [PATCH v14 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware
Christophe Leroy writes: > Le 17/05/2021 à 13:01, Michael Ellerman a écrit : >> Jordan Niethe writes: >>> On Mon, May 17, 2021 at 4:37 PM Christophe Leroy >>> wrote: Le 17/05/2021 à 05:28, Jordan Niethe a écrit : > Make module_alloc() use PAGE_KERNEL protections instead of > PAGE_KERNEL_EXEX if Strict Module RWX is enabled. > > Signed-off-by: Jordan Niethe > --- > v14: - Split out from powerpc: Set ARCH_HAS_STRICT_MODULE_RWX >- Add and use strict_module_rwx_enabled() helper > --- >arch/powerpc/include/asm/mmu.h | 5 + >arch/powerpc/kernel/module.c | 4 +++- >2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/mmu.h > b/arch/powerpc/include/asm/mmu.h > index 607168b1aef4..7710bf0cbf8a 100644 > --- a/arch/powerpc/include/asm/mmu.h > +++ b/arch/powerpc/include/asm/mmu.h > @@ -357,6 +357,11 @@ static inline bool strict_kernel_rwx_enabled(void) >return false; >} >#endif > + > +static inline bool strict_module_rwx_enabled(void) > +{ > + return IS_ENABLED(CONFIG_STRICT_MODULE_RWX) && > strict_kernel_rwx_enabled(); > +} Looking at arch/Kconfig, I have the feeling that it is possible to select CONFIG_STRICT_MODULE_RWX without selecting CONFIG_STRICT_KERNEL_RWX. In that case, strict_kernel_rwx_enabled() will return false. >> >>> Ok, if someone did that currently it would break things, e.g. code >>> patching. I think it should it be made impossible to >>> CONFIG_STRICT_MODULE_RWX without CONFIG_STRICT_KERNEL_RWX? >> >> Yeah I don't see any reason to support that combination. >> >> We should be moving to a world where both are on by default, or in fact >> are always enabled. > > Would it work if we add the following in arch/powerpc/Kconfig ? : > > select STRICT_KERNEL_RWX if STRICT_MODULE_RWX > > There should be no dependency issue as powerpc only selects > ARCH_HAS_STRICT_MODULE_RWX when > ARCH_HAS_STRICT_KERNEL_RWX is also selected. I think it will work. It's slightly rude to select things like that, but I think it's OK for something like this. Medium term we can possibly just have the generic STRICT_MODULE_RWX depend on STRICT_KERNEL_RWX. cheers
[PATCH AUTOSEL 5.10 1/3] powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks
From: Nicholas Piggin [ Upstream commit 2c8c89b95831f46a2fb31a8d0fef4601694023ce ] The paravit queued spinlock slow path adds itself to the queue then calls pv_wait to wait for the lock to become free. This is implemented by calling H_CONFER to donate cycles. When hcall tracing is enabled, this H_CONFER call can lead to a spin lock being taken in the tracing code, which will result in the lock to be taken again, which will also go to the slow path because it queues behind itself and so won't ever make progress. An example trace of a deadlock: __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_hcall_exit __trace_hcall_exit plpar_hcall_norets_trace __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_rcu_dyntick rcu_irq_exit irq_exit __do_irq call_do_irq do_IRQ hardware_interrupt_common_virt Fix this by introducing plpar_hcall_norets_notrace(), and using that to make SPLPAR virtual processor dispatching hcalls by the paravirt spinlock code. Signed-off-by: Nicholas Piggin Reviewed-by: Naveen N. Rao Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210508101455.1578318-2-npig...@gmail.com Signed-off-by: Sasha Levin --- arch/powerpc/include/asm/hvcall.h | 3 +++ arch/powerpc/include/asm/paravirt.h | 22 +++--- arch/powerpc/platforms/pseries/hvCall.S | 10 ++ arch/powerpc/platforms/pseries/lpar.c | 3 +-- 4 files changed, 33 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index c1fbccb04390..3e8e19f5746c 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -437,6 +437,9 @@ */ long plpar_hcall_norets(unsigned long opcode, ...); +/* Variant which does not do hcall tracing */ +long plpar_hcall_norets_notrace(unsigned long opcode, ...); + /** * plpar_hcall: - Make a pseries hypervisor call * @opcode: The hypervisor call to make. diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h index 9362c94fe3aa..588bfb9a0579 100644 --- a/arch/powerpc/include/asm/paravirt.h +++ b/arch/powerpc/include/asm/paravirt.h @@ -24,19 +24,35 @@ static inline u32 yield_count_of(int cpu) return be32_to_cpu(yield_count); } +/* + * Spinlock code confers and prods, so don't trace the hcalls because the + * tracing code takes spinlocks which can cause recursion deadlocks. + * + * These calls are made while the lock is not held: the lock slowpath yields if + * it can not acquire the lock, and unlock slow path might prod if a waiter has + * yielded). So this may not be a problem for simple spin locks because the + * tracing does not technically recurse on the lock, but we avoid it anyway. + * + * However the queued spin lock contended path is more strictly ordered: the + * H_CONFER hcall is made after the task has queued itself on the lock, so then + * recursing on that lock will cause the task to then queue up again behind the + * first instance (or worse: queued spinlocks use tricks that assume a context + * never waits on more than one spinlock, so such recursion may cause random + * corruption in the lock code). + */ static inline void yield_to_preempted(int cpu, u32 yield_count) { - plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); + plpar_hcall_norets_notrace(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); } static inline void prod_cpu(int cpu) { - plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu)); + plpar_hcall_norets_notrace(H_PROD, get_hard_smp_processor_id(cpu)); } static inline void yield_to_any(void) { - plpar_hcall_norets(H_CONFER, -1, 0); + plpar_hcall_norets_notrace(H_CONFER, -1, 0); } #else static inline bool is_shared_processor(void) diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S index 2136e42833af..8a2b8d64265b 100644 --- a/arch/powerpc/platforms/pseries/hvCall.S +++ b/arch/powerpc/platforms/pseries/hvCall.S @@ -102,6 +102,16 @@ END_FTR_SECTION(0, 1); \ #define HCALL_BRANCH(LABEL) #endif +_GLOBAL_TOC(plpar_hcall_norets_notrace) + HMT_MEDIUM + + mfcrr0 + stw r0,8(r1) + HVSC/* invoke the hypervisor */ + lwz r0,8(r1) + mtcrf 0xff,r0 + blr /* return r3 = status */ + _GLOBAL_TOC(plpar_hcall_norets) HMT_MEDIUM diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index 764170fdb0f7..1c3ac0f66336 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -1827,8 +1827,7 @@ void
[PATCH AUTOSEL 5.11 1/3] powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks
From: Nicholas Piggin [ Upstream commit 2c8c89b95831f46a2fb31a8d0fef4601694023ce ] The paravit queued spinlock slow path adds itself to the queue then calls pv_wait to wait for the lock to become free. This is implemented by calling H_CONFER to donate cycles. When hcall tracing is enabled, this H_CONFER call can lead to a spin lock being taken in the tracing code, which will result in the lock to be taken again, which will also go to the slow path because it queues behind itself and so won't ever make progress. An example trace of a deadlock: __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_hcall_exit __trace_hcall_exit plpar_hcall_norets_trace __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_rcu_dyntick rcu_irq_exit irq_exit __do_irq call_do_irq do_IRQ hardware_interrupt_common_virt Fix this by introducing plpar_hcall_norets_notrace(), and using that to make SPLPAR virtual processor dispatching hcalls by the paravirt spinlock code. Signed-off-by: Nicholas Piggin Reviewed-by: Naveen N. Rao Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210508101455.1578318-2-npig...@gmail.com Signed-off-by: Sasha Levin --- arch/powerpc/include/asm/hvcall.h | 3 +++ arch/powerpc/include/asm/paravirt.h | 22 +++--- arch/powerpc/platforms/pseries/hvCall.S | 10 ++ arch/powerpc/platforms/pseries/lpar.c | 3 +-- 4 files changed, 33 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index c98f5141e3fc..dd89aa3aea8f 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -446,6 +446,9 @@ */ long plpar_hcall_norets(unsigned long opcode, ...); +/* Variant which does not do hcall tracing */ +long plpar_hcall_norets_notrace(unsigned long opcode, ...); + /** * plpar_hcall: - Make a pseries hypervisor call * @opcode: The hypervisor call to make. diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h index 5d1726bb28e7..bcb7b5f917be 100644 --- a/arch/powerpc/include/asm/paravirt.h +++ b/arch/powerpc/include/asm/paravirt.h @@ -28,19 +28,35 @@ static inline u32 yield_count_of(int cpu) return be32_to_cpu(yield_count); } +/* + * Spinlock code confers and prods, so don't trace the hcalls because the + * tracing code takes spinlocks which can cause recursion deadlocks. + * + * These calls are made while the lock is not held: the lock slowpath yields if + * it can not acquire the lock, and unlock slow path might prod if a waiter has + * yielded). So this may not be a problem for simple spin locks because the + * tracing does not technically recurse on the lock, but we avoid it anyway. + * + * However the queued spin lock contended path is more strictly ordered: the + * H_CONFER hcall is made after the task has queued itself on the lock, so then + * recursing on that lock will cause the task to then queue up again behind the + * first instance (or worse: queued spinlocks use tricks that assume a context + * never waits on more than one spinlock, so such recursion may cause random + * corruption in the lock code). + */ static inline void yield_to_preempted(int cpu, u32 yield_count) { - plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); + plpar_hcall_norets_notrace(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); } static inline void prod_cpu(int cpu) { - plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu)); + plpar_hcall_norets_notrace(H_PROD, get_hard_smp_processor_id(cpu)); } static inline void yield_to_any(void) { - plpar_hcall_norets(H_CONFER, -1, 0); + plpar_hcall_norets_notrace(H_CONFER, -1, 0); } #else static inline bool is_shared_processor(void) diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S index 2136e42833af..8a2b8d64265b 100644 --- a/arch/powerpc/platforms/pseries/hvCall.S +++ b/arch/powerpc/platforms/pseries/hvCall.S @@ -102,6 +102,16 @@ END_FTR_SECTION(0, 1); \ #define HCALL_BRANCH(LABEL) #endif +_GLOBAL_TOC(plpar_hcall_norets_notrace) + HMT_MEDIUM + + mfcrr0 + stw r0,8(r1) + HVSC/* invoke the hypervisor */ + lwz r0,8(r1) + mtcrf 0xff,r0 + blr /* return r3 = status */ + _GLOBAL_TOC(plpar_hcall_norets) HMT_MEDIUM diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index cd38bd421f38..d4aa6a46e1fa 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -1830,8 +1830,7 @@ void
[PATCH AUTOSEL 5.12 2/5] powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks
From: Nicholas Piggin [ Upstream commit 2c8c89b95831f46a2fb31a8d0fef4601694023ce ] The paravit queued spinlock slow path adds itself to the queue then calls pv_wait to wait for the lock to become free. This is implemented by calling H_CONFER to donate cycles. When hcall tracing is enabled, this H_CONFER call can lead to a spin lock being taken in the tracing code, which will result in the lock to be taken again, which will also go to the slow path because it queues behind itself and so won't ever make progress. An example trace of a deadlock: __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_hcall_exit __trace_hcall_exit plpar_hcall_norets_trace __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_rcu_dyntick rcu_irq_exit irq_exit __do_irq call_do_irq do_IRQ hardware_interrupt_common_virt Fix this by introducing plpar_hcall_norets_notrace(), and using that to make SPLPAR virtual processor dispatching hcalls by the paravirt spinlock code. Signed-off-by: Nicholas Piggin Reviewed-by: Naveen N. Rao Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210508101455.1578318-2-npig...@gmail.com Signed-off-by: Sasha Levin --- arch/powerpc/include/asm/hvcall.h | 3 +++ arch/powerpc/include/asm/paravirt.h | 22 +++--- arch/powerpc/platforms/pseries/hvCall.S | 10 ++ arch/powerpc/platforms/pseries/lpar.c | 3 +-- 4 files changed, 33 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index ed6086d57b22..0c92b01a3c3c 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -446,6 +446,9 @@ */ long plpar_hcall_norets(unsigned long opcode, ...); +/* Variant which does not do hcall tracing */ +long plpar_hcall_norets_notrace(unsigned long opcode, ...); + /** * plpar_hcall: - Make a pseries hypervisor call * @opcode: The hypervisor call to make. diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h index 5d1726bb28e7..bcb7b5f917be 100644 --- a/arch/powerpc/include/asm/paravirt.h +++ b/arch/powerpc/include/asm/paravirt.h @@ -28,19 +28,35 @@ static inline u32 yield_count_of(int cpu) return be32_to_cpu(yield_count); } +/* + * Spinlock code confers and prods, so don't trace the hcalls because the + * tracing code takes spinlocks which can cause recursion deadlocks. + * + * These calls are made while the lock is not held: the lock slowpath yields if + * it can not acquire the lock, and unlock slow path might prod if a waiter has + * yielded). So this may not be a problem for simple spin locks because the + * tracing does not technically recurse on the lock, but we avoid it anyway. + * + * However the queued spin lock contended path is more strictly ordered: the + * H_CONFER hcall is made after the task has queued itself on the lock, so then + * recursing on that lock will cause the task to then queue up again behind the + * first instance (or worse: queued spinlocks use tricks that assume a context + * never waits on more than one spinlock, so such recursion may cause random + * corruption in the lock code). + */ static inline void yield_to_preempted(int cpu, u32 yield_count) { - plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); + plpar_hcall_norets_notrace(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); } static inline void prod_cpu(int cpu) { - plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu)); + plpar_hcall_norets_notrace(H_PROD, get_hard_smp_processor_id(cpu)); } static inline void yield_to_any(void) { - plpar_hcall_norets(H_CONFER, -1, 0); + plpar_hcall_norets_notrace(H_CONFER, -1, 0); } #else static inline bool is_shared_processor(void) diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S index 2136e42833af..8a2b8d64265b 100644 --- a/arch/powerpc/platforms/pseries/hvCall.S +++ b/arch/powerpc/platforms/pseries/hvCall.S @@ -102,6 +102,16 @@ END_FTR_SECTION(0, 1); \ #define HCALL_BRANCH(LABEL) #endif +_GLOBAL_TOC(plpar_hcall_norets_notrace) + HMT_MEDIUM + + mfcrr0 + stw r0,8(r1) + HVSC/* invoke the hypervisor */ + lwz r0,8(r1) + mtcrf 0xff,r0 + blr /* return r3 = status */ + _GLOBAL_TOC(plpar_hcall_norets) HMT_MEDIUM diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index cd38bd421f38..d4aa6a46e1fa 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -1830,8 +1830,7 @@ void
Re: [PATCH kernel v3] powerpc/makefile: Do not redefine $(CPP) for preprocessor
Hi! On Mon, May 17, 2021 at 01:23:11PM +1000, Alexey Kardashevskiy wrote: > On 5/14/21 18:46, Segher Boessenkool wrote: > >On Fri, May 14, 2021 at 11:42:32AM +0900, Masahiro Yamada wrote: > >>In my best guess, the reason why powerpc adding the endian flag to CPP > >>is this line in arch/powerpc/kernel/vdso64/vdso64.lds.S > >> > >>#ifdef __LITTLE_ENDIAN__ > >>OUTPUT_FORMAT("elf64-powerpcle", "elf64-powerpcle", "elf64-powerpcle") > >>#else > >>OUTPUT_FORMAT("elf64-powerpc", "elf64-powerpc", "elf64-powerpc") > >>#endif > > > >Which is equivalent to > > > >#ifdef __LITTLE_ENDIAN__ > >OUTPUT_FORMAT("elf64-powerpcle") > >#else > >OUTPUT_FORMAT("elf64-powerpc") > >#endif > > > >so please change that at the same time if you touch this :-) > > "If you touch this" approach did not work well with this patch so sorry > but no ;) > > and for a separate patch, I'll have to dig since when it is equal, do > you know? Since 1994, when the three-arg version was introduced (the one-arg version is from 1992). > >>__LITTLE_ENDIAN__ is defined by powerpc gcc and clang. > > > >This predefined macro is required by the newer ABIs, but all older > > That's good so I'll stick to it. Great. > >You can just write -mbig and -mlittle btw. Those aren't available on > >all targets, but neither are the long-winded -m{big,little}-endian > >option names. Pet peeve, I know :-) > > I am looking the same guarantees across modern enough gcc and clang and > I am not sure all of the above is valid for clang 10.0.something (or > whatever we say we support) ;) -mbig/-mlittle is supported in GCC since times immemorial. Whether LLVM supports it as well just depends on how good their emulation is, I have no idea. Segher
Re: [PATCH v2 01/14] PCI: Use sysfs_emit() and sysfs_emit_at() in "show" functions
Hi Logan, > > The sysfs_emit() and sysfs_emit_at() functions were introduced to make > > it less ambiguous which function is preferred when writing to the output > > buffer in a device attribute's "show" callback [1]. > > > > Convert the PCI sysfs object "show" functions from sprintf(), snprintf() > > and scnprintf() to sysfs_emit() and sysfs_emit_at() accordingly, as the > > latter is aware of the PAGE_SIZE buffer and correctly returns the number > > of bytes written into the buffer. > > > > No functional change intended. > > > > [1] Documentation/filesystems/sysfs.rst > > > > Related to: > > commit ad025f8e46f3 ("PCI/sysfs: Use sysfs_emit() and sysfs_emit_at() in > > "show" functions") > > I re-reviewed the whole series. It still looks good to me. > > Very nice solution in patch 12 to the new line issue. > > Reviewed-by: Logan Gunthorpe > > Thanks, Thank you! I will send v3 incorporating the style change as per Joe's suggestion and carry-over your "Reviewed-by", if you don't mind, as it will be a trivial change. Krzysztof
Re: [PATCH v2 01/14] PCI: Use sysfs_emit() and sysfs_emit_at() in "show" functions
On 2021-05-14 11:24 p.m., Krzysztof Wilczyński wrote: > The sysfs_emit() and sysfs_emit_at() functions were introduced to make > it less ambiguous which function is preferred when writing to the output > buffer in a device attribute's "show" callback [1]. > > Convert the PCI sysfs object "show" functions from sprintf(), snprintf() > and scnprintf() to sysfs_emit() and sysfs_emit_at() accordingly, as the > latter is aware of the PAGE_SIZE buffer and correctly returns the number > of bytes written into the buffer. > > No functional change intended. > > [1] Documentation/filesystems/sysfs.rst > > Related to: > commit ad025f8e46f3 ("PCI/sysfs: Use sysfs_emit() and sysfs_emit_at() in > "show" functions") > > Signed-off-by: Krzysztof Wilczyński > Reviewed-by: Logan Gunthorpe I re-reviewed the whole series. It still looks good to me. Very nice solution in patch 12 to the new line issue. Reviewed-by: Logan Gunthorpe Thanks, Logan
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
Le 17/05/2021 à 15:12, Anshuman Khandual a écrit : On 5/17/21 6:29 PM, Christophe Leroy wrote: Le 17/05/2021 à 14:49, Anshuman Khandual a écrit : On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: On 5/17/21 11:17 AM, Christophe Leroy wrote: +aneesh +linuxppc-dev list Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : Hello Christophe, DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not aware about this. From the error log, it failed explicitly on 4K page size hash config. static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) { BUG(); --> Failed return pmd; } static inline pmd_t __pmd_mkhuge(pmd_t pmd) { if (radix_enabled()) return radix__pmd_mkhuge(pmd); return hash__pmd_mkhuge(pmd); } pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) { unsigned long pmdv; pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); } It seems like on powerpc, where pfn_pmd() makes a huge page but which is not supported on 4K hash config thus triggering the BUG(). But all pfn_pmd() call sites inside the debug_vm_pgtable() test are protected with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does not directly make a huge page on other platforms. Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant THP helpers has BUG() or 0 which indicates THP might not be supported on 4K page size hash config ? But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 platforms which I assume includes 4K page size hash config as well. Is THP some how getting enabled on this 4K page size hash config where it should not be (thus triggering the BUG) ? OR am I missing something here. We should put those pfn_pmd() and pfn_pud() after if (!has_transparent_hugepage()) return; The following patch has been lightly tested on arm64 and x86 platforms. Could you please verify if this solves the problem on powerpc 4K hash config ? Thank you. No need to update pmd_advanced_tests() and pud_advanced_tests() ? They already have has_transparent_hugepage() check and all pfn_pxx() instances are after that check. Do you see any other concern ? As far as I can see, in 5.13-rc2 they are before the check : https://elixir.bootlin.com/linux/v5.13-rc2/source/mm/debug_vm_pgtable.c#L183 Christophe
Re: [PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument
On 5/17/21 6:55 AM, Aneesh Kumar K.V wrote: Guenter Roeck writes: On 5/17/21 1:40 AM, Aneesh Kumar K.V wrote: On 5/15/21 10:05 PM, Guenter Roeck wrote: On Thu, Apr 22, 2021 at 11:13:19AM +0530, Aneesh Kumar K.V wrote: ... extern void radix__local_flush_all_mm(struct mm_struct *mm); diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index 215973b4cb26..f9f8a3a264f7 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -45,13 +45,30 @@ static inline void tlbiel_all_lpid(bool radix) hash__tlbiel_all(TLB_INVAL_SCOPE_LPID); } +static inline void flush_pmd_tlb_pwc_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + bool flush_pwc) +{ + if (radix_enabled()) + return radix__flush_pmd_tlb_range(vma, start, end, flush_pwc); + return hash__flush_tlb_range(vma, start, end); ^^ +} In this specific case we won't have build errors because, static inline void hash__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { Sorry, you completely lost me. Building parisc:allnoconfig ... failed -- Error log: In file included from arch/parisc/include/asm/cacheflush.h:7, from include/linux/highmem.h:12, from include/linux/pagemap.h:11, from include/linux/ksm.h:13, from mm/mremap.c:14: mm/mremap.c: In function 'flush_pte_tlb_pwc_range': arch/parisc/include/asm/tlbflush.h:20:2: error: 'return' with a value, in function returning void As replied here https://lore.kernel.org/mm-commits/8eedb441-a612-1ec8-8bf7-b40184de9...@linux.ibm.com/ That was the generic header change in the patch. I was commenting about the ppc64 specific change causing build failures. Ah, sorry. I wasn't aware that the following is valid C code void f1() { return f2(); ^^ } as long as f2() is void as well. Confusing, but we live and learn. Guenter
Re: [PATCH 6/9] tty: hvc_console: Fix coding style issues of block comments
On Mon, May 17, 2021 at 02:37:10PM +0800, Xiaofei Tan wrote: > Fix coding style issues of block comments, reported by checkpatch.pl. > Besides, add a period at the end of the sentenses. > > Signed-off-by: Xiaofei Tan > --- > drivers/tty/hvc/hvc_console.c | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c > index 39018e5..a61cdf0 100644 > --- a/drivers/tty/hvc/hvc_console.c > +++ b/drivers/tty/hvc/hvc_console.c > @@ -177,7 +177,8 @@ static void hvc_console_print(struct console *co, const > char *b, > r = cons_ops[index]->put_chars(vtermnos[index], c, i); > if (r <= 0) { > /* throw away characters on error > - * but spin in case of -EAGAIN */ > + * but spin in case of -EAGAIN. > + */ How is this an improvement? First, the multi-line comment style is /* * ... */ Second, that sentence is not capitalised so why do add a period? Third, why are you sending checkpatch.pl cleanups for files outside of staging? Unless doing some real changes to the files in question as well this is mostly just churn and noise that makes it harder to backport fixes and do code forensics for no real gain. Greg may disagree, but I don't think we should be encouraging this kind of patches. Johan
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
On 5/17/21 7:22 PM, Aneesh Kumar K.V wrote: > Anshuman Khandual writes: > >> On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: >>> On 5/17/21 11:17 AM, Christophe Leroy wrote: +aneesh +linuxppc-dev list Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : > Hello Christophe, > > DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not > aware about this. From the error log, it failed explicitly on 4K page > size hash config. > > static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) > { > BUG(); --> Failed > return pmd; > } > > static inline pmd_t __pmd_mkhuge(pmd_t pmd) > { > if (radix_enabled()) > return radix__pmd_mkhuge(pmd); > return hash__pmd_mkhuge(pmd); > } > > pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) > { > unsigned long pmdv; > > pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; > > return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); > } > > It seems like on powerpc, where pfn_pmd() makes a huge page but which > is not supported on 4K hash config thus triggering the BUG(). But all > pfn_pmd() call sites inside the debug_vm_pgtable() test are protected > with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does > not directly make a huge page on other platforms. > > Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant > THP helpers has BUG() or 0 which indicates THP might not be supported > on 4K page size hash config ? > > But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like > HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 > platforms which I assume includes 4K page size hash config as well. > > Is THP some how getting enabled on this 4K page size hash config where > it should not be (thus triggering the BUG) ? OR am I missing something > here. > >>> >>> We should put those pfn_pmd() and pfn_pud() after >>> >>> if (!has_transparent_hugepage()) >>> return; >> >> The following patch has been lightly tested on arm64 and x86 platforms. >> Could you please verify if this solves the problem on powerpc 4K hash >> config ? Thank you. > > Tested the patch with the below additional change. > > modified mm/debug_vm_pgtable.c > @@ -186,12 +186,13 @@ static void __init pmd_advanced_tests(struct mm_struct > *mm, > unsigned long pfn, unsigned long vaddr, > pgprot_t prot, pgtable_t pgtable) > { > - pmd_t pmd = pfn_pmd(pfn, prot); > + pmd_t pmd; > > if (!has_transparent_hugepage()) > return; > > pr_debug("Validating PMD advanced\n"); > + pmd = pfn_pmd(pfn, prot); > /* Align the address wrt HPAGE_PMD_SIZE */ > vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE; > > @@ -334,12 +335,13 @@ static void __init pud_advanced_tests(struct mm_struct > *mm, > unsigned long pfn, unsigned long vaddr, > pgprot_t prot) > { > - pud_t pud = pfn_pud(pfn, prot); > + pud_t pud; > > if (!has_transparent_hugepage()) > return; > > pr_debug("Validating PUD advanced\n"); > + pud = pfn_pud(pfn, prot); > /* Align the address wrt HPAGE_PUD_SIZE */ > vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE; > Right. But this change is already on the linux-next (20210514) via another commit ae7920ce9e9bb ("mm/debug_vm_pgtable: remove redundant pfn_{pmd/pte}() and fix one comment mistake") which got queued recently. Anyways, now that we have verified the fix, will send it out against linux-next soon.
Re: [PATCH 14/14] powerpc/64s: use the same default PPR for user and kernel
Le 15/03/2021 à 23:04, Nicholas Piggin a écrit : Change the default PPR to userspace to 4 (medium), matching the normal kernel PPR. This allows system calls and user interrupts to avoid setting PPR on entry and exit, providing a significant speedup. This is a change to the user environment. The problem with changing the kernel to match userspace at 3 (medium-low), is that userspace can then boost priority above the kernel which is also undesirable. glibc does not seem to change PPR anywhere, so the decision is to go with this. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/interrupt.h | 2 ++ arch/powerpc/include/asm/processor.h | 4 ++-- arch/powerpc/kernel/exceptions-64s.S | 3 --- arch/powerpc/kernel/interrupt.c | 33 arch/powerpc/kernel/interrupt_64.S | 17 -- 5 files changed, 37 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h index d6d54bbcba2f..293e6be9fd71 100644 --- a/arch/powerpc/include/asm/interrupt.h +++ b/arch/powerpc/include/asm/interrupt.h @@ -57,6 +57,8 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup #ifdef CONFIG_PPC64 bool trace_enable = false; + if (unlikely(regs->ppr != DEFAULT_PPR)) + mtspr(SPRN_PPR, DEFAULT_PPR); if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS)) { if (irq_soft_mask_set_return(IRQS_DISABLED) == IRQS_ENABLED) trace_enable = true; diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index cb1edf21a82e..5ff589042103 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -27,8 +27,8 @@ #endif #ifdef CONFIG_PPC64 -/* Default SMT priority is set to 3. Use 11- 13bits to save priority. */ -#define PPR_PRIORITY 3 +/* Default SMT priority is set to 4. Use 11- 13bits to save priority. */ +#define PPR_PRIORITY 4 #ifdef __ASSEMBLY__ #define DEFAULT_PPR (PPR_PRIORITY << 50) #else diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 75cee7cdf887..0d40614d13e0 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -367,7 +367,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) BEGIN_FTR_SECTION mfspr r9,SPRN_PPR END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) - HMT_MEDIUM std r10,IAREA+EX_R10(r13) /* save r10 - r12 */ BEGIN_FTR_SECTION mfspr r10,SPRN_CFAR @@ -1962,8 +1961,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) mfspr r11,SPRN_SRR0 mfspr r12,SPRN_SRR1 - HMT_MEDIUM - .if ! \virt __LOAD_HANDLER(r10, system_call_common_real) mtctr r10 diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c index 09cf699d0e2e..a6e0595da0dd 100644 --- a/arch/powerpc/kernel/interrupt.c +++ b/arch/powerpc/kernel/interrupt.c @@ -40,6 +40,11 @@ notrace long system_call_exception(long r3, long r4, long r5, regs->orig_gpr3 = r3; +#ifdef CONFIG_PPC_BOOK3S_64 + if (unlikely(regs->ppr != DEFAULT_PPR)) + mtspr(SPRN_PPR, DEFAULT_PPR); +#endif Can you have some helper functions to do this instead of those 4 lines #ifdefed blocks all over the place ? Something like #ifdef CONFIG_PPC_BOOK3S_64 static inline void set_ppr_regs(struct pt_regs *regs) { if (unlikely(regs->ppr != DEFAULT_PPR)) mtspr(SPRN_PPR, regs->ppr); } static inline void set_ppr_default(struct pt_regs *regs) { if (unlikely(regs->ppr != DEFAULT_PPR)) mtspr(SPRN_PPR, DEFAULT_PPR); } #else static inline void set_ppr_regs(struct pt_regs *regs) { } static inline void set_ppr_default(struct pt_regs *regs) { } #endif + if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED); @@ -237,6 +242,11 @@ notrace unsigned long syscall_exit_prepare_main(unsigned long r3, account_cpu_user_exit(); +#ifdef CONFIG_PPC_BOOK3S_64 + if (unlikely(regs->ppr != DEFAULT_PPR)) + mtspr(SPRN_PPR, regs->ppr); +#endif + #ifndef CONFIG_PPC_BOOK3E_64 /* BOOK3E not using this */ /* * We do this at the end so that we do context switch with KERNEL AMR @@ -315,6 +325,11 @@ notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *reg */ hard_irq_disable(); +#ifdef CONFIG_PPC_BOOK3S_64 + if (unlikely(regs->ppr != DEFAULT_PPR)) + mtspr(SPRN_PPR, DEFAULT_PPR); +#endif + trace_hardirqs_off(); user_exit_irqoff(); account_cpu_user_entry(); @@ -398,6 +413,11 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs) account_cpu_user_exit(); +#ifdef CONFIG_PPC_BOOK3S_64 + if (unlikely(regs->ppr != DEFAULT_PPR)) + mtspr(SPRN_PPR,
[PATCH] powerpc/powernv: Fix machine check reporting of async store errors
POWER9 and POWER10 asynchronous machine checks due to stores have their cause reported in SRR1 but SRR1[42] is set, which in other cases indicates DSISR cause. Check for these cases and clear SRR1[42], so the cause matching uses the i-side (SRR1) table. Cc: Mahesh Salgaonkar Fixes: 7b9f71f974 ("powerpc/64s: POWER9 machine check handler") Fixes: 201220bb0e ("powerpc/powernv: Machine check handler for POWER10") Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/mce_power.c | 48 +++-- 1 file changed, 40 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c index 667104d4c455..2fff886c549d 100644 --- a/arch/powerpc/kernel/mce_power.c +++ b/arch/powerpc/kernel/mce_power.c @@ -481,12 +481,11 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr, return -1; } -static int mce_handle_ierror(struct pt_regs *regs, +static int mce_handle_ierror(struct pt_regs *regs, unsigned long srr1, const struct mce_ierror_table table[], struct mce_error_info *mce_err, uint64_t *addr, uint64_t *phys_addr) { - uint64_t srr1 = regs->msr; int handled = 0; int i; @@ -695,19 +694,19 @@ static long mce_handle_ue_error(struct pt_regs *regs, } static long mce_handle_error(struct pt_regs *regs, + unsigned long srr1, const struct mce_derror_table dtable[], const struct mce_ierror_table itable[]) { struct mce_error_info mce_err = { 0 }; uint64_t addr, phys_addr = ULONG_MAX; - uint64_t srr1 = regs->msr; long handled; if (SRR1_MC_LOADSTORE(srr1)) handled = mce_handle_derror(regs, dtable, _err, , _addr); else - handled = mce_handle_ierror(regs, itable, _err, , + handled = mce_handle_ierror(regs, srr1, itable, _err, , _addr); if (!handled && mce_err.error_type == MCE_ERROR_TYPE_UE) @@ -723,16 +722,20 @@ long __machine_check_early_realmode_p7(struct pt_regs *regs) /* P7 DD1 leaves top bits of DSISR undefined */ regs->dsisr &= 0x; - return mce_handle_error(regs, mce_p7_derror_table, mce_p7_ierror_table); + return mce_handle_error(regs, regs->msr, + mce_p7_derror_table, mce_p7_ierror_table); } long __machine_check_early_realmode_p8(struct pt_regs *regs) { - return mce_handle_error(regs, mce_p8_derror_table, mce_p8_ierror_table); + return mce_handle_error(regs, regs->msr, + mce_p8_derror_table, mce_p8_ierror_table); } long __machine_check_early_realmode_p9(struct pt_regs *regs) { + unsigned long srr1 = regs->msr; + /* * On POWER9 DD2.1 and below, it's possible to get a machine check * caused by a paste instruction where only DSISR bit 25 is set. This @@ -746,10 +749,39 @@ long __machine_check_early_realmode_p9(struct pt_regs *regs) if (SRR1_MC_LOADSTORE(regs->msr) && regs->dsisr == 0x0200) return 1; - return mce_handle_error(regs, mce_p9_derror_table, mce_p9_ierror_table); + /* +* Async machine check due to bad real address from store or foreign +* link time out comes with the load/store bit (PPC bit 42) set in +* SRR1, but the cause comes in SRR1 not DSISR. Clear bit 42 so we're +* directed to the ierror table so it will find the cause (which +* describes it correctly as a store error). +*/ + if (SRR1_MC_LOADSTORE(srr1) && + ((srr1 & 0x081c) == 0x0814 || +(srr1 & 0x081c) == 0x0818)) { + srr1 &= ~PPC_BIT(42); + } + + return mce_handle_error(regs, srr1, + mce_p9_derror_table, mce_p9_ierror_table); } long __machine_check_early_realmode_p10(struct pt_regs *regs) { - return mce_handle_error(regs, mce_p10_derror_table, mce_p10_ierror_table); + unsigned long srr1 = regs->msr; + + /* +* Async machine check due to bad real address from store comes with +* the load/store bit (PPC bit 42) set in SRR1, but the cause comes in +* SRR1 not DSISR. Clear bit 42 so we're directed to the ierror table +* so it will find the cause (which describes it correctly as a store +* error). +*/ + if (SRR1_MC_LOADSTORE(srr1) && + (srr1 & 0x081c) == 0x0814) { + srr1 &= ~PPC_BIT(42); + } + + return mce_handle_error(regs, srr1, + mce_p10_derror_table, mce_p10_ierror_table); } -- 2.23.0
Re: [PATCH 1/2] powerpc/interrupt: Refactor interrupt_exit_user_prepare() and syscall_exit_prepare()
Le 17/05/2021 à 09:44, Nicholas Piggin a écrit : Excerpts from Christophe Leroy's message of May 14, 2021 6:28 pm: Last part of interrupt_exit_user_prepare() and syscall_exit_prepare() are identical. Create a __interrupt_exit_user_prepare() function that is called by both. Note that it replaces a local_irq_save(flags) by local_irq_disable(). This is similar because the flags are never used. On ppc 8xx it is more efficient because it doesn't require reading MSR. Can these cleanups go after my interrupt performance improvements? I posted them for last series but were dropped due to crashes without time to resubmit. I'm working on them again now. Euh ... ok why not, but at the time being interrupt_exit_user_prepare() and syscall_exit_prepare() are very similar. Which makes sense because both of them are returning from kernel to user so they are to do the same preparation. If you are doing the same changes to both of them, maybe it is worst including this refactor at the begining of your series. Or are you making them diverge with that series ? Christophe
Re: [PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument
Guenter Roeck writes: > On 5/17/21 1:40 AM, Aneesh Kumar K.V wrote: >> On 5/15/21 10:05 PM, Guenter Roeck wrote: >>> On Thu, Apr 22, 2021 at 11:13:19AM +0530, Aneesh Kumar K.V wrote: ... >>> extern void radix__local_flush_all_mm(struct mm_struct *mm); diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index 215973b4cb26..f9f8a3a264f7 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -45,13 +45,30 @@ static inline void tlbiel_all_lpid(bool radix) hash__tlbiel_all(TLB_INVAL_SCOPE_LPID); } +static inline void flush_pmd_tlb_pwc_range(struct vm_area_struct *vma, >>> + unsigned long start, + unsigned long end, + bool flush_pwc) +{ + if (radix_enabled()) + return radix__flush_pmd_tlb_range(vma, start, end, flush_pwc); + return hash__flush_tlb_range(vma, start, end); >>> ^^ >>> +} >> >> In this specific case we won't have build errors because, >> >> static inline void hash__flush_tlb_range(struct vm_area_struct *vma, >> unsigned long start, unsigned long end) >> { >> > > Sorry, you completely lost me. > > Building parisc:allnoconfig ... failed > -- > Error log: > In file included from arch/parisc/include/asm/cacheflush.h:7, > from include/linux/highmem.h:12, > from include/linux/pagemap.h:11, > from include/linux/ksm.h:13, > from mm/mremap.c:14: > mm/mremap.c: In function 'flush_pte_tlb_pwc_range': > arch/parisc/include/asm/tlbflush.h:20:2: error: 'return' with a value, in > function returning void As replied here https://lore.kernel.org/mm-commits/8eedb441-a612-1ec8-8bf7-b40184de9...@linux.ibm.com/ That was the generic header change in the patch. I was commenting about the ppc64 specific change causing build failures. -aneesh
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
Anshuman Khandual writes: > On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: >> On 5/17/21 11:17 AM, Christophe Leroy wrote: >>> +aneesh >>> +linuxppc-dev list >>> >>> Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : Hello Christophe, DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not aware about this. From the error log, it failed explicitly on 4K page size hash config. static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) { BUG(); --> Failed return pmd; } static inline pmd_t __pmd_mkhuge(pmd_t pmd) { if (radix_enabled()) return radix__pmd_mkhuge(pmd); return hash__pmd_mkhuge(pmd); } pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) { unsigned long pmdv; pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); } It seems like on powerpc, where pfn_pmd() makes a huge page but which is not supported on 4K hash config thus triggering the BUG(). But all pfn_pmd() call sites inside the debug_vm_pgtable() test are protected with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does not directly make a huge page on other platforms. Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant THP helpers has BUG() or 0 which indicates THP might not be supported on 4K page size hash config ? But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 platforms which I assume includes 4K page size hash config as well. Is THP some how getting enabled on this 4K page size hash config where it should not be (thus triggering the BUG) ? OR am I missing something here. >>> >> >> We should put those pfn_pmd() and pfn_pud() after >> >> if (!has_transparent_hugepage()) >> return; > > The following patch has been lightly tested on arm64 and x86 platforms. > Could you please verify if this solves the problem on powerpc 4K hash > config ? Thank you. Tested the patch with the below additional change. modified mm/debug_vm_pgtable.c @@ -186,12 +186,13 @@ static void __init pmd_advanced_tests(struct mm_struct *mm, unsigned long pfn, unsigned long vaddr, pgprot_t prot, pgtable_t pgtable) { - pmd_t pmd = pfn_pmd(pfn, prot); + pmd_t pmd; if (!has_transparent_hugepage()) return; pr_debug("Validating PMD advanced\n"); + pmd = pfn_pmd(pfn, prot); /* Align the address wrt HPAGE_PMD_SIZE */ vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE; @@ -334,12 +335,13 @@ static void __init pud_advanced_tests(struct mm_struct *mm, unsigned long pfn, unsigned long vaddr, pgprot_t prot) { - pud_t pud = pfn_pud(pfn, prot); + pud_t pud; if (!has_transparent_hugepage()) return; pr_debug("Validating PUD advanced\n"); + pud = pfn_pud(pfn, prot); /* Align the address wrt HPAGE_PUD_SIZE */ vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE; -aneesh
Re: [PATCH 01/14] powerpc: remove interrupt exit helpers unused argument
Le 15/03/2021 à 23:03, Nicholas Piggin a écrit : The msr argument is not used, remove it. And why not use it instead of re-reading regs->msr ? Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/asm-prototypes.h | 4 ++-- arch/powerpc/kernel/interrupt.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h index 1c7b75834e04..95492655462e 100644 --- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -71,8 +71,8 @@ void __init machine_init(u64 dt_ptr); #endif long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, unsigned long r0, struct pt_regs *regs); notrace unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs *regs, long scv); -notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned long msr); -notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsigned long msr); +notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs); +notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, u32 len_high, u32 len_low); diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c index 96ca27ef68ae..efeeefe6ee8f 100644 --- a/arch/powerpc/kernel/interrupt.c +++ b/arch/powerpc/kernel/interrupt.c @@ -359,7 +359,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3, return ret; } -notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned long msr) +notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs) { unsigned long ti_flags; unsigned long flags; @@ -443,7 +443,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned void preempt_schedule_irq(void); -notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsigned long msr) +notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs) { unsigned long flags; unsigned long ret = 0;
Re: [PATCH] watchdog: Remove MV64x60 watchdog driver
On 5/17/21 4:17 AM, Michael Ellerman wrote: Guenter Roeck writes: On 3/18/21 10:25 AM, Christophe Leroy wrote: Commit 92c8c16f3457 ("powerpc/embedded6xx: Remove C2K board support") removed the last selector of CONFIG_MV64X60. Therefore CONFIG_MV64X60_WDT cannot be selected anymore and can be removed. Signed-off-by: Christophe Leroy Reviewed-by: Guenter Roeck --- drivers/watchdog/Kconfig | 4 - drivers/watchdog/Makefile | 1 - drivers/watchdog/mv64x60_wdt.c | 324 - include/linux/mv643xx.h| 8 - 4 files changed, 337 deletions(-) delete mode 100644 drivers/watchdog/mv64x60_wdt.c I assumed this would go via the watchdog tree, but seems like I misinterpreted. Wim didn't send a pull request this time around. Guenter Should I take this via the powerpc tree for v5.14 ? cheers
Re: [PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument
On 5/17/21 1:40 AM, Aneesh Kumar K.V wrote: On 5/15/21 10:05 PM, Guenter Roeck wrote: On Thu, Apr 22, 2021 at 11:13:19AM +0530, Aneesh Kumar K.V wrote: No functional change in this patch Signed-off-by: Aneesh Kumar K.V --- .../include/asm/book3s/64/tlbflush-radix.h | 19 +++- arch/powerpc/include/asm/book3s/64/tlbflush.h | 23 --- arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 4 +-- arch/powerpc/mm/book3s64/radix_tlb.c | 29 +++ 4 files changed, 42 insertions(+), 33 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 8b33601cdb9d..171441a43b35 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -56,15 +56,18 @@ static inline void radix__flush_all_lpid_guest(unsigned int lpid) } #endif -extern void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end); -extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, - unsigned long end, int psize); -extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end); -extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, +void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end, + bool flush_pwc); +void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end, + bool flush_pwc); +void radix__flush_tlb_pwc_range_psize(struct mm_struct *mm, unsigned long start, + unsigned long end, int psize, bool flush_pwc); +void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); -extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end); +void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end); + extern void radix__local_flush_tlb_mm(struct mm_struct *mm); extern void radix__local_flush_all_mm(struct mm_struct *mm); diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index 215973b4cb26..f9f8a3a264f7 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -45,13 +45,30 @@ static inline void tlbiel_all_lpid(bool radix) hash__tlbiel_all(TLB_INVAL_SCOPE_LPID); } +static inline void flush_pmd_tlb_pwc_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + bool flush_pwc) +{ + if (radix_enabled()) + return radix__flush_pmd_tlb_range(vma, start, end, flush_pwc); + return hash__flush_tlb_range(vma, start, end); ^^ +} In this specific case we won't have build errors because, static inline void hash__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { Sorry, you completely lost me. Building parisc:allnoconfig ... failed -- Error log: In file included from arch/parisc/include/asm/cacheflush.h:7, from include/linux/highmem.h:12, from include/linux/pagemap.h:11, from include/linux/ksm.h:13, from mm/mremap.c:14: mm/mremap.c: In function 'flush_pte_tlb_pwc_range': arch/parisc/include/asm/tlbflush.h:20:2: error: 'return' with a value, in function returning void Guenter But I agree the below is better to read. static inline void flush_pmd_tlb_pwc_range(struct vm_area_struct *vma, unsigned long start, unsigned long end, bool flush_pwc) { if (radix_enabled()) radix__flush_pmd_tlb_range(vma, start, end, flush_pwc); else hash__flush_tlb_range(vma, start, end); return } #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) +{ + return flush_pmd_tlb_pwc_range(vma, start, end, false); ^^ Doesn't that cause build warnings/errors all over the place ? Guenter -aneesh
Re: [PATCH v2 2/2] powerpc/sstep: Add tests for setb instruction
Sathvika Vasireddy wrote: This adds selftests for setb instruction. Signed-off-by: Sathvika Vasireddy --- arch/powerpc/include/asm/ppc-opcode.h | 1 + arch/powerpc/lib/test_emulate_step.c | 29 + 2 files changed, 30 insertions(+) Tested-by: Naveen N. Rao diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index ac41776661e9..927551dd870b 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -245,6 +245,7 @@ #define PPC_INST_STRING0x7c00042a #define PPC_INST_STRING_MASK 0xfc0007fe #define PPC_INST_STRING_GEN_MASK 0xfc00067e +#define PPC_INST_SETB 0x7c000100 #define PPC_INST_STSWI 0x7c0005aa #define PPC_INST_STSWX 0x7c00052a #define PPC_INST_TRECHKPT 0x7c0007dd diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c index 783d1b85ecfe..a0a52fe5e979 100644 --- a/arch/powerpc/lib/test_emulate_step.c +++ b/arch/powerpc/lib/test_emulate_step.c @@ -53,6 +53,8 @@ ppc_inst_prefix(PPC_PREFIX_MLS | __PPC_PRFX_R(pr) | IMM_H(i), \ PPC_RAW_ADDI(t, a, i)) +#define TEST_SETB(t, bfa) ppc_inst(PPC_INST_SETB | ___PPC_RT(t) | ___PPC_RA((bfa & 0x7) << 2)) + static void __init init_pt_regs(struct pt_regs *regs) { @@ -929,6 +931,33 @@ static struct compute_test compute_tests[] = { } } }, + { + .mnemonic = "setb", + .cpu_feature = CPU_FTR_ARCH_300, + .subtests = { + { + .descr = "BFA = 1, CR = GT", + .instr = TEST_SETB(20, 1), + .regs = { + .ccr = 0x400, + } + }, + { + .descr = "BFA = 4, CR = LT", + .instr = TEST_SETB(20, 4), + .regs = { + .ccr = 0x8000, + } + }, + { + .descr = "BFA = 5, CR = EQ", + .instr = TEST_SETB(20, 5), + .regs = { + .ccr = 0x200, + } + } + } + }, { .mnemonic = "add", .subtests = { -- 2.16.4
Re: [PATCH v2 1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction
Sathvika Vasireddy wrote: This adds emulation support for the following instruction: * Set Boolean (setb) Signed-off-by: Sathvika Vasireddy --- arch/powerpc/lib/sstep.c | 22 ++ 1 file changed, 22 insertions(+) Tested-by: Naveen N. Rao diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index 45bda2520755..aee42bcc775b 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -1700,6 +1700,28 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs, op->val = regs->ccr & imm; goto compute_done; + case 128: /* setb */ + if (!cpu_has_feature(CPU_FTR_ARCH_300)) + goto unknown_opcode; + /* +* 'ra' encodes the CR field number (bfa) in the top 3 bits. +* Since each CR field is 4 bits, +* we can simply mask off the bottom two bits (bfa * 4) +* to yield the first bit in the CR field. +*/ + ra = ra & ~0x3; + /* 'val' stores bits of the CR field (bfa) */ + val = regs->ccr >> (CR0_SHIFT - ra); + /* checks if the LT bit of CR field (bfa) is set */ + if (val & 8) + op->val = -1; + /* checks if the GT bit of CR field (bfa) is set */ + else if (val & 4) + op->val = 1; + else + op->val = 0; + goto compute_done; + case 144: /* mtcrf */ op->type = COMPUTE + SETCC; imm = 0xf000UL; -- 2.16.4
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
On 5/17/21 6:29 PM, Christophe Leroy wrote: > > > Le 17/05/2021 à 14:49, Anshuman Khandual a écrit : >> >> >> On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: >>> On 5/17/21 11:17 AM, Christophe Leroy wrote: +aneesh +linuxppc-dev list Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : > Hello Christophe, > > DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not > aware about this. From the error log, it failed explicitly on 4K page > size hash config. > > static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) > { > BUG(); --> Failed > return pmd; > } > > static inline pmd_t __pmd_mkhuge(pmd_t pmd) > { > if (radix_enabled()) > return radix__pmd_mkhuge(pmd); > return hash__pmd_mkhuge(pmd); > } > > pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) > { > unsigned long pmdv; > > pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; > > return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); > } > > It seems like on powerpc, where pfn_pmd() makes a huge page but which > is not supported on 4K hash config thus triggering the BUG(). But all > pfn_pmd() call sites inside the debug_vm_pgtable() test are protected > with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does > not directly make a huge page on other platforms. > > Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant > THP helpers has BUG() or 0 which indicates THP might not be supported > on 4K page size hash config ? > > But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like > HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 > platforms which I assume includes 4K page size hash config as well. > > Is THP some how getting enabled on this 4K page size hash config where > it should not be (thus triggering the BUG) ? OR am I missing something > here. > >>> >>> We should put those pfn_pmd() and pfn_pud() after >>> >>> if (!has_transparent_hugepage()) >>> return; >> >> The following patch has been lightly tested on arm64 and x86 platforms. >> Could you please verify if this solves the problem on powerpc 4K hash >> config ? Thank you. > > No need to update pmd_advanced_tests() and pud_advanced_tests() ? They already have has_transparent_hugepage() check and all pfn_pxx() instances are after that check. Do you see any other concern ? > > Christophe
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
Le 17/05/2021 à 14:49, Anshuman Khandual a écrit : On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: On 5/17/21 11:17 AM, Christophe Leroy wrote: +aneesh +linuxppc-dev list Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : Hello Christophe, DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not aware about this. From the error log, it failed explicitly on 4K page size hash config. static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) { BUG(); --> Failed return pmd; } static inline pmd_t __pmd_mkhuge(pmd_t pmd) { if (radix_enabled()) return radix__pmd_mkhuge(pmd); return hash__pmd_mkhuge(pmd); } pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) { unsigned long pmdv; pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); } It seems like on powerpc, where pfn_pmd() makes a huge page but which is not supported on 4K hash config thus triggering the BUG(). But all pfn_pmd() call sites inside the debug_vm_pgtable() test are protected with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does not directly make a huge page on other platforms. Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant THP helpers has BUG() or 0 which indicates THP might not be supported on 4K page size hash config ? But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 platforms which I assume includes 4K page size hash config as well. Is THP some how getting enabled on this 4K page size hash config where it should not be (thus triggering the BUG) ? OR am I missing something here. We should put those pfn_pmd() and pfn_pud() after if (!has_transparent_hugepage()) return; The following patch has been lightly tested on arm64 and x86 platforms. Could you please verify if this solves the problem on powerpc 4K hash config ? Thank you. No need to update pmd_advanced_tests() and pud_advanced_tests() ? Christophe
[Bug 213069] kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
https://bugzilla.kernel.org/show_bug.cgi?id=213069 --- Comment #3 from Christophe Leroy (christophe.le...@csgroup.eu) --- Patch proposed by Anshuman, to be tested: https://lore.kernel.org/linuxppc-dev/bug-213069-206...@https.bugzilla.kernel.org%2F/T/#m82f2974b813c4d3a085b0cd2ac3d5b732a362e68 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: > On 5/17/21 11:17 AM, Christophe Leroy wrote: >> +aneesh >> +linuxppc-dev list >> >> Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : >>> Hello Christophe, >>> >>> DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not >>> aware about this. From the error log, it failed explicitly on 4K page >>> size hash config. >>> >>> static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) >>> { >>> BUG(); --> Failed >>> return pmd; >>> } >>> >>> static inline pmd_t __pmd_mkhuge(pmd_t pmd) >>> { >>> if (radix_enabled()) >>> return radix__pmd_mkhuge(pmd); >>> return hash__pmd_mkhuge(pmd); >>> } >>> >>> pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) >>> { >>> unsigned long pmdv; >>> >>> pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; >>> >>> return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); >>> } >>> >>> It seems like on powerpc, where pfn_pmd() makes a huge page but which >>> is not supported on 4K hash config thus triggering the BUG(). But all >>> pfn_pmd() call sites inside the debug_vm_pgtable() test are protected >>> with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does >>> not directly make a huge page on other platforms. >>> >>> Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant >>> THP helpers has BUG() or 0 which indicates THP might not be supported >>> on 4K page size hash config ? >>> >>> But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like >>> HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 >>> platforms which I assume includes 4K page size hash config as well. >>> >>> Is THP some how getting enabled on this 4K page size hash config where >>> it should not be (thus triggering the BUG) ? OR am I missing something >>> here. >>> >> > > We should put those pfn_pmd() and pfn_pud() after > > if (!has_transparent_hugepage()) > return; The following patch has been lightly tested on arm64 and x86 platforms. Could you please verify if this solves the problem on powerpc 4K hash config ? Thank you. - Anshuman mm/debug_vm_pgtable.c | 58 +++ 1 file changed, 48 insertions(+), 10 deletions(-) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index e2f35db8ba69..6ff92c8b0a00 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -146,13 +146,14 @@ static void __init pte_savedwrite_tests(unsigned long pfn, pgprot_t prot) static void __init pmd_basic_tests(unsigned long pfn, int idx) { pgprot_t prot = protection_map[idx]; - pmd_t pmd = pfn_pmd(pfn, prot); unsigned long val = idx, *ptr = + pmd_t pmd; if (!has_transparent_hugepage()) return; pr_debug("Validating PMD basic (%pGv)\n", ptr); + pmd = pfn_pmd(pfn, prot); /* * This test needs to be executed after the given page table entry @@ -232,9 +233,14 @@ static void __init pmd_advanced_tests(struct mm_struct *mm, static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot) { - pmd_t pmd = pfn_pmd(pfn, prot); + pmd_t pmd; + + if (!has_transparent_hugepage()) + return; pr_debug("Validating PMD leaf\n"); + pmd = pfn_pmd(pfn, prot); + /* * PMD based THP is a leaf entry. */ @@ -244,12 +250,16 @@ static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot) static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot) { - pmd_t pmd = pfn_pmd(pfn, prot); + pmd_t pmd; if (!IS_ENABLED(CONFIG_NUMA_BALANCING)) return; + if (!has_transparent_hugepage()) + return; + pr_debug("Validating PMD saved write\n"); + pmd = pfn_pmd(pfn, prot); WARN_ON(!pmd_savedwrite(pmd_mk_savedwrite(pmd_clear_savedwrite(pmd; WARN_ON(pmd_savedwrite(pmd_clear_savedwrite(pmd_mk_savedwrite(pmd; } @@ -258,13 +268,14 @@ static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot) static void __init pud_basic_tests(struct mm_struct *mm, unsigned long pfn, int idx) { pgprot_t prot = protection_map[idx]; - pud_t pud = pfn_pud(pfn, prot); unsigned long val = idx, *ptr = + pud_t pud; if (!has_transparent_hugepage()) return; pr_debug("Validating PUD basic (%pGv)\n", ptr); + pud = pfn_pud(pfn, prot); /* * This test needs to be executed after the given page table entry @@ -348,9 +359,13 @@ static void __init pud_advanced_tests(struct mm_struct *mm, static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot) { - pud_t pud = pfn_pud(pfn, prot); + pud_t pud; + + if (!has_transparent_hugepage()) + return; pr_debug("Validating PUD leaf\n"); + pud = pfn_pud(pfn, prot);
[PATCH 1/1] selftests/powerpc: Remove duplicated header file inclusion
The header file is already included above and can be removed here. Signed-off-by: Zhen Lei --- tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c b/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c index e2a0c07e8362..9ef37a9836ac 100644 --- a/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c +++ b/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c @@ -17,7 +17,6 @@ #include #include #include -#include #include "tm.h" #include "utils.h" -- 2.25.1
Re: [PATCH] watchdog: Remove MV64x60 watchdog driver
Guenter Roeck writes: > On 3/18/21 10:25 AM, Christophe Leroy wrote: >> Commit 92c8c16f3457 ("powerpc/embedded6xx: Remove C2K board support") >> removed the last selector of CONFIG_MV64X60. >> >> Therefore CONFIG_MV64X60_WDT cannot be selected anymore and >> can be removed. >> >> Signed-off-by: Christophe Leroy > > Reviewed-by: Guenter Roeck > >> --- >> drivers/watchdog/Kconfig | 4 - >> drivers/watchdog/Makefile | 1 - >> drivers/watchdog/mv64x60_wdt.c | 324 - >> include/linux/mv643xx.h| 8 - >> 4 files changed, 337 deletions(-) >> delete mode 100644 drivers/watchdog/mv64x60_wdt.c I assumed this would go via the watchdog tree, but seems like I misinterpreted. Should I take this via the powerpc tree for v5.14 ? cheers
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
On 5/17/21 4:23 PM, Anshuman Khandual wrote: On 5/17/21 1:14 PM, Aneesh Kumar K.V wrote: On 5/17/21 12:55 PM, Anshuman Khandual wrote: On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: On 5/17/21 11:17 AM, Christophe Leroy wrote: +aneesh +linuxppc-dev list Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : Hello Christophe, DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not aware about this. From the error log, it failed explicitly on 4K page size hash config. static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) { BUG(); --> Failed return pmd; } static inline pmd_t __pmd_mkhuge(pmd_t pmd) { if (radix_enabled()) return radix__pmd_mkhuge(pmd); return hash__pmd_mkhuge(pmd); } pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) { unsigned long pmdv; pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); } It seems like on powerpc, where pfn_pmd() makes a huge page but which is not supported on 4K hash config thus triggering the BUG(). But all pfn_pmd() call sites inside the debug_vm_pgtable() test are protected with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does not directly make a huge page on other platforms. Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant THP helpers has BUG() or 0 which indicates THP might not be supported on 4K page size hash config ? But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 platforms which I assume includes 4K page size hash config as well. Is THP some how getting enabled on this 4K page size hash config where it should not be (thus triggering the BUG) ? OR am I missing something here. We should put those pfn_pmd() and pfn_pud() after if (!has_transparent_hugepage()) return; On hash with 4K page size, we can't support leaf page table entry and PMD and PUD level. Hence we don't support THP for them. But could CONFIG_TRANSPARENT_HUGEPAGE be enabled on such configs ? Otherwise, wondering how the BUG() just got triggered there. yes. We do support THP with radix translation and 4K page size. I was actually wondering about whether 4K page size in hash config supports THP and could enable CONFIG_TRANSPARENT_HUGEPAGE ? But as there is a BUG() trigger on hash__pmd_mkhuge(), it seems like THP config should not have been enabled for such platforms but it does for some reason. Could not we disable it ? Hash is a translation mode which is not build time selected. Rather it is runtime selected. Same kernel can boot into either hash or radix translation. We don't support THP with 4K pagesize with hash translation.But the same kernel when booted in radix translation mode will support THP. -aneesh
Re: [PATCH v14 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware
Le 17/05/2021 à 13:01, Michael Ellerman a écrit : Jordan Niethe writes: On Mon, May 17, 2021 at 4:37 PM Christophe Leroy wrote: Le 17/05/2021 à 05:28, Jordan Niethe a écrit : Make module_alloc() use PAGE_KERNEL protections instead of PAGE_KERNEL_EXEX if Strict Module RWX is enabled. Signed-off-by: Jordan Niethe --- v14: - Split out from powerpc: Set ARCH_HAS_STRICT_MODULE_RWX - Add and use strict_module_rwx_enabled() helper --- arch/powerpc/include/asm/mmu.h | 5 + arch/powerpc/kernel/module.c | 4 +++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index 607168b1aef4..7710bf0cbf8a 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -357,6 +357,11 @@ static inline bool strict_kernel_rwx_enabled(void) return false; } #endif + +static inline bool strict_module_rwx_enabled(void) +{ + return IS_ENABLED(CONFIG_STRICT_MODULE_RWX) && strict_kernel_rwx_enabled(); +} Looking at arch/Kconfig, I have the feeling that it is possible to select CONFIG_STRICT_MODULE_RWX without selecting CONFIG_STRICT_KERNEL_RWX. In that case, strict_kernel_rwx_enabled() will return false. Ok, if someone did that currently it would break things, e.g. code patching. I think it should it be made impossible to CONFIG_STRICT_MODULE_RWX without CONFIG_STRICT_KERNEL_RWX? Yeah I don't see any reason to support that combination. We should be moving to a world where both are on by default, or in fact are always enabled. Would it work if we add the following in arch/powerpc/Kconfig ? : select STRICT_KERNEL_RWX if STRICT_MODULE_RWX There should be no dependency issue as powerpc only selects ARCH_HAS_STRICT_MODULE_RWX when ARCH_HAS_STRICT_KERNEL_RWX is also selected. Christophe
Re: [PATCH v14 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware
Jordan Niethe writes: > On Mon, May 17, 2021 at 4:37 PM Christophe Leroy > wrote: >> Le 17/05/2021 à 05:28, Jordan Niethe a écrit : >> > Make module_alloc() use PAGE_KERNEL protections instead of >> > PAGE_KERNEL_EXEX if Strict Module RWX is enabled. >> > >> > Signed-off-by: Jordan Niethe >> > --- >> > v14: - Split out from powerpc: Set ARCH_HAS_STRICT_MODULE_RWX >> > - Add and use strict_module_rwx_enabled() helper >> > --- >> > arch/powerpc/include/asm/mmu.h | 5 + >> > arch/powerpc/kernel/module.c | 4 +++- >> > 2 files changed, 8 insertions(+), 1 deletion(-) >> > >> > diff --git a/arch/powerpc/include/asm/mmu.h >> > b/arch/powerpc/include/asm/mmu.h >> > index 607168b1aef4..7710bf0cbf8a 100644 >> > --- a/arch/powerpc/include/asm/mmu.h >> > +++ b/arch/powerpc/include/asm/mmu.h >> > @@ -357,6 +357,11 @@ static inline bool strict_kernel_rwx_enabled(void) >> > return false; >> > } >> > #endif >> > + >> > +static inline bool strict_module_rwx_enabled(void) >> > +{ >> > + return IS_ENABLED(CONFIG_STRICT_MODULE_RWX) && >> > strict_kernel_rwx_enabled(); >> > +} >> >> Looking at arch/Kconfig, I have the feeling that it is possible to select >> CONFIG_STRICT_MODULE_RWX >> without selecting CONFIG_STRICT_KERNEL_RWX. >> >> In that case, strict_kernel_rwx_enabled() will return false. > Ok, if someone did that currently it would break things, e.g. code > patching. I think it should it be made impossible to > CONFIG_STRICT_MODULE_RWX without CONFIG_STRICT_KERNEL_RWX? Yeah I don't see any reason to support that combination. We should be moving to a world where both are on by default, or in fact are always enabled. cheers
Re: [PATCH v2] lockdown,selinux: avoid bogus SELinux lockdown permission checks
Ondrej Mosnacek writes: > Commit 59438b46471a ("security,lockdown,selinux: implement SELinux > lockdown") added an implementation of the locked_down LSM hook to > SELinux, with the aim to restrict which domains are allowed to perform > operations that would breach lockdown. > > However, in several places the security_locked_down() hook is called in > situations where the current task isn't doing any action that would > directly breach lockdown, leading to SELinux checks that are basically > bogus. > > Since in most of these situations converting the callers such that > security_locked_down() is called in a context where the current task > would be meaningful for SELinux is impossible or very non-trivial (and > could lead to TOCTOU issues for the classic Lockdown LSM > implementation), fix this by modifying the hook to accept a struct cred > pointer as argument, where NULL will be interpreted as a request for a > "global", task-independent lockdown decision only. Then modify SELinux > to ignore calls with cred == NULL. > > Since most callers will just want to pass current_cred() as the cred > parameter, rename the hook to security_cred_locked_down() and provide > the original security_locked_down() function as a simple wrapper around > the new hook. > > The callers migrated to the new hook, passing NULL as cred: > 1. arch/powerpc/xmon/xmon.c > Here the hook seems to be called from non-task context and is only > used for redacting some sensitive values from output sent to > userspace. It's hard to follow but it actually disables interactive use of xmon entirely if lockdown is in confidentiality mode, and disables modifications of the kernel in integrity mode. But that's not really that important, the patch looks fine. Acked-by: Michael Ellerman (powerpc) cheers
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
On 5/17/21 1:14 PM, Aneesh Kumar K.V wrote: > On 5/17/21 12:55 PM, Anshuman Khandual wrote: >> >> >> On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: >>> On 5/17/21 11:17 AM, Christophe Leroy wrote: +aneesh +linuxppc-dev list Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : > Hello Christophe, > > DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not > aware about this. From the error log, it failed explicitly on 4K page > size hash config. > > static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) > { > BUG(); --> Failed > return pmd; > } > > static inline pmd_t __pmd_mkhuge(pmd_t pmd) > { > if (radix_enabled()) > return radix__pmd_mkhuge(pmd); > return hash__pmd_mkhuge(pmd); > } > > pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) > { > unsigned long pmdv; > > pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; > > return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); > } > > It seems like on powerpc, where pfn_pmd() makes a huge page but which > is not supported on 4K hash config thus triggering the BUG(). But all > pfn_pmd() call sites inside the debug_vm_pgtable() test are protected > with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does > not directly make a huge page on other platforms. > > Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant > THP helpers has BUG() or 0 which indicates THP might not be supported > on 4K page size hash config ? > > But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like > HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 > platforms which I assume includes 4K page size hash config as well. > > Is THP some how getting enabled on this 4K page size hash config where > it should not be (thus triggering the BUG) ? OR am I missing something > here. > >>> >>> We should put those pfn_pmd() and pfn_pud() after >>> >>> if (!has_transparent_hugepage()) >>> return; >>> >>> >>> On hash with 4K page size, we can't support leaf page table entry and PMD >>> and PUD level. Hence we don't support THP for them. >> >> But could CONFIG_TRANSPARENT_HUGEPAGE be enabled on such configs ? >> Otherwise, wondering how the BUG() just got triggered there. >> > > yes. We do support THP with radix translation and 4K page size. I was actually wondering about whether 4K page size in hash config supports THP and could enable CONFIG_TRANSPARENT_HUGEPAGE ? But as there is a BUG() trigger on hash__pmd_mkhuge(), it seems like THP config should not have been enabled for such platforms but it does for some reason. Could not we disable it ?
[Bug 213069] kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
https://bugzilla.kernel.org/show_bug.cgi?id=213069 Michael Ellerman (mich...@ellerman.id.au) changed: What|Removed |Added Status|NEW |ASSIGNED CC||mich...@ellerman.id.au -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [FSL P50x0] KVM HV doesn't work anymore
On 17 May 2021 at 09:42am, Nicholas Piggin wrote: Excerpts from Christian Zigotzky's message of May 15, 2021 11:46 pm: On 15 May 2021 at 12:08pm Christophe Leroy wrote: Le 15/05/2021 à 11:48, Christian Zigotzky a écrit : Hi All, I bisected today [1] and the bisecting itself was OK but the reverting of the bad commit doesn't solve the issue. Do you have an idea which commit could be resposible for this issue? Maybe the bisecting wasn't successful. I will look in the kernel git log. Maybe there is a commit that affected KVM HV on FSL P50x0 machines. If the uImage doesn't load, it may be because of the size of uImage. See https://github.com/linuxppc/issues/issues/208 Is there a significant size difference with and without KVM HV ? Maybe you can try to remove another option to reduce the size of the uImage. I tried it but it doesn't solve the issue. The uImage works without KVM HV in a virtual e5500 QEMU machine. Any more progress with this? I would say that bisect might have just been a bit unstable and maybe by chance some things did not crash so it's pointing to the wrong patch. Upstream merge of powerpc-5.13-1 was good and powerpc-5.13-2 was bad? Between that looks like some KVM MMU rework. You could try the patch before this one b1c5356e873c ("KVM: PPC: Convert to the gfn-based MMU notifier callbacks"). That won't revert cleanly so just try run the tree at that point. If it works, test the patch and see if it fails. Thanks, Nick Hi Nick, Thanks a lot for your answer. Yes, there is a little bit of progress. The RC2 of kernel 5.13 successfully boots with -smp 3 in a virtual e5500 QEMU machine. -smp 4 doesn't work anymore since the PowerPC updates 5.13-2. I used -smp 4 before 5.13 because my FSL P5040 machine has 4 cores. Could you please post a patch for reverting the commit before b1c5356e873c ("KVM: PPC: Convert to the gfn-based MMU notifier callbacks")? Thanks in advance, Christian
[Bug 206733] i2c i2c-3: i2c-powermac: modalias failure on /uni-n@f8000000/i2c@f8001000/cereal@1c0
https://bugzilla.kernel.org/show_bug.cgi?id=206733 Paul Osmialowski (newch...@king.net.pl) changed: What|Removed |Added CC||newch...@king.net.pl --- Comment #4 from Paul Osmialowski (newch...@king.net.pl) --- I've just updated to 5.10.27-gentoo on some old iBook G4, and of course I've spotted this `i2c-powermac: modalias failure on /uni-n@f800/i2c@f8001000/cereal@1c0` message. It is not worrying now after I've found Ben's comment. I've also spotted one more error message (Flag mismatch) that seems new to me, can you also make a judgement if it is also harmless?: [...] [4.515861] hid-generic 0003:17EF:602E.0001: input,hidraw0: USB HID v1.11 Mouse [USB Optical Mouse] on usb-0001:10:1b.0-1/input0 [4.660923] input: PowerMac Beep as /devices/pci0001:10/0001:10:17.0/input/input3 [4.665549] genirq: Flags mismatch irq 21. 0001 (i2sbus: i2s-a (tx)) vs. 0001 (PMac Output) [...] -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[PATCH v2] lockdown, selinux: avoid bogus SELinux lockdown permission checks
Commit 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") added an implementation of the locked_down LSM hook to SELinux, with the aim to restrict which domains are allowed to perform operations that would breach lockdown. However, in several places the security_locked_down() hook is called in situations where the current task isn't doing any action that would directly breach lockdown, leading to SELinux checks that are basically bogus. Since in most of these situations converting the callers such that security_locked_down() is called in a context where the current task would be meaningful for SELinux is impossible or very non-trivial (and could lead to TOCTOU issues for the classic Lockdown LSM implementation), fix this by modifying the hook to accept a struct cred pointer as argument, where NULL will be interpreted as a request for a "global", task-independent lockdown decision only. Then modify SELinux to ignore calls with cred == NULL. Since most callers will just want to pass current_cred() as the cred parameter, rename the hook to security_cred_locked_down() and provide the original security_locked_down() function as a simple wrapper around the new hook. The callers migrated to the new hook, passing NULL as cred: 1. arch/powerpc/xmon/xmon.c Here the hook seems to be called from non-task context and is only used for redacting some sensitive values from output sent to userspace. 2. fs/tracefs/inode.c:tracefs_create_file() Here the call is used to prevent creating new tracefs entries when the kernel is locked down. Assumes that locking down is one-way - i.e. if the hook returns non-zero once, it will never return zero again, thus no point in creating these files. 3. kernel/trace/bpf_trace.c:bpf_probe_read_kernel{,_str}_common() Called when a BPF program calls a helper that could leak kernel memory. The task context is not relevant here, since the program may very well be run in the context of a different task than the consumer of the data. See: https://bugzilla.redhat.com/show_bug.cgi?id=1955585 4. net/xfrm/xfrm_user.c:copy_to_user_*() Here a cryptographic secret is redacted based on the value returned from the hook. There are two possible actions that may lead here: a) A netlink message XFRM_MSG_GETSA with NLM_F_DUMP set - here the task context is relevant, since the dumped data is sent back to the current task. b) When deleting an SA via XFRM_MSG_DELSA, the dumped SAs are broadcasted to tasks subscribed to XFRM events - here the SELinux check is not meningful as the current task's creds do not represent the tasks that could potentially see the secret. It really doesn't seem worth it to try to preserve the check in the a) case, since the eventual leak can be circumvented anyway via b), plus there is no way for the task to indicate that it doesn't care about the actual key value, so the check could generate a lot of noise. Improvements-suggested-by: Casey Schaufler Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") Signed-off-by: Ondrej Mosnacek --- v2: - change to a single hook based on suggestions by Casey Schaufler v1: https://lore.kernel.org/lkml/20210507114048.138933-1-omosn...@redhat.com/ arch/powerpc/xmon/xmon.c | 4 ++-- fs/tracefs/inode.c| 2 +- include/linux/lsm_hook_defs.h | 3 ++- include/linux/lsm_hooks.h | 3 ++- include/linux/security.h | 11 --- kernel/trace/bpf_trace.c | 4 ++-- net/xfrm/xfrm_user.c | 2 +- security/lockdown/lockdown.c | 5 +++-- security/security.c | 6 +++--- security/selinux/hooks.c | 12 +--- 10 files changed, 33 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index c8173e92f19d..90992793b4c5 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -299,7 +299,7 @@ static bool xmon_is_locked_down(void) static bool lockdown; if (!lockdown) { - lockdown = !!security_locked_down(LOCKDOWN_XMON_RW); + lockdown = !!security_cred_locked_down(NULL, LOCKDOWN_XMON_RW); if (lockdown) { printf("xmon: Disabled due to kernel lockdown\n"); xmon_is_ro = true; @@ -307,7 +307,7 @@ static bool xmon_is_locked_down(void) } if (!xmon_is_ro) { - xmon_is_ro = !!security_locked_down(LOCKDOWN_XMON_WR); + xmon_is_ro = !!security_cred_locked_down(NULL, LOCKDOWN_XMON_WR); if (xmon_is_ro) printf("xmon: Read-only due to kernel lockdown\n"); } diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c index 1261e8b41edb..7edde3fc22f5 100644 --- a/fs/tracefs/inode.c +++ b/fs/tracefs/inode.c @@ -396,7 +396,7 @@ struct dentry *tracefs_create_file(const char *name,
[PATCH v5] pseries/drmem: update LMBs after LPM
After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be updated by the hypervisor in the case the NUMA topology of the LPAR's memory is updated. This is handled by the kernel, but the memory's node is not updated because there is no way to move a memory block between nodes from the Linux kernel point of view. If later a memory block is added or removed, drmem_update_dt() is called and it is overwriting the DT node ibm,dynamic-reconfiguration-memory to match the added or removed LMB. But the LMB's associativity node has not been updated after the DT node update and thus the node is overwritten by the Linux's topology instead of the hypervisor one. Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is updated to force an update of the LMB's associativity. However, ignore the call to that hook when the update has been triggered by drmem_update_dt(). Because, in that case, the LMB tree has been used to set the DT property and thus it doesn't need to be updated back. Since drmem_update_dt() is called under the protection of the device_hotplug_lock and the hook is called in the same context, use a simple boolean variable to detect that call. Cc: Nathan Lynch Cc: Aneesh Kumar K.V Cc: Tyrel Datwyler Signed-off-by: Laurent Dufour --- V5: - Reword the commit's description to address Nathan's comments. V4: - Prevent the LMB to be updated back in the case the request came from the LMB tree's update. V3: - Check rd->dn->name instead of rd->dn->full_name V2: - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of introducing a new hook mechanism. --- arch/powerpc/include/asm/drmem.h | 1 + arch/powerpc/mm/drmem.c | 46 +++ .../platforms/pseries/hotplug-memory.c| 4 ++ 3 files changed, 51 insertions(+) diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h index bf2402fed3e0..4265d5e95c2c 100644 --- a/arch/powerpc/include/asm/drmem.h +++ b/arch/powerpc/include/asm/drmem.h @@ -111,6 +111,7 @@ int drmem_update_dt(void); int __init walk_drmem_lmbs_early(unsigned long node, void *data, int (*func)(struct drmem_lmb *, const __be32 **, void *)); +void drmem_update_lmbs(struct property *prop); #endif static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb) diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c index 9af3832c9d8d..22197b18d85e 100644 --- a/arch/powerpc/mm/drmem.c +++ b/arch/powerpc/mm/drmem.c @@ -18,6 +18,7 @@ static int n_root_addr_cells, n_root_size_cells; static struct drmem_lmb_info __drmem_info; struct drmem_lmb_info *drmem_info = &__drmem_info; +static bool in_drmem_update; u64 drmem_lmb_memory_max(void) { @@ -178,6 +179,11 @@ int drmem_update_dt(void) if (!memory) return -1; + /* +* Set in_drmem_update to prevent the notifier callback to process the +* DT property back since the change is coming from the LMB tree. +*/ + in_drmem_update = true; prop = of_find_property(memory, "ibm,dynamic-memory", NULL); if (prop) { rc = drmem_update_dt_v1(memory, prop); @@ -186,6 +192,7 @@ int drmem_update_dt(void) if (prop) rc = drmem_update_dt_v2(memory, prop); } + in_drmem_update = false; of_node_put(memory); return rc; @@ -307,6 +314,45 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data, return ret; } +/* + * Update the LMB associativity index. + */ +static int update_lmb(struct drmem_lmb *updated_lmb, + __maybe_unused const __be32 **usm, + __maybe_unused void *data) +{ + struct drmem_lmb *lmb; + + for_each_drmem_lmb(lmb) { + if (lmb->drc_index != updated_lmb->drc_index) + continue; + + lmb->aa_index = updated_lmb->aa_index; + break; + } + return 0; +} + +/* + * Update the LMB associativity index. + * + * This needs to be called when the hypervisor is updating the + * dynamic-reconfiguration-memory node property. + */ +void drmem_update_lmbs(struct property *prop) +{ + /* +* Don't update the LMBs if triggered by the update done in +* drmem_update_dt(), the LMB values have been used to the update the DT +* property in that case. +*/ + if (in_drmem_update) + return; + if (!strcmp(prop->name, "ibm,dynamic-memory")) + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb); + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2")) + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb); +} #endif static int init_drmem_lmb_size(struct device_node *dn) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index
Re: [PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument
On 5/15/21 10:05 PM, Guenter Roeck wrote: On Thu, Apr 22, 2021 at 11:13:19AM +0530, Aneesh Kumar K.V wrote: No functional change in this patch Signed-off-by: Aneesh Kumar K.V --- .../include/asm/book3s/64/tlbflush-radix.h| 19 +++- arch/powerpc/include/asm/book3s/64/tlbflush.h | 23 --- arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 4 +-- arch/powerpc/mm/book3s64/radix_tlb.c | 29 +++ 4 files changed, 42 insertions(+), 33 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 8b33601cdb9d..171441a43b35 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -56,15 +56,18 @@ static inline void radix__flush_all_lpid_guest(unsigned int lpid) } #endif -extern void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end); -extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, -unsigned long end, int psize); -extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end); -extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, +void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end, + bool flush_pwc); +void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end, + bool flush_pwc); +void radix__flush_tlb_pwc_range_psize(struct mm_struct *mm, unsigned long start, + unsigned long end, int psize, bool flush_pwc); +void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); -extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end); +void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end); + extern void radix__local_flush_tlb_mm(struct mm_struct *mm); extern void radix__local_flush_all_mm(struct mm_struct *mm); diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index 215973b4cb26..f9f8a3a264f7 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -45,13 +45,30 @@ static inline void tlbiel_all_lpid(bool radix) hash__tlbiel_all(TLB_INVAL_SCOPE_LPID); } +static inline void flush_pmd_tlb_pwc_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + bool flush_pwc) +{ + if (radix_enabled()) + return radix__flush_pmd_tlb_range(vma, start, end, flush_pwc); + return hash__flush_tlb_range(vma, start, end); ^^ +} In this specific case we won't have build errors because, static inline void hash__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { But I agree the below is better to read. static inline void flush_pmd_tlb_pwc_range(struct vm_area_struct *vma, unsigned long start, unsigned long end, bool flush_pwc) { if (radix_enabled()) radix__flush_pmd_tlb_range(vma, start, end, flush_pwc); else hash__flush_tlb_range(vma, start, end); return } #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) +{ + return flush_pmd_tlb_pwc_range(vma, start, end, false); ^^ Doesn't that cause build warnings/errors all over the place ? Guenter -aneesh
Re: [PATCH] lockdown, selinux: fix bogus SELinux lockdown permission checks
On Sat, May 15, 2021 at 2:57 AM Casey Schaufler wrote: > On 5/14/2021 8:12 AM, Ondrej Mosnacek wrote: > > On Wed, May 12, 2021 at 7:12 PM Casey Schaufler > > wrote: > >> On 5/12/2021 9:44 AM, Ondrej Mosnacek wrote: > >>> On Wed, May 12, 2021 at 6:18 PM Casey Schaufler > >>> wrote: > On 5/12/2021 6:21 AM, Ondrej Mosnacek wrote: > > On Sat, May 8, 2021 at 12:17 AM Casey Schaufler > > wrote: > >> On 5/7/2021 4:40 AM, Ondrej Mosnacek wrote: > >>> Commit 59438b46471a ("security,lockdown,selinux: implement SELinux > >>> lockdown") added an implementation of the locked_down LSM hook to > >>> SELinux, with the aim to restrict which domains are allowed to perform > >>> operations that would breach lockdown. > >>> > >>> However, in several places the security_locked_down() hook is called > >>> in > >>> situations where the current task isn't doing any action that would > >>> directly breach lockdown, leading to SELinux checks that are basically > >>> bogus. > >>> > >>> Since in most of these situations converting the callers such that > >>> security_locked_down() is called in a context where the current task > >>> would be meaningful for SELinux is impossible or very non-trivial (and > >>> could lead to TOCTOU issues for the classic Lockdown LSM > >>> implementation), fix this by adding a separate hook > >>> security_locked_down_globally() > >> This is a poor solution to the stated problem. Rather than adding > >> a new hook you should add the task as a parameter to the existing hook > >> and let the security modules do as they will based on its value. > >> If the caller does not have an appropriate task it should pass NULL. > >> The lockdown LSM can ignore the task value and SELinux can make its > >> own decision based on the task value passed. > > The problem with that approach is that all callers would then need to > > be updated and I intended to keep the patch small as I'd like it to go > > to stable kernels as well. > > > > But it does seem to be a better long-term solution - would it work for > > you (and whichever maintainer would be taking the patch(es)) if I just > > added another patch that refactors it to use the task parameter? > I can't figure out what you're suggesting. Are you saying that you > want to add a new hook *and* add the task parameter? > >>> No, just to keep this patch as-is (and let it go to stable in this > >>> form) and post another (non-stable) patch on top of it that undoes the > >>> new hook and re-implements the fix using your suggestion. (Yeah, it'll > >>> look weird, but I'm not sure how better to handle such situation - I'm > >>> open to doing it whatever different way the maintainers prefer.) > >> James gets to make the call on this one. If it was my call I would > >> tell you to make the task parameter change and accept the backport > >> pain. I think that as a security developer community we spend way too > >> much time and effort trying to avoid being noticed in source trees. > > Hm... actually, what about this attached patch? It switches to a > > single hook with a cred argument (I figured cred makes more sense than > > task_struct, since the rest of task_struct should be irrelevant for > > the LSM, anyway...) right from the start and keeps the original > > security_locked_down() function only as a simple wrapper around the > > main hook. > > > > At that point I think converting the other callers to call > > security_cred_locked_down() directly isn't really worth it, since the > > resulting calls would just be more verbose without much benefit. So > > I'm tempted to just leave the security_locked_down() helper as is, so > > that the more common pattern can be still achieved with a simpler > > call. > > > > What do you think? > > It's still a bit kludgy, but a big improvement over the previous version. > I wouldn't object to this approach. Ok, thanks. I'll post it as a v2 then. -- Ondrej Mosnacek Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.
[PATCH 0/9] tty: hvc_console: Fix some coding style issues
Fix some issues reported by checkpatch.pl. All of them are coding style issues, no function changes. Xiaofei Tan (9): tty: hvc_console: Fix spaces required around that '=' tty: hvc_console: Fix "foo * bar" should be "foo *bar" tty: hvc_console: Remove trailing whitespace tty: hvc_console: Fix issues of code indent should use tabs tty: hvc_console: Delete spaces prohibited around open parenthesis '(' and ')' tty: hvc_console: Fix coding style issues of block comments tty: hvc_console: Add a blank line after declarations tty: hvc_console: Remove the repeated words 'no' and 'from' tty: hvc_console: Move open brace { on the previous line drivers/tty/hvc/hvc_console.c | 37 ++--- 1 file changed, 22 insertions(+), 15 deletions(-) -- 2.8.1
[PATCH 5/9] tty: hvc_console: Delete spaces prohibited around open parenthesis '(' and ')'
Delete spaces prohibited after that open parenthesis '(' and before that close parenthesis ')', reported by checkpatch.pl. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 2e5c133..39018e5 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -689,7 +689,7 @@ static int __hvc_poll(struct hvc_struct *hp, bool may_sleep) spin_unlock_irqrestore(>lock, flags); tty_hangup(tty); spin_lock_irqsave(>lock, flags); - } else if ( n == -EAGAIN ) { + } else if (n == -EAGAIN) { /* * Some back-ends can only ensure a certain min * num of bytes read, which may be > 'count'. -- 2.8.1
[PATCH 7/9] tty: hvc_console: Add a blank line after declarations
Add a blank line after declarations, reported by checkpatch.pl. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index a61cdf0..f31efeb 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -922,6 +922,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, /* We wait until a driver actually comes along */ if (atomic_inc_not_zero(_needs_init)) { int err = hvc_init(); + if (err) return ERR_PTR(err); } -- 2.8.1
[PATCH 2/9] tty: hvc_console: Fix "foo * bar" should be "foo *bar"
Fix "foo * bar" should be "foo *bar", reported by checkpatch.pl. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index a1eca9d..ddf07ff 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -344,7 +344,7 @@ static int hvc_install(struct tty_driver *driver, struct tty_struct *tty) * The TTY interface won't be used until after the vio layer has exposed the vty * adapter to the kernel. */ -static int hvc_open(struct tty_struct *tty, struct file * filp) +static int hvc_open(struct tty_struct *tty, struct file *filp) { struct hvc_struct *hp = tty->driver_data; unsigned long flags; @@ -386,7 +386,7 @@ static int hvc_open(struct tty_struct *tty, struct file * filp) return rc; } -static void hvc_close(struct tty_struct *tty, struct file * filp) +static void hvc_close(struct tty_struct *tty, struct file *filp) { struct hvc_struct *hp = tty->driver_data; unsigned long flags; -- 2.8.1
[PATCH 8/9] tty: hvc_console: Remove the repeated words 'no' and 'from'
Remove the repeated words 'no' and 'from', reported by checkpatch.pl. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index f31efeb..b6720b0 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -293,7 +293,7 @@ int hvc_instantiate(uint32_t vtermno, int index, const struct hv_ops *ops) if (vtermnos[index] != -1) return -1; - /* make sure no no tty has been registered in this index */ + /* make sure no tty has been registered in this index */ hp = hvc_get_by_index(index); if (hp) { tty_port_put(>port); @@ -622,7 +622,7 @@ static u32 timeout = MIN_TIMEOUT; /* * Maximum number of bytes to get from the console driver if hvc_poll is * called from driver (and can't sleep). Any more than this and we break - * and start polling with khvcd. This value was derived from from an OpenBMC + * and start polling with khvcd. This value was derived from an OpenBMC * console with the OPAL driver that results in about 0.25ms interrupts off * latency. */ -- 2.8.1
[PATCH 3/9] tty: hvc_console: Remove trailing whitespace
Remove trailing whitespace, reported by checkpatch.pl. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index ddf07ff..13f63d5 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -204,7 +204,7 @@ static struct tty_driver *hvc_console_device(struct console *c, int *index) } static int hvc_console_setup(struct console *co, char *options) -{ +{ if (co->index < 0 || co->index >= MAX_NR_HVC_CONSOLES) return -ENODEV; -- 2.8.1
[PATCH 6/9] tty: hvc_console: Fix coding style issues of block comments
Fix coding style issues of block comments, reported by checkpatch.pl. Besides, add a period at the end of the sentenses. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 39018e5..a61cdf0 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -177,7 +177,8 @@ static void hvc_console_print(struct console *co, const char *b, r = cons_ops[index]->put_chars(vtermnos[index], c, i); if (r <= 0) { /* throw away characters on error -* but spin in case of -EAGAIN */ +* but spin in case of -EAGAIN. +*/ if (r != -EAGAIN) { i = 0; } else { @@ -484,7 +485,8 @@ static int hvc_push(struct hvc_struct *hp) return 0; } /* throw away output on error; this happens when - there is no session connected to the vterm. */ +* there is no session connected to the vterm. +*/ hp->n_outbuf = 0; } else hp->n_outbuf -= n; @@ -707,7 +709,8 @@ static int __hvc_poll(struct hvc_struct *hp, bool may_sleep) /* XXX should support a sequence */ if (buf[i] == '\x0f') { /* ^O */ /* if ^O is pressed again, reset -* sysrq_pressed and flip ^O char */ +* sysrq_pressed and flip ^O char. +*/ sysrq_pressed = !sysrq_pressed; if (sysrq_pressed) continue; @@ -749,7 +752,8 @@ static int __hvc_poll(struct hvc_struct *hp, bool may_sleep) if (read_total) { /* Activity is occurring, so reset the polling backoff value to - a minimum for performance. */ +* a minimum for performance. +*/ timeout = MIN_TIMEOUT; tty_flip_buffer_push(>port); @@ -1037,7 +1041,8 @@ static int hvc_init(void) tty_set_operations(drv, _ops); /* Always start the kthread because there can be hotplug vty adapters -* added later. */ +* added later. +*/ hvc_task = kthread_run(khvcd, NULL, "khvcd"); if (IS_ERR(hvc_task)) { printk(KERN_ERR "Couldn't create kthread for console.\n"); -- 2.8.1
[PATCH 1/9] tty: hvc_console: Fix spaces required around that '='
Fix spaces required around that '=', reported by checkpatch.pl. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index d0f0253..a1eca9d 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -944,7 +944,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, * find index to use: * see if this vterm id matches one registered for console. */ - for (i=0; i < MAX_NR_HVC_CONSOLES; i++) + for (i = 0; i < MAX_NR_HVC_CONSOLES; i++) if (vtermnos[i] == hp->vtermno && cons_ops[i] == hp->ops) break; -- 2.8.1
[PATCH 4/9] tty: hvc_console: Fix issues of code indent should use tabs
Fix issues of code indent should use tabs, reported by checkpatch.pl. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 13f63d5..2e5c133 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -268,7 +268,7 @@ static void hvc_check_console(int index) if (hvc_console.flags & CON_ENABLED) return; - /* If this index is what the user requested, then register + /* If this index is what the user requested, then register * now (setup won't fail at this point). It's ok to just * call register again if previously .setup failed. */ -- 2.8.1
[PATCH 9/9] tty: hvc_console: Move open brace { on the previous line
Fix issues that open brace { should be on the previous line, reported by checkpatch.pl. Signed-off-by: Xiaofei Tan --- drivers/tty/hvc/hvc_console.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index b6720b0..b96399e 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -140,8 +140,9 @@ static int hvc_flush(struct hvc_struct *hp) * static because kmalloc will not work during early console init. */ static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; -static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = - {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = { + [0 ... MAX_NR_HVC_CONSOLES - 1] = -1 +}; /* * Console APIs, NOT TTY. These APIs are available immediately when -- 2.8.1
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
On 5/17/21 12:55 PM, Anshuman Khandual wrote: On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: On 5/17/21 11:17 AM, Christophe Leroy wrote: +aneesh +linuxppc-dev list Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : Hello Christophe, DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not aware about this. From the error log, it failed explicitly on 4K page size hash config. static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) { BUG(); --> Failed return pmd; } static inline pmd_t __pmd_mkhuge(pmd_t pmd) { if (radix_enabled()) return radix__pmd_mkhuge(pmd); return hash__pmd_mkhuge(pmd); } pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) { unsigned long pmdv; pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); } It seems like on powerpc, where pfn_pmd() makes a huge page but which is not supported on 4K hash config thus triggering the BUG(). But all pfn_pmd() call sites inside the debug_vm_pgtable() test are protected with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does not directly make a huge page on other platforms. Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant THP helpers has BUG() or 0 which indicates THP might not be supported on 4K page size hash config ? But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 platforms which I assume includes 4K page size hash config as well. Is THP some how getting enabled on this 4K page size hash config where it should not be (thus triggering the BUG) ? OR am I missing something here. We should put those pfn_pmd() and pfn_pud() after if (!has_transparent_hugepage()) return; On hash with 4K page size, we can't support leaf page table entry and PMD and PUD level. Hence we don't support THP for them. But could CONFIG_TRANSPARENT_HUGEPAGE be enabled on such configs ? Otherwise, wondering how the BUG() just got triggered there. yes. We do support THP with radix translation and 4K page size. -aneesh
Re: [PATCH 1/2] powerpc/interrupt: Refactor interrupt_exit_user_prepare() and syscall_exit_prepare()
Excerpts from Christophe Leroy's message of May 14, 2021 6:28 pm: > Last part of interrupt_exit_user_prepare() and syscall_exit_prepare() > are identical. > > Create a __interrupt_exit_user_prepare() function that is called by > both. > > Note that it replaces a local_irq_save(flags) by local_irq_disable(). > This is similar because the flags are never used. On ppc 8xx it is > more efficient because it doesn't require reading MSR. Can these cleanups go after my interrupt performance improvements? I posted them for last series but were dropped due to crashes without time to resubmit. I'm working on them again now. Thanks, Nick
Re: [FSL P50x0] KVM HV doesn't work anymore
Excerpts from Christian Zigotzky's message of May 15, 2021 11:46 pm: > On 15 May 2021 at 12:08pm Christophe Leroy wrote: >> >> >> Le 15/05/2021 à 11:48, Christian Zigotzky a écrit : >>> Hi All, >>> >>> I bisected today [1] and the bisecting itself was OK but the >>> reverting of the bad commit doesn't solve the issue. Do you have an >>> idea which commit could be resposible for this issue? Maybe the >>> bisecting wasn't successful. I will look in the kernel git log. Maybe >>> there is a commit that affected KVM HV on FSL P50x0 machines. >> >> If the uImage doesn't load, it may be because of the size of uImage. >> >> See https://github.com/linuxppc/issues/issues/208 >> >> Is there a significant size difference with and without KVM HV ? >> >> Maybe you can try to remove another option to reduce the size of the >> uImage. > I tried it but it doesn't solve the issue. The uImage works without KVM > HV in a virtual e5500 QEMU machine. Any more progress with this? I would say that bisect might have just been a bit unstable and maybe by chance some things did not crash so it's pointing to the wrong patch. Upstream merge of powerpc-5.13-1 was good and powerpc-5.13-2 was bad? Between that looks like some KVM MMU rework. You could try the patch before this one b1c5356e873c ("KVM: PPC: Convert to the gfn-based MMU notifier callbacks"). That won't revert cleanly so just try run the tree at that point. If it works, test the patch and see if it fails. Thanks, Nick
Re: Fwd: [Bug 213069] New: kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
On 5/17/21 11:25 AM, Aneesh Kumar K.V wrote: > On 5/17/21 11:17 AM, Christophe Leroy wrote: >> +aneesh >> +linuxppc-dev list >> >> Le 17/05/2021 à 07:44, Anshuman Khandual a écrit : >>> Hello Christophe, >>> >>> DEBUG_VM_PGTABLE has now been re-enabled on powerpc recently ? was not >>> aware about this. From the error log, it failed explicitly on 4K page >>> size hash config. >>> >>> static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) >>> { >>> BUG(); --> Failed >>> return pmd; >>> } >>> >>> static inline pmd_t __pmd_mkhuge(pmd_t pmd) >>> { >>> if (radix_enabled()) >>> return radix__pmd_mkhuge(pmd); >>> return hash__pmd_mkhuge(pmd); >>> } >>> >>> pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot) >>> { >>> unsigned long pmdv; >>> >>> pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK; >>> >>> return __pmd_mkhuge(pmd_set_protbits(__pmd(pmdv), pgprot)); >>> } >>> >>> It seems like on powerpc, where pfn_pmd() makes a huge page but which >>> is not supported on 4K hash config thus triggering the BUG(). But all >>> pfn_pmd() call sites inside the debug_vm_pgtable() test are protected >>> with CONFIG_TRANSPARENT_HUGEPAGE. IIUC unlike powerpc, pfn_pmd() does >>> not directly make a huge page on other platforms. >>> >>> Looking at arch/powerpc/include/asm/book3s/64/hash-4k.h, all relevant >>> THP helpers has BUG() or 0 which indicates THP might not be supported >>> on 4K page size hash config ? >>> >>> But looking at arch/powerpc/platforms/Kconfig.cputype, it seems like >>> HAVE_ARCH_TRANSPARENT_HUGEPAGE is invariably selected on PPC_BOOK3S_64 >>> platforms which I assume includes 4K page size hash config as well. >>> >>> Is THP some how getting enabled on this 4K page size hash config where >>> it should not be (thus triggering the BUG) ? OR am I missing something >>> here. >>> >> > > We should put those pfn_pmd() and pfn_pud() after > > if (!has_transparent_hugepage()) > return; > > > On hash with 4K page size, we can't support leaf page table entry and PMD and > PUD level. Hence we don't support THP for them. But could CONFIG_TRANSPARENT_HUGEPAGE be enabled on such configs ? Otherwise, wondering how the BUG() just got triggered there.
Re: [PATCH v14 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
Le 17/05/2021 à 05:28, Jordan Niethe a écrit : From: Russell Currey To enable strict module RWX on powerpc, set: CONFIG_STRICT_MODULE_RWX=y You should also have CONFIG_STRICT_KERNEL_RWX=y set to have any real security benefit. ARCH_HAS_STRICT_MODULE_RWX is set to require ARCH_HAS_STRICT_KERNEL_RWX. This is due to a quirk in arch/Kconfig and arch/powerpc/Kconfig that makes STRICT_MODULE_RWX *on by default* in configurations where STRICT_KERNEL_RWX is *unavailable*. Since this doesn't make much sense, and module RWX without kernel RWX doesn't make much sense, having the same dependencies as kernel RWX works around this problem. Book32s/32 processors with a hash mmu (i.e. 604 core) can not set memory ^^ Book32s ==> Book3s protection on a page by page basis so do not enable. It is not exactly that. The problem on 604 is for _exec_ protection. Note that on book3s/32, on both 603 and 604 core, it is not possible to write protect kernel pages. So maybe it would make sense to disable ARCH_HAS_STRICT_MODULE_RWX on CONFIG_PPC_BOOK3S_32 completely, I'm not sure. Signed-off-by: Russell Currey [jpn: - predicate on !PPC_BOOK3S_604 - make module_alloc() use PAGE_KERNEL protection] Signed-off-by: Jordan Niethe Reviewed-by: Christophe Leroy --- v10: - Predicate on !PPC_BOOK3S_604 - Make module_alloc() use PAGE_KERNEL protection v11: - Neaten up v13: Use strict_kernel_rwx_enabled() v14: Make changes to module_alloc() its own commit --- arch/powerpc/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index cce0a137b046..cb5d9d862c35 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -140,6 +140,7 @@ config PPC select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64 select ARCH_HAS_SET_MEMORY select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION) + select ARCH_HAS_STRICT_MODULE_RWX if ARCH_HAS_STRICT_KERNEL_RWX && !PPC_BOOK3S_604 select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UACCESS_FLUSHCACHE select ARCH_HAS_COPY_MC if PPC64
Re: [PATCH v14 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware
On Mon, May 17, 2021 at 4:37 PM Christophe Leroy wrote: > > > > Le 17/05/2021 à 05:28, Jordan Niethe a écrit : > > Make module_alloc() use PAGE_KERNEL protections instead of > > PAGE_KERNEL_EXEX if Strict Module RWX is enabled. > > > > Signed-off-by: Jordan Niethe > > --- > > v14: - Split out from powerpc: Set ARCH_HAS_STRICT_MODULE_RWX > > - Add and use strict_module_rwx_enabled() helper > > --- > > arch/powerpc/include/asm/mmu.h | 5 + > > arch/powerpc/kernel/module.c | 4 +++- > > 2 files changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h > > index 607168b1aef4..7710bf0cbf8a 100644 > > --- a/arch/powerpc/include/asm/mmu.h > > +++ b/arch/powerpc/include/asm/mmu.h > > @@ -357,6 +357,11 @@ static inline bool strict_kernel_rwx_enabled(void) > > return false; > > } > > #endif > > + > > +static inline bool strict_module_rwx_enabled(void) > > +{ > > + return IS_ENABLED(CONFIG_STRICT_MODULE_RWX) && > > strict_kernel_rwx_enabled(); > > +} > > Looking at arch/Kconfig, I have the feeling that it is possible to select > CONFIG_STRICT_MODULE_RWX > without selecting CONFIG_STRICT_KERNEL_RWX. > > In that case, strict_kernel_rwx_enabled() will return false. Ok, if someone did that currently it would break things, e.g. code patching. I think it should it be made impossible to CONFIG_STRICT_MODULE_RWX without CONFIG_STRICT_KERNEL_RWX? > > > #endif /* !__ASSEMBLY__ */ > > > > /* The kernel use the constants below to index in the page sizes array. > > diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c > > index 3f35c8d20be7..ed04a3ba66fe 100644 > > --- a/arch/powerpc/kernel/module.c > > +++ b/arch/powerpc/kernel/module.c > > @@ -92,12 +92,14 @@ int module_finalize(const Elf_Ehdr *hdr, > > static __always_inline void * > > __module_alloc(unsigned long size, unsigned long start, unsigned long end) > > { > > + pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : > > PAGE_KERNEL_EXEC; > > + > > /* > >* Don't do huge page allocations for modules yet until more testing > >* is done. STRICT_MODULE_RWX may require extra work to support this > >* too. > >*/ > > - return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, > > PAGE_KERNEL_EXEC, > > + return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, prot, > > VM_FLUSH_RESET_PERMS | VM_NO_HUGE_VMAP, > > NUMA_NO_NODE, > > __builtin_return_address(0)); > > } > >
Re: [RFC 1/4] drivers/nvdimm: Add perf interface to expose nvdimm performance stats
On 5/14/21 5:17 PM, Peter Zijlstra wrote: > On Thu, May 13, 2021 at 05:56:14PM +0530, kajoljain wrote: > >> But yes the current read/add/del functions are not adding value. We >> could add an arch/platform specific function which could handle the >> capturing of the counter data and do the rest of the operation here, >> is this approach better? > > Right; have your register_nvdimm_pmu() set pmu->{add,del,read} to > nd_pmu->{add,del,read} directly, don't bother with these intermediates. > Also you can WARN_ON_ONCE() if any of them are NULL and fail > registration at that point. > Hi Peter, I will make all required changes and send next version of this patchset soon. Thanks, Kajol Jain
Re: [PATCH v14 6/9] powerpc/bpf: Write protect JIT code
Le 17/05/2021 à 05:28, Jordan Niethe a écrit : Add the necessary call to bpf_jit_binary_lock_ro() to remove write and add exec permissions to the JIT image after it has finished being written. Without CONFIG_STRICT_MODULE_RWX the image will be writable and executable until the call to bpf_jit_binary_lock_ro(). And _with_ CONFIG_STRICT_MODULE_RWX what will happen ? It will be _writable_ but not _executable_ ? Reviewed-by: Christophe Leroy Signed-off-by: Jordan Niethe --- v10: New to series v11: Remove CONFIG_STRICT_MODULE_RWX conditional --- arch/powerpc/net/bpf_jit_comp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index 6c8c268e4fe8..53aefee3fe70 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -237,6 +237,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) fp->jited_len = alloclen; bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + (bpf_hdr->pages * PAGE_SIZE)); + bpf_jit_binary_lock_ro(bpf_hdr); if (!fp->is_func || extra_pass) { bpf_prog_fill_jited_linfo(fp, addrs); out_addrs:
Re: [PATCH v14 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware
Le 17/05/2021 à 05:28, Jordan Niethe a écrit : Make module_alloc() use PAGE_KERNEL protections instead of PAGE_KERNEL_EXEX if Strict Module RWX is enabled. Signed-off-by: Jordan Niethe --- v14: - Split out from powerpc: Set ARCH_HAS_STRICT_MODULE_RWX - Add and use strict_module_rwx_enabled() helper --- arch/powerpc/include/asm/mmu.h | 5 + arch/powerpc/kernel/module.c | 4 +++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index 607168b1aef4..7710bf0cbf8a 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -357,6 +357,11 @@ static inline bool strict_kernel_rwx_enabled(void) return false; } #endif + +static inline bool strict_module_rwx_enabled(void) +{ + return IS_ENABLED(CONFIG_STRICT_MODULE_RWX) && strict_kernel_rwx_enabled(); +} Looking at arch/Kconfig, I have the feeling that it is possible to select CONFIG_STRICT_MODULE_RWX without selecting CONFIG_STRICT_KERNEL_RWX. In that case, strict_kernel_rwx_enabled() will return false. #endif /* !__ASSEMBLY__ */ /* The kernel use the constants below to index in the page sizes array. diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c index 3f35c8d20be7..ed04a3ba66fe 100644 --- a/arch/powerpc/kernel/module.c +++ b/arch/powerpc/kernel/module.c @@ -92,12 +92,14 @@ int module_finalize(const Elf_Ehdr *hdr, static __always_inline void * __module_alloc(unsigned long size, unsigned long start, unsigned long end) { + pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC; + /* * Don't do huge page allocations for modules yet until more testing * is done. STRICT_MODULE_RWX may require extra work to support this * too. */ - return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, PAGE_KERNEL_EXEC, + return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, prot, VM_FLUSH_RESET_PERMS | VM_NO_HUGE_VMAP, NUMA_NO_NODE, __builtin_return_address(0)); }
[Bug 213069] kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:147! Oops: Exception in kernel mode, sig: 5 [#1]
https://bugzilla.kernel.org/show_bug.cgi?id=213069 Christophe Leroy (christophe.le...@csgroup.eu) changed: What|Removed |Added CC||christophe.le...@csgroup.eu --- Comment #2 from Christophe Leroy (christophe.le...@csgroup.eu) --- The bug is not in powerpc but in function pmd_basic_tests() in mm/debug_vm_pgtable.c (https://elixir.bootlin.com/linux/v5.13-rc1/source/mm/debug_vm_pgtable.c#L146) pfn_pmd() should not be called before the has_transparent_hugepage() verification. Same problem in pmd_advanced_tests() And there is the exact same issue with PUD tests with pfn_pud() function. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[PATCH 4/4] powerpc: Enable KFENCE on BOOK3S/64
From: Christophe Leroy This reuses the DEBUG_PAGEALLOC logic. Tested with CONFIG_KFENCE + CONFIG_KUNIT + CONFIG_KFENCE_KUNIT_TEST on radix and hash. Signed-off-by: Christophe Leroy [jpn: Handle radix] Signed-off-by: Jordan Niethe --- arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/book3s/64/pgtable.h | 2 +- arch/powerpc/include/asm/kfence.h| 19 +++ arch/powerpc/mm/book3s64/hash_utils.c| 12 ++-- arch/powerpc/mm/book3s64/radix_pgtable.c | 8 +--- 5 files changed, 32 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 6df64d6815df..1743364d7370 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -196,7 +196,7 @@ config PPC select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14 select HAVE_ARCH_KASAN_VMALLOC if PPC32 && PPC_PAGE_SHIFT <= 14 select HAVE_ARCH_KGDB - select HAVE_ARCH_KFENCE if PPC32 + select HAVE_ARCH_KFENCE if ARCH_SUPPORTS_DEBUG_PAGEALLOC select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_NVRAM_OPS diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index b89482aed82a..35300f2ee5d0 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -822,7 +822,7 @@ static inline bool check_pte_access(unsigned long access, unsigned long ptev) * Generic functions with hash/radix callbacks */ -#ifdef CONFIG_DEBUG_PAGEALLOC +#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE) static inline void __kernel_map_pages(struct page *page, int numpages, int enable) { if (radix_enabled()) diff --git a/arch/powerpc/include/asm/kfence.h b/arch/powerpc/include/asm/kfence.h index a9846b68c6b9..9d388df7c1a8 100644 --- a/arch/powerpc/include/asm/kfence.h +++ b/arch/powerpc/include/asm/kfence.h @@ -11,11 +11,29 @@ #include #include +#if defined(CONFIG_PPC64) && !defined(PPC64_ELF_ABI_v2) +#define ARCH_FUNC_PREFIX "." +#endif + static inline bool arch_kfence_init_pool(void) { return true; } +#ifdef CONFIG_PPC64 +static inline bool kfence_protect_page(unsigned long addr, bool protect) +{ + struct page *page; + + page = virt_to_page(addr); + if (protect) + __kernel_map_pages(page, 1, 0); + else + __kernel_map_pages(page, 1, 1); + + return true; +} +#else static inline bool kfence_protect_page(unsigned long addr, bool protect) { pte_t *kpte = virt_to_kpte(addr); @@ -29,5 +47,6 @@ static inline bool kfence_protect_page(unsigned long addr, bool protect) return true; } +#endif #endif /* __ASM_POWERPC_KFENCE_H */ diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index fe5cf1cf4dd5..fecb379426e7 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -323,8 +323,8 @@ int htab_bolt_mapping(unsigned long vstart, unsigned long vend, break; cond_resched(); - if (debug_pagealloc_enabled() && - (paddr >> PAGE_SHIFT) < linear_map_hash_count) + if (debug_pagealloc_enabled_or_kfence() && + (paddr >> PAGE_SHIFT) < linear_map_hash_count) linear_map_hash_slots[paddr >> PAGE_SHIFT] = ret | 0x80; } return ret < 0 ? ret : 0; @@ -672,7 +672,7 @@ static void __init htab_init_page_sizes(void) bool aligned = true; init_hpte_page_sizes(); - if (!debug_pagealloc_enabled()) { + if (!debug_pagealloc_enabled_or_kfence()) { /* * Pick a size for the linear mapping. Currently, we only * support 16M, 1M and 4K which is the default @@ -960,7 +960,7 @@ static void __init htab_initialize(void) prot = pgprot_val(PAGE_KERNEL); - if (debug_pagealloc_enabled()) { + if (debug_pagealloc_enabled_or_kfence()) { linear_map_hash_count = memblock_end_of_DRAM() >> PAGE_SHIFT; linear_map_hash_slots = memblock_alloc_try_nid( linear_map_hash_count, 1, MEMBLOCK_LOW_LIMIT, @@ -1936,7 +1936,7 @@ long hpte_insert_repeating(unsigned long hash, unsigned long vpn, return slot; } -#ifdef CONFIG_DEBUG_PAGEALLOC +#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE) static DEFINE_SPINLOCK(linear_map_hash_lock); static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi) @@ -2009,7 +2009,7 @@ void hash__kernel_map_pages(struct page *page, int numpages, int enable) } local_irq_restore(flags); } -#endif /* CONFIG_DEBUG_PAGEALLOC */ +#endif /* CONFIG_DEBUG_PAGEALLOC || CONFIG_KFENCE */ void
[PATCH 3/4] powerpc/64s: Allow double call of kernel_[un]map_linear_page()
From: Christophe Leroy If the page is already mapped resp. already unmapped, bail out. Signed-off-by: Christophe Leroy Signed-off-by: Jordan Niethe --- arch/powerpc/mm/book3s64/hash_utils.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index d74482cce064..fe5cf1cf4dd5 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -1953,6 +1953,9 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi) if (!vsid) return; + if (linear_map_hash_slots[lmi] & 0x80) + return; + ret = hpte_insert_repeating(hash, vpn, __pa(vaddr), mode, HPTE_V_BOLTED, mmu_linear_psize, mmu_kernel_ssize); @@ -1972,7 +1975,10 @@ static void kernel_unmap_linear_page(unsigned long vaddr, unsigned long lmi) hash = hpt_hash(vpn, PAGE_SHIFT, mmu_kernel_ssize); spin_lock(_map_hash_lock); - BUG_ON(!(linear_map_hash_slots[lmi] & 0x80)); + if (!(linear_map_hash_slots[lmi] & 0x80)) { + spin_unlock(_map_hash_lock); + return; + } hidx = linear_map_hash_slots[lmi] & 0x7f; linear_map_hash_slots[lmi] = 0; spin_unlock(_map_hash_lock); -- 2.25.1
[PATCH 2/4] powerpc/64s: Remove unneeded #ifdef CONFIG_DEBUG_PAGEALLOC in hash_utils
From: Christophe Leroy debug_pagealloc_enabled() is always defined and constant folds to 'false' when CONFIG_DEBUG_PAGEALLOC is not enabled. Remove the #ifdefs, the code and associated static variables will be optimised out by the compiler when CONFIG_DEBUG_PAGEALLOC is not defined. Signed-off-by: Christophe Leroy Signed-off-by: Jordan Niethe --- arch/powerpc/mm/book3s64/hash_utils.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index 5b9709075fbd..d74482cce064 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -126,11 +126,8 @@ EXPORT_SYMBOL_GPL(mmu_slb_size); #ifdef CONFIG_PPC_64K_PAGES int mmu_ci_restrictions; #endif -#ifdef CONFIG_DEBUG_PAGEALLOC static u8 *linear_map_hash_slots; static unsigned long linear_map_hash_count; -static DEFINE_SPINLOCK(linear_map_hash_lock); -#endif /* CONFIG_DEBUG_PAGEALLOC */ struct mmu_hash_ops mmu_hash_ops; EXPORT_SYMBOL(mmu_hash_ops); @@ -326,11 +323,9 @@ int htab_bolt_mapping(unsigned long vstart, unsigned long vend, break; cond_resched(); -#ifdef CONFIG_DEBUG_PAGEALLOC if (debug_pagealloc_enabled() && (paddr >> PAGE_SHIFT) < linear_map_hash_count) linear_map_hash_slots[paddr >> PAGE_SHIFT] = ret | 0x80; -#endif /* CONFIG_DEBUG_PAGEALLOC */ } return ret < 0 ? ret : 0; } @@ -965,7 +960,6 @@ static void __init htab_initialize(void) prot = pgprot_val(PAGE_KERNEL); -#ifdef CONFIG_DEBUG_PAGEALLOC if (debug_pagealloc_enabled()) { linear_map_hash_count = memblock_end_of_DRAM() >> PAGE_SHIFT; linear_map_hash_slots = memblock_alloc_try_nid( @@ -975,7 +969,6 @@ static void __init htab_initialize(void) panic("%s: Failed to allocate %lu bytes max_addr=%pa\n", __func__, linear_map_hash_count, _rma_size); } -#endif /* CONFIG_DEBUG_PAGEALLOC */ /* create bolted the linear mapping in the hash table */ for_each_mem_range(i, , ) { @@ -1944,6 +1937,8 @@ long hpte_insert_repeating(unsigned long hash, unsigned long vpn, } #ifdef CONFIG_DEBUG_PAGEALLOC +static DEFINE_SPINLOCK(linear_map_hash_lock); + static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi) { unsigned long hash; -- 2.25.1
[PATCH 1/4] powerpc/64s: Add DEBUG_PAGEALLOC for radix
There is support for DEBUG_PAGEALLOC on hash but not on radix. Add support on radix. Signed-off-by: Jordan Niethe --- arch/powerpc/include/asm/book3s/32/pgtable.h | 10 arch/powerpc/include/asm/book3s/64/hash.h| 2 ++ arch/powerpc/include/asm/book3s/64/pgtable.h | 19 ++ arch/powerpc/include/asm/book3s/64/radix.h | 2 ++ arch/powerpc/include/asm/nohash/pgtable.h| 10 arch/powerpc/include/asm/set_memory.h| 2 ++ arch/powerpc/mm/book3s64/hash_utils.c| 2 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 26 ++-- arch/powerpc/mm/pageattr.c | 6 + 9 files changed, 76 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 83c65845a1a9..30533d409f7f 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -417,6 +417,16 @@ static inline unsigned long pte_pfn(pte_t pte) } /* Generic modifiers for PTE bits */ +static inline pte_t pte_mkabsent(pte_t pte) +{ + return __pte(pte_val(pte) & ~_PAGE_PRESENT); +} + +static inline pte_t pte_mkpresent(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_PRESENT); +} + static inline pte_t pte_wrprotect(pte_t pte) { return __pte(pte_val(pte) & ~_PAGE_RW); diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h index d959b0195ad9..f6171633cdc2 100644 --- a/arch/powerpc/include/asm/book3s/64/hash.h +++ b/arch/powerpc/include/asm/book3s/64/hash.h @@ -179,6 +179,8 @@ static inline unsigned long hash__pte_update(struct mm_struct *mm, return old; } +void hash__kernel_map_pages(struct page *page, int numpages, int enable); + /* Set the dirty and/or accessed bits atomically in a linux PTE, this * function doesn't need to flush the hash entry */ diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index a666d561b44d..b89482aed82a 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -651,6 +651,16 @@ static inline unsigned long pte_pfn(pte_t pte) } /* Generic modifiers for PTE bits */ +static inline pte_t pte_mkabsent(pte_t pte) +{ + return __pte_raw(pte_raw(pte) & cpu_to_be64(~_PAGE_PRESENT)); +} + +static inline pte_t pte_mkpresent(pte_t pte) +{ + return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PRESENT)); +} + static inline pte_t pte_wrprotect(pte_t pte) { if (unlikely(pte_savedwrite(pte))) @@ -812,6 +822,15 @@ static inline bool check_pte_access(unsigned long access, unsigned long ptev) * Generic functions with hash/radix callbacks */ +#ifdef CONFIG_DEBUG_PAGEALLOC +static inline void __kernel_map_pages(struct page *page, int numpages, int enable) +{ + if (radix_enabled()) + radix__kernel_map_pages(page, numpages, enable); + hash__kernel_map_pages(page, numpages, enable); +} +#endif + static inline void __ptep_set_access_flags(struct vm_area_struct *vma, pte_t *ptep, pte_t entry, unsigned long address, diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h index 59cab558e2f0..d4fa28a77cc6 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -137,6 +137,8 @@ extern void radix__mark_rodata_ro(void); extern void radix__mark_initmem_nx(void); #endif +void radix__kernel_map_pages(struct page *page, int numpages, int enable); + extern void radix__ptep_set_access_flags(struct vm_area_struct *vma, pte_t *ptep, pte_t entry, unsigned long address, int psize); diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index ac75f4ab0dba..2a57bbb5820a 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -125,6 +125,16 @@ static inline unsigned long pte_pfn(pte_t pte) { return pte_val(pte) >> PTE_RPN_SHIFT; } /* Generic modifiers for PTE bits */ +static inline pte_t pte_mkabsent(pte_t pte) +{ + return __pte(pte_val(pte) & ~_PAGE_PRESENT); +} + +static inline pte_t pte_mkpresent(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_PRESENT); +} + static inline pte_t pte_exprotect(pte_t pte) { return __pte(pte_val(pte) & ~_PAGE_EXEC); diff --git a/arch/powerpc/include/asm/set_memory.h b/arch/powerpc/include/asm/set_memory.h index b040094f7920..4b6dfaad4cc9 100644 --- a/arch/powerpc/include/asm/set_memory.h +++ b/arch/powerpc/include/asm/set_memory.h @@ -6,6 +6,8 @@ #define SET_MEMORY_RW 1 #define SET_MEMORY_NX 2 #define SET_MEMORY_X 3 +#define SET_MEMORY_EN 4 +#define SET_MEMORY_DIS 5 int
[PATCH 0/4] powerpc/64s: Enable KFENCE
This adds support for radix to Christophe's series that enabled KFENCE on powerpc/64s/hash: https://lore.kernel.org/linuxppc-dev/8dfe1bd2abde26337c1d8c1ad0acfcc82185e0d5.1614868445.git.christophe.le...@csgroup.eu/ First implement DEBUG_PAGEALLOC for radix so KFENCE can reuse the same infrastructure. This requires the "powerpc: Further Strict RWX support" series: https://lore.kernel.org/linuxppc-dev/20210517032810.129949-1-jniet...@gmail.com/ Christophe Leroy (3): powerpc/64s: Remove unneeded #ifdef CONFIG_DEBUG_PAGEALLOC in hash_utils powerpc/64s: Allow double call of kernel_[un]map_linear_page() powerpc: Enable KFENCE on BOOK3S/64 Jordan Niethe (1): powerpc/64s: Add DEBUG_PAGEALLOC for radix arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/book3s/32/pgtable.h | 10 +++ arch/powerpc/include/asm/book3s/64/hash.h| 2 ++ arch/powerpc/include/asm/book3s/64/pgtable.h | 19 arch/powerpc/include/asm/book3s/64/radix.h | 2 ++ arch/powerpc/include/asm/kfence.h| 19 arch/powerpc/include/asm/nohash/pgtable.h| 10 +++ arch/powerpc/include/asm/set_memory.h| 2 ++ arch/powerpc/mm/book3s64/hash_utils.c| 31 ++-- arch/powerpc/mm/book3s64/radix_pgtable.c | 28 -- arch/powerpc/mm/pageattr.c | 6 11 files changed, 113 insertions(+), 18 deletions(-) -- 2.25.1