Joel, On Tue, Sep 01 2020 at 21:29, Joel Fernandes wrote: > On Tue, Sep 01, 2020 at 10:02:10PM +0200, Thomas Gleixner wrote: >> The generated code for the CONFIG_PRETENT_HT_SECURE=n case is the same > > When you say 'pretend', did you mean 'make' ? The point of this patch is to > protect the kernel from the other hyperthread thus making HT secure for the > kernel contexts and not merely pretending.
I'm paranoid and don't trust HT at all. There is too much shared state. >> --- /dev/null >> +++ b/include/linux/pretend_ht_secure.h >> @@ -0,0 +1,21 @@ >> +#ifndef _LINUX_PRETEND_HT_SECURE_H >> +#define _LINUX_PRETEND_HT_SECURE_H >> + >> +#ifdef CONFIG_PRETEND_HT_SECURE >> +static inline void enter_from_user_ht_sucks(void) >> +{ >> + if (static_branch_unlikely(&pretend_ht_secure_key)) >> + enter_from_user_pretend_ht_is_secure(); >> +} >> + >> +static inline void exit_to_user_ht_sucks(void) >> +{ >> + if (static_branch_unlikely(&pretend_ht_secure_key)) >> + exit_to_user_pretend_ht_is_secure(); > > We already have similar config and static keys for the core-scheduling > feature itself. Can we just make it depend on that? Of course. This was just for illustration. :) > Or, are you saying users may want 'core scheduling' enabled but may want to > leave out the kernel protection? Core scheduling per se without all the protection muck, i.e. a relaxed version which tries to gang schedule threads of a process on a core if feasible has advantages to some workloads. >> @@ -111,6 +113,12 @@ static __always_inline void exit_to_user >> /* Workaround to allow gradual conversion of architecture code */ >> void __weak arch_do_signal(struct pt_regs *regs) { } >> >> +static inline unsigned long exit_to_user_get_work(void) >> +{ >> + exit_to_user_ht_sucks(); > > Ok, one issue with your patch is it does not take care of the waiting logic. > sched_core_unsafe_exit_wait() needs to be called *after* all of the > exit_to_user_mode_work is processed. This is because > sched_core_unsafe_exit_wait() also checks for any new exit-to-usermode-work > that popped up while it is spinning and breaks out of its spin-till-safe loop > early. This is key to solving the stop-machine issue. If the stopper needs to > run, then the need-resched flag will be set and we break out of the spin and > redo the whole exit_to_user_mode_loop() as it should. And where is the problem? syscall_entry() ... sys_foo() .... return 0; local_irq_disable(); exit_to_user_mode_prepare() ti_work = exit_to_user_get_work() { if (ht_muck) syscall_exit_ht_muck() { .... while (wait) { local_irq_enable(); while (wait) cpu_relax(); local_irq_disable(); } } return READ_ONCE(current_thread_info()->flags); } if (unlikely(ti_work & WORK)) ti_work = exit_loop(ti_work) while (ti_work & MASK) { local_irq_enable(); ..... local_irq_disable(); ti_work = exit_to_user_get_work() { See above } } It covers both the 'no work' and the 'do work' exit path. If that's not sufficient, then something is fundamentally wrong with your design. Thanks, tglx