Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-30 Thread Thomas Gleixner
pet...@infradead.org writes:
> On Wed, Jul 29, 2020 at 08:40:57PM +, Fenghua Yu wrote:
>> Can we disable Bus Lock Detection before handle it and re-enable it
>> after handle it? Will that resolve the recursion issue?
>
> Because WRMSR is cheap, right?
>
> You have to unconditionally {dis,en}able it on #DB entry/exit. Not only
> when it's a DR_BUS_LOCK, _always_. Then maybe. I'm too tired to think
> through the IST mess.
>
> IST's suck, they're horrible crap.
>
> Suppose we get a #DB, then we get an NMI right before it does WRMSR, so
> BUS_LOCK is still on, then the NMI does a dodgy LOCK op, we die.
>
> So that means, you get to disable it on every NMI-like exception too,
> but we happen to care about performance for those, you loose.
>
> Also, what happens if you have a hardware watchpoint on the instruction
> that causes DR_BUS_LOCK? Does that work as expected?

Q: Why on earth are Intel hardware folks cramming this into #DB?
A: Just because there was a bit left in DR6 to indicate it, right?

Q: Why can't hardware folks talk to us _before_ they make the x86 exception
   trainwreck even worse?
A: Just because they know that we'd tell them to go back to the drawing
   board.

Q: Is that going to be supported by the kernel?
A: No, go back to the drawing board and talk to us _before_ coming back
   with the next half thought out tinkerware cast in silicon.

I'm really tired of wasting time dealing with such misfeatures which create
more problems than they solve.

Thanks,

Thomas




Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Sean Christopherson
On Wed, Jul 29, 2020 at 10:07:14PM +, Fenghua Yu wrote:
> Hi, Sean,
> 
> On Wed, Jul 29, 2020 at 01:39:05PM -0700, Sean Christopherson wrote:
> > On Wed, Jul 29, 2020 at 08:35:57PM +, Fenghua Yu wrote:
> > > If sld=fatal and bld=ratelimit (both sld and bld are enabled in hw),
> > > a split lock always generates #AC and kills the app and bld will never 
> > > have
> > > a chance to trigger #DB for split lock. So effectively the combination 
> > > makes
> > > the kernel to take two different actions after detecting a bus lock: if 
> > > the
> > > bus lock comes from a split lock, fatal (sld); if the bus lock comes from
> > > lock to non-WB memory, ratelimit (bld). Seems this is not a useful 
> > > combination
> > > and is not what the user really wants to do because the user wants 
> > > ratelimit
> > > for BLD, right?
> > 
> > I understood all off that.  And as I user I want to run sld=fatal and
> > bld=ratelimit to provide maximum protection, i.e. disallow split locks at
> > all times, and ratelimit the crud SLD #AC can't catch.
> 
> Then this will expand the current usages and do need two options. Let me work
> on adding a new "bus_lock_detect=" option as you suggested.

I'd wait for feedback from others before spending too much effort rewriting
everything, I'm just one person with an opinion.


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Fenghua Yu
Hi, Sean,

On Wed, Jul 29, 2020 at 01:39:05PM -0700, Sean Christopherson wrote:
> On Wed, Jul 29, 2020 at 08:35:57PM +, Fenghua Yu wrote:
> > If sld=fatal and bld=ratelimit (both sld and bld are enabled in hw),
> > a split lock always generates #AC and kills the app and bld will never have
> > a chance to trigger #DB for split lock. So effectively the combination makes
> > the kernel to take two different actions after detecting a bus lock: if the
> > bus lock comes from a split lock, fatal (sld); if the bus lock comes from
> > lock to non-WB memory, ratelimit (bld). Seems this is not a useful 
> > combination
> > and is not what the user really wants to do because the user wants ratelimit
> > for BLD, right?
> 
> I understood all off that.  And as I user I want to run sld=fatal and
> bld=ratelimit to provide maximum protection, i.e. disallow split locks at
> all times, and ratelimit the crud SLD #AC can't catch.

Then this will expand the current usages and do need two options. Let me work
on adding a new "bus_lock_detect=" option as you suggested.

Thanks.

-Fenghua


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread peterz
On Wed, Jul 29, 2020 at 08:40:57PM +, Fenghua Yu wrote:
> On Wed, Jul 29, 2020 at 10:49:47AM +0200, pet...@infradead.org wrote:
> > On Fri, Jul 17, 2020 at 02:35:00PM -0700, Fenghua Yu wrote:
> > 
> > > #DB for bus lock detect fixes all issues in #AC for split lock detect:
> > > 1) It's architectural ... just need to look at one CPUID bit to know it
> > >exists
> > > 2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
> > >So each process or guest can have different behavior.
> > 
> > And it generates a whole new problem due to #DB being an IST, and

> > we very much rely on #DB never recursing, which we carefully crafted by
> > disallowing hardare breakpoints on noinstr code and clearing DR7 early.
> > 
> > But now it can... please keep the pieces.
> 
> Can we disable Bus Lock Detection before handle it and re-enable it
> after handle it? Will that resolve the recursion issue?

Because WRMSR is cheap, right?

You have to unconditionally {dis,en}able it on #DB entry/exit. Not only
when it's a DR_BUS_LOCK, _always_. Then maybe. I'm too tired to think
through the IST mess.

IST's suck, they're horrible crap.

Suppose we get a #DB, then we get an NMI right before it does WRMSR, so
BUS_LOCK is still on, then the NMI does a dodgy LOCK op, we die.

So that means, you get to disable it on every NMI-like exception too,
but we happen to care about performance for those, you loose.


Also, what happens if you have a hardware watchpoint on the instruction
that causes DR_BUS_LOCK? Does that work as expected?




Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Fenghua Yu
Hi, Peter,

On Wed, Jul 29, 2020 at 10:49:47AM +0200, pet...@infradead.org wrote:
> On Fri, Jul 17, 2020 at 02:35:00PM -0700, Fenghua Yu wrote:
> 
> > #DB for bus lock detect fixes all issues in #AC for split lock detect:
> > 1) It's architectural ... just need to look at one CPUID bit to know it
> >exists
> > 2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
> >So each process or guest can have different behavior.
> 
> And it generates a whole new problem due to #DB being an IST, and
> 
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index b038695f36c5..58725567da39 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -812,6 +812,16 @@ static void handle_debug(struct pt_regs *regs, 
> > unsigned long dr6, bool user)
> > if (!user && !dr6)
> > return;
> >  
> > +   /* Handle bus lock. */
> > +   if (!(dr6 & DR_BUS_LOCK)) {
> > +   cond_local_irq_enable(regs);
> > +   if (user)
> > +   handle_user_bus_lock(regs);
> > +   else
> > +   handle_kernel_bus_lock(regs);
> > +   goto out;
> > +   }
> > +
> > /*
> >  * If dr6 has no reason to give us about the origin of this trap,
> >  * then it's very likely the result of an icebp/int01 trap.
> 
> we very much rely on #DB never recursing, which we carefully crafted by
> disallowing hardare breakpoints on noinstr code and clearing DR7 early.
> 
> But now it can... please keep the pieces.

Can we disable Bus Lock Detection before handle it and re-enable it
after handle it? Will that resolve the recursion issue?

Thanks.

-Fenghua


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Sean Christopherson
On Wed, Jul 29, 2020 at 08:35:57PM +, Fenghua Yu wrote:
> Hi, Sean,
> 
> On Wed, Jul 29, 2020 at 01:00:33PM -0700, Sean Christopherson wrote:
> > On Wed, Jul 29, 2020 at 07:42:59PM +, Fenghua Yu wrote:
> > > > Smushing the two into a single option is confusing, e.g. from the table
> > > > below it's not at all clear what will happen if sld=fatal, both features
> > > > are supported, and the kernel generates a split lock.
> > > > 
> > > > Given that both SLD (per-core, not architectural) and BLD (#DB 
> > > > recursion and
> > > > inverted DR6 flag) have warts, it would be very nice to enable/disable 
> > > > them
> > > > independently.  The lock to non-WB behavior for BLD may also be 
> > > > problematic,
> > > > e.g. maybe it turns out that fixing drivers to avoid locks to non-WB 
> > > > isn't
> > > > as straightforward as avoiding split locks.
> > > 
> > > But the two features are related if both of them are enabled in hardware:
> > > If a split lock happens, SLD will generate #AC before instruction 
> > > execution
> > > and BLD will generate #DB after instruction execution.
> > > 
> > > The software needs to make them exclusive. The same kernel option reflects
> > > the relationship and make them exclusive, e.g. "fatal" enables SLD and
> > > disables BLD, "warn" does the other way.
> > 
> > Why do they need to be exclusive?  We've already established that BLD 
> > catches
> > things that SLD does not.  What's wrong with running sld=fatal and 
> > bld=ratelimit
> > so that split locks never happen and kill applications, and non-WB locks are
> > are ratelimited?
> 
> Sorry if I didn't explain bus lock and split lock detections clearly before.
> 
> There are two causes of bus locks:
> 1. a locked access across cache line boundary: this is split lock.
> 2. a locked access to non-WB memory.
> 
> BLD detects both causes and SLD only detects the first one, i.e. BLD can 
> detect
> both split lock AND lock to non-WB memory.
> 
> If sld=fatal and bld=ratelimit (both sld and bld are enabled in hw),
> a split lock always generates #AC and kills the app and bld will never have
> a chance to trigger #DB for split lock. So effectively the combination makes
> the kernel to take two different actions after detecting a bus lock: if the
> bus lock comes from a split lock, fatal (sld); if the bus lock comes from
> lock to non-WB memory, ratelimit (bld). Seems this is not a useful combination
> and is not what the user really wants to do because the user wants ratelimit
> for BLD, right?

I understood all off that.  And as I user I want to run sld=fatal and
bld=ratelimit to provide maximum protection, i.e. disallow split locks at
all times, and ratelimit the crud SLD #AC can't catch.


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Fenghua Yu
Hi, Sean,

On Wed, Jul 29, 2020 at 01:00:33PM -0700, Sean Christopherson wrote:
> On Wed, Jul 29, 2020 at 07:42:59PM +, Fenghua Yu wrote:
> > > Smushing the two into a single option is confusing, e.g. from the table
> > > below it's not at all clear what will happen if sld=fatal, both features
> > > are supported, and the kernel generates a split lock.
> > > 
> > > Given that both SLD (per-core, not architectural) and BLD (#DB recursion 
> > > and
> > > inverted DR6 flag) have warts, it would be very nice to enable/disable 
> > > them
> > > independently.  The lock to non-WB behavior for BLD may also be 
> > > problematic,
> > > e.g. maybe it turns out that fixing drivers to avoid locks to non-WB isn't
> > > as straightforward as avoiding split locks.
> > 
> > But the two features are related if both of them are enabled in hardware:
> > If a split lock happens, SLD will generate #AC before instruction execution
> > and BLD will generate #DB after instruction execution.
> > 
> > The software needs to make them exclusive. The same kernel option reflects
> > the relationship and make them exclusive, e.g. "fatal" enables SLD and
> > disables BLD, "warn" does the other way.
> 
> Why do they need to be exclusive?  We've already established that BLD catches
> things that SLD does not.  What's wrong with running sld=fatal and 
> bld=ratelimit
> so that split locks never happen and kill applications, and non-WB locks are
> are ratelimited?

Sorry if I didn't explain bus lock and split lock detections clearly before.

There are two causes of bus locks:
1. a locked access across cache line boundary: this is split lock.
2. a locked access to non-WB memory.

BLD detects both causes and SLD only detects the first one, i.e. BLD can detect
both split lock AND lock to non-WB memory.

If sld=fatal and bld=ratelimit (both sld and bld are enabled in hw),
a split lock always generates #AC and kills the app and bld will never have
a chance to trigger #DB for split lock. So effectively the combination makes
the kernel to take two different actions after detecting a bus lock: if the
bus lock comes from a split lock, fatal (sld); if the bus lock comes from
lock to non-WB memory, ratelimit (bld). Seems this is not a useful combination
and is not what the user really wants to do because the user wants ratelimit
for BLD, right?

> > If using two different kernel options, the user needs to give right options
> > to make both work, e.g. can the user give this combination
> > "split_lock_detect=fatal bus_lock_detect=warn"? What does the combination
> > mean?
> 
> Split locks are fatal, non-WB locks are logged but not fatal.

Similar here: bus lock from a split lock is fatal (sld triggers #AC) and
bus lock from lock to non-WB mem is warn (bld triggers #DB). Seems not what
the user really wants, right?

Thanks.

-Fenghua


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread peterz
On Wed, Jul 29, 2020 at 01:00:33PM -0700, Sean Christopherson wrote:
> Why do they need to be exclusive?  We've already established that BLD catches
> things that SLD does not.  What's wrong with running sld=fatal and 
> bld=ratelimit
> so that split locks never happen and kill applications, and non-WB locks are
> are ratelimited?

It's all moot until there's a sane proposal for #DB that isn't utterly
wrecked.


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Sean Christopherson
On Wed, Jul 29, 2020 at 07:42:59PM +, Fenghua Yu wrote:
> > Smushing the two into a single option is confusing, e.g. from the table
> > below it's not at all clear what will happen if sld=fatal, both features
> > are supported, and the kernel generates a split lock.
> > 
> > Given that both SLD (per-core, not architectural) and BLD (#DB recursion and
> > inverted DR6 flag) have warts, it would be very nice to enable/disable them
> > independently.  The lock to non-WB behavior for BLD may also be problematic,
> > e.g. maybe it turns out that fixing drivers to avoid locks to non-WB isn't
> > as straightforward as avoiding split locks.
> 
> But the two features are related if both of them are enabled in hardware:
> If a split lock happens, SLD will generate #AC before instruction execution
> and BLD will generate #DB after instruction execution.
> 
> The software needs to make them exclusive. The same kernel option reflects
> the relationship and make them exclusive, e.g. "fatal" enables SLD and
> disables BLD, "warn" does the other way.

Why do they need to be exclusive?  We've already established that BLD catches
things that SLD does not.  What's wrong with running sld=fatal and bld=ratelimit
so that split locks never happen and kill applications, and non-WB locks are
are ratelimited?

Sure, sld==warn with bld!=off is a bit silly, but the kernel can easily handle
that particular case.

> If using two different kernel options, the user needs to give right options
> to make both work, e.g. can the user give this combination
> "split_lock_detect=fatal bus_lock_detect=warn"? What does the combination
> mean?

Split locks are fatal, non-WB locks are logged but not fatal.

> There could be many combinations of the two options, some of them
> are meaningful and some of them aren't. Maintaining the combinations is
> unnecessary complex, right?

Honestly, it seems less complex than deciphering the resulting behavior from
that table.

  sld=off|warn|fatal
  bld=off|warn|ratelimit

As above, sld then could become

  if (sld == warn && bld != off) {
  pr_warn("disabling SLD in favor of BLD\n");
  sld = off;
  }

Everything else should simply work.  The necessary refactoring for SLD should
be minimial as well.


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Fenghua Yu
Hi, Sean,

On Wed, Jul 29, 2020 at 11:46:14AM -0700, Sean Christopherson wrote:
> On Wed, Jul 29, 2020 at 11:09:16AM -0700, Yu, Fenghua wrote:
> > > > Some CPUs have ability to notify the kernel by an #DB trap after the
> > > > instruction acquires a bus lock and is executed. This allows the
> > > > kernel to enforce user application throttling or mitigations and also
> > > > provides a better environment to debug kernel split lock issues since
> > > > the kernel can continue instead of crashing.
> > > >
> > > > #DB for bus lock detect fixes all issues in #AC for split lock detect:
> > > 
> > > Fixes "all" issues... and creates some new ones, e.g. there are use cases
> > > where preventing the split lock from happening in the first place is 
> > > strongly
> > > desired.  It's why that train wreck exists.
> > 
> > Bus Lock Detection doesn't replace Split Lock Detection. If both features
> > are enabled, default behavior is warning from bus lock, fatal behavior is
> > still from split lock. See the behavior table as follows.
> > 
> > > 
> > > > 1) It's architectural ... just need to look at one CPUID bit to know it
> > > >exists
> > > > 2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
> > > >So each process or guest can have different behavior.
> > > > 3) It has support for VMM/guests (new VMEXIT codes, etc).
> > > >
> > > > Use the existing kernel command line option "split_lock_detect=" to
> > > > handle #DB for bus lock:
> > > 
> > > Are SLD and BLD mutually exclusive?  Can we even guarantee that given the
> > > track record of SLD?  If not, we'll likely want to allow the user to 
> > > choose
> > > between SDL and BLD via split_lock_detect.
> > 
> > The two hardware features can be enabled on the same platform.
> > But they are mutually exclusive in the kernel because #AC from SLD happens
> > before the instruction is executed and #DB happens after the instruction is
> > executed.
> > 
> > Right now, if both of them are enabled, "warn" behavior goes to
> > bus lock and "fatal" behavior goes to split lock.
> > 
> > Do you want the user to override the behaviors by something like this?
> > 
> > split_lock_detect=warn[,sld]: if given "sld" while both features are 
> > enabled,
> > warn behavior is from split lock instead of bus lock detection.
> > 
> > split_lock_detect=fatal[,bld]: if given "bld" while both features are 
> > enabled,
> > fatal behavior is from bus lock detection.
> 
> IMO these should be completely independent features (that happen to share
> some code).
> 
> BLD in fatal mode doesn't make any sense because it can't be fatal without
> a completely different implementation, e.g. the bus lock has already
> happened and the application can eat the SIGBUS.  The current SLD code
> works because the split lock is prevented entirely, i.e. eating SIGBUS
> doesn't allow the application to make forward progress.

If "fatal" is meaningless for BLD, we can remove it for BLD. If we need
it in the future, we can add it later.

The "ratelimit:N" maybe more useful for BLD: it mitigates DOS from bus locks.

> 
> Smushing the two into a single option is confusing, e.g. from the table
> below it's not at all clear what will happen if sld=fatal, both features
> are supported, and the kernel generates a split lock.
> 
> Given that both SLD (per-core, not architectural) and BLD (#DB recursion and
> inverted DR6 flag) have warts, it would be very nice to enable/disable them
> independently.  The lock to non-WB behavior for BLD may also be problematic,
> e.g. maybe it turns out that fixing drivers to avoid locks to non-WB isn't
> as straightforward as avoiding split locks.

But the two features are related if both of them are enabled in hardware:
If a split lock happens, SLD will generate #AC before instruction execution
and BLD will generate #DB after instruction execution.

The software needs to make them exclusive. The same kernel option reflects
the relationship and make them exclusive, e.g. "fatal" enables SLD and
disables BLD, "warn" does the other way.

If using two different kernel options, the user needs to give right options
to make both work, e.g. can the user give this combination
"split_lock_detect=fatal bus_lock_detect=warn"? What does the combination
mean? There could be many combinations of the two options, some of them
are meaningful and some of them aren't. Maintaining the combinations is
unnecessary complex, right?

Thanks.

-Fenghua


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Sean Christopherson
On Wed, Jul 29, 2020 at 11:09:16AM -0700, Yu, Fenghua wrote:
> > > Some CPUs have ability to notify the kernel by an #DB trap after the
> > > instruction acquires a bus lock and is executed. This allows the
> > > kernel to enforce user application throttling or mitigations and also
> > > provides a better environment to debug kernel split lock issues since
> > > the kernel can continue instead of crashing.
> > >
> > > #DB for bus lock detect fixes all issues in #AC for split lock detect:
> > 
> > Fixes "all" issues... and creates some new ones, e.g. there are use cases
> > where preventing the split lock from happening in the first place is 
> > strongly
> > desired.  It's why that train wreck exists.
> 
> Bus Lock Detection doesn't replace Split Lock Detection. If both features
> are enabled, default behavior is warning from bus lock, fatal behavior is
> still from split lock. See the behavior table as follows.
> 
> > 
> > > 1) It's architectural ... just need to look at one CPUID bit to know it
> > >exists
> > > 2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
> > >So each process or guest can have different behavior.
> > > 3) It has support for VMM/guests (new VMEXIT codes, etc).
> > >
> > > Use the existing kernel command line option "split_lock_detect=" to
> > > handle #DB for bus lock:
> > 
> > Are SLD and BLD mutually exclusive?  Can we even guarantee that given the
> > track record of SLD?  If not, we'll likely want to allow the user to choose
> > between SDL and BLD via split_lock_detect.
> 
> The two hardware features can be enabled on the same platform.
> But they are mutually exclusive in the kernel because #AC from SLD happens
> before the instruction is executed and #DB happens after the instruction is
> executed.
> 
> Right now, if both of them are enabled, "warn" behavior goes to
> bus lock and "fatal" behavior goes to split lock.
> 
> Do you want the user to override the behaviors by something like this?
> 
> split_lock_detect=warn[,sld]: if given "sld" while both features are enabled,
> warn behavior is from split lock instead of bus lock detection.
> 
> split_lock_detect=fatal[,bld]: if given "bld" while both features are enabled,
> fatal behavior is from bus lock detection.

IMO these should be completely independent features (that happen to share
some code).

BLD in fatal mode doesn't make any sense because it can't be fatal without
a completely different implementation, e.g. the bus lock has already
happened and the application can eat the SIGBUS.  The current SLD code
works because the split lock is prevented entirely, i.e. eating SIGBUS
doesn't allow the application to make forward progress.

Smushing the two into a single option is confusing, e.g. from the table
below it's not at all clear what will happen if sld=fatal, both features
are supported, and the kernel generates a split lock.

Given that both SLD (per-core, not architectural) and BLD (#DB recursion and
inverted DR6 flag) have warts, it would be very nice to enable/disable them
independently.  The lock to non-WB behavior for BLD may also be problematic,
e.g. maybe it turns out that fixing drivers to avoid locks to non-WB isn't
as straightforward as avoiding split locks.

> > >  /*
> > >   * Default to sld_off because most systems do not support split lock
> > > detection
> > > - * split_lock_setup() will switch this to sld_warn on systems that
> > > support
> > > - * split lock detect, unless there is a command line override.
> > > + * sld_state_setup() will switch this to sld_warn on systems that
> > > + support
> > > + * split lock/bus lock detect, unless there is a command line override.
> > >   */
> > >  static enum split_lock_detect_state sld_state __ro_after_init =
> > > sld_off;  static u64 msr_test_ctrl_cache __ro_after_init;
> > > +/* Split lock detection is enabled if it's true. */ static bool sld;
> > > +/* Bus lock detection is enabled if it's true. */ static bool bld;
> > 
> > Why can't these be tracked/reflected in X86_FEATURE_*?
> 
> sld and bld are enabled depending on kernel parameter "split_lock_detect=".
> They are not static and cannot be tracked by static X86_FEATURE_*.

X86_FEATURE_* flags aren't static, the kernel sets/clears them all over the
place.


RE: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread Yu, Fenghua
Hi, Sean,

> > A bus lock [1] is acquired either through split locked access to
> > writeback (WB) memory or by using locks to uncacheable (UC) memory
> > (e.g. direct device
> 
> Does SLD not detect the lock to UC memory?

The statement might not be accurate. Split Lock Detection doesn't detect bus
lock to non-WB memory (including UC memory).

> 
> > assignment). This is typically >1000 cycles slower than an atomic
> > operation within a cache line. It also disrupts performance on other cores.
> >
> > Although split lock can be detected by #AC trap, the trap is triggered
> > before the instruction acquires bus lock. This makes it difficult to
> > mitigate bus lock (e.g. throttle the user application).
> 
> Mitigate _in a non-fatal way_.  The #AC makes it very easy to mitigate split
> locks, it just has the side effect of SIGBUGS or killing the KVM guest.

Adding "in a non-fatal way" is more better. Will fix it.

> 
> > Some CPUs have ability to notify the kernel by an #DB trap after the
> > instruction acquires a bus lock and is executed. This allows the
> > kernel to enforce user application throttling or mitigations and also
> > provides a better environment to debug kernel split lock issues since
> > the kernel can continue instead of crashing.
> >
> > #DB for bus lock detect fixes all issues in #AC for split lock detect:
> 
> Fixes "all" issues... and creates some new ones, e.g. there are use cases
> where preventing the split lock from happening in the first place is strongly
> desired.  It's why that train wreck exists.

Bus Lock Detection doesn't replace Split Lock Detection. If both features
are enabled, default behavior is warning from bus lock, fatal behavior is
still from split lock. See the behavior table as follows.

> 
> > 1) It's architectural ... just need to look at one CPUID bit to know it
> >exists
> > 2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
> >So each process or guest can have different behavior.
> > 3) It has support for VMM/guests (new VMEXIT codes, etc).
> >
> > Use the existing kernel command line option "split_lock_detect=" to
> > handle #DB for bus lock:
> 
> Are SLD and BLD mutually exclusive?  Can we even guarantee that given the
> track record of SLD?  If not, we'll likely want to allow the user to choose
> between SDL and BLD via split_lock_detect.

The two hardware features can be enabled on the same platform.
But they are mutually exclusive in the kernel because #AC from SLD happens
before the instruction is executed and #DB happens after the instruction is
executed.

Right now, if both of them are enabled, "warn" behavior goes to
bus lock and "fatal" behavior goes to split lock.

Do you want the user to override the behaviors by something like this?

split_lock_detect=warn[,sld]: if given "sld" while both features are enabled,
warn behavior is from split lock instead of bus lock detection.

split_lock_detect=fatal[,bld]: if given "bld" while both features are enabled,
fatal behavior is from bus lock detection.

> 
> > split_lock_detect=
> > #AC for split lock  #DB for bus lock
> >
> > off Do nothing  Do nothing
> >
> > warnKernel OOPs Kernel warns rate 
> > limited
> > Warn once per task and  and continues to run.
> > disable future checking Warn once per task and
> > and continue to run.
> > When both features are
> > supported, warn in #DB
> >
> > fatal   Kernel OOPs Kernel warn rate limited
> 
> Unless the lock to UC #DB is new behavior, why would we revert to allowing
> split locks in the kernel?

SLD cannot detect lock to non-WB memory (including UC). BLD can detect
both bus locks from both split lock and lock to non-WB.

> 
> > Send SIGBUS to user Send SIGBUS to user
> > When both features are
> > supported, fatal in #AC.
> >
> > ratelimit:N Do nothing  Kernel warns rate limited
> 
> This should be more than "Do nothing" for #AC, e.g. fall back to warn or at
> least print a loud error.

Ok. Will fall back to warn.

> 
> > and continue to run.
> > Limit bus lock rate to
> > N per second in the
> > current non root user.
> >
> > On systems that support #DB for bus lock detection the default is "warn".
> >
> > [1] Chapter 8
> > https://software.intel.com/sites/default/files/managed/c5/15/architect
> > ure-instruction-set-extensions-programming-reference.pdf
> >
> > Signed-off-by: Fenghua Yu 
> > Reviewed-by: Tony Luck 
> > ---
> >  .../admin-guide/kernel-parameters.txt |  48 +-
> >  

Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread peterz
On Tue, Jul 28, 2020 at 08:02:32PM -0700, Sean Christopherson wrote:
> Maybe it's just me, but it'd be nice to break this into multiple patches
> so that the SLD refactoring is separate from the introduction of BLD.  As
> is, I find it hard to review as I can't easily distinguish refactoring from
> new functionality.

This. Absolutely 100% this. It's like Fenghua has never send a patch
before..


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-29 Thread peterz
On Fri, Jul 17, 2020 at 02:35:00PM -0700, Fenghua Yu wrote:

> #DB for bus lock detect fixes all issues in #AC for split lock detect:
> 1) It's architectural ... just need to look at one CPUID bit to know it
>exists
> 2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
>So each process or guest can have different behavior.

And it generates a whole new problem due to #DB being an IST, and

> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index b038695f36c5..58725567da39 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -812,6 +812,16 @@ static void handle_debug(struct pt_regs *regs, unsigned 
> long dr6, bool user)
>   if (!user && !dr6)
>   return;
>  
> + /* Handle bus lock. */
> + if (!(dr6 & DR_BUS_LOCK)) {
> + cond_local_irq_enable(regs);
> + if (user)
> + handle_user_bus_lock(regs);
> + else
> + handle_kernel_bus_lock(regs);
> + goto out;
> + }
> +
>   /*
>* If dr6 has no reason to give us about the origin of this trap,
>* then it's very likely the result of an icebp/int01 trap.

we very much rely on #DB never recursing, which we carefully crafted by
disallowing hardare breakpoints on noinstr code and clearing DR7 early.

But now it can... please keep the pieces.


Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection

2020-07-28 Thread Sean Christopherson
On Fri, Jul 17, 2020 at 02:35:00PM -0700, Fenghua Yu wrote:
> A bus lock [1] is acquired either through split locked access to writeback 
> (WB)
> memory or by using locks to uncacheable (UC) memory (e.g. direct device

Does SLD not detect the lock to UC memory?

> assignment). This is typically >1000 cycles slower than an atomic operation
> within a cache line. It also disrupts performance on other cores.
> 
> Although split lock can be detected by #AC trap, the trap is triggered
> before the instruction acquires bus lock. This makes it difficult to
> mitigate bus lock (e.g. throttle the user application).

Mitigate _in a non-fatal way_.  The #AC makes it very easy to mitigate split
locks, it just has the side effect of SIGBUGS or killing the KVM guest.

> Some CPUs have ability to notify the kernel by an #DB trap after the
> instruction acquires a bus lock and is executed. This allows the kernel
> to enforce user application throttling or mitigations and also provides
> a better environment to debug kernel split lock issues since the kernel
> can continue instead of crashing.
> 
> #DB for bus lock detect fixes all issues in #AC for split lock detect:

Fixes "all" issues... and creates some new ones, e.g. there are use cases
where preventing the split lock from happening in the first place is
strongly desired.  It's why that train wreck exists.

> 1) It's architectural ... just need to look at one CPUID bit to know it
>exists
> 2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
>So each process or guest can have different behavior.
> 3) It has support for VMM/guests (new VMEXIT codes, etc).
> 
> Use the existing kernel command line option "split_lock_detect=" to handle
> #DB for bus lock:

Are SLD and BLD mutually exclusive?  Can we even guarantee that given the
track record of SLD?  If not, we'll likely want to allow the user to choose
between SDL and BLD via split_lock_detect.

> split_lock_detect=
>   #AC for split lock  #DB for bus lock
> 
> off   Do nothing  Do nothing
> 
> warn  Kernel OOPs Kernel warns rate limited
>   Warn once per task and  and continues to run.
>   disable future checking Warn once per task and
>   and continue to run.
>   When both features are
>   supported, warn in #DB
> 
> fatal Kernel OOPs Kernel warn rate limited

Unless the lock to UC #DB is new behavior, why would we revert to allowing
split locks in the kernel?

>   Send SIGBUS to user Send SIGBUS to user
>   When both features are
>   supported, fatal in #AC.
> 
> ratelimit:N   Do nothing  Kernel warns rate limited

This should be more than "Do nothing" for #AC, e.g. fall back to warn or
at least print a loud error.

>   and continue to run.
>   Limit bus lock rate to
>   N per second in the
>   current non root user.
> 
> On systems that support #DB for bus lock detection the default is "warn".
> 
> [1] Chapter 8 
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> 
> Signed-off-by: Fenghua Yu 
> Reviewed-by: Tony Luck 
> ---
>  .../admin-guide/kernel-parameters.txt |  48 +-
>  arch/x86/include/asm/cpu.h|  16 +-
>  arch/x86/include/asm/cpufeatures.h|   1 +
>  arch/x86/include/asm/msr-index.h  |   1 +
>  arch/x86/include/uapi/asm/debugreg.h  |   3 +-
>  arch/x86/kernel/cpu/common.c  |   2 +-
>  arch/x86/kernel/cpu/intel.c   | 156 +++---
>  arch/x86/kernel/traps.c   |  10 ++
>  include/linux/sched/user.h|   4 +-
>  kernel/user.c |   7 +
>  10 files changed, 214 insertions(+), 34 deletions(-)

Maybe it's just me, but it'd be nice to break this into multiple patches
so that the SLD refactoring is separate from the introduction of BLD.  As
is, I find it hard to review as I can't easily distinguish refactoring from
new functionality.

> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index fb95fad81c79..7a1cb6fe8b8e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4816,27 +4816,59 @@
>   spia_peddr=
>  
>   split_lock_detect=
> - [X86] Enable split lock detection
> + [X86] Enable split lock detection or bus lock detection
>  
>