Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-12 Thread peterz
On Wed, Aug 12, 2020 at 10:18:32AM +0200, pet...@infradead.org wrote:
> >  trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> >  rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> >  trace_irq_enable_rcuidle+0x87/0x120 
> > include/trace/events/preemptirq.h:40
> >  trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> >  rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> >  trace_irq_enable_rcuidle+0x87/0x120 
> > include/trace/events/preemptirq.h:40
> >  trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> >  rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> >  trace_irq_enable_rcuidle+0x87/0x120 
> > include/trace/events/preemptirq.h:40
> >  trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> >  rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> > 
> > <... repeated many many times ...>
> > 
> >  trace_irq_enable_rcuidle+0x87/0x120 
> > include/trace/events/preemptirq.h:40
> >  trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> >  rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> > Lost 500 message(s)!
> > BUG: stack guard page was hit at cab483ba (stack is 
> > b1442365..c26f9ad3)
> > BUG: stack guard page was hit at 318ff8d8 (stack is 
> > fd87d656..58100136)
> > ---[ end trace 4157e0bb4a65941a ]---
> 
> Wheee... recursion! Let me try and see if I can make something of that.

All that's needed is enabling the preemptirq tracepoints. Lemme go fix.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-12 Thread peterz
On Wed, Aug 12, 2020 at 10:06:50AM +0200, Marco Elver wrote:
> On Tue, Aug 11, 2020 at 10:17PM +0200, pet...@infradead.org wrote:
> > On Tue, Aug 11, 2020 at 11:46:51AM +0200, pet...@infradead.org wrote:
> > 
> > > So let me once again see if I can't find a better solution for this all.
> > > Clearly it needs one :/
> > 
> > So the below boots without triggering the debug code from Marco -- it
> > should allow nesting local_irq_save/restore under raw_local_irq_*().
> > 
> > I tried unconditional counting, but there's some _reallly_ wonky /
> > asymmetric code that wrecks that and I've not been able to come up with
> > anything useful.
> > 
> > This one starts counting when local_irq_save() finds it didn't disable
> > IRQs while lockdep though it did. At that point, local_irq_restore()
> > will decrement and enable things again when it reaches 0.
> > 
> > This assumes local_irq_save()/local_irq_restore() are nested sane, which
> > is mostly true.
> > 
> > This leaves #PF, which I fixed in these other patches, but I realized it
> > needs fixing for all architectures :-( No bright ideas there yet.
> > 
> > ---
> >  arch/x86/entry/thunk_32.S   |  5 
> >  include/linux/irqflags.h| 45 +++-
> >  init/main.c | 16 
> >  kernel/locking/lockdep.c| 58 
> > +
> >  kernel/trace/trace_preemptirq.c | 33 +++
> >  5 files changed, 134 insertions(+), 23 deletions(-)
> 
> Testing this again with syzkaller produced some new reports:
> 
>   BUG: stack guard page was hit in error_entry
>   BUG: stack guard page was hit in exc_int3
>   PANIC: double fault in error_entry
>   PANIC: double fault in exc_int3
> 
> Most of them have corrupted reports, but this one might be useful:
> 
>   BUG: stack guard page was hit at 1fab0982 (stack is 
> 063f33dc..bf04b0d8)
>   BUG: stack guard page was hit at ca97ac69 (stack is 
> af3e6c84..1597e1bf)
>   kernel stack overflow (double-fault):  [#1] PREEMPT SMP
>   CPU: 1 PID: 4709 Comm: kworker/1:1H Not tainted 5.8.0+ #5
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 
> 04/01/2014
>   Workqueue: events_highpri snd_vmidi_output_work
>   RIP: 0010:exc_int3+0x5/0xf0 arch/x86/kernel/traps.c:636
>   Code: c9 85 4d 89 e8 31 c0 e8 a9 7d 68 fd e9 90 fe ff ff e8 0f 35 00 00 
> 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 53 48 89 fb  76 0e 00 00 
> 85 c0 74 03 5b 5d c3 f6 83 88 00 00 00 03 74 7e 48
>   RSP: 0018:c90008114000 EFLAGS: 00010083
>   RAX: 84e00e17 RBX: c90008114018 RCX: 84e00e17
>   RDX:  RSI: 84e00a39 RDI: c90008114018
>   RBP:  R08:  R09: 
>   R10:  R11:  R12: 
>   R13:  R14:  R15: 
>   FS:  () GS:88807dc8() 
> knlGS:
>   CS:  0010 DS:  ES:  CR0: 80050033
>   CR2: c90008113ff8 CR3: 2dae4006 CR4: 00770ee0
>   DR0:  DR1:  DR2: 
>   DR3:  DR6: fffe0ff0 DR7: 0400
>   PKRU: 
>   Call Trace:
>asm_exc_int3+0x31/0x40 arch/x86/include/asm/idtentry.h:537
>   RIP: 0010:arch_static_branch include/trace/events/preemptirq.h:40 
> [inline]
>   RIP: 0010:static_key_false include/linux/jump_label.h:200 [inline]
>   RIP: 0010:trace_irq_enable_rcuidle+0xd/0x120 
> include/trace/events/preemptirq.h:40
>   Code: 24 08 48 89 df e8 43 8d ef ff 48 89 df 5b e9 4a 2e 99 03 66 2e 0f 
> 1f 84 00 00 00 00 00 55 41 56 53 48 89 fb e8 84 1a fd ff cc <1f> 44 00 00 5b 
> 41 5e 5d c3 65 8b 05 ab 74 c3 7e 89 c0 31 f6 48 0f
>   RSP: 0018:c900081140f8 EFLAGS: 0093
>   RAX: 813d9e8c RBX: 81314dd3 RCX: 888076ce6000
>   RDX:  RSI:  RDI: 81314dd3
>   RBP:  R08: 813da3d4 R09: 0001
>   R10:  R11:  R12: 
>   R13: 0082 R14:  R15: 888076ce6000
>trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
>rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
>trace_irq_enable_rcuidle+0x87/0x120 
> include/trace/events/preemptirq.h:40
>trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
>rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
>trace_irq_enable_rcuidle+0x87/0x120 
> include/trace/events/preemptirq.h:40
>trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
>rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> 

Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread peterz
On Tue, Aug 11, 2020 at 11:46:51AM +0200, pet...@infradead.org wrote:

> So let me once again see if I can't find a better solution for this all.
> Clearly it needs one :/

So the below boots without triggering the debug code from Marco -- it
should allow nesting local_irq_save/restore under raw_local_irq_*().

I tried unconditional counting, but there's some _reallly_ wonky /
asymmetric code that wrecks that and I've not been able to come up with
anything useful.

This one starts counting when local_irq_save() finds it didn't disable
IRQs while lockdep though it did. At that point, local_irq_restore()
will decrement and enable things again when it reaches 0.

This assumes local_irq_save()/local_irq_restore() are nested sane, which
is mostly true.

This leaves #PF, which I fixed in these other patches, but I realized it
needs fixing for all architectures :-( No bright ideas there yet.

---
 arch/x86/entry/thunk_32.S   |  5 
 include/linux/irqflags.h| 45 +++-
 init/main.c | 16 
 kernel/locking/lockdep.c| 58 +
 kernel/trace/trace_preemptirq.c | 33 +++
 5 files changed, 134 insertions(+), 23 deletions(-)

diff --git a/arch/x86/entry/thunk_32.S b/arch/x86/entry/thunk_32.S
index 3a07ce3ec70b..f1f96d4d8cd6 100644
--- a/arch/x86/entry/thunk_32.S
+++ b/arch/x86/entry/thunk_32.S
@@ -29,11 +29,6 @@ SYM_CODE_START_NOALIGN(\name)
 SYM_CODE_END(\name)
.endm
 
-#ifdef CONFIG_TRACE_IRQFLAGS
-   THUNK trace_hardirqs_on_thunk,trace_hardirqs_on_caller,1
-   THUNK trace_hardirqs_off_thunk,trace_hardirqs_off_caller,1
-#endif
-
 #ifdef CONFIG_PREEMPTION
THUNK preempt_schedule_thunk, preempt_schedule
THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index bd5c55755447..67e2a16d3846 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -23,12 +23,16 @@
   extern void lockdep_hardirqs_on_prepare(unsigned long ip);
   extern void lockdep_hardirqs_on(unsigned long ip);
   extern void lockdep_hardirqs_off(unsigned long ip);
+  extern void lockdep_hardirqs_save(unsigned long ip, unsigned long flags);
+  extern void lockdep_hardirqs_restore(unsigned long ip, unsigned long flags);
 #else
   static inline void lockdep_softirqs_on(unsigned long ip) { }
   static inline void lockdep_softirqs_off(unsigned long ip) { }
   static inline void lockdep_hardirqs_on_prepare(unsigned long ip) { }
   static inline void lockdep_hardirqs_on(unsigned long ip) { }
   static inline void lockdep_hardirqs_off(unsigned long ip) { }
+  static inline void lockdep_hardirqs_save(unsigned long ip, unsigned long 
flags) { }
+  static inline void lockdep_hardirqs_restore(unsigned long ip, unsigned long 
flags) { }
 #endif
 
 #ifdef CONFIG_TRACE_IRQFLAGS
@@ -49,10 +53,13 @@ struct irqtrace_events {
 DECLARE_PER_CPU(int, hardirqs_enabled);
 DECLARE_PER_CPU(int, hardirq_context);
 
-  extern void trace_hardirqs_on_prepare(void);
-  extern void trace_hardirqs_off_finish(void);
-  extern void trace_hardirqs_on(void);
-  extern void trace_hardirqs_off(void);
+extern void trace_hardirqs_on_prepare(void);
+extern void trace_hardirqs_off_finish(void);
+extern void trace_hardirqs_on(void);
+extern void trace_hardirqs_off(void);
+extern void trace_hardirqs_save(unsigned long flags);
+extern void trace_hardirqs_restore(unsigned long flags);
+
 # define lockdep_hardirq_context() (this_cpu_read(hardirq_context))
 # define lockdep_softirq_context(p)((p)->softirq_context)
 # define lockdep_hardirqs_enabled()(this_cpu_read(hardirqs_enabled))
@@ -120,17 +127,19 @@ do {  \
 #else
 # define trace_hardirqs_on_prepare()   do { } while (0)
 # define trace_hardirqs_off_finish()   do { } while (0)
-# define trace_hardirqs_on()   do { } while (0)
-# define trace_hardirqs_off()  do { } while (0)
-# define lockdep_hardirq_context() 0
-# define lockdep_softirq_context(p)0
-# define lockdep_hardirqs_enabled()0
-# define lockdep_softirqs_enabled(p)   0
-# define lockdep_hardirq_enter()   do { } while (0)
-# define lockdep_hardirq_threaded()do { } while (0)
-# define lockdep_hardirq_exit()do { } while (0)
-# define lockdep_softirq_enter()   do { } while (0)
-# define lockdep_softirq_exit()do { } while (0)
+# define trace_hardirqs_on()   do { } while (0)
+# define trace_hardirqs_off()  do { } while (0)
+# define trace_hardirqs_save(flags)do { } while (0)
+# define trace_hardirqs_restore(flags) do { } while (0)
+# define lockdep_hardirq_context() 0
+# define lockdep_softirq_context(p)0
+# define lockdep_hardirqs_enabled()0
+# define lockdep_softirqs_enabled(p)   0
+# define 

Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread peterz
On Tue, Aug 11, 2020 at 11:20:54AM +0200, pet...@infradead.org wrote:
> On Tue, Aug 11, 2020 at 10:38:50AM +0200, Jürgen Groß wrote:
> > In case you don't want to do it I can send the patch for the Xen
> > variants.
> 
> I might've opened a whole new can of worms here. I'm not sure we
> can/want to fix the entire fallout this release :/
> 
> Let me ponder this a little, because the more I look at things, the more
> problems I keep finding... bah bah bah.

That is, most of these irq-tracking problem are new because commit:

  859d069ee1dd ("lockdep: Prepare for NMI IRQ state tracking")

changed irq-tracking to ignore the lockdep recursion count.

This then allows:

lock_acquire()
  raw_local_irq_save();
  current->lockdep_recursion++;
  trace_lock_acquire()
   ... tracing ...
 #PF under raw_local_irq_*()

  __lock_acquire()
arch_spin_lock(_lock)
  pv-spinlock-wait()
local_irq_save() under raw_local_irq_*()

However afaict that just made a bad situation worse. There already were
issues, take for example:

  trace_clock_global()
raw_local_irq_save();
arch_spin_lock()
  pv-spinlock-wait
local_irq_save()

And that has no lockdep_recursion to 'save' the say.

The tracing recursion does however avoid some of the obvious fails
there, like trace_clock calling into paravirt which then calls back into
tracing. But still, that would've caused IRQ tracking problems even with
the old code.

And in that respect, this is all the exact same problem as that other
set of patches has ( 20200807192336.405068...@infradead.org ).

Now, on the flip side, it does find actual problems, the trace_lock_*()
things were using RCU in RCU-disabled code, and here I found that
trace_clock_global() thinkg (and I suspect there's more of that).

But at this point I'm not entirelty sure how best to proceed... tracing
uses arch_spinlock_t, which means all spinlock implementations should be
notrace, but then that drops into paravirt and all hell breaks loose
because Hyper-V then calls into the APIC code etc.. etc..

At that rate we'll have the entire kernel marked notrace, and I'm fairly
sure that's not a solution either.

So let me once again see if I can't find a better solution for this all.
Clearly it needs one :/
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread peterz
On Tue, Aug 11, 2020 at 10:38:50AM +0200, Jürgen Groß wrote:
> In case you don't want to do it I can send the patch for the Xen
> variants.

I might've opened a whole new can of worms here. I'm not sure we
can/want to fix the entire fallout this release :/

Let me ponder this a little, because the more I look at things, the more
problems I keep finding... bah bah bah.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread Jürgen Groß

On 11.08.20 10:12, Peter Zijlstra wrote:

On Tue, Aug 11, 2020 at 09:57:55AM +0200, Jürgen Groß wrote:

On 11.08.20 09:41, Peter Zijlstra wrote:

On Fri, Aug 07, 2020 at 05:19:03PM +0200, Marco Elver wrote:


My hypothesis here is simply that kvm_wait() may be called in a place
where we get the same case I mentioned to Peter,

raw_local_irq_save(); /* or other IRQs off without tracing */
...
kvm_wait() /* IRQ state tracing gets confused */
...
raw_local_irq_restore();

and therefore, using raw variants in kvm_wait() works. It's also safe
because it doesn't call any other libraries that would result in corrupt


Yes, this is definitely an issue.

Tracing, we also musn't call into tracing when using raw_local_irq_*().
Because then we re-intoduce this same issue all over again.

Both halt() and safe_halt() are more paravirt calls, but given we're in
a KVM paravirt call already, I suppose we can directly use native_*()
here.

Something like so then... I suppose, but then the Xen variants need TLC
too.


Just to be sure I understand you correct:

You mean that xen_qlock_kick() and xen_qlock_wait() and all functions
called by those should gain the "notrace" attribute, right?

I am not sure why the kick variants need it, though. IMO those are
called only after the lock has been released, so they should be fine
without notrace.


The issue happens when someone uses arch_spinlock_t under
raw_local_irq_*().


And again: we shouldn't forget the Hyper-V variants.


Bah, my grep failed :/ Also *groan*, that's calling apic->send_IPI().


In case you don't want to do it I can send the patch for the Xen
variants.


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread Jürgen Groß

On 11.08.20 10:12, Peter Zijlstra wrote:

On Tue, Aug 11, 2020 at 09:57:55AM +0200, Jürgen Groß wrote:

On 11.08.20 09:41, Peter Zijlstra wrote:

On Fri, Aug 07, 2020 at 05:19:03PM +0200, Marco Elver wrote:


My hypothesis here is simply that kvm_wait() may be called in a place
where we get the same case I mentioned to Peter,

raw_local_irq_save(); /* or other IRQs off without tracing */
...
kvm_wait() /* IRQ state tracing gets confused */
...
raw_local_irq_restore();

and therefore, using raw variants in kvm_wait() works. It's also safe
because it doesn't call any other libraries that would result in corrupt


Yes, this is definitely an issue.

Tracing, we also musn't call into tracing when using raw_local_irq_*().
Because then we re-intoduce this same issue all over again.

Both halt() and safe_halt() are more paravirt calls, but given we're in
a KVM paravirt call already, I suppose we can directly use native_*()
here.

Something like so then... I suppose, but then the Xen variants need TLC
too.


Just to be sure I understand you correct:

You mean that xen_qlock_kick() and xen_qlock_wait() and all functions
called by those should gain the "notrace" attribute, right?

I am not sure why the kick variants need it, though. IMO those are
called only after the lock has been released, so they should be fine
without notrace.


The issue happens when someone uses arch_spinlock_t under
raw_local_irq_*().


Ah, okay.




And again: we shouldn't forget the Hyper-V variants.


Bah, my grep failed :/ Also *groan*, that's calling apic->send_IPI().


I've seen that, too. :-(


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread Peter Zijlstra
On Tue, Aug 11, 2020 at 09:57:55AM +0200, Jürgen Groß wrote:
> On 11.08.20 09:41, Peter Zijlstra wrote:
> > On Fri, Aug 07, 2020 at 05:19:03PM +0200, Marco Elver wrote:
> > 
> > > My hypothesis here is simply that kvm_wait() may be called in a place
> > > where we get the same case I mentioned to Peter,
> > > 
> > >   raw_local_irq_save(); /* or other IRQs off without tracing */
> > >   ...
> > >   kvm_wait() /* IRQ state tracing gets confused */
> > >   ...
> > >   raw_local_irq_restore();
> > > 
> > > and therefore, using raw variants in kvm_wait() works. It's also safe
> > > because it doesn't call any other libraries that would result in corrupt
> > 
> > Yes, this is definitely an issue.
> > 
> > Tracing, we also musn't call into tracing when using raw_local_irq_*().
> > Because then we re-intoduce this same issue all over again.
> > 
> > Both halt() and safe_halt() are more paravirt calls, but given we're in
> > a KVM paravirt call already, I suppose we can directly use native_*()
> > here.
> > 
> > Something like so then... I suppose, but then the Xen variants need TLC
> > too.
> 
> Just to be sure I understand you correct:
> 
> You mean that xen_qlock_kick() and xen_qlock_wait() and all functions
> called by those should gain the "notrace" attribute, right?
> 
> I am not sure why the kick variants need it, though. IMO those are
> called only after the lock has been released, so they should be fine
> without notrace.

The issue happens when someone uses arch_spinlock_t under
raw_local_irq_*().

> And again: we shouldn't forget the Hyper-V variants.

Bah, my grep failed :/ Also *groan*, that's calling apic->send_IPI().


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread Jürgen Groß

On 11.08.20 09:41, Peter Zijlstra wrote:

On Fri, Aug 07, 2020 at 05:19:03PM +0200, Marco Elver wrote:


My hypothesis here is simply that kvm_wait() may be called in a place
where we get the same case I mentioned to Peter,

raw_local_irq_save(); /* or other IRQs off without tracing */
...
kvm_wait() /* IRQ state tracing gets confused */
...
raw_local_irq_restore();

and therefore, using raw variants in kvm_wait() works. It's also safe
because it doesn't call any other libraries that would result in corrupt


Yes, this is definitely an issue.

Tracing, we also musn't call into tracing when using raw_local_irq_*().
Because then we re-intoduce this same issue all over again.

Both halt() and safe_halt() are more paravirt calls, but given we're in
a KVM paravirt call already, I suppose we can directly use native_*()
here.

Something like so then... I suppose, but then the Xen variants need TLC
too.


Just to be sure I understand you correct:

You mean that xen_qlock_kick() and xen_qlock_wait() and all functions
called by those should gain the "notrace" attribute, right?

I am not sure why the kick variants need it, though. IMO those are
called only after the lock has been released, so they should be fine
without notrace.

And again: we shouldn't forget the Hyper-V variants.


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread Peter Zijlstra
On Fri, Aug 07, 2020 at 05:19:03PM +0200, Marco Elver wrote:

> My hypothesis here is simply that kvm_wait() may be called in a place
> where we get the same case I mentioned to Peter,
> 
>   raw_local_irq_save(); /* or other IRQs off without tracing */
>   ...
>   kvm_wait() /* IRQ state tracing gets confused */
>   ...
>   raw_local_irq_restore();
> 
> and therefore, using raw variants in kvm_wait() works. It's also safe
> because it doesn't call any other libraries that would result in corrupt

Yes, this is definitely an issue.

Tracing, we also musn't call into tracing when using raw_local_irq_*().
Because then we re-intoduce this same issue all over again.

Both halt() and safe_halt() are more paravirt calls, but given we're in
a KVM paravirt call already, I suppose we can directly use native_*()
here.

Something like so then... I suppose, but then the Xen variants need TLC
too.

---
 arch/x86/include/asm/irqflags.h |  4 ++--
 arch/x86/include/asm/kvm_para.h | 18 +-
 arch/x86/kernel/kvm.c   | 14 +++---
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 02a0cf547d7b..7c614db25274 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -54,13 +54,13 @@ static __always_inline void native_irq_enable(void)
asm volatile("sti": : :"memory");
 }

-static inline __cpuidle void native_safe_halt(void)
+static __always_inline __cpuidle void native_safe_halt(void)
 {
mds_idle_clear_cpu_buffers();
asm volatile("sti; hlt": : :"memory");
 }

-static inline __cpuidle void native_halt(void)
+static __always_inline __cpuidle void native_halt(void)
 {
mds_idle_clear_cpu_buffers();
asm volatile("hlt": : :"memory");
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 49d3a9edb06f..90f7ea58ebb0 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -30,7 +30,7 @@ static inline bool kvm_check_and_clear_guest_paused(void)
  * noted by the particular hypercall.
  */

-static inline long kvm_hypercall0(unsigned int nr)
+static __always_inline long kvm_hypercall0(unsigned int nr)
 {
long ret;
asm volatile(KVM_HYPERCALL
@@ -40,7 +40,7 @@ static inline long kvm_hypercall0(unsigned int nr)
return ret;
 }

-static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
+static __always_inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
 {
long ret;
asm volatile(KVM_HYPERCALL
@@ -50,8 +50,8 @@ static inline long kvm_hypercall1(unsigned int nr, unsigned 
long p1)
return ret;
 }

-static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
- unsigned long p2)
+static __always_inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
+  unsigned long p2)
 {
long ret;
asm volatile(KVM_HYPERCALL
@@ -61,8 +61,8 @@ static inline long kvm_hypercall2(unsigned int nr, unsigned 
long p1,
return ret;
 }

-static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
- unsigned long p2, unsigned long p3)
+static __always_inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
+  unsigned long p2, unsigned long p3)
 {
long ret;
asm volatile(KVM_HYPERCALL
@@ -72,9 +72,9 @@ static inline long kvm_hypercall3(unsigned int nr, unsigned 
long p1,
return ret;
 }

-static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
- unsigned long p2, unsigned long p3,
- unsigned long p4)
+static __always_inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
+  unsigned long p2, unsigned long p3,
+  unsigned long p4)
 {
long ret;
asm volatile(KVM_HYPERCALL
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 233c77d056c9..15f8dfd8812d 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -779,7 +779,7 @@ arch_initcall(kvm_alloc_cpumask);
 #ifdef CONFIG_PARAVIRT_SPINLOCKS

 /* Kick a cpu by its apicid. Used to wake up a halted vcpu */
-static void kvm_kick_cpu(int cpu)
+static notrace kvm_kick_cpu(int cpu)
 {
int apicid;
unsigned long flags = 0;
@@ -790,14 +790,14 @@ static void kvm_kick_cpu(int cpu)

 #include 

-static void kvm_wait(u8 *ptr, u8 val)
+static notrace kvm_wait(u8 *ptr, u8 val)
 {
unsigned long flags;

if (in_nmi())
return;

-   local_irq_save(flags);
+   raw_local_irq_save(flags);

if (READ_ONCE(*ptr) != val)
goto out;
@@ -808,16 +808,16 @@ static void kvm_wait(u8 *ptr, u8 val)
 * in irq spinlock slowpath and no 

Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-11 Thread Jürgen Groß

On 11.08.20 09:00, Marco Elver wrote:

On Fri, 7 Aug 2020 at 17:19, Marco Elver  wrote:

On Fri, Aug 07, 2020 at 02:08PM +0200, Marco Elver wrote:

On Fri, 7 Aug 2020 at 14:04, Jürgen Groß  wrote:


On 07.08.20 13:38, Marco Elver wrote:

On Fri, Aug 07, 2020 at 12:35PM +0200, Jürgen Groß wrote:

...

I think CONFIG_PARAVIRT_XXL shouldn't matter, but I'm not completely
sure about that. CONFIG_PARAVIRT_SPINLOCKS would be my primary suspect.


Yes, PARAVIRT_XXL doesn't make a different. When disabling
PARAVIRT_SPINLOCKS, however, the warnings go away.


Thanks for testing!

I take it you are doing the tests in a KVM guest?


Yes, correct.


If so I have a gut feeling that the use of local_irq_save() and
local_irq_restore() in kvm_wait() might be fishy. I might be completely
wrong here, though.


Happy to help debug more, although I might need patches or pointers
what to play with.


BTW, I think Xen's variant of pv spinlocks is fine (no playing with IRQ
on/off).

Hyper-V seems to do the same as KVM, and kicking another vcpu could be
problematic as well, as it is just using IPI.


I experimented a bit more, and the below patch seems to solve the
warnings. However, that was based on your pointer about kvm_wait(), and
I can't quite tell if it is the right solution.

My hypothesis here is simply that kvm_wait() may be called in a place
where we get the same case I mentioned to Peter,

 raw_local_irq_save(); /* or other IRQs off without tracing */
 ...
 kvm_wait() /* IRQ state tracing gets confused */
 ...
 raw_local_irq_restore();

and therefore, using raw variants in kvm_wait() works. It's also safe
because it doesn't call any other libraries that would result in corrupt
IRQ state AFAIK.


Just to follow-up, it'd still be nice to fix this. Suggestions?

I could send the below as a patch, but can only go off my above
hypothesis and the fact that syzbot is happier, so not entirely
convincing.


Peter has told me via IRC he will look soon further into this.

Your finding suggests that the pv-lock implementation for Hyper-V
needs some tweaking, too. For that purpose I'm adding Wei to Cc.


Juergen



Thanks,
-- Marco


-- >8 --

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 233c77d056c9..1d412d1466f0 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -797,7 +797,7 @@ static void kvm_wait(u8 *ptr, u8 val)
 if (in_nmi())
 return;

-   local_irq_save(flags);
+   raw_local_irq_save(flags);

 if (READ_ONCE(*ptr) != val)
 goto out;
@@ -810,10 +810,10 @@ static void kvm_wait(u8 *ptr, u8 val)
 if (arch_irqs_disabled_flags(flags))
 halt();
 else
-   safe_halt();
+   raw_safe_halt();

  out:
-   local_irq_restore(flags);
+   raw_local_irq_restore(flags);
  }

  #ifdef CONFIG_X86_32




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-07 Thread Jürgen Groß

On 07.08.20 13:38, Marco Elver wrote:

On Fri, Aug 07, 2020 at 12:35PM +0200, Jürgen Groß wrote:

On 07.08.20 11:50, Marco Elver wrote:

On Fri, Aug 07, 2020 at 11:24AM +0200, Jürgen Groß wrote:

On 07.08.20 11:01, Marco Elver wrote:

On Thu, 6 Aug 2020 at 18:06, Marco Elver  wrote:

On Thu, 6 Aug 2020 at 15:17, Marco Elver  wrote:

On Thu, Aug 06, 2020 at 01:32PM +0200, pet...@infradead.org wrote:

On Thu, Aug 06, 2020 at 09:47:23AM +0200, Marco Elver wrote:

Testing my hypothesis that raw then nested non-raw
local_irq_save/restore() breaks IRQ state tracking -- see the reproducer
below. This is at least 1 case I can think of that we're bound to hit.

...


/me goes ponder things...

How's something like this then?

---
include/linux/sched.h |  3 ---
kernel/kcsan/core.c   | 62 
---
2 files changed, 44 insertions(+), 21 deletions(-)


Thank you! That approach seems to pass syzbot (also with
CONFIG_PARAVIRT) and kcsan-test tests.

I had to modify it some, so that report.c's use of the restore logic
works and not mess up the IRQ trace printed on KCSAN reports (with
CONFIG_KCSAN_VERBOSE).

I still need to fully convince myself all is well now and we don't end
up with more fixes. :-) If it passes further testing, I'll send it as a
real patch (I want to add you as Co-developed-by, but would need your
Signed-off-by for the code you pasted, I think.)


I let it run on syzbot through the night, and it's fine without
PARAVIRT (see below). I have sent the patch (need your Signed-off-by
as it's based on your code, thank you!):
https://lkml.kernel.org/r/20200807090031.3506555-1-el...@google.com


With CONFIG_PARAVIRT=y (without the notrace->noinstr patch), I still
get lockdep DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled()), although
it takes longer for syzbot to hit them. But I think that's expected
because we can still get the recursion that I pointed out, and will
need that patch.


Never mind, I get these warnings even if I don't turn on KCSAN
(CONFIG_KCSAN=n). Something else is going on with PARAVIRT=y that
throws off IRQ state tracking. :-/


What are the settings of CONFIG_PARAVIRT_XXL and
CONFIG_PARAVIRT_SPINLOCKS in this case?


I attached a config.

$> grep PARAVIRT .config
CONFIG_PARAVIRT=y
CONFIG_PARAVIRT_XXL=y
# CONFIG_PARAVIRT_DEBUG is not set
CONFIG_PARAVIRT_SPINLOCKS=y
# CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
CONFIG_PARAVIRT_CLOCK=y


Anything special I need to do to reproduce the problem? Or would you be
willing to do some more rounds with different config settings?


I can only test it with syzkaller, but that probably doesn't help if you
don't already have it set up. It can't seem to find a C reproducer.

I did some more rounds with different configs.


I think CONFIG_PARAVIRT_XXL shouldn't matter, but I'm not completely
sure about that. CONFIG_PARAVIRT_SPINLOCKS would be my primary suspect.


Yes, PARAVIRT_XXL doesn't make a different. When disabling
PARAVIRT_SPINLOCKS, however, the warnings go away.


Thanks for testing!

I take it you are doing the tests in a KVM guest?

If so I have a gut feeling that the use of local_irq_save() and
local_irq_restore() in kvm_wait() might be fishy. I might be completely
wrong here, though.

BTW, I think Xen's variant of pv spinlocks is fine (no playing with IRQ
on/off).

Hyper-V seems to do the same as KVM, and kicking another vcpu could be
problematic as well, as it is just using IPI.


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-07 Thread Jürgen Groß

On 07.08.20 11:50, Marco Elver wrote:

On Fri, Aug 07, 2020 at 11:24AM +0200, Jürgen Groß wrote:

On 07.08.20 11:01, Marco Elver wrote:

On Thu, 6 Aug 2020 at 18:06, Marco Elver  wrote:

On Thu, 6 Aug 2020 at 15:17, Marco Elver  wrote:

On Thu, Aug 06, 2020 at 01:32PM +0200, pet...@infradead.org wrote:

On Thu, Aug 06, 2020 at 09:47:23AM +0200, Marco Elver wrote:

Testing my hypothesis that raw then nested non-raw
local_irq_save/restore() breaks IRQ state tracking -- see the reproducer
below. This is at least 1 case I can think of that we're bound to hit.

...


/me goes ponder things...

How's something like this then?

---
   include/linux/sched.h |  3 ---
   kernel/kcsan/core.c   | 62 
---
   2 files changed, 44 insertions(+), 21 deletions(-)


Thank you! That approach seems to pass syzbot (also with
CONFIG_PARAVIRT) and kcsan-test tests.

I had to modify it some, so that report.c's use of the restore logic
works and not mess up the IRQ trace printed on KCSAN reports (with
CONFIG_KCSAN_VERBOSE).

I still need to fully convince myself all is well now and we don't end
up with more fixes. :-) If it passes further testing, I'll send it as a
real patch (I want to add you as Co-developed-by, but would need your
Signed-off-by for the code you pasted, I think.)


I let it run on syzbot through the night, and it's fine without
PARAVIRT (see below). I have sent the patch (need your Signed-off-by
as it's based on your code, thank you!):
https://lkml.kernel.org/r/20200807090031.3506555-1-el...@google.com


With CONFIG_PARAVIRT=y (without the notrace->noinstr patch), I still
get lockdep DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled()), although
it takes longer for syzbot to hit them. But I think that's expected
because we can still get the recursion that I pointed out, and will
need that patch.


Never mind, I get these warnings even if I don't turn on KCSAN
(CONFIG_KCSAN=n). Something else is going on with PARAVIRT=y that
throws off IRQ state tracking. :-/


What are the settings of CONFIG_PARAVIRT_XXL and
CONFIG_PARAVIRT_SPINLOCKS in this case?


I attached a config.

$> grep PARAVIRT .config
CONFIG_PARAVIRT=y
CONFIG_PARAVIRT_XXL=y
# CONFIG_PARAVIRT_DEBUG is not set
CONFIG_PARAVIRT_SPINLOCKS=y
# CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
CONFIG_PARAVIRT_CLOCK=y


Anything special I need to do to reproduce the problem? Or would you be
willing to do some more rounds with different config settings?

I think CONFIG_PARAVIRT_XXL shouldn't matter, but I'm not completely
sure about that. CONFIG_PARAVIRT_SPINLOCKS would be my primary suspect.


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-07 Thread Jürgen Groß

On 07.08.20 11:01, Marco Elver wrote:

On Thu, 6 Aug 2020 at 18:06, Marco Elver  wrote:

On Thu, 6 Aug 2020 at 15:17, Marco Elver  wrote:

On Thu, Aug 06, 2020 at 01:32PM +0200, pet...@infradead.org wrote:

On Thu, Aug 06, 2020 at 09:47:23AM +0200, Marco Elver wrote:

Testing my hypothesis that raw then nested non-raw
local_irq_save/restore() breaks IRQ state tracking -- see the reproducer
below. This is at least 1 case I can think of that we're bound to hit.

...


/me goes ponder things...

How's something like this then?

---
  include/linux/sched.h |  3 ---
  kernel/kcsan/core.c   | 62 ---
  2 files changed, 44 insertions(+), 21 deletions(-)


Thank you! That approach seems to pass syzbot (also with
CONFIG_PARAVIRT) and kcsan-test tests.

I had to modify it some, so that report.c's use of the restore logic
works and not mess up the IRQ trace printed on KCSAN reports (with
CONFIG_KCSAN_VERBOSE).

I still need to fully convince myself all is well now and we don't end
up with more fixes. :-) If it passes further testing, I'll send it as a
real patch (I want to add you as Co-developed-by, but would need your
Signed-off-by for the code you pasted, I think.)


I let it run on syzbot through the night, and it's fine without
PARAVIRT (see below). I have sent the patch (need your Signed-off-by
as it's based on your code, thank you!):
https://lkml.kernel.org/r/20200807090031.3506555-1-el...@google.com


With CONFIG_PARAVIRT=y (without the notrace->noinstr patch), I still
get lockdep DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled()), although
it takes longer for syzbot to hit them. But I think that's expected
because we can still get the recursion that I pointed out, and will
need that patch.


Never mind, I get these warnings even if I don't turn on KCSAN
(CONFIG_KCSAN=n). Something else is going on with PARAVIRT=y that
throws off IRQ state tracking. :-/


What are the settings of CONFIG_PARAVIRT_XXL and
CONFIG_PARAVIRT_SPINLOCKS in this case?


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-06 Thread peterz
On Thu, Aug 06, 2020 at 09:47:23AM +0200, Marco Elver wrote:
> Testing my hypothesis that raw then nested non-raw
> local_irq_save/restore() breaks IRQ state tracking -- see the reproducer
> below. This is at least 1 case I can think of that we're bound to hit.

Aaargh!

> diff --git a/init/main.c b/init/main.c
> index 15bd0efff3df..0873319dcff4 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1041,6 +1041,22 @@ asmlinkage __visible void __init start_kernel(void)
>   sfi_init_late();
>   kcsan_init();
>  
> + /* DEBUG CODE */
> + lockdep_assert_irqs_enabled(); /* Pass. */
> + {
> + unsigned long flags1;
> + raw_local_irq_save(flags1);

This disables IRQs but doesn't trace..

> + {
> + unsigned long flags2;
> + lockdep_assert_irqs_enabled(); /* Pass - expectedly 
> blind. */

Indeed, we didn't trace the above disable, so software state is still
on.

> + local_irq_save(flags2);

So here we save IRQ state, and unconditionally disable IRQs and trace
them disabled.

> + lockdep_assert_irqs_disabled(); /* Pass. */
> + local_irq_restore(flags2);

But here, we restore IRQ state to 'disabled' and explicitly trace it
disabled *again* (which is a bit daft, but whatever).

> + }
> + raw_local_irq_restore(flags1);

This then restores the IRQ state to enable, but no tracing.

> + }
> + lockdep_assert_irqs_enabled(); /* FAIL! */

And we're out of sync... :/

/me goes ponder things...

How's something like this then?

---
 include/linux/sched.h |  3 ---
 kernel/kcsan/core.c   | 62 ---
 2 files changed, 44 insertions(+), 21 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 06ec60462af0..2f5aef57e687 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1193,9 +1193,6 @@ struct task_struct {
 
 #ifdef CONFIG_KCSAN
struct kcsan_ctxkcsan_ctx;
-#ifdef CONFIG_TRACE_IRQFLAGS
-   struct irqtrace_events  kcsan_save_irqtrace;
-#endif
 #endif
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c
index 9147ff6a12e5..9c4436bf0561 100644
--- a/kernel/kcsan/core.c
+++ b/kernel/kcsan/core.c
@@ -291,17 +291,50 @@ static inline unsigned int get_delay(void)
0);
 }
 
-void kcsan_save_irqtrace(struct task_struct *task)
+/*
+ * KCSAN hooks are everywhere, which means they're NMI like for interrupt
+ * tracing. In order to present a 'normal' as possible context to the code
+ * called by KCSAN when reporting errors we need to update the irq-tracing
+ * state.
+ *
+ * Save and restore the IRQ state trace touched by KCSAN, since KCSAN's
+ * runtime is entered for every memory access, and potentially useful
+ * information is lost if dirtied by KCSAN.
+ */
+
+struct kcsan_irq_state {
+   unsigned long   flags;
+#ifdef CONFIG_TRACE_IRQFLAGS
+   int hardirqs;
+   struct irqtrace_events  irqtrace;
+#endif
+};
+
+void kcsan_save_irqtrace(struct kcsan_irq_state *irq_state)
 {
 #ifdef CONFIG_TRACE_IRQFLAGS
-   task->kcsan_save_irqtrace = task->irqtrace;
+   irq_state->irqtrace = task->irqtrace;
+   irq_state->hardirq = lockdep_hardirqs_enabled();
 #endif
+   if (!kcsan_interrupt_watcher) {
+   raw_local_irq_save(irq_state->flags);
+   lockdep_hardirqs_off(CALLER_ADDR0);
+   }
 }
 
-void kcsan_restore_irqtrace(struct task_struct *task)
+void kcsan_restore_irqtrace(struct kcsan_irq_state *irq_state)
 {
+   if (!kcsan_interrupt_watcher) {
+#ifdef CONFIG_TRACE_IRQFLAGS
+   if (irq_state->hardirqs) {
+   lockdep_hardirqs_on_prepare(CALLER_ADDR0);
+   lockdep_hardirqs_on(CALLER_ADDR0);
+   }
+#endif
+   raw_local_irq_restore(irq_state->flags);
+   }
 #ifdef CONFIG_TRACE_IRQFLAGS
-   task->irqtrace = task->kcsan_save_irqtrace;
+   task->irqtrace = irq_state->irqtrace;
 #endif
 }
 
@@ -350,11 +383,13 @@ static noinline void kcsan_found_watchpoint(const 
volatile void *ptr,
flags = user_access_save();
 
if (consumed) {
-   kcsan_save_irqtrace(current);
+   struct kcsan_irq_state irqstate;
+
+   kcsan_save_irqtrace();
kcsan_report(ptr, size, type, KCSAN_VALUE_CHANGE_MAYBE,
 KCSAN_REPORT_CONSUMED_WATCHPOINT,
 watchpoint - watchpoints);
-   kcsan_restore_irqtrace(current);
+   kcsan_restore_irqtrace();
} else {
/*
 * The other thread may not print any diagnostics, as it has
@@ -387,7 +422,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t 
size, int type)
unsigned long access_mask;
enum 

Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-05 Thread Jürgen Groß

On 05.08.20 16:12, pet...@infradead.org wrote:

On Wed, Aug 05, 2020 at 03:59:40PM +0200, Marco Elver wrote:

On Wed, Aug 05, 2020 at 03:42PM +0200, pet...@infradead.org wrote:



Shouldn't we __always_inline those? They're going to be really small.


I can send a v2, and you can choose. For reference, though:

86271ee0 :
86271ee0:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
86271ee5:   48 83 3d 43 87 e4 01cmpq   $0x0,0x1e48743(%rip)   
 # 880ba630 
86271eec:   00
86271eed:   74 0d   je 86271efc 

86271eef:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
86271ef4:   ff 14 25 30 a6 0b 88callq  
*0x880ba630
86271efb:   c3  retq
86271efc:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
86271f01:   0f 0b   ud2



86271a90 :
86271a90:   53  push   %rbx
86271a91:   48 89 fbmov%rdi,%rbx
86271a94:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
86271a99:   48 83 3d 97 8b e4 01cmpq   $0x0,0x1e48b97(%rip)   
 # 880ba638 
86271aa0:   00
86271aa1:   74 11   je 86271ab4 

86271aa3:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
86271aa8:   48 89 dfmov%rbx,%rdi
86271aab:   ff 14 25 38 a6 0b 88callq  
*0x880ba638
86271ab2:   5b  pop%rbx
86271ab3:   c3  retq
86271ab4:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
86271ab9:   0f 0b   ud2



Blergh, that's abysmall. In part I suspect because you have
CONFIG_PARAVIRT_DEBUG, let me try and untangle that PV macro maze.



Probably. I have found the following in my kernel:

fff81540a5f :
81540a5f:   ff 14 25 40 a4 23 82callq  *0x8223a440
81540a66:   c3  retq


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-05 Thread peterz
On Wed, Aug 05, 2020 at 04:12:37PM +0200, pet...@infradead.org wrote:
> On Wed, Aug 05, 2020 at 03:59:40PM +0200, Marco Elver wrote:
> > On Wed, Aug 05, 2020 at 03:42PM +0200, pet...@infradead.org wrote:
> 
> > > Shouldn't we __always_inline those? They're going to be really small.
> > 
> > I can send a v2, and you can choose. For reference, though:
> > 
> > 86271ee0 :
> > 86271ee0:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
> > 86271ee5:   48 83 3d 43 87 e4 01cmpq   
> > $0x0,0x1e48743(%rip)# 880ba630 
> > 86271eec:   00
> > 86271eed:   74 0d   je 86271efc 
> > 
> > 86271eef:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
> > 86271ef4:   ff 14 25 30 a6 0b 88callq  
> > *0x880ba630
> > 86271efb:   c3  retq
> > 86271efc:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
> > 86271f01:   0f 0b   ud2
> 
> > 86271a90 :
> > 86271a90:   53  push   %rbx
> > 86271a91:   48 89 fbmov%rdi,%rbx
> > 86271a94:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
> > 86271a99:   48 83 3d 97 8b e4 01cmpq   
> > $0x0,0x1e48b97(%rip)# 880ba638 
> > 86271aa0:   00
> > 86271aa1:   74 11   je 86271ab4 
> > 
> > 86271aa3:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
> > 86271aa8:   48 89 dfmov%rbx,%rdi
> > 86271aab:   ff 14 25 38 a6 0b 88callq  
> > *0x880ba638
> > 86271ab2:   5b  pop%rbx
> > 86271ab3:   c3  retq
> > 86271ab4:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
> > 86271ab9:   0f 0b   ud2
> 
> 
> Blergh, that's abysmall. In part I suspect because you have
> CONFIG_PARAVIRT_DEBUG, let me try and untangle that PV macro maze.

Yeah, look here:

 00462149 :
   462149:  ff 14 25 00 00 00 00callq  *0x0
000346214c: R_X86_64_32Spv_ops+0x120
0007   462150:  c3  retq


That's exactly what I was expecting.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-05 Thread peterz
On Wed, Aug 05, 2020 at 03:59:40PM +0200, Marco Elver wrote:
> On Wed, Aug 05, 2020 at 03:42PM +0200, pet...@infradead.org wrote:

> > Shouldn't we __always_inline those? They're going to be really small.
> 
> I can send a v2, and you can choose. For reference, though:
> 
>   86271ee0 :
>   86271ee0:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
>   86271ee5:   48 83 3d 43 87 e4 01cmpq   
> $0x0,0x1e48743(%rip)# 880ba630 
>   86271eec:   00
>   86271eed:   74 0d   je 86271efc 
> 
>   86271eef:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
>   86271ef4:   ff 14 25 30 a6 0b 88callq  
> *0x880ba630
>   86271efb:   c3  retq
>   86271efc:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
>   86271f01:   0f 0b   ud2

>   86271a90 :
>   86271a90:   53  push   %rbx
>   86271a91:   48 89 fbmov%rdi,%rbx
>   86271a94:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
>   86271a99:   48 83 3d 97 8b e4 01cmpq   
> $0x0,0x1e48b97(%rip)# 880ba638 
>   86271aa0:   00
>   86271aa1:   74 11   je 86271ab4 
> 
>   86271aa3:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
>   86271aa8:   48 89 dfmov%rbx,%rdi
>   86271aab:   ff 14 25 38 a6 0b 88callq  
> *0x880ba638
>   86271ab2:   5b  pop%rbx
>   86271ab3:   c3  retq
>   86271ab4:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
>   86271ab9:   0f 0b   ud2


Blergh, that's abysmall. In part I suspect because you have
CONFIG_PARAVIRT_DEBUG, let me try and untangle that PV macro maze.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-05 Thread peterz
On Wed, Aug 05, 2020 at 03:26:29PM +0200, Marco Elver wrote:
> Add missing noinstr to arch_local*() helpers, as they may be called from
> noinstr code.
> 
> On a KCSAN config with CONFIG_PARAVIRT=y, syzbot stumbled across corrupt

Cute, so I've been working on adding objtool support for this a little:

  
https://lkml.kernel.org/r/20200803143231.ge2...@hirez.programming.kicks-ass.net

> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index 3d2afecde50c..a606f2ba2b5e 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -760,27 +760,27 @@ bool __raw_callee_save___native_vcpu_is_preempted(long 
> cpu);
>   ((struct paravirt_callee_save) { func })
>  
>  #ifdef CONFIG_PARAVIRT_XXL
> -static inline notrace unsigned long arch_local_save_flags(void)
> +static inline noinstr unsigned long arch_local_save_flags(void)
>  {
>   return PVOP_CALLEE0(unsigned long, irq.save_fl);
>  }
>  
> -static inline notrace void arch_local_irq_restore(unsigned long f)
> +static inline noinstr void arch_local_irq_restore(unsigned long f)
>  {
>   PVOP_VCALLEE1(irq.restore_fl, f);
>  }
>  
> -static inline notrace void arch_local_irq_disable(void)
> +static inline noinstr void arch_local_irq_disable(void)
>  {
>   PVOP_VCALLEE0(irq.irq_disable);
>  }
>  
> -static inline notrace void arch_local_irq_enable(void)
> +static inline noinstr void arch_local_irq_enable(void)
>  {
>   PVOP_VCALLEE0(irq.irq_enable);
>  }
>  
> -static inline notrace unsigned long arch_local_irq_save(void)
> +static inline noinstr unsigned long arch_local_irq_save(void)
>  {
>   unsigned long f;
>  

Shouldn't we __always_inline those? They're going to be really small.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization