On 05/25/2018 10:31 AM, Nick Desaulniers wrote: > On Fri, May 25, 2018 at 9:53 AM <h...@zytor.com> wrote: >> On May 25, 2018 9:46:42 AM PDT, Nick Desaulniers <ndesaulni...@google.com> > wrote: >>> On Fri, May 25, 2018 at 9:33 AM <h...@zytor.com> wrote: >>>> On May 25, 2018 9:27:40 AM PDT, Nick Desaulniers >>> <ndesaulni...@google.com> wrote: >>> When you say >>> >>>> It still should be available as as inline, however, but now "extern >>> inline". >>> >>> Am I understanding correctly that native_save_fl should be inlined into >>> all >>> call sites (modulo the problematic pv_irq_ops.save_fl case)? Because >>> for >>> these two assembly implementations, it's not, but maybe there's >>> something >>> missing in my implementation? > >> Yes, that's what "extern inline" means. Maybe it needs a must inline > annotation, but that's really messed up. >
What about doing something like suggested here: https://bugs.llvm.org/show_bug.cgi?id=37512#c17 This would keep the definition in C and make it easier for compilers to inline. -Tom > I don't think it's possible to inline a function from an external > translation unit without something like LTO. > > If I move the implementation of native_save_fl() to a separate .c (with out > of line assembly) or .S, neither clang nor gcc will inline that assembly to > any call sites, whether the declaration of native_save_fl() looks like: > > extern inline unsigned long native_save_fl(void); > > or > > __attribute__((always_inline)) > > extern inline unsigned long native_save_fl(void); > > I think an external copy is the best approach for the paravirt code: > > diff --git a/arch/x86/kernel/irqflags.c b/arch/x86/kernel/irqflags.c > new file mode 100644 > index 000000000000..e173ba8bee7b > --- /dev/null > +++ b/arch/x86/kernel/irqflags.c > @@ -0,0 +1,24 @@ > +#include <asm/asm.h> > + > +extern unsigned long native_save_fl_no_stack_protector(void); > +extern void native_restore_fl_no_stack_protector(unsigned long flags); > + > +asm( > +".pushsection .text;" > +".global native_save_fl_no_stack_protector;" > +".type native_save_fl_no_stack_protector, @function;" > +"native_save_fl_no_stack_protector:" > +"pushf;" > +"pop %" _ASM_AX ";" > +"ret;" > +".popsection"); > + > +asm( > +".pushsection .text;" > +".global native_restore_fl_no_stack_protector;" > +".type native_restore_fl_no_stack_protector, @function;" > +"native_restore_fl_no_stack_protector:" > +"push %" _ASM_DI ";" > +"popf;" > +"ret;" > +".popsection"); > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile > index 02d6f5cf4e70..8824d01c0c35 100644 > --- a/arch/x86/kernel/Makefile > +++ b/arch/x86/kernel/Makefile > @@ -61,6 +61,7 @@ obj-y += alternative.o i8253.o > hw_breakpoint.o > obj-y += tsc.o tsc_msr.o io_delay.o rtc.o > obj-y += pci-iommu_table.o > obj-y += resource.o > +obj-y += irqflags.o > > obj-y += process.o > obj-y += fpu/ > --- a/arch/x86/kernel/paravirt.c > +++ b/arch/x86/kernel/paravirt.c > @@ -322,9 +322,12 @@ struct pv_time_ops pv_time_ops = { > .steal_clock = native_steal_clock, > }; > > +extern unsigned long native_save_fl_no_stack_protector(void); > +extern void native_restore_fl_no_stack_protector(unsigned long flags); > + > __visible struct pv_irq_ops pv_irq_ops = { > - .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl), > - .restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl), > + .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl_no_stack_protector), > + .restore_fl = > __PV_IS_CALLEE_SAVE(native_restore_fl_no_stack_protector), > .irq_disable = __PV_IS_CALLEE_SAVE(native_irq_disable), > .irq_enable = __PV_IS_CALLEE_SAVE(native_irq_enable), > .safe_halt = native_safe_halt, > > Thoughts? >