Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-09 Thread Jeremy Fitzhardinge
Andi Kleen wrote:
 +if (opfunc == NULL)
 +/* If there's no function, patch it with a ud2a (BUG) */
 +ret = paravirt_patch_insns(site, len, start_ud2a, end_ud2a);
 

 This will actually give corrupted BUGs because you don't supply
 the full inline BUG header. Perhaps another trap would be better.
   

The BUG handler will still report it as a normal illegal instruction. 
It should never happen; the main thing is that it clearly points out
where the problem is (as opposed to jumping to a NULL pointer and
getting the unhelpful oh, eip is zero symptom).

 +EXPORT_SYMBOL(paravirt_ops);
 

 Definitely _GPL at least.
   

No, for the same reason as i386.

 +extern struct paravirt_ops paravirt_ops;
 

 Should be native_paravirt_ops I guess

   

No, because its the current set of pv_ops.  It starts all native, but it
is either completely or partially overwritten by hypervisor-specific ops.

 +
 + * This generates an indirect call based on the operation type number.
 

 The macros here don't
   

Yes, PARAVIRT_CALL does: call *(paravirt_ops+%c[paravirt_typenum]*8);


 + : =a(f)
 + : paravirt_type(save_fl),
 +   paravirt_clobber(CLBR_RAX)
 + : memory, cc);
 +return f;
 +}
 +
 +static inline void raw_local_irq_restore(unsigned long f)
 +{
 +__asm__ __volatile__(paravirt_alt(PARAVIRT_CALL)
 + :
 + : D (f),
 

 Have you investigated if a different input register generates better/smaller 
 code? I would assume rdi to be usually used already for the caller's 
 arguments so it will produce spilling

 Similar for the rax return in the other functions.
   

This has to match the normal C calling convention though, doesn't it?

J
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-09 Thread Jeremy Fitzhardinge
Glauber de Oliveira Costa wrote:
 +static unsigned native_patch(u8 type, u16 clobbers, void *insns, unsigned 
 len)
 +{
 + const unsigned char *start, *end;
 + unsigned ret;
 +
 + switch(type) {
 +#define SITE(x)  case PARAVIRT_PATCH(x): start = start_##x; end = 
 end_##x; goto patch_site
 + SITE(irq_disable);
 + SITE(irq_enable);
 + SITE(restore_fl);
 + SITE(save_fl);
 + SITE(iret);
 + SITE(sysret);
 + SITE(swapgs);
 + SITE(read_cr2);
 + SITE(read_cr3);
 + SITE(write_cr3);
 + SITE(clts);
 + SITE(flush_tlb_single);
 + SITE(wbinvd);
 +#undef SITE
 +
 + patch_site:
 + ret = paravirt_patch_insns(insns, len, start, end);
 + break;
 +
 + case PARAVIRT_PATCH(make_pgd):
 + case PARAVIRT_PATCH(pgd_val):
 + case PARAVIRT_PATCH(make_pte):
 + case PARAVIRT_PATCH(pte_val):
 + case PARAVIRT_PATCH(make_pmd):
 + case PARAVIRT_PATCH(pmd_val):
 + case PARAVIRT_PATCH(make_pud):
 + case PARAVIRT_PATCH(pud_val):
 + /* These functions end up returning what
 +they're passed in the first argument */
   

Is this still true with 64-bit?  Either way, I don't think its worth
having this here.  The damage to codegen around all those sites has
already happened, and the additional cost of a noop direct call is
pretty trivial.  I think this is a nanooptimisation which risks more
problems than it could possibly be worth.

 + case PARAVIRT_PATCH(set_pte):
 + case PARAVIRT_PATCH(set_pmd):
 + case PARAVIRT_PATCH(set_pud):
 + case PARAVIRT_PATCH(set_pgd):
 + /* These functions end up storing the second
 +  * argument in the location pointed by the first */
 + ret = paravirt_patch_store_reg(insns, len);
 + break;
   

Ditto, really.  Do this in a later patch if it actually seems to help.

 +unsigned paravirt_patch_copy_reg(void *site, unsigned len)
 +{
 + unsigned char *mov = site;
 + if (len  3)
 + return len;
 +
 + /* This is mov %rdi, %rax */
 + *mov++ = 0x48;
 + *mov++ = 0x89;
 + *mov   = 0xf8;
 + return 3;
 +}
 +
 +unsigned paravirt_patch_store_reg(void *site, unsigned len)
 +{
 + unsigned char *mov = site;
 + if (len  3)
 + return len;
 +
 + /* This is mov %rsi, (%rdi) */
 + *mov++ = 0x48;
 + *mov++ = 0x89;
 + *mov   = 0x37;
 + return 3;
 +}
   

These seem excessively special-purpose.  Are their only uses the ones I
commented on above.

 +/*
 + * integers must be use with care here. They can break the PARAVIRT_PATCH(x)
 + * macro, that divides the offset in the structure by 8, to get a number
 + * associated with the hook. Dividing by four would be a solution, but it
 + * would limit the future growth of the structure if needed.
   

Why not just stick them at the end of the structure?


J

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-09 Thread Jeremy Fitzhardinge
Glauber de Oliveira Costa wrote:
 +/*
 + * integers must be use with care here. They can break the 
 PARAVIRT_PATCH(x)
 + * macro, that divides the offset in the structure by 8, to get a number
 + * associated with the hook. Dividing by four would be a solution, but it
 + * would limit the future growth of the structure if needed.

   
 Why not just stick them at the end of the structure?
 

 Does it really matter?
   

Well, yes, if alignment is an issue.


J
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-09 Thread Glauber de Oliveira Costa
  + case PARAVIRT_PATCH(make_pgd):
  + case PARAVIRT_PATCH(pgd_val):
  + case PARAVIRT_PATCH(make_pte):
  + case PARAVIRT_PATCH(pte_val):
  + case PARAVIRT_PATCH(make_pmd):
  + case PARAVIRT_PATCH(pmd_val):
  + case PARAVIRT_PATCH(make_pud):
  + case PARAVIRT_PATCH(pud_val):
  + /* These functions end up returning what
  +they're passed in the first argument */
 

 Is this still true with 64-bit?  Either way, I don't think its worth
 having this here.  The damage to codegen around all those sites has
 already happened, and the additional cost of a noop direct call is
 pretty trivial.  I think this is a nanooptimisation which risks more
 problems than it could possibly be worth.

No it is not. But it is just the comment that is broken. (I forgot to
update it). The case here, is that they put in rax what they receive
in rdi.

  + case PARAVIRT_PATCH(set_pte):
  + case PARAVIRT_PATCH(set_pmd):
  + case PARAVIRT_PATCH(set_pud):
  + case PARAVIRT_PATCH(set_pgd):
  + /* These functions end up storing the second
  +  * argument in the location pointed by the first */
  + ret = paravirt_patch_store_reg(insns, len);
  + break;
 

 Ditto, really.  Do this in a later patch if it actually seems to help.

Okay, I can remove them both.

  +/*
  + * integers must be use with care here. They can break the 
  PARAVIRT_PATCH(x)
  + * macro, that divides the offset in the structure by 8, to get a number
  + * associated with the hook. Dividing by four would be a solution, but it
  + * would limit the future growth of the structure if needed.
 

 Why not just stick them at the end of the structure?

Does it really matter?

-- 
Glauber de Oliveira Costa.
Free as in Freedom
http://glommer.net

The less confident you are, the more serious you have to act.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-09 Thread Andi Kleen

 Hm.  So x86-64 doesn't make 64-bit pointers be 64-bit aligned?

The ABI does of course, although the penalty of not doing it on current
CPUs is only minor.

-Andi
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-09 Thread Steven Rostedt
--
On Thu, 9 Aug 2007, Andi Kleen wrote:

  This has to match the normal C calling convention though, doesn't it?

 Native cli/sti/save/restore_flags are all only assembly and can be easily
 (in fact more easily than in C) written as pure assembler functions. Then
 you can use whatever calling convention you want.

I agree.
Should we make a paravirt_ops_asm.S file that can implement these native
funcions, and so we can get rid of the C functions only doing asm?


 While some paravirt implementations may have more complicated implementations
 i guess it's still a reasonable requirement to make them simple enough
 in pure assembler. If not they can use a trampoline, but that's hopefully
 not needed.

It works for lguest64.  I'm sure it should be no problem with other HVs.

-- Steve

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


[PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-08 Thread Glauber de Oliveira Costa
This is finally, the patch we were all looking for. This
patch adds a paravirt.h header with the definition of paravirt_ops
struct. Also, it defines a bunch of inline functions that will
replace, or hook, the other calls. Every one of those functions
adds an entry in the parainstructions section (see vmlinux.lds.S).
Those entries can then be used to runtime-patch the paravirt_ops
functions.

paravirt.c contains implementations of paravirt functions that
are used natively, such as the native_patch. It also fill the
paravirt_ops structure with the whole lot of functions that
were (re)defined throughout this patch set.

There are also changes in asm-offsets.c. paravirt.h needs it
to find out the offsets into the structure of functions
such as irq_enable, used in assembly files.

The text in Kconfig is the same as i386 one.

Signed-off-by: Glauber de Oliveira Costa [EMAIL PROTECTED]
Signed-off-by: Steven Rostedt [EMAIL PROTECTED]
---
 arch/x86_64/Kconfig  |   11 +
 arch/x86_64/kernel/Makefile  |1 +
 arch/x86_64/kernel/asm-offsets.c |   14 +
 arch/x86_64/kernel/paravirt.c|  455 +++
 arch/x86_64/kernel/vmlinux.lds.S |6 +
 include/asm-x86_64/paravirt.h|  901 ++
 6 files changed, 1388 insertions(+), 0 deletions(-)

diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig
index ffa0364..bfea34c 100644
--- a/arch/x86_64/Kconfig
+++ b/arch/x86_64/Kconfig
@@ -373,6 +373,17 @@ config NODES_SHIFT
 
 # Dummy CONFIG option to select ACPI_NUMA from drivers/acpi/Kconfig.
 
+config PARAVIRT
+   bool Paravirtualization support (EXPERIMENTAL)
+   depends on EXPERIMENTAL
+   help
+ Paravirtualization is a way of running multiple instances of
+ Linux on the same machine, under a hypervisor.  This option
+ changes the kernel so it can modify itself when it is run
+ under a hypervisor, improving performance significantly.
+ However, when run without a hypervisor the kernel is
+ theoretically slower.  If in doubt, say N.
+
 config X86_64_ACPI_NUMA
bool ACPI NUMA detection
depends on NUMA
diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile
index ff5d8c9..120467f 100644
--- a/arch/x86_64/kernel/Makefile
+++ b/arch/x86_64/kernel/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_X86_VSMP)+= vsmp.o
 obj-$(CONFIG_K8_NB)+= k8.o
 obj-$(CONFIG_AUDIT)+= audit.o
 
+obj-$(CONFIG_PARAVIRT) += paravirt.o
 obj-$(CONFIG_MODULES)  += module.o
 obj-$(CONFIG_PCI)  += early-quirks.o
 
diff --git a/arch/x86_64/kernel/asm-offsets.c b/arch/x86_64/kernel/asm-offsets.c
index 778953b..a8ffc95 100644
--- a/arch/x86_64/kernel/asm-offsets.c
+++ b/arch/x86_64/kernel/asm-offsets.c
@@ -15,6 +15,9 @@
 #include asm/segment.h
 #include asm/thread_info.h
 #include asm/ia32.h
+#ifdef CONFIG_PARAVIRT
+#include asm/paravirt.h
+#endif
 
 #define DEFINE(sym, val) \
 asm volatile(\n- #sym  %0  #val : : i (val))
@@ -72,6 +75,17 @@ int main(void)
   offsetof (struct rt_sigframe32, uc.uc_mcontext));
BLANK();
 #endif
+#ifdef CONFIG_PARAVIRT
+#define ENTRY(entry) DEFINE(PARAVIRT_ ## entry, offsetof(struct paravirt_ops, 
entry))
+   ENTRY(paravirt_enabled);
+   ENTRY(irq_disable);
+   ENTRY(irq_enable);
+   ENTRY(sysret);
+   ENTRY(iret);
+   ENTRY(read_cr2);
+   ENTRY(swapgs);
+   BLANK();
+#endif
DEFINE(pbe_address, offsetof(struct pbe, address));
DEFINE(pbe_orig_address, offsetof(struct pbe, orig_address));
DEFINE(pbe_next, offsetof(struct pbe, next));
diff --git a/arch/x86_64/kernel/paravirt.c b/arch/x86_64/kernel/paravirt.c
new file mode 100644
index 000..a41c1c0
--- /dev/null
+++ b/arch/x86_64/kernel/paravirt.c
@@ -0,0 +1,455 @@
+/*  Paravirtualization interfaces
+Copyright (C) 2007 Glauber de Oliveira Costa and Steven Rostedt,
+Red Hat Inc.
+Based on i386 work by Rusty Russell.
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+*/
+#include linux/errno.h
+#include linux/module.h
+#include linux/efi.h
+#include linux/bcd.h
+#include linux/start_kernel.h
+
+#include asm/bug.h
+#include asm/paravirt.h
+#include asm/desc.h
+#include asm/setup.h
+#include asm/irq.h
+#include 

Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-08 Thread Andi Kleen

 +config PARAVIRT
 +   bool Paravirtualization support (EXPERIMENTAL)

This should be hidden and selected by the clients as needed
(I already did this change on i386) 

Users know nothing about paravirt, they just know about Xen, lguest
etc.

Strictly you would at least need a !X86_VSMP dependency, but
with the vsmp change i requested that will be unnecessary

Is this really synced with the latest version of the i386 code?


 +#ifdef CONFIG_PARAVIRT
 +#include asm/paravirt.h
 +#endif


 +#include linux/errno.h
 +#include linux/module.h
 +#include linux/efi.h
 +#include linux/bcd.h
 +#include linux/start_kernel.h
 +
 +#include asm/bug.h
 +#include asm/paravirt.h
 +#include asm/desc.h
 +#include asm/setup.h
 +#include asm/irq.h
 +#include asm/delay.h
 +#include asm/fixmap.h
 +#include asm/apic.h
 +#include asm/tlbflush.h
 +#include asm/msr.h
 +#include asm/page.h
 +#include asm/pgtable.h
 +#include asm/proto.h
 +#include asm/e820.h
 +#include asm/time.h
 +#include asm/asm-offsets.h
 +#include asm/smp.h
 +#include asm/irqflags.h


Are the includes really all needed?


 + if (opfunc == NULL)
 + /* If there's no function, patch it with a ud2a (BUG) */
 + ret = paravirt_patch_insns(site, len, start_ud2a, end_ud2a);

This will actually give corrupted BUGs because you don't supply
the full inline BUG header. Perhaps another trap would be better.


 +EXPORT_SYMBOL(paravirt_ops);

Definitely _GPL at least.



 +extern struct paravirt_ops paravirt_ops;

Should be native_paravirt_ops I guess

 +
 + * This generates an indirect call based on the operation type number.

The macros here don't

 +static inline unsigned long read_msr(unsigned int msr)
 +{
 + int __err;

No need for __ in inlines

 +/* The paravirtualized I/O functions */
 +static inline void slow_down_io(void) {

I doubt this needs to be inline and it's large

 + __asm__ __volatile__(paravirt_alt(PARAVIRT_CALL)

No __*__ in new code please

 +  : =a(f)
 +  : paravirt_type(save_fl),
 +paravirt_clobber(CLBR_RAX)
 +  : memory, cc);
 + return f;
 +}
 +
 +static inline void raw_local_irq_restore(unsigned long f)
 +{
 + __asm__ __volatile__(paravirt_alt(PARAVIRT_CALL)
 +  :
 +  : D (f),

Have you investigated if a different input register generates better/smaller 
code? I would assume rdi to be usually used already for the caller's 
arguments so it will produce spilling

Similar for the rax return in the other functions.



-Andi
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-08 Thread Steven Rostedt

On Wed, 8 Aug 2007, Andi Kleen wrote:

 Strictly you would at least need a !X86_VSMP dependency, but
 with the vsmp change i requested that will be unnecessary

 Is this really synced with the latest version of the i386 code?

Glauber started the paravirt ops 64 a second time around, from scratch
using the PVOPS of i386 as a base. But since we couldn't just take a the
PVOPS patch from i386 and apply it to x86_64, this was mainly done by
looking at i386 code and massaging it for x86_64. Somethings may have
slipped (and we may have been looking at different versions of PVOPS).
It's not easy trying to keep up with a moving target ;-)

-- Steve

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-08 Thread Glauber de Oliveira Costa
On 8/8/07, Andi Kleen [EMAIL PROTECTED] wrote:

 Is this really synced with the latest version of the i386 code?
Roasted already commented on this. I will check out and change it here.


  +#ifdef CONFIG_PARAVIRT
  +#include asm/paravirt.h
  +#endif


  +#include linux/errno.h
  +#include linux/module.h
  +#include linux/efi.h
  +#include linux/bcd.h
  +#include linux/start_kernel.h
  +
  +#include asm/bug.h
  +#include asm/paravirt.h
  +#include asm/desc.h
  +#include asm/setup.h
  +#include asm/irq.h
  +#include asm/delay.h
  +#include asm/fixmap.h
  +#include asm/apic.h
  +#include asm/tlbflush.h
  +#include asm/msr.h
  +#include asm/page.h
  +#include asm/pgtable.h
  +#include asm/proto.h
  +#include asm/e820.h
  +#include asm/time.h
  +#include asm/asm-offsets.h
  +#include asm/smp.h
  +#include asm/irqflags.h


 Are the includes really all needed?
delay.h is not needed anymore. Most of them, could be maybe moved to
paravirt.c , which is the one that really needs all the native_
things. Yeah, it will be better code this way, will change.


  + if (opfunc == NULL)
  + /* If there's no function, patch it with a ud2a (BUG) */
  + ret = paravirt_patch_insns(site, len, start_ud2a, end_ud2a);

 This will actually give corrupted BUGs because you don't supply
 the full inline BUG header. Perhaps another trap would be better.

You mean this:
  +#include asm/bug.h
?


  +EXPORT_SYMBOL(paravirt_ops);

 Definitely _GPL at least.
Sure.


 Should be native_paravirt_ops I guess

makes sense.

  +
  + * This generates an indirect call based on the operation type number.

 The macros here don't


  +static inline unsigned long read_msr(unsigned int msr)
  +{
  + int __err;

 No need for __ in inlines
Right. Thanks.


  +/* The paravirtualized I/O functions */
  +static inline void slow_down_io(void) {

 I doubt this needs to be inline and it's large
In a second look, i386 have such a function in io.h because they need
slow_down_io in a bunch of I/O instructions. It seems that we do not.
Could we just get rid of it, then?

  + __asm__ __volatile__(paravirt_alt(PARAVIRT_CALL)

 No __*__ in new code please

Yup, will fix.

  +  : =a(f)
  +  : paravirt_type(save_fl),
  +paravirt_clobber(CLBR_RAX)
  +  : memory, cc);
  + return f;
  +}
  +
  +static inline void raw_local_irq_restore(unsigned long f)
  +{
  + __asm__ __volatile__(paravirt_alt(PARAVIRT_CALL)
  +  :
  +  : D (f),

 Have you investigated if a different input register generates better/smaller
 code? I would assume rdi to be usually used already for the caller's
 arguments so it will produce spilling

 Similar for the rax return in the other functions.
I don't think we can do different. These functions can be patched, and
if it happens, they will put their return value in rax. So we'd better
expect it there.
Same goes for rdi, as they will expect the value to be there as an input.

I don't think it will spill in the normal case, as rdi is already the
parameter. So the compiler will just leave it there, untouched.

-- 
Glauber de Oliveira Costa.
Free as in Freedom
http://glommer.net

The less confident you are, the more serious you have to act.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 25/25] [PATCH] add paravirtualization support for x86_64

2007-08-08 Thread Andi Kleen
On Thursday 09 August 2007 01:18:57 Rusty Russell wrote:
 On Wed, 2007-08-08 at 11:49 -0300, Glauber de Oliveira Costa wrote:
  On 8/8/07, Andi Kleen [EMAIL PROTECTED] wrote:
+EXPORT_SYMBOL(paravirt_ops);
  
   Definitely _GPL at least.
  Sure.
 
 We ended up making it EXPORT_SYMBOL in i386 because every driver wants
 to save and restore interrupt state.

Ah true.
 
 But questionably-licensed drivers might be less of a concern on x86-64.

Nvidia/ATI and other binary modules exist too and users will probably unhappy 
if they cannot run them anymore.

But at usually irq state changes should be patched in anyways and won't
need paravirt I guess? 

Hmm, actually thinking about it the module loader probably has no clue
that the relocation it linked will be overwritten so it'll check
for the export anyways.

So the alternatives would be to add ugly hacks to the module loader
or split paravirt_ops in common and low level system areas or 
export it as a normal export.

Not sure what's best. Ok using a normal export is easiest and not
that big an issue.

-Andi
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization