Re: [kvm-devel] The SMP RHEL 5.1 PAE guest can't boot up issue

2008-02-29 Thread Zhao Forrest

  I believe the patch is still necessary, since we still need to guarantee
  that a vcpu's tsc is monotonous.  I think there are three issues to be
  addressed:

  1. The majority of intel machines don't need the offset adjustment since
  they already have a constant rate tsc that is synchronized on all cpus.
  I think this is indicated by X86_FEATURE_CONSTANT_TSC (though I'm not
  100% certain if it means that the rate is the same for all cpus, Thomas
  can you clarify?)

  This will improve tsc quality for those machines, but we can't depend on
  it, since some machines don't have constant tsc.  Further, I don't think
  really large machines can have constant tsc since clock distribution
  becomes difficult or impossible.

I have another newbie question: can the current Linux kernel handle the unsynced
TSC? If kernel can't handle this case, it still has problem to run
Linux on hardware
with unsynced TSC.

Thanks,
Forrest

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Can Linux kernel handle unsynced TSC?

2008-02-29 Thread Zhao Forrest
For example,
1 rdtsc() is invoked on CPU0
2 process is migrated to CPU1, and rdtsc() is invoked on CPU1
3 if TSC on CPU1 is slower than TSC on CPU0, can kernel guarantee
that the second rdtsc() doesn't return a value smaller than the one
returned by the first rdtsc()?

Thanks,
Forrest

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Can Linux kernel handle unsynced TSC?

2008-02-29 Thread Zhao Forrest
Sorry for reposting it.

For example,
1 rdtsc() is invoked on CPU0
2 process is migrated to CPU1, and rdtsc() is invoked on CPU1
3 if TSC on CPU1 is slower than TSC on CPU0, can kernel guarantee
that the second rdtsc() doesn't return a value smaller than the one
returned by the first rdtsc()?

Thanks,
Forrest

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Can Linux kernel handle unsynced TSC?

2008-02-29 Thread Peter Zijlstra

On Fri, 2008-02-29 at 16:55 +0800, Zhao Forrest wrote:
 Sorry for reposting it.
 
 For example,
 1 rdtsc() is invoked on CPU0
 2 process is migrated to CPU1, and rdtsc() is invoked on CPU1
 3 if TSC on CPU1 is slower than TSC on CPU0, can kernel guarantee
 that the second rdtsc() doesn't return a value smaller than the one
 returned by the first rdtsc()?

No, rdtsc() goes directly to the hardware. You need a (preferably cheap)
clock abstraction layer on top if you need this.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] I/O bandwidth control on KVM

2008-02-29 Thread Ryo Tsuruta
Hello all,

I've implemented a block device which throttles block I/O bandwidth, 
which I called dm-ioband, and been trying to throttle I/O bandwidth on
KVM environment. But unfortunately it doesn't work well, the number of
issued I/Os is not according to the bandwidth setting.
On the other hand, I got the good result when accessing directly to
the local disk on the local machine.

I'm not so familiar with KVM. Could anyone give me any advice?

For dm-ioband details, please see the website at
http://people.valinux.co.jp/~ryov/dm-ioband/

   The number of issued I/Os
 --
|   device |   sda11   |   sda12   |
|weight setting|80%|20%|
|--+---+---|
| KVM |  I/Os  |   4397|   2902|
| | ratio to total |   60.2%   |   39.8%   |
|-++---+---|
|local|  I/Os  |   5447|   1314|
| | ratio to total |   80.6%   |   19.4%   |
 --

The test environment and the procedure are as follow:

  o Prepare two partitions sda11 and sda12.
  o Create two bandwidth control devices, each device is mapped to the
sda11 and sda12 respectively.
  o Give weights of 80 and 20 to each bandwidth control device respectively.
  o Run two virtual machines, the virtual machine's disk is mapped to
the each bandwidth control device.
  o Run 128 processes issuing random read/write direct I/O with 4KB data
on each virtual machine at the same time respectively.
  o Count up the number of I/Os which have done in 60 seconds.

Access through KVM
  +---+ +--+
  | Virtual Machine 1 (VM1)   | | Virtual Machine 2 (VM2)  |
  |in cgroup ioband1| |in cgroup ioband2   |
  |   | |  |
  | Read/Write with O_DIRECT  | | Read/Write with O_DIRECT |
  |   process x 128   | |   process x 128  |  
  | | | | ||
  | V | | V|
  | /dev/vda1 | | /dev/vda1|
  +-|-+ +-|+
  +-V-V+
  | /dev/mapper/ioband1  | /dev/mapper/ioband2 |
  | 80% for cgroup ioband1 | 20% for cgroup ioband2|
  |  | |
  |Control I/O bandwidth according to the cgroup tasks |
  +-|-|+
  +-V-+ +-|+
  |/dev/sda11 | | /dev/sda12   |
  +---+ +--+

  Direct access
  +---+ +--+
  | cgroup ioband1  | | cgroup ioband2 |
  |   | |  |
  | Read/Write with O_DIRECT  | | Read/Write with O_DIRECT |
  |   process x 128   | |   process x 128  |  
  | | | | ||
  +-|-+ +-|+
  +-V-V+
  | /dev/mapper/ioband1  |/dev/mapper/ioband2  |
  | 80% for cgroup ioband1 | 20% for cgroup ioband2|
  |  | |
  | Control I/O bandwidth according to the cgroup tasks|
  +-|-|+
  +-V-+ +-|+
  |/dev/sda11 | | /dev/sda12   |
  +---+ +--+

Thanks,
Ryo Tsuruta

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] FW: KVM Test result, kernel 4a7f582.., userspace bc6db37..

2008-02-29 Thread Zhao, Yunfeng
Zhao, Yunfeng wrote:
 Hi, all,
 This is today's KVM test result against kvm.git
 4a7f582a07e14763ee4714b681e98b3b134d1d46 and kvm-userspace.git
 bc6db37817ce749dcc88fbc761a36bb8df5cf60a.
 LTP and kernel build test on pae linux guest are failed,
 because these case boot guests with smp 2.6.9 kernel, it's
 related with today's new issue.
 With manual test save restore have no problem for the first time.
 
 Save/restore test cases passed in manually testing. Because
 the command has been changed, it failed in auto testing. We
 will change the test cases.
 
 One new issue:
 1. Can not boot guests with 2.6.9 smp pae kernel
 https://sourceforge.net/tracker/index.php?func=detailaid=19037
 32group_id=180599atid=893831
 
We doubt this issue is caused by this commit:
 kvm: bios: mark extra cpus as present
kvm-userspace: 538c90271b9431f8c7f2ebfdffdab07749b97d86

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote:
 And thus the device driver may stop receiving data on a UP system? It will 
 never get the ack.

Not sure to follow, sorry.

My idea was:

   post the invalidate in the mmio region of the device
   smp_call_function()
   while (mmio device wait-bitflag is on);

Instead of the current:

   smp_call_function()
   post the invalidate in the mmio region of the device
   while (mmio device wait-bitflag is on);

To decrease the wait loop time.

 invalidate_page_before/end could be realized as an 
 invalidate_range_begin/end on a page sized range?

If we go this route, once you add support to xpmem, you'll have to
make the anon_vma lock a mutex too, that would be fine with me
though. The main reason invalidate_page exists, is to allow you to
leave it as non-sleep-capable even after you make invalidate_range
sleep capable, and to implement the mmu_rmap_notifiers sleep capable
in all the paths that invalidate_page would be called. That was the
strategy you had in your patch. I'll try to drop invalidate_page. I
wonder if then you won't need the mmu_rmap_notifiers anymore.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-29 Thread Andrea Arcangeli
On Thu, Feb 28, 2008 at 05:03:01PM -0800, Christoph Lameter wrote:
 I thought you wanted to get rid of the sync via pte lock?

Sure. _notify is happening inside the pt lock by coincidence, to
reduce the changes to mm/* as long as the mmu notifiers aren't
sleep capable.

 What changes to do_wp_page do you envision?

Converting it to invalidate_range_begin/end.

 What is the trouble with the current do_wp_page modifications? There is 
 no need for invalidate_page() there so far. invalidate_range() does the 
 trick there.

No trouble, it's just that I didn't want to mangle over the logic of
do_wp_page unless it was strictly required, the patch has to be
obviously safe. You need to keep that bit of your patch to make the
mmu notifiers sleepable.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Can Linux kernel handle unsynced TSC?

2008-02-29 Thread Zhao Forrest
On 2/29/08, Peter Zijlstra [EMAIL PROTECTED] wrote:

 On Fri, 2008-02-29 at 16:55 +0800, Zhao Forrest wrote:
  Sorry for reposting it.
 
  For example,
  1 rdtsc() is invoked on CPU0
  2 process is migrated to CPU1, and rdtsc() is invoked on CPU1
  3 if TSC on CPU1 is slower than TSC on CPU0, can kernel guarantee
  that the second rdtsc() doesn't return a value smaller than the one
  returned by the first rdtsc()?

 No, rdtsc() goes directly to the hardware. You need a (preferably cheap)
 clock abstraction layer on top if you need this.

Thank you for the clarification. I think gettimeofday() is such kind
of clock abstraction layer, am I right?

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] catch vmentry failure (was enable gfxboot on VMX)

2008-02-29 Thread Guillaume Thouvenin
On Mon, 18 Feb 2008 10:39:31 +0100
Alexander Graf [EMAIL PROTECTED] wrote:


  So if you want to see a VMentry failure, just remove the SS patching
  and you'll see one. My guess would be that you see a lot of problems
  with otherwise working code too then, though, as SS can be anything in
  that state.

So I made some tests and you were right, removing the SS patching
showed VM entry failure but it also generated lots of problems. Thus I
tried to modify a little bit the code and with the following patch (see
the end of the email) I can detect VM Entry failures without generating
other problems. It works when you use a distribution that is
big-real-mode free. I pasted the patch just to show the idea. 

It's interesting because we can continue to use the virtual mode for the
majority of distribution and we can detect when a VM entry failure is
detected it means that we need to switch from virtual mode to full real
mode emulation. Such failure is caught in handle_vmentry_failure() when
patch applied. If it's doable, the next step is the modification of the
SS segment selector to succeed the vm-entry and the switch from virtual
mode to a real mode emulation that could be done in
handle_vmentry_failure(). Does it make sense?

Regards,
Guillaume

---

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 46e0e58..c2c3897 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1166,15 +1166,13 @@ static void enter_pmode(struct kvm_vcpu *vcpu)
(vmcs_readl(CR4_READ_SHADOW)  X86_CR4_VME));
 
update_exception_bitmap(vcpu);
-
+ 
+   fix_pmode_dataseg(VCPU_SREG_SS, vcpu-arch.rmode.ss);
fix_pmode_dataseg(VCPU_SREG_ES, vcpu-arch.rmode.es);
fix_pmode_dataseg(VCPU_SREG_DS, vcpu-arch.rmode.ds);
fix_pmode_dataseg(VCPU_SREG_GS, vcpu-arch.rmode.gs);
fix_pmode_dataseg(VCPU_SREG_FS, vcpu-arch.rmode.fs);
 
-   vmcs_write16(GUEST_SS_SELECTOR, 0);
-   vmcs_write32(GUEST_SS_AR_BYTES, 0x93);
-
vmcs_write16(GUEST_CS_SELECTOR,
 vmcs_read16(GUEST_CS_SELECTOR)  ~SELECTOR_RPL_MASK);
vmcs_write32(GUEST_CS_AR_BYTES, 0x9b);
@@ -1228,20 +1226,12 @@ static void enter_rmode(struct kvm_vcpu *vcpu)
vmcs_writel(GUEST_CR4, vmcs_readl(GUEST_CR4) | X86_CR4_VME);
update_exception_bitmap(vcpu);
 
-   vmcs_write16(GUEST_SS_SELECTOR, vmcs_readl(GUEST_SS_BASE)  4);
-   vmcs_write32(GUEST_SS_LIMIT, 0x);
-   vmcs_write32(GUEST_SS_AR_BYTES, 0xf3);
-
-   vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
-   vmcs_write32(GUEST_CS_LIMIT, 0x);
-   if (vmcs_readl(GUEST_CS_BASE) == 0x)
-   vmcs_writel(GUEST_CS_BASE, 0xf);
-   vmcs_write16(GUEST_CS_SELECTOR, vmcs_readl(GUEST_CS_BASE)  4);
-
+   fix_rmode_seg(VCPU_SREG_CS, vcpu-arch.rmode.cs);
fix_rmode_seg(VCPU_SREG_ES, vcpu-arch.rmode.es);
fix_rmode_seg(VCPU_SREG_DS, vcpu-arch.rmode.ds);
fix_rmode_seg(VCPU_SREG_GS, vcpu-arch.rmode.gs);
fix_rmode_seg(VCPU_SREG_FS, vcpu-arch.rmode.fs);
+   fix_rmode_seg(VCPU_SREG_SS, vcpu-arch.rmode.ss);
 
kvm_mmu_reset_context(vcpu);
init_rmode_tss(vcpu-kvm);
@@ -2257,6 +2247,39 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
*vcpu,
 static const int kvm_vmx_max_exit_handlers =
ARRAY_SIZE(kvm_vmx_exit_handlers);
 
+static int handle_vmentry_failure(u32 exit_reason, struct kvm_vcpu *vcpu)
+{
+   unsigned long exit_qualification = vmcs_read64(EXIT_QUALIFICATION);
+   u32 info_field = vmcs_read32(VMX_INSTRUCTION_INFO);
+   unsigned int basic_exit_reason = (uint16_t) exit_reason;
+
+   printk(%s: exit reason 0x%x  \n, __FUNCTION__, exit_reason);
+   printk(%s: vmentry failure reason %u  \n, __FUNCTION__, 
basic_exit_reason);
+   printk(%s: VMX-instruction Information field 0x%x  \n, __FUNCTION__, 
info_field);
+
+   switch (basic_exit_reason) {
+   case EXIT_REASON_INVALID_GUEST_STATE:
+   printk(caused by invalid guest state (%ld).\n, 
exit_qualification);
+   /* At this point we need to modify SS selector to pass 
vmentry test.
+* This modification prevent the usage of virtual mode 
to emulate real 
+* mode so we need to pass in big real mode emulation
+* with somehting like:
+* vcpu-arch.rmode.emulate = 1
+*/
+   break;
+   case EXIT_REASON_MSR_LOADING:
+   printk(caused by MSR entry %ld loading.\n, 
exit_qualification);
+   break;
+   case EXIT_REASON_MACHINE_CHECK:
+   printk(caused by machine check.\n);
+   break;
+   default:
+   printk(reason not known yet!\n);
+   break;
+   }
+   return 0;
+}
+
 /*
  * The guest has exited.  See if we can 

Re: [kvm-devel] Can Linux kernel handle unsynced TSC?

2008-02-29 Thread Peter Zijlstra

On Fri, 2008-02-29 at 22:20 +0800, Zhao Forrest wrote:
 On 2/29/08, Peter Zijlstra [EMAIL PROTECTED] wrote:
 
  On Fri, 2008-02-29 at 16:55 +0800, Zhao Forrest wrote:
   Sorry for reposting it.
  
   For example,
   1 rdtsc() is invoked on CPU0
   2 process is migrated to CPU1, and rdtsc() is invoked on CPU1
   3 if TSC on CPU1 is slower than TSC on CPU0, can kernel guarantee
   that the second rdtsc() doesn't return a value smaller than the one
   returned by the first rdtsc()?
 
  No, rdtsc() goes directly to the hardware. You need a (preferably cheap)
  clock abstraction layer on top if you need this.
 
 Thank you for the clarification. I think gettimeofday() is such kind
 of clock abstraction layer, am I right?

Yes, gtod is one such a layer, however it fails the 'cheap' test for
many definitions of cheap.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] [PATCH] USB 2.0 EHCI emulation

2008-02-29 Thread Gerb Stralko
On Fri, Feb 29, 2008 at 2:33 AM, Arnon Gilboa [EMAIL PROTECTED] wrote:
 In hw/pc.c, replace usb_uhci_piix3_init(pci_bus, piix3_devfn + 2);
  With usb_ehci_init(pci_bus, piix3_devfn + 2);

With these changes.. I can't add the usb devices anymore to a Windows
XP (32 bit).

This is the command i use to start kvm:
/usr/local/bin/kvm/qemu-system-x86_64 -localtime -m 512 -usb -hda win32xp.img

To add usb device i normally go to the qemu console and type:
info usbhost
find the number for my device i want to connect to
usb_add host:03f0:01cda

But with your patch, when i try to add a usb device i get:
Could not add 'USB device host:03f0:01cda'

Since i'm using EHCI emulation, do i need to add usb devices in a
different way? Or should it work exactly the same way?

Thanks,

Jerry

  Note my comments on the original post:
  -tested on XP guest
  -does not support ISO transfers
  -timing issues



  -Original Message-
  From: Gerb Stralko [mailto:[EMAIL PROTECTED]
  Sent: Thursday, February 28, 2008 9:46 PM
  To: Arnon Gilboa
  Cc: [EMAIL PROTECTED]; kvm-devel@lists.sourceforge.net
  Subject: Re: [kvm-devel] [Qemu-devel] [PATCH] USB 2.0 EHCI emulation

Attached is a repost of the preliminary patch implementing USB 2.0
   EHCI  emulation.

  I want to start testing your patches for the EHCI stuff.   Do i need
  to enable anything inorder to get EHCI emulation working after applying
  your patch?

  Unfortunately, with this patch it doesn't work for me.  My guest host
  (windows vista) still became really slow when I add the a usb device.
  
Waiting for your comments,
Arnon
  

  Thanks,

  Jerry


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Belebt Geist und Korper

2008-02-29 Thread Rolland Montoya
Online Apotheke - original Qualitaet - 100% wirksam

Spezialangebot: Vi. 10 Tab. 100 mg + Ci. 10 Tab. x 20 mg 53,82 Euro 

Vi. 10 Tab. 26,20 Euro
Vi. 30 Tab. 51,97 Euro - Sie sparen: 27,00 Euro
Vi. 60 Tab. 95,69 Euro - Sie sparen: 62,00 Euro
Vi. 90 Tab. 136,91 Euro - Sie sparen: 100,00 Euro

Ci. 10 - 30,00 Euro
Ci. 20 - 59,35 Euro - Sie sparen: 2,00 Euro
Ci. 30 - 80,30 Euro - Sie sparen: 12,00 Euro

- Bequem und diskret online bestellen.
- Visa verifizierter Onlineshop
- keine versteckte Kosten
- Diskrete Verpackung und Zahlung
- Kein langes Warten - Auslieferung innerhalb von 2-3 Tagen
- Kein peinlicher Arztbesuch erforderlich
- Kostenlose, arztliche Telefon-Beratung

Bestellen Sie jetzt und vergessen Sie Ihre Enttauschungen, anhaltende 
Versagensaengste und wiederholte peinliche Situationen

Vier Dosen gibt's bei jeder Bestellung umsonst

http://believetalk.com-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] KVM-61/62 build fails on SLES 10

2008-02-29 Thread M.J. Rutter
Whereas KVM-60 builds out of the box on SLES 10 SP1 (assuming gcc 3.4 is 
installed), KVM-61 and KVM-62 don't. They fail with:

make[1]: Entering directory `/scratch/KVM/kvm-61/kernel'
# include header priority 1) INUX 2) ERNELDIR 3) include-compat
make -C /lib/modules/2.6.16.54-0.2.5-smp/build M=`pwd` \
LINUXINCLUDE=-I`pwd`/include -Iinclude -I`pwd`/include-compat \
-include include/linux/autoconf.h \
$@
make[2]: Entering directory 
`/usr/src/linux-2.6.16.54-0.2.5-obj/x86_64/smp'
make -C ../../../linux-2.6.16.54-0.2.5 
O=../linux-2.6.16.54-0.2.5-obj/x86_64/smp
  LD  /scratch/KVM/kvm-61/kernel/built-in.o
  CC [M]  /scratch/KVM/kvm-61/kernel/svm.o
In file included from command line:1:
/scratch/KVM/kvm-61/kernel/external-module-compat.h:10:28: error: 
linux/compiler.h: No such file or directory
/scratch/KVM/kvm-61/kernel/external-module-compat.h:12:26: error: 
linux/string.h: No such file or directory

Trying to fiddle the include path to ensure that it finds 
/usr/src/linux/include then produces an error for linux/clocksource.h.

SLES 10 SP1 uses a kernel whose version is 2.6.16.54-0.2.5-smp, i.e. 
2.6.16 plus various back-ported bits. However, SLES 10 SP1 is the current 
version of SuSE Linux Enterprise Server, so in some sense this is current.

KVM was configured with

./configure --prefix=/usr/local/kvm/kvm-61 \
  --qemu-cc=/scratch/gcc-3.4/bin/gcc-3.4

Michael

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] KVM-61/62 build fails on SLES 10

2008-02-29 Thread M.J. Rutter
On Fri, 29 Feb 2008, M.J. Rutter wrote:

 Whereas KVM-60 builds out of the box on SLES 10 SP1 (assuming gcc 3.4 is 
 installed), KVM-61 and KVM-62 don't.

Bother. Ignore that. As far as I can see, no KVM since about KVM-37 has 
actually run on a kernel that old, due to the lack of hrtimer_init and 
friends. KVM-60 may build, but it certainly doesn't run, so KVM-61/62's 
inability to build is of no consequence.

Michael

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] 64bit host performance

2008-02-29 Thread gerryw
Hello All,

Is there a significant performance advantage with using a 64bit host os? I 
am specifically wondering about the advantages where KVM and QEMU are 
concerned.

Thanks in advance,
-G







 -
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:

 On Thu, Feb 28, 2008 at 05:03:01PM -0800, Christoph Lameter wrote:
  I thought you wanted to get rid of the sync via pte lock?
 
 Sure. _notify is happening inside the pt lock by coincidence, to
 reduce the changes to mm/* as long as the mmu notifiers aren't
 sleep capable.

Ok if this is a coincidence then it would be better to separate the 
notifier callouts from the pte macro calls.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:

 On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote:
  And thus the device driver may stop receiving data on a UP system? It will 
  never get the ack.
 
 Not sure to follow, sorry.
 
 My idea was:
 
post the invalidate in the mmio region of the device
smp_call_function()
while (mmio device wait-bitflag is on);

So the device driver on UP can only operate through interrupts? If you are 
hogging the only cpu then driver operations may not be possible.

  invalidate_page_before/end could be realized as an 
  invalidate_range_begin/end on a page sized range?
 
 If we go this route, once you add support to xpmem, you'll have to
 make the anon_vma lock a mutex too, that would be fine with me
 though. The main reason invalidate_page exists, is to allow you to
 leave it as non-sleep-capable even after you make invalidate_range
 sleep capable, and to implement the mmu_rmap_notifiers sleep capable
 in all the paths that invalidate_page would be called. That was the
 strategy you had in your patch. I'll try to drop invalidate_page. I
 wonder if then you won't need the mmu_rmap_notifiers anymore.

I am mainly concerned with making the mmu notifier a generally useful 
feature for multiple users. Xpmem is one example of a different user. It 
should be considered as one example of a different type of callback user. 
It is not the gold standard that you make it to be. RDMA is another and 
there are likely scores of others (DMA engines etc) once it becomes clear 
that such a feature is available. In general the mmu notifier will allows 
us to fix the problems caused by memory pinning and mlock by various 
devices and other mechanisms that need to directly access memory. 

And yes I would like to get rid of the mmu_rmap_notifiers altogether. It 
would be much cleaner with just one mmu_notifier that can sleep in all 
functions.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Fri, Feb 29, 2008 at 11:55:17AM -0800, Christoph Lameter wrote:
 post the invalidate in the mmio region of the device
 smp_call_function()
 while (mmio device wait-bitflag is on);
 
 So the device driver on UP can only operate through interrupts? If you are 
 hogging the only cpu then driver operations may not be possible.

There was no irq involved in the above pseudocode, the irq if
something would run in the remote system. Still irqs can run fine
during the while loop like they run fine on top of
smp_call_function. The send-irq and the following spin-on-a-bitflag
works exactly as smp_call_function except this isn't a numa-CPU to
invalidate.

 And yes I would like to get rid of the mmu_rmap_notifiers altogether. It 
 would be much cleaner with just one mmu_notifier that can sleep in all 
 functions.

Agreed. I just thought xpmem needed an invalidate-by-page, but
I'm glad if xpmem can go in sync with the KVM/GRU/DRI model in this
regard.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] 64bit host performance

2008-02-29 Thread izik eidus
ציטוט [EMAIL PROTECTED]:

 Hello All,

 Is there a significant performance advantage with using a 64bit host 
 os? I am specifically wondering about the advantages where KVM and 
 QEMU are concerned.

the mmu code (the page table entries pointers are 64bits) would run 
faster on 64bits host
i think this should be the main diffrence.


 Thanks in advance,
 -G






 

 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 

 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel
   


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:

 Agreed. I just thought xpmem needed an invalidate-by-page, but
 I'm glad if xpmem can go in sync with the KVM/GRU/DRI model in this
 regard.

That means we need both the anon_vma locks and the i_mmap_lock to become 
semaphores. I think semaphores are better than mutexes. Rik and Lee saw 
some performance improvements because list can be traversed in parallel 
when the anon_vma lock is switched to be a rw lock.

Sounds like we get to a conceptually clean version here?


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:

 On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
  That means we need both the anon_vma locks and the i_mmap_lock to become 
  semaphores. I think semaphores are better than mutexes. Rik and Lee saw 
  some performance improvements because list can be traversed in parallel 
  when the anon_vma lock is switched to be a rw lock.
 
 The improvement was with a rw spinlock IIRC, so I don't see how it's
 related to this.

AFAICT The rw semaphore fastpath is similar in performance to a rw 
spinlock. 
 
 Perhaps the rwlock spinlock can be changed to a rw semaphore without
 measurable overscheduling in the fast path. However theoretically

Overscheduling? You mean overhead?

 speaking the rw_lock spinlock is more efficient than a rw semaphore in
 case of a little contention during the page fault fast path because
 the critical section is just a list_add so it'd be overkill to
 schedule while waiting. That's why currently it's a spinlock (or rw
 spinlock).

On the other hand a semaphore puts the process to sleep and may actually 
improve performance because there is less time spend in a busy loop. 
Other processes may do something useful and we stay off the contended 
cacheline reducing traffic on the interconnect.
 
 preempt-rt runs quite a bit slower, or we could rip spinlocks out of
 the kernel in the first place ;)

The question is why that is the case and it seesm that there are issues 
with interrupt on/off that are important here and particularly significant 
with the SLAB allocator (significant hacks there to deal with that issue). 
The fastpath that we have in the works for SLUB may address a large 
part of that issue because it no longer relies on disabling interrupts.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Fri, Feb 29, 2008 at 01:34:34PM -0800, Christoph Lameter wrote:
 On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
 
  On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
   That means we need both the anon_vma locks and the i_mmap_lock to become 
   semaphores. I think semaphores are better than mutexes. Rik and Lee saw 
   some performance improvements because list can be traversed in parallel 
   when the anon_vma lock is switched to be a rw lock.
  
  The improvement was with a rw spinlock IIRC, so I don't see how it's
  related to this.
 
 AFAICT The rw semaphore fastpath is similar in performance to a rw 
 spinlock. 

read side is taken in the slow path.

write side is taken in the fast path.

pagefault is fast path, VM during swapping is slow path.

  Perhaps the rwlock spinlock can be changed to a rw semaphore without
  measurable overscheduling in the fast path. However theoretically
 
 Overscheduling? You mean overhead?

The only possible overhead that a rw semaphore could ever generate vs
a rw lock is overscheduling.

  speaking the rw_lock spinlock is more efficient than a rw semaphore in
  case of a little contention during the page fault fast path because
  the critical section is just a list_add so it'd be overkill to
  schedule while waiting. That's why currently it's a spinlock (or rw
  spinlock).
 
 On the other hand a semaphore puts the process to sleep and may actually 
 improve performance because there is less time spend in a busy loop. 
 Other processes may do something useful and we stay off the contended 
 cacheline reducing traffic on the interconnect.

Yes, that's the positive side, the negative side is that you'll put
the task in uninterruptible sleep and call schedule() and require a
wakeup, because a list_add taking 1usec is running in the
other cpu. No other downside. But that's the only reason it's a
spinlock right now, infact there can't be any other reason.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
 That means we need both the anon_vma locks and the i_mmap_lock to become 
 semaphores. I think semaphores are better than mutexes. Rik and Lee saw 
 some performance improvements because list can be traversed in parallel 
 when the anon_vma lock is switched to be a rw lock.

The improvement was with a rw spinlock IIRC, so I don't see how it's
related to this.

Perhaps the rwlock spinlock can be changed to a rw semaphore without
measurable overscheduling in the fast path. However theoretically
speaking the rw_lock spinlock is more efficient than a rw semaphore in
case of a little contention during the page fault fast path because
the critical section is just a list_add so it'd be overkill to
schedule while waiting. That's why currently it's a spinlock (or rw
spinlock).

 Sounds like we get to a conceptually clean version here?

I don't have a strong opinion if it should become a semaphore
unconditionally or only with a CONFIG_XPMEM=y. But keep in mind
preempt-rt runs quite a bit slower, or we could rip spinlocks out of
the kernel in the first place ;)

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Fri, Feb 29, 2008 at 02:12:57PM -0800, Christoph Lameter wrote:
 On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
 
   AFAICT The rw semaphore fastpath is similar in performance to a rw 
   spinlock. 
  
  read side is taken in the slow path.
 
 Slowpath meaning VM slowpath or lock slow path? Its seems that the rwsem 

With slow path I meant the VM. Sorry if that was confusing given locks
also have fast paths (no contention) and slow paths (contention).

 read side path is pretty efficient:

Yes. The assembly doesn't worry me at all.

  pagefault is fast path, VM during swapping is slow path.
 
 Not sure what you are saying here. A pagefault should be considered as a 
 fast path and swapping is not performance critical?

Yes, swapping is I/O bound and it rarely becomes CPU hog in the common
case.

There are corner case workloads (including OOM) where swapping can
become cpu bound (that's also where rwlock helps). But certainly the
speed of fork() and a page fault, is critical for _everyone_, not just
a few workloads and setups.

 Ok too many calls to schedule() because the slow path (of the semaphore) 
 is taken?

Yes, that's the only possible worry when converting a spinlock to
mutex.

 But that is only happening for the contended case. Certainly a spinlock is 
 better for 2p system but the more processors content for the lock (and 
 the longer the hold off is, typical for the processors with 4p or 8p or 
 more) the better a semaphore will work.

Sure. That's also why the PT lock switches for 4way compiles. Config
option helps to keep the VM optimal for everyone. Here it is possible
it won't be necessary but I can't be sure given both i_mmap_lock and
anon-vma lock are used in some many places. Some TPC comparison would
be nice before making a default switch IMHO.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel