Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset

2010-01-06 Thread Avi Kivity

On 01/06/2010 09:05 AM, Huang Ying wrote:

@@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)

   #endif
   set_msr_entry(msrs[n++], MSR_KVM_SYSTEM_TIME,  env-system_time_msr);
   set_msr_entry(msrs[n++], MSR_KVM_WALL_CLOCK,  env-wall_clock_msr);
+set_msr_entry(msrs[n++], MSR_MCG_STATUS, 0);


   

Not sure why you reset this in kvm_arch_load_regs().  Shouldn't this be
in the cpu reset code?
 

I found kvm_arch_load_regs() is called by kvm_arch_cpu_reset(), which is
called by qemu_kvm_system_reset(). It is not in cpu reset path?
   


It is, but it is also called from many other places, which could cause 
this msr to be zeroed.


A better solution is to allocate it a field in CPUState, load and save 
it in kvm_arch_*_regs, and zero it during reset.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset

2010-01-06 Thread Huang Ying
On Wed, 2010-01-06 at 16:03 +0800, Avi Kivity wrote:
 On 01/06/2010 09:05 AM, Huang Ying wrote:
  @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
 #endif
 set_msr_entry(msrs[n++], MSR_KVM_SYSTEM_TIME,  
  env-system_time_msr);
 set_msr_entry(msrs[n++], MSR_KVM_WALL_CLOCK,  
  env-wall_clock_msr);
  +set_msr_entry(msrs[n++], MSR_MCG_STATUS, 0);
 
 
 
  Not sure why you reset this in kvm_arch_load_regs().  Shouldn't this be
  in the cpu reset code?
   
  I found kvm_arch_load_regs() is called by kvm_arch_cpu_reset(), which is
  called by qemu_kvm_system_reset(). It is not in cpu reset path?
 
 
 It is, but it is also called from many other places, which could cause 
 this msr to be zeroed.
 
 A better solution is to allocate it a field in CPUState, load and save 
 it in kvm_arch_*_regs, and zero it during reset.

Yes. You are right. I will fix this.

Best Regards,
Huang Ying


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset

2010-01-06 Thread Huang Ying
On Tue, 2010-01-05 at 18:44 +0800, Andi Kleen wrote:
  --- a/qemu-kvm-x86.c
  +++ b/qemu-kvm-x86.c
  @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
   #endif
   set_msr_entry(msrs[n++], MSR_KVM_SYSTEM_TIME,  env-system_time_msr);
   set_msr_entry(msrs[n++], MSR_KVM_WALL_CLOCK,  env-wall_clock_msr);
  +set_msr_entry(msrs[n++], MSR_MCG_STATUS, 0);
 
 Still need to keep EIPV and possibly RIPV, don't we?

It appears that EIPV and RIPV is meaningless outside of MCE exception
handler. But I will try to check the real hardware behavior.

Best Regards,
Huang Ying


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 0.12.x: message Option 'ipv4': Use 'on' or 'off'

2010-01-06 Thread Marcelo Tosatti
On Mon, Jan 04, 2010 at 09:18:01AM +0100, Thomas Beinicke wrote:
 I get the same message since the update, is it -chardev option related?
 
 
 On Saturday 02 January 2010 11:53:42 Thomas Mueller wrote:
  hi
  
  since 0.12.x i get the following messages starting a vm:
  
  Option 'ipv4': Use 'on' or 'off'
  Failed to parse yes for dummy.ipv4
  
  
  command is:
  
  
  kvm -usbdevice tablet -drive file=~/virt/xp/drive1.qcow2,cache=writeback -
  drive file=~/virt/xp/drive2.qcow2,cache=writeback -net nic -net
  user,hostfwd=tcp:127.0.0.1:3389-:3389 -k de-ch -m 1024 -smp 2 -vnc
  127.0.0.1:20 -monitor unix:/var/run/kvm-winxp.socket,server,nowait -
  daemonize -localtime
  
  what is option ipv4?

This patch should fix it:

Fix inet_parse typo

qemu_opt_set wants on/off, not yes/no.

diff --git a/qemu-sockets.c b/qemu-sockets.c
index 8850516..a88b2a7 100644
--- a/qemu-sockets.c
+++ b/qemu-sockets.c
@@ -424,7 +424,7 @@ static int inet_parse(QemuOpts *opts, const char *str)
 __FUNCTION__, str);
 return -1;
 }
-qemu_opt_set(opts, ipv6, yes);
+qemu_opt_set(opts, ipv6, on);
 } else if (qemu_isdigit(str[0])) {
 /* IPv4 addr */
 if (2 != sscanf(str,%64[0-9.]:%32[^,]%n,addr,port,pos)) {
@@ -432,7 +432,7 @@ static int inet_parse(QemuOpts *opts, const char *str)
 __FUNCTION__, str);
 return -1;
 }
-qemu_opt_set(opts, ipv4, yes);
+qemu_opt_set(opts, ipv4, on);
 } else {
 /* hostname */
 if (2 != sscanf(str,%64[^:]:%32[^,]%n,addr,port,pos)) {
@@ -450,9 +450,9 @@ static int inet_parse(QemuOpts *opts, const char *str)
 if (h)
 qemu_opt_set(opts, to, h+4);
 if (strstr(optstr, ,ipv4))
-qemu_opt_set(opts, ipv4, yes);
+qemu_opt_set(opts, ipv4, on);
 if (strstr(optstr, ,ipv6))
-qemu_opt_set(opts, ipv6, yes);
+qemu_opt_set(opts, ipv6, on);
 return 0;
 }
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Soft lockups and cpu frequency scaling

2010-01-06 Thread Marcelo Tosatti
On Mon, Jan 04, 2010 at 03:01:29PM +0100, Martin Schmitt wrote:
 Ciao Marcelo,
 
 sorry for getting back so late. Thanks for your patience. :-)
 
 Marcelo Tosatti schrieb:
 
  I'm running a manually compiled KVM on CentOS 5.4. The KVM installation
  has been carried over from CentOS 5.3, when KVM wasn't distributed with
  the OS. (I tried to migrate to CentOS 5.4 native KVM support, but wasn't
  able to get along with RedHat's interpretation of KVM.)
 
  The KVM version used is 88, on Kernel 2.6.18-128.7.1.el5, as KVM doesn't
  seem to compile on CentOS' current 2.6.18-164.9.1.el5.
 
  Only on CentOS guests, I see very frequent soft lockup messages and
  excessively hanging KVM instances.
  
  Can you please share some of the soft lockup messages.
  
  And how exactly are the VMs hanging?
 
 They are unresponsive for a few seconds. More hiccuping than hanging.
 It appears to be I/O-related in some way, because it happens most
 frequently when I do things on the file system.
 
 Dmesg is full of these:
 
 BUG: soft lockup - CPU#0 stuck for 10s! [kblockd/0:10]
 
 Pid: 10, comm:kblockd/0
 EIP: 0060:[c056f931] CPU: 0
 EIP is at ide_outb+0x4/0x5
  EFLAGS: 0202Not tainted  (2.6.18-164.6.1.el5 #1)
 EAX: 0001 EBX: c07e2f80 ECX: 0286 EDX: c000
 ESI: 0011 EDI:  EBP: c07e3014 DS: 007b ES: 007b
 CR0: 8005003b CR2: b7f3c000 CR3: 12122000 CR4: 06d0
  [c0573cab] ide_dma_start+0x22/0x2e
  [c0576474] ide_do_rw_disk+0x3b2/0x4a6
  [c056de34] ide_do_request+0x533/0x6bf
  [c04de1b9] freed_request+0x1d/0x37
  [c056d8d0] ide_end_request+0xcc/0xd4
  [c056e221] ide_intr+0x167/0x190
  [c044da39] handle_IRQ_event+0x45/0x8c
  [c044db04] __do_IRQ+0x84/0xd6
  [c044da80] __do_IRQ+0x0/0xd6
  [c04074b2] do_IRQ+0x99/0xc3
  [c0405946] common_interrupt+0x1a/0x20
  [c04291ab] __do_softirq+0x57/0x114
  [c04073cf] do_softirq+0x52/0x9c
  [c04059d7] apic_timer_interrupt+0x1f/0x24
  [c056f931] ide_outb+0x4/0x5
  [c0573cab] ide_dma_start+0x22/0x2e
  [c0576474] ide_do_rw_disk+0x3b2/0x4a6
  [c056de34] ide_do_request+0x533/0x6bf
  [c04e710f] cfq_kick_queue+0x70/0x80
  [c0431e8a] run_workqueue+0x78/0xb5
  [c04e709f] cfq_kick_queue+0x0/0x80
  [c043273e] worker_thread+0xd9/0x10b
  [c041e727] default_wake_function+0x0/0xc
  [c0432665] worker_thread+0x0/0x10b
  [c0434b55] kthread+0xc0/0xeb
  [c0434a95] kthread+0x0/0xeb
  [c0405c53] kernel_thread_helper+0x7/0x10
  ===
  by the hangs. The problem already was there on CentOS 5.3 as well.
  With the Debian guests on the same host, I have never had any apparent
  problems.
  
  Questions:
  
  - Is there significant swapping on the host?
  - Are you migrating vm's? 
 
 No migration and no swap activity. The host has plenty of idle RAM:
 
 [r...@zulu ~]# free -m
  total   used   free sharedbuffers cached
 Mem:  7987   7904 82  0667   5101
 -/+ buffers/cache:   2135   5851
 Swap: 1983  0   1983
 
  A number of google results suggest that I should work with CPU scaling
  on the CentOS guest systems, but unfortunately, CPU scaling is not
  available in my guests. So, here's my question: How do I enable CPU
  scaling in KVM guests? Or is there any other measure against these soft
  lockups that you can recommend?
  
  What probably was suggested is to disable cpu frequency scaling on the
  host. Please provide more details on the host system.
 
 Host is a Quadcore Xeon HP DL320 G5 with CentOS 5.4, old Kernel
 2.6.18-128.7.1.el5.
 
 There are no hints toward CPU scaling in /sys/devices/system/ on the host:
 
 [r...@zulu ~]# ls -l /sys/devices/system/cpu/cpu0
 total 0
 drwxr-xr-x 5 root root0 Nov  7 13:47 cache
 -r 1 root root 4096 Jan  4 14:55 crash_notes
 drwxr-xr-x 2 root root0 Nov  7 13:48 topology
 
 The file Crash Notes contains the following number: 22792b400
 
 Thanks for your help,


Martin,

Can you please share a few more soft lockup messages? (with
backtrace included).

Also qemu command line.

And boot-up messages of host and guest.

Thanks

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX] MCE: Fix bug of IA32_MCG_STATUS after system reset

2010-01-06 Thread Marcelo Tosatti
On Wed, Jan 06, 2010 at 04:17:42PM +0800, Huang Ying wrote:
 On Wed, 2010-01-06 at 16:03 +0800, Avi Kivity wrote:
  On 01/06/2010 09:05 AM, Huang Ying wrote:
   @@ -1015,6 +1015,7 @@ void kvm_arch_load_regs(CPUState *env)
  #endif
  set_msr_entry(msrs[n++], MSR_KVM_SYSTEM_TIME,  
   env-system_time_msr);
  set_msr_entry(msrs[n++], MSR_KVM_WALL_CLOCK,  
   env-wall_clock_msr);
   +set_msr_entry(msrs[n++], MSR_MCG_STATUS, 0);
  
  
  
   Not sure why you reset this in kvm_arch_load_regs().  Shouldn't this be
   in the cpu reset code?

   I found kvm_arch_load_regs() is called by kvm_arch_cpu_reset(), which is
   called by qemu_kvm_system_reset(). It is not in cpu reset path?
  
  
  It is, but it is also called from many other places, which could cause 
  this msr to be zeroed.
  
  A better solution is to allocate it a field in CPUState, load and save 
  it in kvm_arch_*_regs, and zero it during reset.
 
 Yes. You are right. I will fix this.

BTW, the MCE MSRs are not being migrated. Perhaps you'd like to fix that
while at it.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH resend] Fix the explanation of write_emulated

2010-01-06 Thread Takuya Yoshikawa
The explanation of write_emulated is confused with
that of read_emulated. This patch fix it.

Signed-off-by: Takuya Yoshikawa yoshikawa.tak...@oss.ntt.co.jp
---
 arch/x86/include/asm/kvm_emulate.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index 7c18e12..9b697c2 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -74,7 +74,7 @@ struct x86_emulate_ops {
 struct kvm_vcpu *vcpu);
 
/*
-* write_emulated: Read bytes from emulated/special memory area.
+* write_emulated: Write bytes to emulated/special memory area.
 *  @addr:  [IN ] Linear address to which to write.
 *  @val:   [IN ] Value to write to memory (low-order bytes used as
 *required).
-- 
1.6.3.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Soft lockups and cpu frequency scaling

2010-01-06 Thread Martin Schmitt
Marcelo Tosatti schrieb:

 Can you please share a few more soft lockup messages? (with
 backtrace included).

Full dmesg from guest: http://pastebin.com/f51a966df

 Also qemu command line.

From ps:

/usr/local/kvm/bin/qemu-system-x86_64 -hda /drbd/vweb/vweb.vmdk -vnc
127.0.0.1:4 -m 512 -net nic,macaddr=00:16:3e:6a:a5:10 -net tap
-enable-kvm -k de -pidfile /drbd/vweb/pidfile.pid -monitor
tcp:127.0.0.1:1004,server,nowait -name /drbd/vweb/vweb.vmdk -cpu host
-daemonize

 And boot-up messages of host and guest.

Does the dmesg above cover that for the guest or are you asking for
anything further than that?

Host side dmesg: http://pastebin.com/f47a30d22

Thanks for your time,

-martin

-- 
Martin Schmitt / Schmitt Systemberatung / www.scsy.de
-- http://www.pug.org/index.php/Benutzer:Martin --
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: VMX: Enable EPT 1GB page support

2010-01-06 Thread Marcelo Tosatti
On Tue, Jan 05, 2010 at 07:02:29PM +0800, Sheng Yang wrote:
 
 Signed-off-by: Sheng Yang sh...@linux.intel.com
 ---
  arch/x86/include/asm/vmx.h |1 +
  arch/x86/kvm/mmu.c |8 +---
  arch/x86/kvm/vmx.c |   11 ++-
  3 files changed, 16 insertions(+), 4 deletions(-)
 
 diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
 index 713ed9a..43f1e9b 100644
 --- a/arch/x86/include/asm/vmx.h
 +++ b/arch/x86/include/asm/vmx.h
 @@ -364,6 +364,7 @@ enum vmcs_field {
  #define VMX_EPTP_UC_BIT  (1ull  8)
  #define VMX_EPTP_WB_BIT  (1ull  14)
  #define VMX_EPT_2MB_PAGE_BIT (1ull  16)
 +#define VMX_EPT_1GB_PAGE_BIT (1ull  17)
  #define VMX_EPT_EXTENT_INDIVIDUAL_BIT(1ull  24)
  #define VMX_EPT_EXTENT_CONTEXT_BIT   (1ull  25)
  #define VMX_EPT_EXTENT_GLOBAL_BIT(1ull  26)
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index 43cf2ea..9012541 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -499,8 +499,7 @@ out:
  static int mapping_level(struct kvm_vcpu *vcpu, gfn_t large_gfn)
  {
   struct kvm_memory_slot *slot;
 - int host_level;
 - int level = PT_PAGE_TABLE_LEVEL;
 + int host_level, level, max_level;
  
   slot = gfn_to_memslot(vcpu-kvm, large_gfn);
   if (slot  slot-dirty_bitmap)
 @@ -511,7 +510,10 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
 large_gfn)
   if (host_level == PT_PAGE_TABLE_LEVEL)
   return host_level;
  
 - for (level = PT_DIRECTORY_LEVEL; level = host_level; ++level)
 + max_level = kvm_x86_ops-get_lpage_level()  host_level ?
 + kvm_x86_ops-get_lpage_level() : host_level;
 +

BUG_ON(kvm_x86_ops-get_lpage_level()  host_level) instead? See
the if (host_level == PT_PAGE_TABLE_LEVEL) above.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] PPC: Enable lightweight exits again

2010-01-06 Thread Marcelo Tosatti
On Mon, Jan 04, 2010 at 10:19:25PM +0100, Alexander Graf wrote:
 The PowerPC C ABI defines that registers r14-r31 need to be preserved across
 function calls. Since our exit handler is written in C, we can make use of 
 that
 and don't need to reload r14-r31 on every entry/exit cycle.
 
 This technique is also used in the BookE code and is called lightweight 
 exits
 there. To follow the tradition, it's called the same in Book3S.
 
 So far this optimization was disabled though, as the code didn't do what it 
 was
 expected to do, but failed to work.
 
 This patch fixes and enables lightweight exits again.
 
 Signed-off-by: Alexander Graf ag...@suse.de

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] PPC: Fix typo in rebolting code

2010-01-06 Thread Marcelo Tosatti
On Mon, Jan 04, 2010 at 10:19:22PM +0100, Alexander Graf wrote:
 When we're loading bolted entries into the SLB again, we're checking if an
 entry is in use and only slbmte it when it is.
 
 Unfortunately, the check always goes to the skip label of the first entry,
 resulting in an endless loop when it actually gets triggered.
 
 Signed-off-by: Alexander Graf ag...@suse.de

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] configure: Correct KVM options in help output

2010-01-06 Thread Pierre Riteau
Signed-off-by: Pierre Riteau pierre.rit...@irisa.fr
---
 configure |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 81c44e8..4254485 100755
--- a/configure
+++ b/configure
@@ -756,10 +756,10 @@ echo   --disable-bluez  disable bluez stack 
connectivity
 echo   --enable-bluez   enable bluez stack connectivity
 echo   --disable-kvmdisable KVM acceleration support
 echo   --enable-kvm enable KVM acceleration support
-echo   --disable-cap-kvm-pitdisable KVM pit support
-echo   --enable-cap-kvm-pit enable KVM pit support
-echo   --disable-cap-device-assignmentdisable KVM device assignment 
support
-echo   --enable-cap-device-assignment enable KVM device assignment 
support
+echo   --disable-kvm-cap-pitdisable KVM pit support
+echo   --enable-kvm-cap-pit enable KVM pit support
+echo   --disable-kvm-cap-device-assignmentdisable KVM device assignment 
support
+echo   --enable-kvm-cap-device-assignment enable KVM device assignment 
support
 echo   --disable-nptl   disable usermode NPTL support
 echo   --enable-nptlenable usermode NPTL support
 echo   --enable-system  enable all system emulation targets
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: VMX: Enable EPT 1GB page support

2010-01-06 Thread Sheng Yang
On Wednesday 06 January 2010 17:19:15 Marcelo Tosatti wrote:
 On Tue, Jan 05, 2010 at 07:02:29PM +0800, Sheng Yang wrote:
  Signed-off-by: Sheng Yang sh...@linux.intel.com
  ---
   arch/x86/include/asm/vmx.h |1 +
   arch/x86/kvm/mmu.c |8 +---
   arch/x86/kvm/vmx.c |   11 ++-
   3 files changed, 16 insertions(+), 4 deletions(-)
 
  diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
  index 713ed9a..43f1e9b 100644
  --- a/arch/x86/include/asm/vmx.h
  +++ b/arch/x86/include/asm/vmx.h
  @@ -364,6 +364,7 @@ enum vmcs_field {
   #define VMX_EPTP_UC_BIT(1ull  8)
   #define VMX_EPTP_WB_BIT(1ull  14)
   #define VMX_EPT_2MB_PAGE_BIT   (1ull  16)
  +#define VMX_EPT_1GB_PAGE_BIT   (1ull  17)
   #define VMX_EPT_EXTENT_INDIVIDUAL_BIT  (1ull  24)
   #define VMX_EPT_EXTENT_CONTEXT_BIT (1ull  25)
   #define VMX_EPT_EXTENT_GLOBAL_BIT  (1ull  26)
  diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
  index 43cf2ea..9012541 100644
  --- a/arch/x86/kvm/mmu.c
  +++ b/arch/x86/kvm/mmu.c
  @@ -499,8 +499,7 @@ out:
   static int mapping_level(struct kvm_vcpu *vcpu, gfn_t large_gfn)
   {
  struct kvm_memory_slot *slot;
  -   int host_level;
  -   int level = PT_PAGE_TABLE_LEVEL;
  +   int host_level, level, max_level;
 
  slot = gfn_to_memslot(vcpu-kvm, large_gfn);
  if (slot  slot-dirty_bitmap)
  @@ -511,7 +510,10 @@ static int mapping_level(struct kvm_vcpu *vcpu,
  gfn_t large_gfn) if (host_level == PT_PAGE_TABLE_LEVEL)
  return host_level;
 
  -   for (level = PT_DIRECTORY_LEVEL; level = host_level; ++level)
  +   max_level = kvm_x86_ops-get_lpage_level()  host_level ?
  +   kvm_x86_ops-get_lpage_level() : host_level;
  +
 
 BUG_ON(kvm_x86_ops-get_lpage_level()  host_level) instead? See
 the if (host_level == PT_PAGE_TABLE_LEVEL) above.
 
Sorry, I don't understand...

Here, EPT can support either 2MB or 1GB page at most, So we represent the 
value through get_lpage_level(). Now 1GB page backed host memory can still 
using by 2MB page backed EPT if 1GB page EPT is not supported in the 
processor.

-- 
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Soft lockups and cpu frequency scaling

2010-01-06 Thread Marcelo Tosatti
On Wed, Jan 06, 2010 at 09:15:02AM +0100, Martin Schmitt wrote:
 Marcelo Tosatti schrieb:
 
  Can you please share a few more soft lockup messages? (with
  backtrace included).
 
 Full dmesg from guest: http://pastebin.com/f51a966df
 
  Also qemu command line.
 
 From ps:
 
 /usr/local/kvm/bin/qemu-system-x86_64 -hda /drbd/vweb/vweb.vmdk -vnc
 127.0.0.1:4 -m 512 -net nic,macaddr=00:16:3e:6a:a5:10 -net tap
 -enable-kvm -k de -pidfile /drbd/vweb/pidfile.pid -monitor
 tcp:127.0.0.1:1004,server,nowait -name /drbd/vweb/vweb.vmdk -cpu host
 -daemonize

Hum, can you try converting that vmdk image to qcow2 or raw? (with
qemu-img convert).

AFAICS the QEMU vmdk implementation is synchronous, so the guest 
waits on IO operations to complete on the host side.

  And boot-up messages of host and guest.
 
 Does the dmesg above cover that for the guest or are you asking for
 anything further than that?
 
 Host side dmesg: http://pastebin.com/f47a30d22
 
 Thanks for your time,
 
 -martin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Soft lockups and cpu frequency scaling

2010-01-06 Thread Martin Schmitt
Marcelo Tosatti schrieb:

 Hum, can you try converting that vmdk image to qcow2 or raw? (with
 qemu-img convert).
 
 AFAICS the QEMU vmdk implementation is synchronous, so the guest 
 waits on IO operations to complete on the host side.

I'll do so, but I can only make the conversion early morning or late
evening, so it won't happen before tomorrow.

The reason why I'm on vmdk is that these were migrated from VMware
server. The free VMware server isn't much fun anymore, and I just need
simple virtualization, no big-iron enterprise BS compliance. ;-)

Thanks,

-martin

-- 
Martin Schmitt / Schmitt Systemberatung / www.scsy.de
-- http://www.pug.org/index.php/Benutzer:Martin --
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


The HPET issue on Linux

2010-01-06 Thread Sheng Yang
Hi Beth

I still found the emulated HPET would result in some boot failure. For 
example, on my 2.6.30, with HPET enabled, the kernel would fail check_timer(), 
especially in timer_irq_works().

The testing of timer_irq_works() is let 10 ticks pass(using mdelay()), and 
want to confirm the clock source with at least 5 ticks advanced in jiffies. 
I've checked that, on my machine, it would mostly get only 4 ticks when HPET 
enabled, then fail the test. On the other hand, if I using PIT, it would get 
more than 10 ticks(maybe understandable if some complementary ticks there). Of 
course, extend the ticks count/mdelay() time can work.

I think it's a major issue of HPET. And it maybe just due to a too long 
userspace path for interrupt injection... If it's true, I think it's not easy 
to deal with it.

-- 
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: VMX: Enable EPT 1GB page support

2010-01-06 Thread Marcelo Tosatti
On Wed, Jan 06, 2010 at 05:25:27PM +0800, Sheng Yang wrote:
   - for (level = PT_DIRECTORY_LEVEL; level = host_level; ++level)
   + max_level = kvm_x86_ops-get_lpage_level()  host_level ?
   + kvm_x86_ops-get_lpage_level() : host_level;
   +
  
  BUG_ON(kvm_x86_ops-get_lpage_level()  host_level) instead? See
  the if (host_level == PT_PAGE_TABLE_LEVEL) above.
  
 Sorry, I don't understand...
 
 Here, EPT can support either 2MB or 1GB page at most, So we represent the 
 value through get_lpage_level(). Now 1GB page backed host memory can still 
 using by 2MB page backed EPT if 1GB page EPT is not supported in the 
 processor.

Oh, OK, i misundertood.

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Gleb Natapov
On Wed, Jan 06, 2010 at 05:48:52PM +0800, Sheng Yang wrote:
 Hi Beth
 
 I still found the emulated HPET would result in some boot failure. For 
 example, on my 2.6.30, with HPET enabled, the kernel would fail 
 check_timer(), 
 especially in timer_irq_works().
 
 The testing of timer_irq_works() is let 10 ticks pass(using mdelay()), and 
 want to confirm the clock source with at least 5 ticks advanced in jiffies. 
 I've checked that, on my machine, it would mostly get only 4 ticks when HPET 
 enabled, then fail the test. On the other hand, if I using PIT, it would get 
 more than 10 ticks(maybe understandable if some complementary ticks there). 
 Of 
 course, extend the ticks count/mdelay() time can work.
 
 I think it's a major issue of HPET. And it maybe just due to a too long 
 userspace path for interrupt injection... If it's true, I think it's not easy 
 to deal with it.
 
PIT tick are reinjected automatically, HPET should probably do the same
although it may just create another set of problems.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/12] KVM: Add host swap event notifications for PV guest

2010-01-06 Thread Jun Koi
On Wed, Jan 6, 2010 at 1:04 AM, Avi Kivity a...@redhat.com wrote:
 On 01/05/2010 05:05 PM, Jun Koi wrote:

 Is it true that to make this work, we will need a (PV) kernel driver
 for each guest OS (Windows, Linux, ...)?



 It's partially usable even without guest modifications; while servicing a
 host page fault we can still deliver interrupts to the guest (which might
 cause a context switch and thus further progress to be made).

Lets say, in the case the guest has no PV driver. When we find that a
guest page is swapped out, we can send a pagefault
to the guest to trick it to load that page in. And we dont need the
driver at all.

Is that a reasonable solution?

Thanks,
J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/12] KVM: Add host swap event notifications for PV guest

2010-01-06 Thread Gleb Natapov
On Wed, Jan 06, 2010 at 07:17:30PM +0900, Jun Koi wrote:
 On Wed, Jan 6, 2010 at 1:04 AM, Avi Kivity a...@redhat.com wrote:
  On 01/05/2010 05:05 PM, Jun Koi wrote:
 
  Is it true that to make this work, we will need a (PV) kernel driver
  for each guest OS (Windows, Linux, ...)?
 
 
 
  It's partially usable even without guest modifications; while servicing a
  host page fault we can still deliver interrupts to the guest (which might
  cause a context switch and thus further progress to be made).
 
 Lets say, in the case the guest has no PV driver. When we find that a
 guest page is swapped out, we can send a pagefault
 to the guest to trick it to load that page in. And we dont need the
 driver at all.
 
That's not the guest who should load the page. From guest's point of view the
page is in memory.

 Is that a reasonable solution?
 
 Thanks,
 J

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Dor Laor

On 01/06/2010 12:09 PM, Gleb Natapov wrote:

On Wed, Jan 06, 2010 at 05:48:52PM +0800, Sheng Yang wrote:

Hi Beth

I still found the emulated HPET would result in some boot failure. For
example, on my 2.6.30, with HPET enabled, the kernel would fail check_timer(),
especially in timer_irq_works().

The testing of timer_irq_works() is let 10 ticks pass(using mdelay()), and
want to confirm the clock source with at least 5 ticks advanced in jiffies.
I've checked that, on my machine, it would mostly get only 4 ticks when HPET
enabled, then fail the test. On the other hand, if I using PIT, it would get
more than 10 ticks(maybe understandable if some complementary ticks there). Of
course, extend the ticks count/mdelay() time can work.

I think it's a major issue of HPET. And it maybe just due to a too long
userspace path for interrupt injection... If it's true, I think it's not easy
to deal with it.


PIT tick are reinjected automatically, HPET should probably do the same
although it may just create another set of problems.


Older Linux do automatic adjustment for lost ticks so automatic 
reinjection causes time to run too fast. This is why we added the 
-no-kvm-pit-reinject flag...


It took lots of time to pit/rtc to stabilize, in order of seriously 
consider the hpet emulation, lots of testing should be done.




--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] KVM: Lazify fpu activation and deactivation

2010-01-06 Thread Joerg Roedel
On Wed, Jan 06, 2010 at 05:18:13AM +0200, Avi Kivity wrote:
 Joerg, what was the reason the initial npt implementation did not do
 lazy fpu switching?

The lazy fpu switching code needed cr3 accesses to be intercepted. With
NPT this was the only reason left to intercept cr3 so I decided to
switch lazy fpu switching off and don't intercept cr3 accesses.

Joerg


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] KVM: Lazify fpu activation and deactivation

2010-01-06 Thread Avi Kivity

On 01/06/2010 12:47 PM, Joerg Roedel wrote:

On Wed, Jan 06, 2010 at 05:18:13AM +0200, Avi Kivity wrote:
   

Joerg, what was the reason the initial npt implementation did not do
lazy fpu switching?
 

The lazy fpu switching code needed cr3 accesses to be intercepted. With
NPT this was the only reason left to intercept cr3 so I decided to
switch lazy fpu switching off and don't intercept cr3 accesses.
   


Oh.  In fact the interaction of cr3 intercepts with lazy fpu was a 
mistake which this patchset removes; I'll replace cr0 intercepts with 
selective cr0 write intercept in a new patch.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Mark Cave-Ayland

Avi Kivity wrote:

I think I'm experiencing a regression with the new qemu-kvm-0.12.1.2 
release compared to qemu-kvm-0.12.1.1 with a WinXP guest on Linux.


I can boot my WinXP guest without a problem under qemu-kvm-0.12.1.1, 
however under qemu-kvm-0.12.1.2 a couple of seconds after reaching the 
login screen, the WinXP guest goes BSOD with the following error: 
DRIVER_UNLOADED_WITHOUT_CANCELING_PENDING_OPERATIONS.


I've confirmed by switching between the two installations several 
times that the error consistently occurs with qemu-kvm-0.12.1.2 but 
not qemu-kvm-0.12.1.1. Is this a known issue? This is on an x86_64 
Debian Lenny host with a 2.6.32.2 kernel on Intel.


It's not a known issue.  What's your command line?  What's your host cpu 
type?


Hi Avi,

Good news - I downloaded the userspace git repository and managed to 
identify the offending commit between 0.12.1.1 and 0.12.1.2 using git 
bisect:



4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1 is first bad commit
commit 4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1
Author: Avi Kivity a...@redhat.com
Date:   Mon Dec 28 10:48:00 2009 +0200

Reinstate cpuid vendor override when kvm is enabled

Due to upstream qemu changes we no longer expose the host cpu vendor id
to the guest.  This leads to failures when the syscall/sysenter 
instructions

are used in compatibility mode.

Change the default to override when kvm is enabled.

Signed-off-by: Avi Kivity a...@redhat.com
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com


HTH,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Marcelo Tosatti
On Wed, Jan 06, 2010 at 11:25:14AM +, Mark Cave-Ayland wrote:
 Avi Kivity wrote:

 I think I'm experiencing a regression with the new qemu-kvm-0.12.1.2  
 release compared to qemu-kvm-0.12.1.1 with a WinXP guest on Linux.

 I can boot my WinXP guest without a problem under qemu-kvm-0.12.1.1,  
 however under qemu-kvm-0.12.1.2 a couple of seconds after reaching 
 the login screen, the WinXP guest goes BSOD with the following error: 
 DRIVER_UNLOADED_WITHOUT_CANCELING_PENDING_OPERATIONS.

 I've confirmed by switching between the two installations several  
 times that the error consistently occurs with qemu-kvm-0.12.1.2 but  
 not qemu-kvm-0.12.1.1. Is this a known issue? This is on an x86_64  
 Debian Lenny host with a 2.6.32.2 kernel on Intel.

 It's not a known issue.  What's your command line?  What's your host 
 cpu type?

 Hi Avi,

 Good news - I downloaded the userspace git repository and managed to  
 identify the offending commit between 0.12.1.1 and 0.12.1.2 using git  
 bisect:


 4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1 is first bad commit
 commit 4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1
 Author: Avi Kivity a...@redhat.com
 Date:   Mon Dec 28 10:48:00 2009 +0200

 Reinstate cpuid vendor override when kvm is enabled

 Due to upstream qemu changes we no longer expose the host cpu vendor id
 to the guest.  This leads to failures when the syscall/sysenter  
 instructions
 are used in compatibility mode.

 Change the default to override when kvm is enabled.

 Signed-off-by: Avi Kivity a...@redhat.com
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com


 HTH,

 Mark.

Mark,

Thanks for tracking it down. Is there any difference with -cpu host
option?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Gleb Natapov
On Wed, Jan 06, 2010 at 10:29:18AM -0200, Marcelo Tosatti wrote:
 On Wed, Jan 06, 2010 at 11:25:14AM +, Mark Cave-Ayland wrote:
  Avi Kivity wrote:
 
  I think I'm experiencing a regression with the new qemu-kvm-0.12.1.2  
  release compared to qemu-kvm-0.12.1.1 with a WinXP guest on Linux.
 
  I can boot my WinXP guest without a problem under qemu-kvm-0.12.1.1,  
  however under qemu-kvm-0.12.1.2 a couple of seconds after reaching 
  the login screen, the WinXP guest goes BSOD with the following error: 
  DRIVER_UNLOADED_WITHOUT_CANCELING_PENDING_OPERATIONS.
 
  I've confirmed by switching between the two installations several  
  times that the error consistently occurs with qemu-kvm-0.12.1.2 but  
  not qemu-kvm-0.12.1.1. Is this a known issue? This is on an x86_64  
  Debian Lenny host with a 2.6.32.2 kernel on Intel.
 
  It's not a known issue.  What's your command line?  What's your host 
  cpu type?
 
  Hi Avi,
 
  Good news - I downloaded the userspace git repository and managed to  
  identify the offending commit between 0.12.1.1 and 0.12.1.2 using git  
  bisect:
 
 
  4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1 is first bad commit
  commit 4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1
  Author: Avi Kivity a...@redhat.com
  Date:   Mon Dec 28 10:48:00 2009 +0200
 
  Reinstate cpuid vendor override when kvm is enabled
 
  Due to upstream qemu changes we no longer expose the host cpu vendor id
  to the guest.  This leads to failures when the syscall/sysenter  
  instructions
  are used in compatibility mode.
 
  Change the default to override when kvm is enabled.
 
  Signed-off-by: Avi Kivity a...@redhat.com
  Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 
 
  HTH,
 
  Mark.
 
 Mark,
 
 Thanks for tracking it down. Is there any difference with -cpu host
 option?
 
And what output of cat /proc/cpuinfo on the host looks like?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Gleb Natapov
On Wed, Jan 06, 2010 at 02:33:36PM +0200, Gleb Natapov wrote:
 On Wed, Jan 06, 2010 at 10:29:18AM -0200, Marcelo Tosatti wrote:
  On Wed, Jan 06, 2010 at 11:25:14AM +, Mark Cave-Ayland wrote:
   Avi Kivity wrote:
  
   I think I'm experiencing a regression with the new qemu-kvm-0.12.1.2  
   release compared to qemu-kvm-0.12.1.1 with a WinXP guest on Linux.
  
   I can boot my WinXP guest without a problem under qemu-kvm-0.12.1.1,  
   however under qemu-kvm-0.12.1.2 a couple of seconds after reaching 
   the login screen, the WinXP guest goes BSOD with the following error: 
   DRIVER_UNLOADED_WITHOUT_CANCELING_PENDING_OPERATIONS.
  
   I've confirmed by switching between the two installations several  
   times that the error consistently occurs with qemu-kvm-0.12.1.2 but  
   not qemu-kvm-0.12.1.1. Is this a known issue? This is on an x86_64  
   Debian Lenny host with a 2.6.32.2 kernel on Intel.
  
   It's not a known issue.  What's your command line?  What's your host 
   cpu type?
  
   Hi Avi,
  
   Good news - I downloaded the userspace git repository and managed to  
   identify the offending commit between 0.12.1.1 and 0.12.1.2 using git  
   bisect:
  
  
   4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1 is first bad commit
   commit 4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1
   Author: Avi Kivity a...@redhat.com
   Date:   Mon Dec 28 10:48:00 2009 +0200
  
   Reinstate cpuid vendor override when kvm is enabled
  
   Due to upstream qemu changes we no longer expose the host cpu vendor 
   id
   to the guest.  This leads to failures when the syscall/sysenter  
   instructions
   are used in compatibility mode.
  
   Change the default to override when kvm is enabled.
  
   Signed-off-by: Avi Kivity a...@redhat.com
   Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
  
  
   HTH,
  
   Mark.
  
  Mark,
  
  Thanks for tracking it down. Is there any difference with -cpu host
  option?
  
 And what output of cat /proc/cpuinfo on the host looks like?
 
Ah sorry, you already did that.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Avi Kivity

On 01/06/2010 01:25 PM, Mark Cave-Ayland wrote:
Good news - I downloaded the userspace git repository and managed to 
identify the offending commit between 0.12.1.1 and 0.12.1.2 using git 
bisect:



4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1 is first bad commit
commit 4dad7ff32aa6dcf18cef0c606d8fb43ff0b939a1
Author: Avi Kivity a...@redhat.com
Date:   Mon Dec 28 10:48:00 2009 +0200

Reinstate cpuid vendor override when kvm is enabled

Due to upstream qemu changes we no longer expose the host cpu 
vendor id
to the guest.  This leads to failures when the syscall/sysenter 
instructions

are used in compatibility mode.

Change the default to override when kvm is enabled.

Signed-off-by: Avi Kivity a...@redhat.com
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com



Did you install using 0.12.1.1 and then run using 0.12.1.2?  If so, it 
seems Windows XP is not able to move from AMD to Intel.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Mark Cave-Ayland

Marcelo Tosatti wrote:

 Mark,

 Thanks for tracking it down. Is there any difference with -cpu host
 option?

Yeah; if I launch 0.12.1.1 which normally works with an additional -cpu 
host option then kvm crashes in exactly the same way.



HTH,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Mark Cave-Ayland

Avi Kivity wrote:

Did you install using 0.12.1.1 and then run using 0.12.1.2?  If so, it 
seems Windows XP is not able to move from AMD to Intel.


No, although the original VM was built in VirtualBox if that makes a 
difference? I had to manually implement the merge ide fix here 
(http://support.microsoft.com/kb/314082) in order to bring up the VM 
under KVM but it has worked fine until the upgrade to 0.12.1.2.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Gleb Natapov
On Wed, Jan 06, 2010 at 12:50:29PM +, Mark Cave-Ayland wrote:
 Avi Kivity wrote:
 
 Did you install using 0.12.1.1 and then run using 0.12.1.2?  If
 so, it seems Windows XP is not able to move from AMD to Intel.
 
 No, although the original VM was built in VirtualBox if that makes a
 difference? I had to manually implement the merge ide fix here
 (http://support.microsoft.com/kb/314082) in order to bring up the VM
 under KVM but it has worked fine until the upgrade to 0.12.1.2.
 
 
Which version of kvm you've used before 0.12.1.1? Does this VM work on
0.11?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Avi Kivity

On 01/06/2010 02:50 PM, Mark Cave-Ayland wrote:

Avi Kivity wrote:

Did you install using 0.12.1.1 and then run using 0.12.1.2?  If so, 
it seems Windows XP is not able to move from AMD to Intel.


No, although the original VM was built in VirtualBox if that makes a 
difference? I had to manually implement the merge ide fix here 
(http://support.microsoft.com/kb/314082) in order to bring up the VM 
under KVM but it has worked fine until the upgrade to 0.12.1.2.




It probably did make some kind of difference.  Please try a clean install.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Memory hotplug

2010-01-06 Thread Ryota Ozaki
Hi all,

I would like to know the state of guest memory
hotplug support.

Is the feature already supported?
Otherwise, is anyone working on that?

I know virtio balloon can changes amount of
guest memory online, however, it can change
only under initially assigned amount of memory.
I want to increase guest memory over the initial
amount.

Thanks in advance,
 ozaki-r
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Mark Cave-Ayland

Gleb Natapov wrote:


Can you start is in safe mode and see which driver fails?


Hmmm. Booting in Safe Mode seems to work fine, although the next attempt 
to boot into Normal Mode returns a BSOD with INTERNAL_POWER_ERROR 
which is new to me. Subsequent reboots then return to the 
DRIVER_UNLOADED_WITHOUT_CANCELING_PENDING_OPERATIONS BSOD.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Yaniv Kaul

On 1/6/2010 3:14 PM, Mark Cave-Ayland wrote:

Gleb Natapov wrote:


Can you start is in safe mode and see which driver fails?


Hmmm. Booting in Safe Mode seems to work fine, although the next 
attempt to boot into Normal Mode returns a BSOD with 
INTERNAL_POWER_ERROR which is new to me. Subsequent reboots then 
return to the DRIVER_UNLOADED_WITHOUT_CANCELING_PENDING_OPERATIONS 
BSOD.



ATB,

Mark.



Try logging the boot (possible via the F8 menu as well).
Y.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VNC framebuffer block artefacts on qemu-kvm-0.12.1.1

2010-01-06 Thread Marcelo Tosatti
On Tue, Jan 05, 2010 at 04:11:26PM +, Mark Cave-Ayland wrote:
 Hi all,

 Having just upgraded from kvm-85 to qemu-kvm-0.12.1.1 on one of our  
 servers, I've noticed that I am seeing block artefacts when connecting  
 using VNC to the graphical VGA console of a WinXP guest.

 Looking at the VNC output, what I am seeing is that instead of updating  
 some parts of the screen which require a redraw, they are just being  
 replaced by light grey blocks of around 16x16 pixels. Generally, but not  
 always, several of these blocks appear in a row. Moving the mouse over  
 the relevant sections of the screen causes them to be redrawn correctly.

 I've tried this using both the cirrus and vga drivers, switching between  
 16/24/32 bit colour and also different resolutions and unfortunately the  
 effect still remains :( Is there anything else I can do to help try and  
 debug this? Again this is on an x86_64 Debian Lenny host with a 2.6.32.2  
 kernel on Intel.

Mark,

Can you confirm that reverting commit
02c2b87fff97e77a1f6033fb09f53afa267c0c1e fixes the problem? (patch
attached).

Anthony: its reproducible with upstream/tcg.

diff --git a/vnchextile.h b/vnchextile.h
index 432ed89..c96ede3 100644
--- a/vnchextile.h
+++ b/vnchextile.h
@@ -73,7 +73,7 @@ static void CONCAT(send_hextile_tile_, NAME)(VncState *vs,
*last_bg = bg;
 }
 
-if (n_colors  3  (!*has_fg || *last_fg != fg)) {
+if (!*has_fg || *last_fg != fg) {
flags |= 0x04;
*has_fg = 1;
*last_fg = fg;
@@ -165,6 +165,8 @@ static void CONCAT(send_hextile_tile_, NAME)(VncState *vs,
irow += ds_get_linesize(vs-ds) / sizeof(pixel_t);
}
 
+   /* A SubrectsColoured subtile invalidates the foreground color */
+   *has_fg = 0;
if (n_data  (w * h * sizeof(pixel_t))) {
n_colors = 4;
flags = 0x01;


Re: Memory hotplug

2010-01-06 Thread Marcelo Tosatti
On Wed, Jan 06, 2010 at 10:07:25PM +0900, Ryota Ozaki wrote:
 Hi all,
 
 I would like to know the state of guest memory
 hotplug support.
 
 Is the feature already supported?

No.

 Otherwise, is anyone working on that?

Not that i know of.

 I know virtio balloon can changes amount of
 guest memory online, however, it can change
 only under initially assigned amount of memory.
 I want to increase guest memory over the initial
 amount.

That'd be great. Probably ACPI is the best mechanism
for memory hotplug.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Memory hotplug

2010-01-06 Thread Ryota Ozaki
On Wed, Jan 6, 2010 at 10:59 PM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Wed, Jan 06, 2010 at 10:07:25PM +0900, Ryota Ozaki wrote:
 Hi all,

 I would like to know the state of guest memory
 hotplug support.

 Is the feature already supported?

 No.

 Otherwise, is anyone working on that?

 Not that i know of.

 I know virtio balloon can changes amount of
 guest memory online, however, it can change
 only under initially assigned amount of memory.
 I want to increase guest memory over the initial
 amount.

 That'd be great. Probably ACPI is the best mechanism
 for memory hotplug.

Thanks for your advice. I agree with you.

I'll try to implement it if nobody working on it.

  ozaki-r
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VNC framebuffer block artefacts on qemu-kvm-0.12.1.1

2010-01-06 Thread Mark Cave-Ayland

Marcelo Tosatti wrote:


Mark,

Can you confirm that reverting commit
02c2b87fff97e77a1f6033fb09f53afa267c0c1e fixes the problem? (patch
attached).

Anthony: its reproducible with upstream/tcg.


Hi Marcelo,

Yes, this solves the problem for me - thanks a lot!

FWIW there is still another race condition in the VNC code somewhere. 
Due to the bad weather in the UK today, I'm working remotely over an SSH 
tunnel which seems to exacerbate the problem. What I see is that when 
scrolling large windows in WinXP quickly, my VNC client disconnects from 
the VGA framebuffer with messages like this:



 CConn:   Throughput 1131 kbit/s - changing to full colour
 CConn:   Using pixel format depth 24 (32bpp) little-endian rgb888
Rect too big: 7088x27 at 4123,40960 exceeds 800x600
 main:Rect too big

 CConn:   Throughput 1093 kbit/s - changing to full colour
 CConn:   Using pixel format depth 24 (32bpp) little-endian rgb888
Rect too big: 30224x50704 at 50448,13840 exceeds 720x400
 main:Rect too big


While I can always reconnect and continue where I left off, it can still 
be quite annoying sometimes.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Mark Cave-Ayland

Avi Kivity wrote:


It probably did make some kind of difference.  Please try a clean install.


After several hours of testing, I've finally found out what the problem is.

I tried a clean WinXP guest install and that worked, so it was obviously 
a driver issue. After disabling various drivers in the WinXP guest, I 
didn't get anywhere so I decided to take a break and test Marcelo's VNC 
patch. With this applied, I could actually see all of the information in 
the BSOD which showed the error was in intelppm.sys.


A quick search took me to this page here: 
http://blogs.msdn.com/virtual_pc_guy/archive/2005/10/24/484461.aspx 
which explains the issue in more detail. I first tried disabling the 
intelppm driver and rebooting, but that didn't make a difference; 
however disabling the Processor driver worked and my guest VM booted in 
Normal Mode :)


I think the issue is probably similar to that explained in the article 
above; with a new processor reported to the guest, the internal 
processor driver tries to upload some kind of microcode to the new 
device which fails and causes the guest to fall over. Can we teach KVM 
to silently discard these kinds of updates?



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-06 Thread Mark Cave-Ayland

Mark Cave-Ayland wrote:

A quick search took me to this page here: 
http://blogs.msdn.com/virtual_pc_guy/archive/2005/10/24/484461.aspx 
which explains the issue in more detail. I first tried disabling the 
intelppm driver and rebooting, but that didn't make a difference; 
however disabling the Processor driver worked and my guest VM booted in 
Normal Mode :)


I've just re-created the KVM image fresh from the VDI image once again 
and can confirm that disabling just the Processor driver is enough to 
allow the guest WinXP VM to function in qemu-kvm-0.12.1.2. Perhaps the 
default for -cpu host should not be changed in a micro release as 
there is a risk of breaking existing VMs?



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VNC framebuffer block artefacts on qemu-kvm-0.12.1.1

2010-01-06 Thread Anthony Liguori

On 01/06/2010 07:51 AM, Marcelo Tosatti wrote:

On Tue, Jan 05, 2010 at 04:11:26PM +, Mark Cave-Ayland wrote:
   

Hi all,

Having just upgraded from kvm-85 to qemu-kvm-0.12.1.1 on one of our
servers, I've noticed that I am seeing block artefacts when connecting
using VNC to the graphical VGA console of a WinXP guest.

Looking at the VNC output, what I am seeing is that instead of updating
some parts of the screen which require a redraw, they are just being
replaced by light grey blocks of around 16x16 pixels. Generally, but not
always, several of these blocks appear in a row. Moving the mouse over
the relevant sections of the screen causes them to be redrawn correctly.

I've tried this using both the cirrus and vga drivers, switching between
16/24/32 bit colour and also different resolutions and unfortunately the
effect still remains :( Is there anything else I can do to help try and
debug this? Again this is on an x86_64 Debian Lenny host with a 2.6.32.2
kernel on Intel.
 

Mark,

Can you confirm that reverting commit
02c2b87fff97e77a1f6033fb09f53afa267c0c1e fixes the problem? (patch
attached).

Anthony: its reproducible with upstream/tcg.


Which vnc client is this?

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VNC framebuffer block artefacts on qemu-kvm-0.12.1.1

2010-01-06 Thread Marcelo Tosatti
On Wed, Jan 06, 2010 at 11:58:14AM -0600, Anthony Liguori wrote:
 On 01/06/2010 07:51 AM, Marcelo Tosatti wrote:
 On Tue, Jan 05, 2010 at 04:11:26PM +, Mark Cave-Ayland wrote:

 Hi all,

 Having just upgraded from kvm-85 to qemu-kvm-0.12.1.1 on one of our
 servers, I've noticed that I am seeing block artefacts when connecting
 using VNC to the graphical VGA console of a WinXP guest.

 Looking at the VNC output, what I am seeing is that instead of updating
 some parts of the screen which require a redraw, they are just being
 replaced by light grey blocks of around 16x16 pixels. Generally, but not
 always, several of these blocks appear in a row. Moving the mouse over
 the relevant sections of the screen causes them to be redrawn correctly.

 I've tried this using both the cirrus and vga drivers, switching between
 16/24/32 bit colour and also different resolutions and unfortunately the
 effect still remains :( Is there anything else I can do to help try and
 debug this? Again this is on an x86_64 Debian Lenny host with a 2.6.32.2
 kernel on Intel.
  
 Mark,

 Can you confirm that reverting commit
 02c2b87fff97e77a1f6033fb09f53afa267c0c1e fixes the problem? (patch
 attached).

 Anthony: its reproducible with upstream/tcg.

 Which vnc client is this?

vncviewer and vinagre.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Beth Kon

Dor Laor wrote:

On 01/06/2010 12:09 PM, Gleb Natapov wrote:

On Wed, Jan 06, 2010 at 05:48:52PM +0800, Sheng Yang wrote:

Hi Beth

I still found the emulated HPET would result in some boot failure. For
example, on my 2.6.30, with HPET enabled, the kernel would fail 
check_timer(),

especially in timer_irq_works().

The testing of timer_irq_works() is let 10 ticks pass(using 
mdelay()), and
want to confirm the clock source with at least 5 ticks advanced in 
jiffies.
I've checked that, on my machine, it would mostly get only 4 ticks 
when HPET
enabled, then fail the test. On the other hand, if I using PIT, it 
would get
more than 10 ticks(maybe understandable if some complementary ticks 
there). Of

course, extend the ticks count/mdelay() time can work.

I think it's a major issue of HPET. And it maybe just due to a too long
userspace path for interrupt injection... If it's true, I think it's 
not easy

to deal with it.


PIT tick are reinjected automatically, HPET should probably do the same
although it may just create another set of problems.


Older Linux do automatic adjustment for lost ticks so automatic 
reinjection causes time to run too fast. This is why we added the 
-no-kvm-pit-reinject flag...


It took lots of time to pit/rtc to stabilize, in order of seriously 
consider the hpet emulation, lots of testing should be done.
I will try to look into this. Since HPET is edge-triggered, looks like 
this problem is of a different nature than PIT.  Is this a solid failure 
or intermittent?






--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Regards,

Beth Kon

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VNC framebuffer block artefacts on qemu-kvm-0.12.1.1

2010-01-06 Thread Anthony Liguori

On 01/06/2010 07:51 AM, Marcelo Tosatti wrote:

On Tue, Jan 05, 2010 at 04:11:26PM +, Mark Cave-Ayland wrote:
   

Hi all,

Having just upgraded from kvm-85 to qemu-kvm-0.12.1.1 on one of our
servers, I've noticed that I am seeing block artefacts when connecting
using VNC to the graphical VGA console of a WinXP guest.

Looking at the VNC output, what I am seeing is that instead of updating
some parts of the screen which require a redraw, they are just being
replaced by light grey blocks of around 16x16 pixels. Generally, but not
always, several of these blocks appear in a row. Moving the mouse over
the relevant sections of the screen causes them to be redrawn correctly.

I've tried this using both the cirrus and vga drivers, switching between
16/24/32 bit colour and also different resolutions and unfortunately the
effect still remains :( Is there anything else I can do to help try and
debug this? Again this is on an x86_64 Debian Lenny host with a 2.6.32.2
kernel on Intel.
 

Mark,

Can you confirm that reverting commit
02c2b87fff97e77a1f6033fb09f53afa267c0c1e fixes the problem? (patch
attached).

Anthony: its reproducible with upstream/tcg.


What about just adding back this bit:

@@ -165,6 +165,8 @@ static void CONCAT(send_hextile_tile_, NAME)(VncState *vs,
irow += ds_get_linesize(vs-ds) / sizeof(pixel_t);
}

+   /* A SubrectsColoured subtile invalidates the foreground color */
+   *has_fg = 0;
if (n_data  (w * h * sizeof(pixel_t))) {
n_colors = 4;
flags = 0x01;

I think I can rationalize why that would be needed.

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Beth Kon

Beth Kon wrote:

Dor Laor wrote:

On 01/06/2010 12:09 PM, Gleb Natapov wrote:

On Wed, Jan 06, 2010 at 05:48:52PM +0800, Sheng Yang wrote:

Hi Beth

I still found the emulated HPET would result in some boot failure. For
example, on my 2.6.30, with HPET enabled, the kernel would fail 
check_timer(),

especially in timer_irq_works().

The testing of timer_irq_works() is let 10 ticks pass(using 
mdelay()), and
want to confirm the clock source with at least 5 ticks advanced in 
jiffies.
I've checked that, on my machine, it would mostly get only 4 ticks 
when HPET
enabled, then fail the test. On the other hand, if I using PIT, it 
would get
more than 10 ticks(maybe understandable if some complementary ticks 
there). Of

course, extend the ticks count/mdelay() time can work.

I think it's a major issue of HPET. And it maybe just due to a too 
long
userspace path for interrupt injection... If it's true, I think 
it's not easy

to deal with it.


PIT tick are reinjected automatically, HPET should probably do the same
although it may just create another set of problems.


Older Linux do automatic adjustment for lost ticks so automatic 
reinjection causes time to run too fast. This is why we added the 
-no-kvm-pit-reinject flag...


It took lots of time to pit/rtc to stabilize, in order of seriously 
consider the hpet emulation, lots of testing should be done.
I will try to look into this. Since HPET is edge-triggered, looks like 
this problem is of a different nature than PIT.  Is this a solid 
failure or intermittent?
Anthony just explained that on x86, even edge-triggered interrupts are 
queued in the apic and an eoi will occur, so this is not different than 
the PIT.






--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html






--
Regards,

Beth Kon

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Anthony Liguori

On 01/06/2010 01:20 PM, Beth Kon wrote:

Beth Kon wrote:
I will try to look into this. Since HPET is edge-triggered, looks 
like this problem is of a different nature than PIT.  Is this a solid 
failure or intermittent?
Anthony just explained that on x86, even edge-triggered interrupts are 
queued in the apic and an eoi will occur, so this is not different 
than the PIT.


Not quite queued in the sense that multiple events will be delivered in 
order, but I think the point is that you can still detect whether 
delivery succeeded by counting APIC EOIs.


The trouble is that historically we've struggled with doing this in 
userspace.  Maybe it's time to revisit.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Michael S. Tsirkin
On Tue, Sep 01, 2009 at 07:24:24AM -0700, Davide Libenzi wrote:
 On Tue, 1 Sep 2009, Avi Kivity wrote:
 
  On 09/01/2009 02:45 AM, Davide Libenzi wrote:
   On Thu, 27 Aug 2009, Davide Libenzi wrote:
   
  
On Thu, 27 Aug 2009, Michael S. Tsirkin wrote:

 
 Oh, I stopped pushing EFD_STATE since we have a solution.

Do you guys need the kernel-side eventfd_ctx_read() I posted or not?
Because if nobody uses it, I'm not going to push it.
 
   Guys, I did not get a reply on this. Do you need me to push it, or you're
   not going to use it at the end?
  
  
  We'll use it eventually, but we're still some ways from it.
 
 OK, then bug me when you're going to need it. I won't push it before that.
 
 
 - Davide

So, it turns out that we need this: be thought we don't because
currently kvm does not zero eventfd counter when it polls eventfd.  But
this causes spurious interrupts when we disconnect irqfd from kvm and
re-connect it back.

However, since kvm does its own thing with the wait queue, and might
read the counter from wait queue callback (which might be from
interrupt context), a simpler, lower-level interface would be better for
us.  Does the following (build tested only) look palatable?

Thanks!


diff --git a/fs/eventfd.c b/fs/eventfd.c
index d26402f..e350ffd 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -135,6 +135,17 @@ static unsigned int eventfd_poll(struct file *file, 
poll_table *wait)
return events;
 }
 
+/* Caller must have wait queue head lock. */
+ssize_t _eventfd_read_ctx(struct eventfd_ctx *ctx, u64 *ucnt)
+{
+   if (!ctx-count)
+   return -EAGAIN;
+   *ucnt = (ctx-flags  EFD_SEMAPHORE) ? 1 : ctx-count;
+   ctx-count -= *ucnt;
+   return sizeof *ucnt;
+}
+EXPORT_SYMBOL_GPL(_eventfd_read_ctx);
+
 static ssize_t eventfd_read(struct file *file, char __user *buf, size_t count,
loff_t *ppos)
 {
@@ -146,17 +157,14 @@ static ssize_t eventfd_read(struct file *file, char 
__user *buf, size_t count,
if (count  sizeof(ucnt))
return -EINVAL;
spin_lock_irq(ctx-wqh.lock);
-   res = -EAGAIN;
-   if (ctx-count  0)
-   res = sizeof(ucnt);
-   else if (!(file-f_flags  O_NONBLOCK)) {
+   res = _eventfd_read_ctx(ctx, ucnt);
+   if (res  0  !(file-f_flags  O_NONBLOCK)) {
__add_wait_queue(ctx-wqh, wait);
for (res = 0;;) {
set_current_state(TASK_INTERRUPTIBLE);
-   if (ctx-count  0) {
-   res = sizeof(ucnt);
+   res = _eventfd_read_ctx(ctx, ucnt);
+   if (res  0)
break;
-   }
if (signal_pending(current)) {
res = -ERESTARTSYS;
break;
@@ -169,8 +177,6 @@ static ssize_t eventfd_read(struct file *file, char __user 
*buf, size_t count,
__set_current_state(TASK_RUNNING);
}
if (likely(res  0)) {
-   ucnt = (ctx-flags  EFD_SEMAPHORE) ? 1 : ctx-count;
-   ctx-count -= ucnt;
if (waitqueue_active(ctx-wqh))
wake_up_locked_poll(ctx-wqh, POLLOUT);
}
diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h
index 94dd103..a3d0ce9 100644
--- a/include/linux/eventfd.h
+++ b/include/linux/eventfd.h
@@ -34,6 +34,7 @@ struct file *eventfd_fget(int fd);
 struct eventfd_ctx *eventfd_ctx_fdget(int fd);
 struct eventfd_ctx *eventfd_ctx_fileget(struct file *file);
 int eventfd_signal(struct eventfd_ctx *ctx, int n);
+ssize_t _eventfd_read_ctx(struct eventfd_ctx *ctx, u64 *ucnt);
 
 #else /* CONFIG_EVENTFD */
 
@@ -61,6 +62,11 @@ static inline void eventfd_ctx_put(struct eventfd_ctx *ctx)
 
 }
 
+static inline ssize_t _eventfd_read_ctx(struct eventfd_ctx *ctx, u64 *ucnt)
+{
+   return -ENOSYS;
+}
+
 #endif
 
 #endif /* _LINUX_EVENTFD_H */


-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Gleb Natapov
On Wed, Jan 06, 2010 at 01:23:28PM -0600, Anthony Liguori wrote:
 On 01/06/2010 01:20 PM, Beth Kon wrote:
 Beth Kon wrote:
 I will try to look into this. Since HPET is edge-triggered,
 looks like this problem is of a different nature than PIT.  Is
 this a solid failure or intermittent?
 Anthony just explained that on x86, even edge-triggered interrupts
 are queued in the apic and an eoi will occur, so this is not
 different than the PIT.
 
 Not quite queued in the sense that multiple events will be delivered
 in order, but I think the point is that you can still detect whether
 delivery succeeded by counting APIC EOIs.
 
 The trouble is that historically we've struggled with doing this in
 userspace.  Maybe it's time to revisit.
 
We reinject PIT interrupts from kernel and RTC interrupts from
userspace.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] make help output be a little more self-consistent

2010-01-06 Thread Bruce Rogers
Signed-off-by: Bruce Rogers brog...@novell.com
---
 qemu-options.hx |   58 --
 1 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 812d067..fdd5884 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -42,7 +42,7 @@ DEF(smp, HAS_ARG, QEMU_OPTION_smp,
 -smp n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n
 set the number of CPUs to 'n' [default=1]\n
 maxcpus= maximum number of total cpus, including\n
-  offline CPUs for hotplug etc.\n
+offline CPUs for hotplug, etc\n
 cores= number of CPU cores on one socket\n
 threads= number of threads on one CPU core\n
 sockets= number of discrete sockets in the system\n)
@@ -406,8 +406,9 @@ ETEXI
 DEF(device, HAS_ARG, QEMU_OPTION_device,
 -device driver[,options]  add device\n)
 DEF(name, HAS_ARG, QEMU_OPTION_name,
--name string1[,process=string2]set the name of the guest\n
-string1 sets the window title and string2 the process name 
(on Linux)\n)
+-name string1[,process=string2]\n
+set the name of the guest\n
+string1 sets the window title and string2 the process 
name (on Linux)\n)
 STEXI
 @item -name @var{name}
 Sets the @var{name} of the guest.
@@ -484,7 +485,7 @@ ETEXI
 
 #ifdef CONFIG_SDL
 DEF(ctrl-grab, 0, QEMU_OPTION_ctrl_grab,
--ctrl-grab   use Right-Ctrl to grab mouse (instead of Ctrl-Alt)\n)
+-ctrl-grab  use Right-Ctrl to grab mouse (instead of Ctrl-Alt)\n)
 #endif
 STEXI
 @item -ctrl-grab
@@ -757,12 +758,12 @@ ETEXI
 #ifdef TARGET_I386
 DEF(smbios, HAS_ARG, QEMU_OPTION_smbios,
 -smbios file=binary\n
-Load SMBIOS entry from binary file\n
+load SMBIOS entry from binary file\n
 -smbios type=0[,vendor=str][,version=str][,date=str][,release=%%d.%%d]\n
-Specify SMBIOS type 0 fields\n
+specify SMBIOS type 0 fields\n
 -smbios 
type=1[,manufacturer=str][,product=str][,version=str][,serial=str]\n
   [,uuid=uuid][,sku=str][,family=str]\n
-Specify SMBIOS type 1 fields\n)
+specify SMBIOS type 1 fields\n)
 #endif
 STEXI
 @item -smbios fi...@var{binary}
@@ -817,13 +818,13 @@ DEF(net, HAS_ARG, QEMU_OPTION_net,
 -net 
tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile][,sndbuf=nbytes][,vnet_hdr=on|off]\n
 connect the host TAP network interface to VLAN 'n' and 
use the\n
 network scripts 'file' (default=%s)\n
-and 'dfile' (default=%s);\n
-use '[down]script=no' to disable script execution;\n
+and 'dfile' (default=%s)\n
+use '[down]script=no' to disable script execution\n
 use 'fd=h' to connect to an already opened TAP 
interface\n
-use 'sndbuf=nbytes' to limit the size of the send buffer; 
the\n
-default of 'sndbuf=1048576' can be disabled using 
'sndbuf=0'\n
-use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap 
flag; use\n
-vnet_hdr=on to make the lack of IFF_VNET_HDR support an 
error condition\n
+use 'sndbuf=nbytes' to limit the size of the send buffer 
(the\n
+default of 'sndbuf=1048576' can be disabled using 
'sndbuf=0')\n
+use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap 
flag\n
+use vnet_hdr=on to make the lack of IFF_VNET_HDR support 
an error condition\n
 #endif
 -net 
socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n
 connect the vlan 'n' to another VLAN using a socket 
connection\n
@@ -838,7 +839,7 @@ DEF(net, HAS_ARG, QEMU_OPTION_net,
 #endif
 -net dump[,vlan=n][,file=f][,len=n]\n
 dump traffic on vlan 'n' to file 'f' (max n bytes per 
packet)\n
--net none   use it alone to have zero network devices; if no -net 
option\n
+-net none   use it alone to have zero network devices. If no -net 
option\n
 is provided, the default is '-net nic -net user'\n)
 DEF(netdev, HAS_ARG, QEMU_OPTION_netdev,
 -netdev [
@@ -1590,7 +1591,7 @@ The default device is @code{vc} in graphical mode and 
@code{stdio} in
 non graphical mode.
 ETEXI
 DEF(qmp, HAS_ARG, QEMU_OPTION_qmp, \
--qmp devlike -monitor but opens in 'control' mode.\n)
+-qmp devlike -monitor but opens in 'control' mode\n)
 
 DEF(mon, HAS_ARG, QEMU_OPTION_mon, \
 -mon chardev=[name][,mode=readline|control][,default]\n)
@@ -1608,7 +1609,7 @@ from a script.
 ETEXI
 
 DEF(singlestep, 0, QEMU_OPTION_singlestep, \
--singlestep   always run in singlestep 

Re: The HPET issue on Linux

2010-01-06 Thread Anthony Liguori

On 01/06/2010 01:44 PM, Gleb Natapov wrote:

On Wed, Jan 06, 2010 at 01:23:28PM -0600, Anthony Liguori wrote:
   

On 01/06/2010 01:20 PM, Beth Kon wrote:
 

Beth Kon wrote:
   

I will try to look into this. Since HPET is edge-triggered,
looks like this problem is of a different nature than PIT.  Is
this a solid failure or intermittent?
 

Anthony just explained that on x86, even edge-triggered interrupts
are queued in the apic and an eoi will occur, so this is not
different than the PIT.
   

Not quite queued in the sense that multiple events will be delivered
in order, but I think the point is that you can still detect whether
delivery succeeded by counting APIC EOIs.

The trouble is that historically we've struggled with doing this in
userspace.  Maybe it's time to revisit.

 

We reinject PIT interrupts from kernel and RTC interrupts from
userspace.
   


Because we can determine that we've missed an RTC interrupt in 
userspace.  We cannot determine this with the PIT without adding a hook 
into the userspace apic that lets us know whether an injection failed or 
not.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Gleb Natapov
On Wed, Jan 06, 2010 at 01:51:54PM -0600, Anthony Liguori wrote:
 On 01/06/2010 01:44 PM, Gleb Natapov wrote:
 On Wed, Jan 06, 2010 at 01:23:28PM -0600, Anthony Liguori wrote:
 On 01/06/2010 01:20 PM, Beth Kon wrote:
 Beth Kon wrote:
 I will try to look into this. Since HPET is edge-triggered,
 looks like this problem is of a different nature than PIT.  Is
 this a solid failure or intermittent?
 Anthony just explained that on x86, even edge-triggered interrupts
 are queued in the apic and an eoi will occur, so this is not
 different than the PIT.
 Not quite queued in the sense that multiple events will be delivered
 in order, but I think the point is that you can still detect whether
 delivery succeeded by counting APIC EOIs.
 
 The trouble is that historically we've struggled with doing this in
 userspace.  Maybe it's time to revisit.
 
 We reinject PIT interrupts from kernel and RTC interrupts from
 userspace.
 
 Because we can determine that we've missed an RTC interrupt in
 userspace.  We cannot determine this with the PIT without adding a
 hook into the userspace apic that lets us know whether an injection
 failed or not.
 
We have exactly that hook in apic already and that's how RTC determines
that interrupt was coalesced.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Wed, 6 Jan 2010, Michael S. Tsirkin wrote:

 On Tue, Sep 01, 2009 at 07:24:24AM -0700, Davide Libenzi wrote:
  On Tue, 1 Sep 2009, Avi Kivity wrote:
  
   On 09/01/2009 02:45 AM, Davide Libenzi wrote:
On Thu, 27 Aug 2009, Davide Libenzi wrote:

   
 On Thu, 27 Aug 2009, Michael S. Tsirkin wrote:
 
  
  Oh, I stopped pushing EFD_STATE since we have a solution.
 
 Do you guys need the kernel-side eventfd_ctx_read() I posted or not?
 Because if nobody uses it, I'm not going to push it.
  
Guys, I did not get a reply on this. Do you need me to push it, or 
you're
not going to use it at the end?
   
   
   We'll use it eventually, but we're still some ways from it.
  
  OK, then bug me when you're going to need it. I won't push it before that.
  
  
  - Davide
 
 So, it turns out that we need this: be thought we don't because
 currently kvm does not zero eventfd counter when it polls eventfd.  But
 this causes spurious interrupts when we disconnect irqfd from kvm and
 re-connect it back.
 
 However, since kvm does its own thing with the wait queue, and might
 read the counter from wait queue callback (which might be from
 interrupt context), a simpler, lower-level interface would be better for
 us.  Does the following (build tested only) look palatable?

What is wrong with you? :/
There was a patch attached with this email, and yet you did your own, and 
yet again you managed to add an underscore at the beginning of the API 
name, in an API set where there are no leading underscores.
Even if KVM does that for its own reasons, you should be able to fit your 
naming style to the interface you're adding your droplets into.
I will post an eventfd_read_ctx() to Andrew ASAP.



- Davide


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Michael S. Tsirkin
On Wed, Jan 06, 2010 at 12:43:07PM -0800, Davide Libenzi wrote:
 On Wed, 6 Jan 2010, Michael S. Tsirkin wrote:
 
  On Tue, Sep 01, 2009 at 07:24:24AM -0700, Davide Libenzi wrote:
   On Tue, 1 Sep 2009, Avi Kivity wrote:
   
On 09/01/2009 02:45 AM, Davide Libenzi wrote:
 On Thu, 27 Aug 2009, Davide Libenzi wrote:
 

  On Thu, 27 Aug 2009, Michael S. Tsirkin wrote:
  
   
   Oh, I stopped pushing EFD_STATE since we have a solution.
  
  Do you guys need the kernel-side eventfd_ctx_read() I posted or not?
  Because if nobody uses it, I'm not going to push it.
   
 Guys, I did not get a reply on this. Do you need me to push it, or 
 you're
 not going to use it at the end?


We'll use it eventually, but we're still some ways from it.
   
   OK, then bug me when you're going to need it. I won't push it before that.
   
   
   - Davide
  
  So, it turns out that we need this: be thought we don't because
  currently kvm does not zero eventfd counter when it polls eventfd.  But
  this causes spurious interrupts when we disconnect irqfd from kvm and
  re-connect it back.
  
  However, since kvm does its own thing with the wait queue, and might
  read the counter from wait queue callback (which might be from
  interrupt context), a simpler, lower-level interface would be better for
  us.  Does the following (build tested only) look palatable?
 
 What is wrong with you? :/

I was trying to be helpful.

 There was a patch attached with this email, and yet you did your own,

I tried to explain, no?  That patch was taking wait queue spinlock and
was assuming that eventfd_read_ctx is called from a task that can block.
KVM attaches its own poller so this is not a good fit for us.  Hope this
clarifies.

 and yet again you managed to add an underscore at the beginning of the
 API name, in an API set where there are no leading underscores.  Even
 if KVM does that for its own reasons, you should be able to fit your
 naming style to the interface you're adding your droplets into.  I
 will post an eventfd_read_ctx() to Andrew ASAP.
 
 
 
 - Davide

Sorry about the underscore - it's easy to fix but I won't bore you with
more patch revisions before I understand what makes you unhappy.

Before KVM starts creating workqueues and doing schedule_work calls just
to work around the API, can we discuss this please? I am not hung on my
patch, but could we have it work so that we can call eventfd_read_ctx
from wait queue callback directly?

Thanks!


-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Wed, 6 Jan 2010, Michael S. Tsirkin wrote:

 I tried to explain, no?  That patch was taking wait queue spinlock and
 was assuming that eventfd_read_ctx is called from a task that can block.
 KVM attaches its own poller so this is not a good fit for us.  Hope this
 clarifies.
 
  and yet again you managed to add an underscore at the beginning of the
  API name, in an API set where there are no leading underscores.  Even
  if KVM does that for its own reasons, you should be able to fit your
  naming style to the interface you're adding your droplets into.  I
  will post an eventfd_read_ctx() to Andrew ASAP.
  
  
  
  - Davide
 
 Sorry about the underscore - it's easy to fix but I won't bore you with
 more patch revisions before I understand what makes you unhappy.
 
 Before KVM starts creating workqueues and doing schedule_work calls just
 to work around the API, can we discuss this please? I am not hung on my
 patch, but could we have it work so that we can call eventfd_read_ctx
 from wait queue callback directly?

The read needs to wake possible OUT waiters, *cannot* be done your way.
On top of creating an interface which requires a lock, which noone can get 
from the interface itself, since it's not exposed.
I could split the two and have a locked one, and an unlocked one, but that 
looks shitty too (for the above reason).
I thought you *already* do your own stuff from a work queue, why would you 
need to read from IRQ context?



- Davide


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VNC framebuffer block artefacts on qemu-kvm-0.12.1.1

2010-01-06 Thread Mark Cave-Ayland

Anthony Liguori wrote:


What about just adding back this bit:

@@ -165,6 +165,8 @@ static void CONCAT(send_hextile_tile_, 
NAME)(VncState *vs,

 irow += ds_get_linesize(vs-ds) / sizeof(pixel_t);
 }

+/* A SubrectsColoured subtile invalidates the foreground color */
+*has_fg = 0;
 if (n_data  (w * h * sizeof(pixel_t))) {
 n_colors = 4;
 flags = 0x01;

I think I can rationalize why that would be needed.


Hi Anthony,

I've done a rebuild with just the above part of the patch applied, and I 
can confirm that indeed it resolves the problem on my WinXP guest here.



Many thanks,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Michael S. Tsirkin
On Wed, Jan 06, 2010 at 01:17:02PM -0800, Davide Libenzi wrote:
 On Wed, 6 Jan 2010, Michael S. Tsirkin wrote:
 
  I tried to explain, no?  That patch was taking wait queue spinlock and
  was assuming that eventfd_read_ctx is called from a task that can block.
  KVM attaches its own poller so this is not a good fit for us.  Hope this
  clarifies.
  
   and yet again you managed to add an underscore at the beginning of the
   API name, in an API set where there are no leading underscores.  Even
   if KVM does that for its own reasons, you should be able to fit your
   naming style to the interface you're adding your droplets into.  I
   will post an eventfd_read_ctx() to Andrew ASAP.
   
   
   
   - Davide
  
  Sorry about the underscore - it's easy to fix but I won't bore you with
  more patch revisions before I understand what makes you unhappy.
  
  Before KVM starts creating workqueues and doing schedule_work calls just
  to work around the API, can we discuss this please? I am not hung on my
  patch, but could we have it work so that we can call eventfd_read_ctx
  from wait queue callback directly?
 
 The read needs to wake possible OUT waiters, *cannot* be done your way.

Right. Now I see. Thanks!

 On top of creating an interface which requires a lock, which noone can get 
 from the interface itself, since it's not exposed.

I think here's how KVM gets it: the way it does is by calling poll with
our own poll table, then in poll_queue_proc we get wait queue pointer,
and we use the wait queue. Lock is in there :)

 I could split the two and have a locked one, and an unlocked one, but that 
 looks shitty too (for the above reason).

Yes, this will work. Thanks!

 I thought you *already* do your own stuff from a work queue, why would you 
 need to read from IRQ context?
 
 
 
 - Davide

This was just work around while interrupt delivery was using mutex
locks, so we had a workqueue to work around that limitation.  Now that
it has been all switched to RCU, we'll be getting getting rid of this.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Anthony Liguori

On 01/06/2010 02:37 PM, Gleb Natapov wrote:

We have exactly that hook in apic already and that's how RTC determines
that interrupt was coalesced.
   


AFAICT, apic_irq_delivered is only reset explicitly by the RTC when the 
line is lowered.  It's not currently lowered based on EOI.


How can this mechanism be used with the HPET when operating in edge 
triggered mode?


Regards,

Anthony Liguori


--
Gleb.
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:

  On top of creating an interface which requires a lock, which noone can get 
  from the interface itself, since it's not exposed.
 
 I think here's how KVM gets it: the way it does is by calling poll with
 our own poll table, then in poll_queue_proc we get wait queue pointer,
 and we use the wait queue. Lock is in there :)

Yes, I know you are called locked, but it does not lead to a clean 
interface.



  I could split the two and have a locked one, and an unlocked one, but that 
  looks shitty too (for the above reason).
 
 Yes, this will work. Thanks!

This is a lot more complex than I thought. The wakeup code is already 
enumerating the list, and doing a wakeup might trigger a secondary 
enumeration/recursion.
Do you really need to consume the value from IRQ context, or you can 
simply peek the value, and flush it later?



- Davide


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Michael S. Tsirkin
On Wed, Jan 06, 2010 at 02:46:06PM -0800, Davide Libenzi wrote:
 On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:
 
   On top of creating an interface which requires a lock, which noone can 
   get 
   from the interface itself, since it's not exposed.
  
  I think here's how KVM gets it: the way it does is by calling poll with
  our own poll table, then in poll_queue_proc we get wait queue pointer,
  and we use the wait queue. Lock is in there :)
 
 Yes, I know you are called locked, but it does not lead to a clean 
 interface.

True.

   I could split the two and have a locked one, and an unlocked one, but 
   that 
   looks shitty too (for the above reason).
  
  Yes, this will work. Thanks!
 
 This is a lot more complex than I thought. The wakeup code is already 
 enumerating the list, and doing a wakeup might trigger a secondary 
 enumeration/recursion.

For KVM what you describe is I think is not a problem: we check wake type
and ignore POLLOUT events.

 Do you really need to consume the value from IRQ context, or you can 
 simply peek the value, and flush it later?
 
 
 
 - Davide

Maybe yes. I'll think it over and get back to you. Thanks!

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:

 On Wed, Jan 06, 2010 at 02:46:06PM -0800, Davide Libenzi wrote:
  On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:
  
On top of creating an interface which requires a lock, which noone can 
get 
from the interface itself, since it's not exposed.
   
   I think here's how KVM gets it: the way it does is by calling poll with
   our own poll table, then in poll_queue_proc we get wait queue pointer,
   and we use the wait queue. Lock is in there :)
  
  Yes, I know you are called locked, but it does not lead to a clean 
  interface.
 
 True.
 
I could split the two and have a locked one, and an unlocked one, but 
that 
looks shitty too (for the above reason).
   
   Yes, this will work. Thanks!
  
  This is a lot more complex than I thought. The wakeup code is already 
  enumerating the list, and doing a wakeup might trigger a secondary 
  enumeration/recursion.
 
 For KVM what you describe is I think is not a problem: we check wake type
 and ignore POLLOUT events.

You seem to think in one dimension only ;)
The interface needs to be stable for everyone.



 Maybe yes. I'll think it over and get back to you. Thanks!

Let me know.



- Davide


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Michael S. Tsirkin
On Wed, Jan 06, 2010 at 03:59:11PM -0800, Davide Libenzi wrote:
 On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:
 
  On Wed, Jan 06, 2010 at 02:46:06PM -0800, Davide Libenzi wrote:
   On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:
   
 On top of creating an interface which requires a lock, which noone 
 can get 
 from the interface itself, since it's not exposed.

I think here's how KVM gets it: the way it does is by calling poll with
our own poll table, then in poll_queue_proc we get wait queue pointer,
and we use the wait queue. Lock is in there :)
   
   Yes, I know you are called locked, but it does not lead to a clean 
   interface.
  
  True.
  
 I could split the two and have a locked one, and an unlocked one, but 
 that 
 looks shitty too (for the above reason).

Yes, this will work. Thanks!
   
   This is a lot more complex than I thought. The wakeup code is already 
   enumerating the list, and doing a wakeup might trigger a secondary 
   enumeration/recursion.
  
  For KVM what you describe is I think is not a problem: we check wake type
  and ignore POLLOUT events.
 
 You seem to think in one dimension only ;)
 The interface needs to be stable for everyone.
 

I agree it's better to have interface that can't be misused.  I was just
pointing out that users *can* break recursion by looking at event type.

  Maybe yes. I'll think it over and get back to you. Thanks!
 
 Let me know.
 
 
 
 - Davide
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Sheng Yang
On Thursday 07 January 2010 02:36:26 Beth Kon wrote:
 Dor Laor wrote:
  On 01/06/2010 12:09 PM, Gleb Natapov wrote:
  On Wed, Jan 06, 2010 at 05:48:52PM +0800, Sheng Yang wrote:
  Hi Beth
 
  I still found the emulated HPET would result in some boot failure. For
  example, on my 2.6.30, with HPET enabled, the kernel would fail
  check_timer(),
  especially in timer_irq_works().
 
  The testing of timer_irq_works() is let 10 ticks pass(using
  mdelay()), and
  want to confirm the clock source with at least 5 ticks advanced in
  jiffies.
  I've checked that, on my machine, it would mostly get only 4 ticks
  when HPET
  enabled, then fail the test. On the other hand, if I using PIT, it
  would get
  more than 10 ticks(maybe understandable if some complementary ticks
  there). Of
  course, extend the ticks count/mdelay() time can work.
 
  I think it's a major issue of HPET. And it maybe just due to a too long
  userspace path for interrupt injection... If it's true, I think it's
  not easy
  to deal with it.
 
  PIT tick are reinjected automatically, HPET should probably do the same
  although it may just create another set of problems.
 
  Older Linux do automatic adjustment for lost ticks so automatic
  reinjection causes time to run too fast. This is why we added the
  -no-kvm-pit-reinject flag...
 
  It took lots of time to pit/rtc to stabilize, in order of seriously
  consider the hpet emulation, lots of testing should be done.
 
 I will try to look into this. Since HPET is edge-triggered, looks like
 this problem is of a different nature than PIT.  Is this a solid failure
 or intermittent?
 
At least for v2.6.30 in my box, it always fails... Of course, I believe the 
chance of successful injecting enough interrupt depends on the many factors. 
So I think out target can be: not far behind what PIT can do...

-- 
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The HPET issue on Linux

2010-01-06 Thread Gleb Natapov
On Wed, Jan 06, 2010 at 04:42:30PM -0600, Anthony Liguori wrote:
 On 01/06/2010 02:37 PM, Gleb Natapov wrote:
 We have exactly that hook in apic already and that's how RTC determines
 that interrupt was coalesced.
 
 AFAICT, apic_irq_delivered is only reset explicitly by the RTC when
 the line is lowered.  It's not currently lowered based on EOI.
 
Correct. We can expose ACK notifiers to userspace (and if we want to
move assigned devices into userspace we have to), but I'd rather avoid
it.

 How can this mechanism be used with the HPET when operating in edge
 triggered mode?

If interrupt is coalesced increment counter and double HPET timer
frequency. When counter is zeroed return HPET timer to normal frequency.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Michael S. Tsirkin
On Wed, Jan 06, 2010 at 03:59:11PM -0800, Davide Libenzi wrote:
 On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:
 
  On Wed, Jan 06, 2010 at 02:46:06PM -0800, Davide Libenzi wrote:
   On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:
   
 On top of creating an interface which requires a lock, which noone 
 can get 
 from the interface itself, since it's not exposed.

I think here's how KVM gets it: the way it does is by calling poll with
our own poll table, then in poll_queue_proc we get wait queue pointer,
and we use the wait queue. Lock is in there :)
   
   Yes, I know you are called locked, but it does not lead to a clean 
   interface.
  
  True.
  
 I could split the two and have a locked one, and an unlocked one, but 
 that 
 looks shitty too (for the above reason).

Yes, this will work. Thanks!
   
   This is a lot more complex than I thought. The wakeup code is already 
   enumerating the list, and doing a wakeup might trigger a secondary 
   enumeration/recursion.
  
  For KVM what you describe is I think is not a problem: we check wake type
  and ignore POLLOUT events.
 
 You seem to think in one dimension only ;)
 The interface needs to be stable for everyone.
 
 
 
  Maybe yes. I'll think it over and get back to you. Thanks!
 
 Let me know.
 
 
 
 - Davide

OK. What I think we need is a way to remove ourselves from the eventfd
wait queue and clear the counter atomically.

We currently do
remove_wait_queue(irqfd-wqh, irqfd-wait);
where wqh saves the eventfd wait queue head.

If we do this before proposed eventfd_read_ctx, we can lose events.
If we do this after, we can get spurious events.

An unlocked read is one way to fix this.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Thu, 7 Jan 2010, Michael S. Tsirkin wrote:

 OK. What I think we need is a way to remove ourselves from the eventfd
 wait queue and clear the counter atomically.
 
 We currently do
 remove_wait_queue(irqfd-wqh, irqfd-wait);
 where wqh saves the eventfd wait queue head.

You do a remove_wait_queue() from inside a callback wakeup on the same 
wait queue head?


 If we do this before proposed eventfd_read_ctx, we can lose events.
 If we do this after, we can get spurious events.
 
 An unlocked read is one way to fix this.

You posted one line of code and a two lines analysis of the issue. Can you 
be a little bit more verbose and show me more code, so that I can actually 
see what is going on?


- Davide


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] PPC: Enable lightweight exits again

2010-01-06 Thread Marcelo Tosatti
On Mon, Jan 04, 2010 at 10:19:25PM +0100, Alexander Graf wrote:
 The PowerPC C ABI defines that registers r14-r31 need to be preserved across
 function calls. Since our exit handler is written in C, we can make use of 
 that
 and don't need to reload r14-r31 on every entry/exit cycle.
 
 This technique is also used in the BookE code and is called lightweight 
 exits
 there. To follow the tradition, it's called the same in Book3S.
 
 So far this optimization was disabled though, as the code didn't do what it 
 was
 expected to do, but failed to work.
 
 This patch fixes and enables lightweight exits again.
 
 Signed-off-by: Alexander Graf ag...@suse.de

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html