date:20120227

Recall: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction

2012-02-27 Thread Yin Olivia-R63875

Yin Olivia-R63875 would like to recall the message, "[PATCH 1/2] powerpc/e500: 
make load_up_spe a normal fuction".
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction

2012-02-27 Thread Yin Olivia-R63875

Hi Scott,

This had been reviewed before and accepted by internal tree.
http://linux.freescale.net/patchwork/patch/11100/
http://git.am.freescale.net/gitolite/gitweb.cgi/sdk/kvm.git/commit/?h=for-sdk1.2&id=c5088844dc665dbdae4fa51b8d58dc203bacc17e

I didn't change anything except the line.
I just commit to external kvm-ppc mailing list. Should I add my own 
Signed-off-by?

Best Regards,
Olivia

-Original Message-
From: Wood Scott-B07421 
Sent: Tuesday, February 28, 2012 3:19 AM
To: Yin Olivia-R63875
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; 
linuxppc-...@lists.ozlabs.org; Liu Yu-B13201
Subject: Re: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction

On 02/27/2012 04:59 AM, Olivia Yin wrote:
> So that we can call it in kernel.
> 
> Signed-off-by: Liu Yu 

Explain why we want this, and point out that this makes it similar to 
load_up_fpu.

> ---
>  arch/powerpc/kernel/head_fsl_booke.S |   23 ++-
>  1 files changed, 6 insertions(+), 17 deletions(-)

When posting a patch authored by someone else, more or less unchanged, you 
should put a From: line in the body of the e-mail.

git send-email will do this automatically if you preserve the authorship in the 
git commit.

Also, you should add your own Signed-off-by.

-Scott

Re: [PATCH] kvm: notify host when guest paniced

2012-02-27 Thread Wen Congyang

At 02/27/2012 11:08 PM, Jan Kiszka Wrote:
> On 2012-02-27 04:01, Wen Congyang wrote:
>> We can know the guest is paniced when the guest runs on xen.
>> But we do not have such feature on kvm. This patch implemnts
>> this feature, and the implementation is the same as xen:
>> register panic notifier, and call hypercall when the guest
>> is paniced.
>>
>> Signed-off-by: Wen Congyang 
>> ---
>>  arch/x86/kernel/kvm.c|   12 
>>  arch/x86/kvm/svm.c   |8 ++--
>>  arch/x86/kvm/vmx.c   |8 ++--
>>  arch/x86/kvm/x86.c   |   13 +++--
>>  include/linux/kvm.h  |1 +
>>  include/linux/kvm_para.h |1 +
>>  6 files changed, 37 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index f0c6fd6..b928d1d 100644
>> --- a/arch/x86/kernel/kvm.c
>> +++ b/arch/x86/kernel/kvm.c
>> @@ -331,6 +331,17 @@ static struct notifier_block kvm_pv_reboot_nb = {
>>  .notifier_call = kvm_pv_reboot_notify,
>>  };
>>  
>> +static int
>> +kvm_pv_panic_notify(struct notifier_block *nb, unsigned long code, void 
>> *unused)
>> +{
>> +kvm_hypercall0(KVM_HC_GUEST_PANIC);
>> +return NOTIFY_DONE;
>> +}
>> +
>> +static struct notifier_block kvm_pv_panic_nb = {
>> +.notifier_call = kvm_pv_panic_notify,
>> +};
>> +
> 
> You should split up host and guest-side changes.

OK

> 
>>  static u64 kvm_steal_clock(int cpu)
>>  {
>>  u64 steal;
>> @@ -417,6 +428,7 @@ void __init kvm_guest_init(void)
>>  
>>  paravirt_ops_setup();
>>  register_reboot_notifier(&kvm_pv_reboot_nb);
>> +atomic_notifier_chain_register(&panic_notifier_list, &kvm_pv_panic_nb);
>>  for (i = 0; i < KVM_TASK_SLEEP_HASHSIZE; i++)
>>  spin_lock_init(&async_pf_sleepers[i].lock);
>>  if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF))
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 0b7690e..38b4705 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -1900,10 +1900,14 @@ static int halt_interception(struct vcpu_svm *svm)
>>  
>>  static int vmmcall_interception(struct vcpu_svm *svm)
>>  {
>> +int ret;
>> +
>>  svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
>>  skip_emulated_instruction(&svm->vcpu);
>> -kvm_emulate_hypercall(&svm->vcpu);
>> -return 1;
>> +ret = kvm_emulate_hypercall(&svm->vcpu);
>> +
>> +/* Ignore the error? */
>> +return ret == 0 ? 0 : 1;
> 
> Why can't kvm_emulate_hypercall return the right value?

Because before this patch, kvm always ignores the error.

After rereading the code, kvm_emulate_hypercall() will return -KVM_EPERM
when vcpu's CPL is not 0. I think we should deal with this exception
in kvm_emulate_hypercall(), and return 1. But I donot know
how to do it. kvm_queue_exception(vcpu, UD_VECTOR)?

> 
>>  }
>>  
>>  static unsigned long nested_svm_get_tdp_cr3(struct kvm_vcpu *vcpu)
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 66147ca..1b57ebb 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -4582,9 +4582,13 @@ static int handle_halt(struct kvm_vcpu *vcpu)
>>  
>>  static int handle_vmcall(struct kvm_vcpu *vcpu)
>>  {
>> +int ret;
>> +
>>  skip_emulated_instruction(vcpu);
>> -kvm_emulate_hypercall(vcpu);
>> -return 1;
>> +ret = kvm_emulate_hypercall(vcpu);
>> +
>> +/* Ignore the error? */
>> +return ret == 0 ? 0 : 1;
>>  }
>>  
>>  static int handle_invd(struct kvm_vcpu *vcpu)
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index c9d99e5..3fc2853 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -4923,7 +4923,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>>  u64 param, ingpa, outgpa, ret;
>>  uint16_t code, rep_idx, rep_cnt, res = HV_STATUS_SUCCESS, rep_done = 0;
>>  bool fast, longmode;
>> -int cs_db, cs_l;
>> +int cs_db, cs_l, r = 1;
>>  
>>  /*
>>   * hypercall generates UD from non zero cpl and real mode
>> @@ -4964,6 +4964,10 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>>  case HV_X64_HV_NOTIFY_LONG_SPIN_WAIT:
>>  kvm_vcpu_on_spin(vcpu);
>>  break;
>> +case KVM_HC_GUEST_PANIC:
>> +vcpu->run->exit_reason = KVM_EXIT_GUEST_PANIC;
>> +r = 0;
>> +break;
> 
> That's the wrong place. This is a KVM hypercall, not a HyperV one.

OK, I will remove it.

Thanks
Wen Congyang

> 
>>  default:
>>  res = HV_STATUS_INVALID_HYPERCALL_CODE;
>>  break;
>> @@ -4977,7 +4981,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>>  kvm_register_write(vcpu, VCPU_REGS_RAX, ret & 0x);
>>  }
>>  
>> -return 1;
>> +return r;
>>  }
>>  
>>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>> @@ -5013,6 +5017,11 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>>  case KVM_HC_VAPIC_POLL_IRQ:
>>  ret = 0;
>>  break;
>> +case KVM_HC_GUEST_PANIC:
>> +ret = 0;
>> +vcpu->run->exi

[PATCH] KVM: expose Intel cpu new features to guest

2012-02-27 Thread Liu, Jinsong

>From ecd8be962f69393c183f941bfdbd7a7d3876d442 Mon Sep 17 00:00:00 2001
From: Liu, Jinsong 
Date: Mon, 27 Feb 2012 05:19:32 +0800
Subject: [PATCH] KVM: expose Intel cpu new features to guest

Intel recently release 2 new features, HLE and TRM.
Refer to http://software.intel.com/file/41417.
This patch expose them to guest.

Signed-off-by: Liu, Jinsong 
---
 arch/x86/include/asm/cpufeature.h |2 ++
 arch/x86/kvm/cpuid.c  |3 ++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index 17c5d4b..e8d12a8 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -198,10 +198,12 @@
 /* Intel-defined CPU features, CPUID level 0x0007:0 (ebx), word 9 */
 #define X86_FEATURE_FSGSBASE   (9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
 #define X86_FEATURE_BMI1   (9*32+ 3) /* 1st group bit manipulation 
extensions */
+#define X86_FEATURE_HLE(9*32+ 4) /* Hardware Lock Elision */
 #define X86_FEATURE_AVX2   (9*32+ 5) /* AVX2 instructions */
 #define X86_FEATURE_SMEP   (9*32+ 7) /* Supervisor Mode Execution 
Protection */
 #define X86_FEATURE_BMI2   (9*32+ 8) /* 2nd group bit manipulation 
extensions */
 #define X86_FEATURE_ERMS   (9*32+ 9) /* Enhanced REP MOVSB/STOSB */
+#define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional 
Memory */
 
 #if defined(__KERNEL__) && !defined(__ASSEMBLY__)
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 9fed5be..c2134b8 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -247,7 +247,8 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 
function,
 
/* cpuid 7.0.ebx */
const u32 kvm_supported_word9_x86_features =
-   F(FSGSBASE) | F(BMI1) | F(AVX2) | F(SMEP) | F(BMI2) | F(ERMS);
+   F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
+   F(BMI2) | F(ERMS) | F(RTM);
 
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
-- 
1.7.1


0001-KVM-expose-Intel-cpu-new-features-to-guest.patch
Description: 0001-KVM-expose-Intel-cpu-new-features-to-guest.patch

Re: [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices

2012-02-27 Thread Jan Kiszka

On 2012-02-28 00:15, Alex Williamson wrote:
> On Mon, 2012-02-27 at 23:07 +0100, Jan Kiszka wrote:
>> On 2012-02-27 22:05, Alex Williamson wrote:
>>> On Fri, 2012-02-10 at 19:17 +0100, Jan Kiszka wrote:
 PCI 2.3 allows to generically disable IRQ sources at device level. This
 enables us to share legacy IRQs of such devices with other host devices
 when passing them to a guest.

 The new IRQ sharing feature introduced here is optional, user space has
 to request it explicitly. Moreover, user space can inform us about its
 view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
 interrupt and signaling it if the guest masked it via the virtualized
 PCI config space.

 Signed-off-by: Jan Kiszka 
 ---

 Changes in v3:
  - rebased over current kvm.git (no code conflict, just api.txt)

  Documentation/virtual/kvm/api.txt |   31 ++
  arch/x86/kvm/x86.c|1 +
  include/linux/kvm.h   |6 +
  include/linux/kvm_host.h  |2 +
  virt/kvm/assigned-dev.c   |  208 
 +++-
  5 files changed, 219 insertions(+), 29 deletions(-)

 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 59a3826..5ce0e29 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1169,6 +1169,14 @@ following flags are specified:
  
  /* Depends on KVM_CAP_IOMMU */
  #define KVM_DEV_ASSIGN_ENABLE_IOMMU   (1 << 0)
 +/* The following two depend on KVM_CAP_PCI_2_3 */
 +#define KVM_DEV_ASSIGN_PCI_2_3(1 << 1)
 +#define KVM_DEV_ASSIGN_MASK_INTX  (1 << 2)
 +
 +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx 
 interrupts
 +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with 
 other
 +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
 +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
  
  The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
  isolation of the device.  Usages not specifying this flag are deprecated.
 @@ -1441,6 +1449,29 @@ The "num_dirty" field is a performance hint for KVM 
 to determine whether it
  should skip processing the bitmap and just invalidate everything.  It must
  be set to the number of set bits in the bitmap.
  
 +4.60 KVM_ASSIGN_SET_INTX_MASK
 +
 +Capability: KVM_CAP_PCI_2_3
 +Architectures: x86
 +Type: vm ioctl
 +Parameters: struct kvm_assigned_pci_dev (in)
 +Returns: 0 on success, -1 on error
 +
 +Informs the kernel about the guest's view on the INTx mask. As long as the
 +guest masks the legacy INTx, the kernel will refrain from unmasking it at
 +hardware level and will not assert the guest's IRQ line. User space is 
 still
 +responsible for applying this state to the assigned device's real config 
 space
 +by setting or clearing the Interrupt Disable bit 10 in the Command 
 register.
 +
 +To avoid that the kernel overwrites the state user space wants to set,
 +KVM_ASSIGN_SET_INTX_MASK has to be called prior to updating the config 
 space.
 +Moreover, user space has to write back its own view on the Interrupt 
 Disable
 +bit whenever modifying the Command word.
>>>
>>> This is very confusing.  I think we need to work on the wording, but
>>> perhaps it's not worth hold up the patch.  It seems the simplest,
>>
>> As I need another round anyway (see below), I'm open for better wording
>> suggestions.
> 
> Now that I know what it does, I'll probably write something just as
> confusing, but here's a shot:
> 
> Allows userspace to mask PCI INTx interrupts from the assigned
> device.  The kernel will not deliver INTx interrupts to the
> guest between setting and clearing of KVM_ASSIGN_SET_INTX_MASK
> via this interface.  This enables use of and emulation of PCI
> 2.3 INTx disable command register behavior.
> 
> This may be used for both PCI 2.3 devices supporting INTx
> disable natively and older devices lacking this support.
> Userspace is responsible for emulating the read value of the
> INTx disable bit in the guest visible PCI command register.
> When modifying the INTx disable state, userspace should precede
> updating the physical device command register by calling this
> ioctl to inform the kernel of the new intended INTx mask state.
> 
> Note that the kernel uses the device INTx disable bit to
> internally manage the device interrupt state for PCI 2.3
> devices.  Reads of this register may therefore not match the
> expected value.  Writes should always use the guest intended
> INTx disable valu

Re: [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices

2012-02-27 Thread Alex Williamson

On Mon, 2012-02-27 at 23:07 +0100, Jan Kiszka wrote:
> On 2012-02-27 22:05, Alex Williamson wrote:
> > On Fri, 2012-02-10 at 19:17 +0100, Jan Kiszka wrote:
> >> PCI 2.3 allows to generically disable IRQ sources at device level. This
> >> enables us to share legacy IRQs of such devices with other host devices
> >> when passing them to a guest.
> >>
> >> The new IRQ sharing feature introduced here is optional, user space has
> >> to request it explicitly. Moreover, user space can inform us about its
> >> view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
> >> interrupt and signaling it if the guest masked it via the virtualized
> >> PCI config space.
> >>
> >> Signed-off-by: Jan Kiszka 
> >> ---
> >>
> >> Changes in v3:
> >>  - rebased over current kvm.git (no code conflict, just api.txt)
> >>
> >>  Documentation/virtual/kvm/api.txt |   31 ++
> >>  arch/x86/kvm/x86.c|1 +
> >>  include/linux/kvm.h   |6 +
> >>  include/linux/kvm_host.h  |2 +
> >>  virt/kvm/assigned-dev.c   |  208 
> >> +++-
> >>  5 files changed, 219 insertions(+), 29 deletions(-)
> >>
> >> diff --git a/Documentation/virtual/kvm/api.txt 
> >> b/Documentation/virtual/kvm/api.txt
> >> index 59a3826..5ce0e29 100644
> >> --- a/Documentation/virtual/kvm/api.txt
> >> +++ b/Documentation/virtual/kvm/api.txt
> >> @@ -1169,6 +1169,14 @@ following flags are specified:
> >>  
> >>  /* Depends on KVM_CAP_IOMMU */
> >>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU   (1 << 0)
> >> +/* The following two depend on KVM_CAP_PCI_2_3 */
> >> +#define KVM_DEV_ASSIGN_PCI_2_3(1 << 1)
> >> +#define KVM_DEV_ASSIGN_MASK_INTX  (1 << 2)
> >> +
> >> +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx 
> >> interrupts
> >> +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with 
> >> other
> >> +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
> >> +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
> >>  
> >>  The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
> >>  isolation of the device.  Usages not specifying this flag are deprecated.
> >> @@ -1441,6 +1449,29 @@ The "num_dirty" field is a performance hint for KVM 
> >> to determine whether it
> >>  should skip processing the bitmap and just invalidate everything.  It must
> >>  be set to the number of set bits in the bitmap.
> >>  
> >> +4.60 KVM_ASSIGN_SET_INTX_MASK
> >> +
> >> +Capability: KVM_CAP_PCI_2_3
> >> +Architectures: x86
> >> +Type: vm ioctl
> >> +Parameters: struct kvm_assigned_pci_dev (in)
> >> +Returns: 0 on success, -1 on error
> >> +
> >> +Informs the kernel about the guest's view on the INTx mask. As long as the
> >> +guest masks the legacy INTx, the kernel will refrain from unmasking it at
> >> +hardware level and will not assert the guest's IRQ line. User space is 
> >> still
> >> +responsible for applying this state to the assigned device's real config 
> >> space
> >> +by setting or clearing the Interrupt Disable bit 10 in the Command 
> >> register.
> >> +
> >> +To avoid that the kernel overwrites the state user space wants to set,
> >> +KVM_ASSIGN_SET_INTX_MASK has to be called prior to updating the config 
> >> space.
> >> +Moreover, user space has to write back its own view on the Interrupt 
> >> Disable
> >> +bit whenever modifying the Command word.
> > 
> > This is very confusing.  I think we need to work on the wording, but
> > perhaps it's not worth hold up the patch.  It seems the simplest,
> 
> As I need another round anyway (see below), I'm open for better wording
> suggestions.

Now that I know what it does, I'll probably write something just as
confusing, but here's a shot:

Allows userspace to mask PCI INTx interrupts from the assigned
device.  The kernel will not deliver INTx interrupts to the
guest between setting and clearing of KVM_ASSIGN_SET_INTX_MASK
via this interface.  This enables use of and emulation of PCI
2.3 INTx disable command register behavior.

This may be used for both PCI 2.3 devices supporting INTx
disable natively and older devices lacking this support.
Userspace is responsible for emulating the read value of the
INTx disable bit in the guest visible PCI command register.
When modifying the INTx disable state, userspace should precede
updating the physical device command register by calling this
ioctl to inform the kernel of the new intended INTx mask state.

Note that the kernel uses the device INTx disable bit to
internally manage the device interrupt state for PCI 2.3
devices.  Reads of this register may therefore not match the
expected value.  Writes should always use the guest intended
INTx disable value rather than attempting to read-copy-update
the current physical device state.  Races

[Bug 42829] New: KVM Guest with virtio network driver loses network connectivity

2012-02-27 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=42829

   Summary: KVM Guest with virtio network driver loses network
connectivity
   Product: Virtualization
   Version: unspecified
Kernel Version: 3.3.0-rc5
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: high
  Priority: P1
 Component: kvm
AssignedTo: virtualization_...@kernel-bugs.osdl.org
ReportedBy: stefan.bo...@gmail.com
CC: a...@redhat.com
Regression: Yes


Description:

Running KVM guests with virtio network interfaces, the guest will
(probably in some - unidentified - circumstances) stop receiving packets.
A tcpdump on the bugged interface will show only ARP requests being
sent by the server and unanswered.

Possible workarounds:

Temporarily:
- restart network interface in guest

Permanent:
- use e1000 network driver as replacement of virtio_net driver


How to reproduce:
-
1) start KVM guest:
qemu-kvm
-nodefaults
-name vps
-chroot /chroot
-runas kvm
-pidfile /var/run/kvm/vps.pid
-vnc 1.2.3.4:0
-vga std --full-screen
-smp 2 -m 1g -cpu host
-mem-path /hugepages
-mem-prealloc
-kvm-shadow-memory 1g
-enable-kvm
-daemonize
-rtc base=localtime,clock=host,driftfix=none
-balloon virtio
-net nic,model=virtio,vlan=0,macaddr=52:54:00:33:22:11
-net bridge,br=br0,vlan=0
-drive aio=native,index=0,media=disk,cache=none,if=virtio,file=vps.img
-boot order=c,menu=off &

2) generate huge trafic (after few minutes virtio network card crashed):
 from host to guest:
  screen ping 10.8.7.2 -s 65507
  screen iperf -s -i 1 -f M
 from guest to host:
  screen ping 10.8.7.1 -s 65507
  screen iperf -c 10.8.7.2 -i 1 -f M

3) restart guest network interface and repeat step 2) until network crash again

Above situation occurs on (my tests):
host kernel 3.3+
qemu-kvm 1.0+
any guest Linux 2.6+ - 3.3+ kernel
any guest Windows 7+

Same situation described from other users on some forums:
http://bugs.centos.org/view.php?id=5526

http://serverfault.com/questions/362038/
qemu-kvm-virtual-machine-virtio-network-freeze-under-load
(possible, not tested old patch)

Please fix above crashing of virtio_net driver.

Thank you for your time.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] KVM call agenda for Tuesday 28th

2012-02-27 Thread Paolo Bonzini

Il 27/02/2012 23:06, Anthony Liguori ha scritto:
> 
> Thanks!  One thing I'm having trouble following on your proposal: What
> commands are valid within
> blockdev-start-transaction/blockdev-commit-transaction?
> 
> If I do:
> 
> blockdev-start-transaction
> stop
> drive-reopen
> drive-mirror
> blockdev-end-transaction
> 
> What state should I expect that my guest is in (paused or running)?

Paused.  Only the two new commands and blockdev-snapshot-sync are part
of the transaction (edited the wiki now).

What I like most in Jeff's new command is that it's not even a question.
 On the other hand we have to be sure that we can extend it, and perhaps
change its name already in 1.1...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices

2012-02-27 Thread Jan Kiszka

On 2012-02-27 22:05, Alex Williamson wrote:
> On Fri, 2012-02-10 at 19:17 +0100, Jan Kiszka wrote:
>> PCI 2.3 allows to generically disable IRQ sources at device level. This
>> enables us to share legacy IRQs of such devices with other host devices
>> when passing them to a guest.
>>
>> The new IRQ sharing feature introduced here is optional, user space has
>> to request it explicitly. Moreover, user space can inform us about its
>> view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
>> interrupt and signaling it if the guest masked it via the virtualized
>> PCI config space.
>>
>> Signed-off-by: Jan Kiszka 
>> ---
>>
>> Changes in v3:
>>  - rebased over current kvm.git (no code conflict, just api.txt)
>>
>>  Documentation/virtual/kvm/api.txt |   31 ++
>>  arch/x86/kvm/x86.c|1 +
>>  include/linux/kvm.h   |6 +
>>  include/linux/kvm_host.h  |2 +
>>  virt/kvm/assigned-dev.c   |  208 
>> +++-
>>  5 files changed, 219 insertions(+), 29 deletions(-)
>>
>> diff --git a/Documentation/virtual/kvm/api.txt 
>> b/Documentation/virtual/kvm/api.txt
>> index 59a3826..5ce0e29 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -1169,6 +1169,14 @@ following flags are specified:
>>  
>>  /* Depends on KVM_CAP_IOMMU */
>>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
>> +/* The following two depend on KVM_CAP_PCI_2_3 */
>> +#define KVM_DEV_ASSIGN_PCI_2_3  (1 << 1)
>> +#define KVM_DEV_ASSIGN_MASK_INTX(1 << 2)
>> +
>> +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx 
>> interrupts
>> +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with 
>> other
>> +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
>> +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
>>  
>>  The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
>>  isolation of the device.  Usages not specifying this flag are deprecated.
>> @@ -1441,6 +1449,29 @@ The "num_dirty" field is a performance hint for KVM 
>> to determine whether it
>>  should skip processing the bitmap and just invalidate everything.  It must
>>  be set to the number of set bits in the bitmap.
>>  
>> +4.60 KVM_ASSIGN_SET_INTX_MASK
>> +
>> +Capability: KVM_CAP_PCI_2_3
>> +Architectures: x86
>> +Type: vm ioctl
>> +Parameters: struct kvm_assigned_pci_dev (in)
>> +Returns: 0 on success, -1 on error
>> +
>> +Informs the kernel about the guest's view on the INTx mask. As long as the
>> +guest masks the legacy INTx, the kernel will refrain from unmasking it at
>> +hardware level and will not assert the guest's IRQ line. User space is still
>> +responsible for applying this state to the assigned device's real config 
>> space
>> +by setting or clearing the Interrupt Disable bit 10 in the Command register.
>> +
>> +To avoid that the kernel overwrites the state user space wants to set,
>> +KVM_ASSIGN_SET_INTX_MASK has to be called prior to updating the config 
>> space.
>> +Moreover, user space has to write back its own view on the Interrupt Disable
>> +bit whenever modifying the Command word.
> 
> This is very confusing.  I think we need to work on the wording, but
> perhaps it's not worth hold up the patch.  It seems the simplest,

As I need another round anyway (see below), I'm open for better wording
suggestions.

> un-optimized version of writing to the command register from userspace
> is then:
> 
> ioctl(kvm_fd, KVM_ASSIGN_SET_INTX_MASK,
>  .flags = (command & PCI_COMMAND_INTX_DISABLE) ?
>  KVM_DEV_ASSIGN_MASK_INTX : 0);
> pwrite(config_fd, &command, 2, PCI_COMMAND);
> 
> From the v1 discussion, I take it that in the case where we're unmasking
> a pending interrupt, the ioctl will post the interrupt, leaving INTx
> disable set; the pwrite will clear INTx disable on the device, assuming
> irq is still pending, trigger the kvm irq handler, which will set INTx

s/set/clear? Yes.

> disable and repost the interrupt.  We assume that single spurious
> interrupts are ok 

Spurious for the host, but not visible for the guest at any time (unless
user space messes it up).

> and we also assume that it's the responsibility of
> userspace to present an emulated INTx disable value on read to avoid
> confusing guests.
> 
> More below...
> 
>> +
>> +See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is 
>> specified
>> +by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is
>> +evaluated.
>> +
>>  4.62 KVM_CREATE_SPAPR_TCE
>>  
>>  Capability: KVM_CAP_SPAPR_TCE
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 2bd77a3..1f11435 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2099,6 +2099,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>>  case KVM_CAP_XSAVE:
>>  case KVM_CAP_ASYNC_PF:
>>  case KVM_CAP_GET_TSC_KHZ:
>> +case KVM_CAP_PCI_2_3:
>>

Re: [Qemu-devel] KVM call agenda for Tuesday 28th

2012-02-27 Thread Anthony Liguori


On 02/27/2012 03:58 PM, Paolo Bonzini wrote:

Il 27/02/2012 18:21, Eric Blake ha scritto:

Please send in any agenda items you are interested in covering.

Given all the threads on snapshot/mirror/migrate/reopen in the blockdev
layer, that sounds like a worthwhile topic to discuss on a phone call.


I put a description of the existing proposals here:

http://wiki.qemu.org/Features/SnapshotsMultipleDevices/CommandSetProposals


Thanks!  One thing I'm having trouble following on your proposal: What commands 
are valid within blockdev-start-transaction/blockdev-commit-transaction?


If I do:

blockdev-start-transaction
stop
drive-reopen
drive-mirror
blockdev-end-transaction

What state should I expect that my guest is in (paused or running)?

Regards,

Anthony Liguori



Paolo



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] KVM call agenda for Tuesday 28th

2012-02-27 Thread Paolo Bonzini

Il 27/02/2012 18:21, Eric Blake ha scritto:
>> > Please send in any agenda items you are interested in covering.
> Given all the threads on snapshot/mirror/migrate/reopen in the blockdev
> layer, that sounds like a worthwhile topic to discuss on a phone call.

I put a description of the existing proposals here:

http://wiki.qemu.org/Features/SnapshotsMultipleDevices/CommandSetProposals

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices

2012-02-27 Thread Alex Williamson

On Fri, 2012-02-10 at 19:17 +0100, Jan Kiszka wrote:
> PCI 2.3 allows to generically disable IRQ sources at device level. This
> enables us to share legacy IRQs of such devices with other host devices
> when passing them to a guest.
> 
> The new IRQ sharing feature introduced here is optional, user space has
> to request it explicitly. Moreover, user space can inform us about its
> view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
> interrupt and signaling it if the guest masked it via the virtualized
> PCI config space.
> 
> Signed-off-by: Jan Kiszka 
> ---
> 
> Changes in v3:
>  - rebased over current kvm.git (no code conflict, just api.txt)
> 
>  Documentation/virtual/kvm/api.txt |   31 ++
>  arch/x86/kvm/x86.c|1 +
>  include/linux/kvm.h   |6 +
>  include/linux/kvm_host.h  |2 +
>  virt/kvm/assigned-dev.c   |  208 +++-
>  5 files changed, 219 insertions(+), 29 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index 59a3826..5ce0e29 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1169,6 +1169,14 @@ following flags are specified:
>  
>  /* Depends on KVM_CAP_IOMMU */
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU  (1 << 0)
> +/* The following two depend on KVM_CAP_PCI_2_3 */
> +#define KVM_DEV_ASSIGN_PCI_2_3   (1 << 1)
> +#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2)
> +
> +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx 
> interrupts
> +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with 
> other
> +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
> +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
>  
>  The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
>  isolation of the device.  Usages not specifying this flag are deprecated.
> @@ -1441,6 +1449,29 @@ The "num_dirty" field is a performance hint for KVM to 
> determine whether it
>  should skip processing the bitmap and just invalidate everything.  It must
>  be set to the number of set bits in the bitmap.
>  
> +4.60 KVM_ASSIGN_SET_INTX_MASK
> +
> +Capability: KVM_CAP_PCI_2_3
> +Architectures: x86
> +Type: vm ioctl
> +Parameters: struct kvm_assigned_pci_dev (in)
> +Returns: 0 on success, -1 on error
> +
> +Informs the kernel about the guest's view on the INTx mask. As long as the
> +guest masks the legacy INTx, the kernel will refrain from unmasking it at
> +hardware level and will not assert the guest's IRQ line. User space is still
> +responsible for applying this state to the assigned device's real config 
> space
> +by setting or clearing the Interrupt Disable bit 10 in the Command register.
> +
> +To avoid that the kernel overwrites the state user space wants to set,
> +KVM_ASSIGN_SET_INTX_MASK has to be called prior to updating the config space.
> +Moreover, user space has to write back its own view on the Interrupt Disable
> +bit whenever modifying the Command word.

This is very confusing.  I think we need to work on the wording, but
perhaps it's not worth hold up the patch.  It seems the simplest,
un-optimized version of writing to the command register from userspace
is then:

ioctl(kvm_fd, KVM_ASSIGN_SET_INTX_MASK,
 .flags = (command & PCI_COMMAND_INTX_DISABLE) ?
 KVM_DEV_ASSIGN_MASK_INTX : 0);
pwrite(config_fd, &command, 2, PCI_COMMAND);

>From the v1 discussion, I take it that in the case where we're unmasking
a pending interrupt, the ioctl will post the interrupt, leaving INTx
disable set; the pwrite will clear INTx disable on the device, assuming
irq is still pending, trigger the kvm irq handler, which will set INTx
disable and repost the interrupt.  We assume that single spurious
interrupts are ok and we also assume that it's the responsibility of
userspace to present an emulated INTx disable value on read to avoid
confusing guests.

More below...

> +
> +See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified
> +by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is
> +evaluated.
> +
>  4.62 KVM_CREATE_SPAPR_TCE
>  
>  Capability: KVM_CAP_SPAPR_TCE
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2bd77a3..1f11435 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2099,6 +2099,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>   case KVM_CAP_XSAVE:
>   case KVM_CAP_ASYNC_PF:
>   case KVM_CAP_GET_TSC_KHZ:
> + case KVM_CAP_PCI_2_3:
>   r = 1;
>   break;
>   case KVM_CAP_COALESCED_MMIO:
> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> index acbe429..6c322a9 100644
> --- a/include/linux/kvm.h
> +++ b/include/linux/kvm.h
> @@ -588,6 +588,7 @@ struct kvm_ppc_pvinfo {
>  #define KVM_CAP_TSC_DEADLINE_TIMER 72
>  #define KVM_CAP_S390_UCONTROL 73
>  #define KVM_CAP_SYNC_REGS 74
> +#define KVM_CA

Re: [PATCH-WIP 01/13] xen/arm: use r12 to pass the hypercall number to the hypervisor

2012-02-27 Thread Peter Maydell

On 27 February 2012 16:27, Ian Campbell  wrote:
> R12 is not accessible from the 16 bit "T1" Thumb encoding of mov
> immediate (which can only target r0..r7).
>
> Since we support only ARMv7+ there are "T2" and "T3" encodings available
> which do allow direct mov of an immediate into R12, but are 32 bit Thumb
> instructions.
>
> Should we use r7 instead to maximise instruction density for Thumb code?

r7 is (used by gcc as) the Thumb frame pointer; I don't know if this
makes it worth avoiding in this context.

-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 1/4] KVM: PPC: epapr: Factor out the epapr init

2012-02-27 Thread Scott Wood

On 02/23/2012 03:22 AM, Liu Yu wrote:
> +static int __init epapr_paravirt_init(void)
> +{
> + struct device_node *hyper_node;
> + const u32 *insts;
> + int len, i;
> +
> + hyper_node = of_find_node_by_path("/hypervisor");
> + if (!hyper_node)
> + return -ENODEV;
> +
> + insts = of_get_property(hyper_node, "hcall-instructions", &len);
> + if (!insts)
> + return 0;

-ENODEV here too.

> + if (!(len % 4) && len <= (4 * 4)) {
> + for (i = 0; i < (len / 4); i++)
> + patch_instruction(epapr_hypercall_start + i, insts[i]);
> +
> + epapr_paravirt_enabled = true;
> + } else {
> + printk(KERN_WARNING
> +"ePAPR paravirt: hcall-instructions format error\n");
> + }

Do this:

if (error) {
print error
return error code
}

continue with function

Not this:

if (!error) {
continue with function
} else {
report the error from several lines back
}

> @@ -33,6 +34,14 @@ config KVM_GUEST
>  
> In case of doubt, say Y
>  
> +config EPAPR_PARAVIRT
> + bool "ePAPR para-virtualization support"
> + default n
> + help
> +   Used to enalbe ePAPR complied para-virtualization support for guest.
> +
> +   In case of doubt, say Y

s/Used to enalbe/Enable/

s/complied/compliant/ (or just s/complied//)

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 4/4] KVM: PPC: epapr: Update other hypercall invoking

2012-02-27 Thread Scott Wood

On 02/23/2012 03:22 AM, Liu Yu wrote:
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index 2dcdbc9..99ebdde 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -15,6 +15,7 @@ if VIRT_DRIVERS
>  config FSL_HV_MANAGER
>   tristate "Freescale hypervisor management driver"
>   depends on FSL_SOC
> + select EPAPR_PARAVIRT
>   help
>The Freescale hypervisor management driver provides several 
> services
> to drivers and applications related to the Freescale hypervisor:

What about the byte channel driver, and possibly others?

Grep for the hypercalls to make sure you got everything that uses this.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH-WIP 01/13] xen/arm: use r12 to pass the hypercall number to the hypervisor

2012-02-27 Thread Ian Campbell

On Mon, 2012-02-27 at 17:53 +, Dave Martin wrote:
> On Thu, Feb 23, 2012 at 05:48:22PM +, Stefano Stabellini wrote:
> > We need a register to pass the hypercall number because we might not
> > know it at compile time and HVC only takes an immediate argument.
> > 
> > Among the available registers r12 seems to be the best choice because it
> > is defined as "intra-procedure call scratch register".
> 
> This would be massively simplified if you didn't try to inline the HVC.
> Does it really need to be inline?
>
> > +#define __HYPERCALL ".word 0xe1400070 + " __HVC_IMM(XEN_HYPERCALL_TAG)
> 
> Please, do not do this.  It won't work in Thumb, where the encodings are
> different.
> 
> It is reasonable to expect anyone building Xen to have reasonably new
> tools, you you can justifiably use
> 
> AFLAGS_thisfile.o := -Wa,-march=armv7-a+virt
> 
> in the Makefile and just use the hvc instruction directly.

Our aim is for guest kernel binaries not to be specific to Xen -- i.e.
they should be able to run on baremetal and other hypervisors as well.
The differences should only be in the device-tree passed to the kernel.

> Of course, this is only practical if the HVC invocation is not inlined.

I suppose we could make the stub functions out of line, we just copied
what Xen does on x86.

The only thing which springs to mind is that 5 argument hypercalls will
end up pushing the fifth argument to the stack only to pop it back into
r4 for the hypercall and IIRC it also needs to preserve r4 (callee saved
reg) which is going to involve some small amount of code to move stuff
around too.

So by inlining the functions we avoid some thunking because the compiler
would know exactly what was happening at the hypercall site.

We don't currently have any 6 argument hypercalls but the same would
extend there.

> If we can't avoid macro-ising HVC, we should do it globally, not locally
> to the Xen code.  That way we at least keep all the horror in one place.

That sounds like a good idea to me.

Given that Stefano is proposing to make the ISS a (per-hypervisor)
constant we could consider just defining the Thumb and non-Thumb
constants instead of doing all the construction with the __HVC_IMM stuff
-- that would remove a big bit of the macroization.

Ian.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 36/37] KVM: PPC: booke: expose guest registers on irq reinject

2012-02-27 Thread Scott Wood

On 02/26/2012 05:59 AM, Alexander Graf wrote:
> 
> On 25.02.2012, at 00:40, Scott Wood wrote:
> 
>> On 02/24/2012 08:26 AM, Alexander Graf wrote:
>>> +static void kvmppc_fill_pt_regs(struct kvm_vcpu *vcpu, struct pt_regs 
>>> *regs)
>>> {
>>> -   int r = RESUME_HOST;
>>> +   int i;
>>>
>>> -   /* update before a new last_exit_type is rewritten */
>>> -   kvmppc_update_timing_stats(vcpu);
>>> +   for (i = 0; i < 32; i++)
>>> +   regs->gpr[i] = kvmppc_get_gpr(vcpu, i);
>>> +   regs->nip = vcpu->arch.pc;
>>> +   regs->msr = vcpu->arch.shared->msr;
>>> +   regs->ctr = vcpu->arch.ctr;
>>> +   regs->link = vcpu->arch.lr;
>>> +   regs->xer = kvmppc_get_xer(vcpu);
>>> +   regs->ccr = kvmppc_get_cr(vcpu);
>>> +   regs->dar = get_guest_dear(vcpu);
>>> +   regs->dsisr = get_guest_esr(vcpu);
>>> +}
>>
>> How much overhead does this add to every interrupt?  Can't we keep this
>> to the minimum that perf cares about?
> 
> I would rather not make assumptions on what perf cares about - maybe we want 
> to one day implement "perf kvm" and then perf could rely on pretty much 
> anything in there.

In that case I think we should be populating a real pt_regs from the
start, as in my original patchset.

I only agreed to take it out because I thought the set of things we'd
copy would be minimal.  This seems like a lot of overhead.

I'm not familiar with "perf kvm", but if it's kvm-specific surely the
KVM code should know/dictate what it can rely on?  Or maybe there can be
a debug option that enables full pt_regs (similar to exit timing)?

Could we just set regs to NULL when the debug option isn't enabled?

>>> +static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
>>> +unsigned int exit_nr)
>>> +{
>>> +   struct pt_regs regs = *current->thread.regs;
>>>
>>> +   kvmppc_fill_pt_regs(vcpu, ®s);
>>
>> Why are you copying out of current->thread.regs?  That's old junk data,
>> set by some previous exception and possibly overwritten since.
> 
> Because it gives us good default values for anything we don't set. Do you 
> have other recommendations?

It does not give good default values for anything.  It is junk,
unallocated memory, overwritten by who knows what.  Same as the memory
you're copying to.

To avoid garbage in fields we don't set, fill it with zeroes first.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH-WIP 01/13] xen/arm: use r12 to pass the hypercall number to the hypervisor

2012-02-27 Thread Ian Campbell

On Mon, 2012-02-27 at 18:03 +, Dave Martin wrote:
> On Mon, Feb 27, 2012 at 04:27:23PM +, Ian Campbell wrote:
> > On Thu, 2012-02-23 at 17:48 +, Stefano Stabellini wrote:
> > > We need a register to pass the hypercall number because we might not
> > > know it at compile time and HVC only takes an immediate argument.
> > > 
> > > Among the available registers r12 seems to be the best choice because it
> > > is defined as "intra-procedure call scratch register".
> > 
> > R12 is not accessible from the 16 bit "T1" Thumb encoding of mov
> > immediate (which can only target r0..r7).
> 
> This is untrue.  The important instructions, like MOV Rd, Rn can access
> all the regs.  But anyway, there is no such thing as a Thumb-1 kernel,
> so we won't really care.

I did say "mov immediate", which is the one which matters when loading a
constant hypercall number (the common case). AFAIK the "mov Rd, #imm" T1
encoding cannot access all registers.

The "mov rd,rn" form only helps for syscall(2) like functions, which are
unusual, at least for Xen, although as Stefano says, they do exist.

> > Since we support only ARMv7+ there are "T2" and "T3" encodings available
> > which do allow direct mov of an immediate into R12, but are 32 bit Thumb
> > instructions.
> > 
> > Should we use r7 instead to maximise instruction density for Thumb code?
> 
> The difference seems trivial when put into context, even if you code a
> special Thumb version of the code to maximise density (the Thumb-2 code
> which gets built from assembler in the kernel is very suboptimal in
> size, but there simply isn't a high proportion of asm code in the kernel
> anyway.)  I wouldn't consider the ARM/Thumb differences as an important
> factor when deciding on a register.

OK, that's useful information. thanks.

> One argument for _not_ using r12 for this purpose is that it is then
> harder to put a generic "HVC" function (analogous to the "syscall"
> syscall) out-of-line, since r12 could get destroyed by the call.

For an out of line syscall(2) wouldn't the syscall number either be in a
standard C calling convention argument register or on the stack when the
function was called, since it is just a normal argument at that point?
As you point out it cannot be passed in r12 (and could never be, due to
the clobbering).

The syscall function itself would have to move the arguments and syscall
nr etc around before issuing the syscall.

I think the same is true of a similar hypercall(2)

> If you don't think you will ever care about putting HVC out of line
> though, it may not matter.

Ian.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks

2012-02-27 Thread Scott Wood

On 02/24/2012 08:26 AM, Alexander Graf wrote:
> -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  {
>   unsigned long *pending = &vcpu->arch.pending_exceptions;
>   unsigned long old_pending = vcpu->arch.pending_exceptions;
> @@ -283,6 +283,8 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  
>   /* Tell the guest about our interrupt status */
>   kvmppc_update_int_pending(vcpu, *pending, old_pending);
> +
> + return 0;
>  }
>  
>  pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn)
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 9979be1..3fcec2c 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -439,8 +439,9 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu 
> *vcpu)
>  }
>  
>  /* Check pending exceptions and deliver one, if possible. */
> -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  {
> + int r = 0;
>   WARN_ON_ONCE(!irqs_disabled());
>  
>   kvmppc_core_check_exceptions(vcpu);
> @@ -451,8 +452,44 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>   local_irq_disable();
>  
>   kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS);
> - kvmppc_core_check_exceptions(vcpu);
> + r = 1;
>   };
> +
> + return r;
> +}
> +
> +/*
> + * Common checks before entering the guest world.  Call with interrupts
> + * disabled.
> + *
> + * returns !0 if a signal is pending and check_signal is true
> + */
> +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal)
> +{
> + int r = 0;
> +
> + WARN_ON_ONCE(!irqs_disabled());
> + while (true) {
> + if (need_resched()) {
> + local_irq_enable();
> + cond_resched();
> + local_irq_disable();
> + continue;
> + }
> +
> + if (kvmppc_core_prepare_to_enter(vcpu)) {
> + /* interrupts got enabled in between, so we
> +are back at square 1 */
> + continue;
> + }
> +
> +
> + if (check_signal && signal_pending(current))
> + r = 1;

If there is a signal pending and MSR[WE] is set, we'll loop forever
without reaching this check.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction

2012-02-27 Thread Scott Wood

On 02/27/2012 04:59 AM, Olivia Yin wrote:
> So that we can call it in kernel.
> 
> Signed-off-by: Liu Yu 

Explain why we want this, and point out that this makes it similar to
load_up_fpu.

> ---
>  arch/powerpc/kernel/head_fsl_booke.S |   23 ++-
>  1 files changed, 6 insertions(+), 17 deletions(-)

When posting a patch authored by someone else, more or less unchanged,
you should put a From: line in the body of the e-mail.

git send-email will do this automatically if you preserve the authorship
in the git commit.

Also, you should add your own Signed-off-by.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] [KVM-autotest] tests.cfg.sample: change import order

2012-02-27 Thread Lukas Doktor

Currently subtests.cfg is proceeded and then all other configs. My test
needs to override smp parameter in some variant which is currently
impossible.

Using words current order means: we define subtests variants, than we
specify base and guest and other details. In the end we limit what
we want to execute.

My proposed order enables forcing base/guest params in subtest variants.

By words this means we specify base, guest system, cdkeys, etc. and in
the end we define subtests with various variants. Then we limit what
we actually want to execute but now subtest can force varius base/guest
settings.

Signed-off-by: Lukas Doktor 
---
 client/tests/kvm/tests-shared.cfg.sample |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm/tests-shared.cfg.sample 
b/client/tests/kvm/tests-shared.cfg.sample
index c6304b3..bda982d 100644
--- a/client/tests/kvm/tests-shared.cfg.sample
+++ b/client/tests/kvm/tests-shared.cfg.sample
@@ -5,11 +5,11 @@
 
 # Include the base config files.
 include base.cfg
-include subtests.cfg
 include guest-os.cfg
 include guest-hw.cfg
 include cdkeys.cfg
 include virtio-win.cfg
+include subtests.cfg
 
 # Virtualization type (kvm or libvirt)
 vm_type = kvm
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] [KVM-autotest] virt.virt_vm: Add option to create raw images with dd

2012-02-27 Thread Lukas Doktor

Adds option to create raw images with dd (non-sparse file unlike
qemu-img one).

Signed-off-by: Lukas Doktor 
---
 client/virt/virt_vm.py |   38 ++
 1 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/client/virt/virt_vm.py b/client/virt/virt_vm.py
index 6cdb91e..06db7a6 100644
--- a/client/virt/virt_vm.py
+++ b/client/virt/virt_vm.py
@@ -336,23 +336,37 @@ def create_image(params, root_dir):
image_cluster_size (optional) -- the cluster size for the image
image_size -- the requested size of the image (a string
qemu-img can understand, such as '10G')
+   create_with_dd -- use dd to create the image (raw format only)
 """
-qemu_img_cmd = virt_utils.get_path(root_dir, params.get("qemu_img_binary",
-   "qemu-img"))
-qemu_img_cmd += " create"
-
 format = params.get("image_format", "qcow2")
-qemu_img_cmd += " -f %s" % format
+image_filename = get_image_filename(params, root_dir)
+size = params.get("image_size", "10G")
+if params.get("create_with_dd") == "yes" and format == "raw":
+# maps K,M,G,T => (count, bs)
+human = {'K': (1, 1),
+ 'M': (1, 1024),
+ 'G': (1024, 1024),
+ 'T': (1024, 1048576),
+}
+if human.has_key(size[-1]):
+block_size = human[size[-1]][1]
+size = int(size[:-1]) * human[size[-1]][0]
+qemu_img_cmd = ("dd if=/dev/zero of=%s count=%s bs=%sK"
+% (image_filename, size, block_size))
+else:
+qemu_img_cmd = virt_utils.get_path(root_dir,
+params.get("qemu_img_binary", "qemu-img"))
+qemu_img_cmd += " create"
 
-image_cluster_size = params.get("image_cluster_size", None)
-if image_cluster_size is not None:
-qemu_img_cmd += " -o cluster_size=%s" % image_cluster_size
+qemu_img_cmd += " -f %s" % format
 
-image_filename = get_image_filename(params, root_dir)
-qemu_img_cmd += " %s" % image_filename
+image_cluster_size = params.get("image_cluster_size", None)
+if image_cluster_size is not None:
+qemu_img_cmd += " -o cluster_size=%s" % image_cluster_size
 
-size = params.get("image_size", "10G")
-qemu_img_cmd += " %s" % size
+qemu_img_cmd += " %s" % image_filename
+
+qemu_img_cmd += " %s" % size
 
 utils.system(qemu_img_cmd)
 return image_filename
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] [KVM-autotest] virt.kvm_vm: Make snapshot and boot params optional

2012-02-27 Thread Lukas Doktor

Currently boot and snapshot parameters are either 'yes' or not present.
This patch enables specify 'yes', 'no' or not present. 'no' option
is necessarily eg. when -snapshot is present and we want to override
the 'snapshot=off' value on one device.

Signed-off-by: Lukas Doktor 
---
 client/virt/kvm_vm.py |   17 +
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/client/virt/kvm_vm.py b/client/virt/kvm_vm.py
index cc181d7..fcbdde4 100644
--- a/client/virt/kvm_vm.py
+++ b/client/virt/kvm_vm.py
@@ -201,11 +201,8 @@ class VM(virt_vm.BaseVM):
 Add option to qemu parameters.
 """
 fmt=",%s=%s"
-if value and isinstance(value, bool):
-if value:
-return fmt % (option, "on")
-else:
-return fmt % (option, "off")
+if isinstance(value, bool):
+return fmt % (option, "on" if value else "off")
 elif value and isinstance(value, str):
 # "EMPTY_STRING" and "NULL_STRING" is used for testing illegal
 # foramt of option.
@@ -301,7 +298,7 @@ class VM(virt_vm.BaseVM):
 return " -cdrom '%s'" % filename
 
 def add_drive(help, filename, index=None, format=None, cache=None,
-  werror=None, rerror=None, serial=None, snapshot=False,
+  werror=None, rerror=None, serial=None, snapshot=None,
   boot=False, blkdebug=None, bus=None, port=None,
   bootindex=None, removable=None, min_io_size=None,
   opt_io_size=None, physical_block_size=None,
@@ -648,8 +645,12 @@ class VM(virt_vm.BaseVM):
 image_params.get("drive_werror"),
 image_params.get("drive_rerror"),
 image_params.get("drive_serial"),
-image_params.get("image_snapshot") == "yes",
-image_params.get("image_boot") == "yes",
+True if image_params.get("image_snapshot") == "yes" else (
+False if image_params.get("image_snapshot") == "no" 
else
+None),
+True if image_params.get("image_boot") == "yes" else (
+False if image_params.get("image_boot") == "no" else
+None),
 virt_vm.get_image_blkdebug_filename(image_params,
 self.virt_dir),
 bus,
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks

2012-02-27 Thread Bhushan Bharat-R65777



> -Original Message-
> From: Alexander Graf [mailto:ag...@suse.de]
> Sent: Monday, February 27, 2012 11:53 PM
> To: Bhushan Bharat-R65777
> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; 
> linuxppc-...@lists.ozlabs.org;
> Wood Scott-B07421
> Subject: Re: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks
> 
> On 02/27/2012 06:33 PM, Alexander Graf wrote:
> > On 02/27/2012 05:34 PM, Bhushan Bharat-R65777 wrote:
> >>
> >>> +}
> >>> +
> >>> +/*
> >>> + * Common checks before entering the guest world.  Call with
> >>> interrupts
> >>> + * disabled.
> >>> + *
> >>> + * returns !0 if a signal is pending and check_signal is true  */
> >>> +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool
> >>> +check_signal) {
> >>> +int r = 0;
> >>> +
> >>> +WARN_ON_ONCE(!irqs_disabled());
> >>> +while (true) {
> >>> +if (need_resched()) {
> >>> +local_irq_enable();
> >>> +cond_resched();
> >>> +local_irq_disable();
> >>> +continue;
> >>> +}
> >>> +
> >>> +if (kvmppc_core_prepare_to_enter(vcpu)) {
> >> kvmppc_prepare_to_enter() is called even on heavyweight_exit. Should
> >> not this be called only on lightweight_exit?
> >
> > Yeah, we don't need to call it when exiting anyways. That's a
> > functional change though, which this patch is trying not to introduce.
> > So we should rather do that as a patch on top.
> 
> So how about this (warning! broken whitespace)?
> 
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
> 7a16b56..616aa2d 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -464,7 +464,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>*
>* returns !0 if a signal is pending and check_signal is true
>*/
> -static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool
> check_signal)
> +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>   {
>  int r = 0;
> 
> @@ -483,7 +483,7 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu,
> bool check_signal)
>  continue;
>  }
> 
> -   if (check_signal && signal_pending(current))
> +   if (signal_pending(current))
>  r = 1;
> 
>  break;
> @@ -507,7 +507,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
> kvm_vcpu
> *vcpu)
>  }
> 
>  local_irq_disable();
> -   if (kvmppc_prepare_to_enter(vcpu, true)) {
> +   if (kvmppc_prepare_to_enter(vcpu)) {
>  kvm_run->exit_reason = KVM_EXIT_INTR;
>  ret = -EINTR;
>  goto out;
> @@ -941,13 +941,16 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
> kvm_vcpu *vcpu,
>   * To avoid clobbering exit_reason, only check for signals if we
>   * aren't already exiting to userspace for some other reason.
>   */
> -   local_irq_disable();
> -   if (kvmppc_prepare_to_enter(vcpu, !(r & RESUME_HOST))) {
> -   run->exit_reason = KVM_EXIT_INTR;
> -   r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
> -   kvmppc_account_exit(vcpu, SIGNAL_EXITS);
> +   if (!(r & RESUME_HOST)) {
> +   local_irq_disable();
> +   if (kvmppc_prepare_to_enter(vcpu)) {
> +   run->exit_reason = KVM_EXIT_INTR;
> +   r = (-EINTR << 2) | RESUME_HOST | (r &
> RESUME_FLAG_NV);
> +   kvmppc_account_exit(vcpu, SIGNAL_EXITS);
> +   }
>  }
> 
> +out:

Why?
Otherwise looks ok to me.

Thanks
-Bharat

>  return r;
>   }
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks

2012-02-27 Thread Alexander Graf


On 02/27/2012 06:33 PM, Alexander Graf wrote:

On 02/27/2012 05:34 PM, Bhushan Bharat-R65777 wrote:



+}
+
+/*
+ * Common checks before entering the guest world.  Call with 
interrupts

+ * disabled.
+ *
+ * returns !0 if a signal is pending and check_signal is true  */
+static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool
+check_signal) {
+int r = 0;
+
+WARN_ON_ONCE(!irqs_disabled());
+while (true) {
+if (need_resched()) {
+local_irq_enable();
+cond_resched();
+local_irq_disable();
+continue;
+}
+
+if (kvmppc_core_prepare_to_enter(vcpu)) {
kvmppc_prepare_to_enter() is called even on heavyweight_exit. Should 
not this be called only on lightweight_exit?


Yeah, we don't need to call it when exiting anyways. That's a 
functional change though, which this patch is trying not to introduce. 
So we should rather do that as a patch on top.


So how about this (warning! broken whitespace)?


diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 7a16b56..616aa2d 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -464,7 +464,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
  *
  * returns !0 if a signal is pending and check_signal is true
  */
-static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool 
check_signal)

+static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 {
int r = 0;

@@ -483,7 +483,7 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu 
*vcpu, bool check_signal)

continue;
}

-   if (check_signal && signal_pending(current))
+   if (signal_pending(current))
r = 1;

break;
@@ -507,7 +507,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)

}

local_irq_disable();
-   if (kvmppc_prepare_to_enter(vcpu, true)) {
+   if (kvmppc_prepare_to_enter(vcpu)) {
kvm_run->exit_reason = KVM_EXIT_INTR;
ret = -EINTR;
goto out;
@@ -941,13 +941,16 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,

 * To avoid clobbering exit_reason, only check for signals if we
 * aren't already exiting to userspace for some other reason.
 */
-   local_irq_disable();
-   if (kvmppc_prepare_to_enter(vcpu, !(r & RESUME_HOST))) {
-   run->exit_reason = KVM_EXIT_INTR;
-   r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
-   kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+   if (!(r & RESUME_HOST)) {
+   local_irq_disable();
+   if (kvmppc_prepare_to_enter(vcpu)) {
+   run->exit_reason = KVM_EXIT_INTR;
+   r = (-EINTR << 2) | RESUME_HOST | (r & 
RESUME_FLAG_NV);

+   kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+   }
}

+out:
return r;
 }


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH-WIP 01/13] xen/arm: use r12 to pass the hypercall number to the hypervisor

2012-02-27 Thread Dave Martin

On Mon, Feb 27, 2012 at 04:27:23PM +, Ian Campbell wrote:
> On Thu, 2012-02-23 at 17:48 +, Stefano Stabellini wrote:
> > We need a register to pass the hypercall number because we might not
> > know it at compile time and HVC only takes an immediate argument.
> > 
> > Among the available registers r12 seems to be the best choice because it
> > is defined as "intra-procedure call scratch register".
> 
> R12 is not accessible from the 16 bit "T1" Thumb encoding of mov
> immediate (which can only target r0..r7).

This is untrue.  The important instructions, like MOV Rd, Rn can access
all the regs.  But anyway, there is no such thing as a Thumb-1 kernel,
so we won't really care.

> Since we support only ARMv7+ there are "T2" and "T3" encodings available
> which do allow direct mov of an immediate into R12, but are 32 bit Thumb
> instructions.
> 
> Should we use r7 instead to maximise instruction density for Thumb code?

The difference seems trivial when put into context, even if you code a
special Thumb version of the code to maximise density (the Thumb-2 code
which gets built from assembler in the kernel is very suboptimal in
size, but there simply isn't a high proportion of asm code in the kernel
anyway.)  I wouldn't consider the ARM/Thumb differences as an important
factor when deciding on a register.

One argument for _not_ using r12 for this purpose is that it is then
harder to put a generic "HVC" function (analogous to the "syscall"
syscall) out-of-line, since r12 could get destroyed by the call.  

If you don't think you will ever care about putting HVC out of line
though, it may not matter.

Cheers
---Dave
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH-WIP 01/13] xen/arm: use r12 to pass the hypercall number to the hypervisor

2012-02-27 Thread Dave Martin

On Thu, Feb 23, 2012 at 05:48:22PM +, Stefano Stabellini wrote:
> We need a register to pass the hypercall number because we might not
> know it at compile time and HVC only takes an immediate argument.
> 
> Among the available registers r12 seems to be the best choice because it
> is defined as "intra-procedure call scratch register".

This would be massively simplified if you didn't try to inline the HVC.
Does it really need to be inline?

> 
> Use the ISS to pass an hypervisor specific tag.
> 
> Signed-off-by: Stefano Stabellini 
> CC: kvm@vger.kernel.org
> ---
>  arch/arm/include/asm/xen/hypercall.h |   87 
> +++---
>  1 files changed, 48 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/arm/include/asm/xen/hypercall.h 
> b/arch/arm/include/asm/xen/hypercall.h
> index 404e63f0..04eba1c 100644
> --- a/arch/arm/include/asm/xen/hypercall.h
> +++ b/arch/arm/include/asm/xen/hypercall.h
> @@ -33,13 +33,17 @@
>  #ifndef _ASM_ARM_XEN_HYPERCALL_H
>  #define _ASM_ARM_XEN_HYPERCALL_H
>  
> -#define __HVC_IMM(name)  "( " #name " & 0xf) + "   \
> - "((" #name " << 4) & 0xfff00)"
> +#include 
> +#include 
>  
> -#define HYPERCALL(name) ".word 0xe1400070 + " __HVC_IMM(name)
> -#define __HYPERCALL(name) HYPERCALL(__HYPERVISOR_##name)
> +#define XEN_HYPERCALL_TAG  "0XEA1"
> +
> +#define __HVC_IMM(tag)   "( " tag " & 0xf) + " \
> + "((" tag " << 4) & 0xfff00)"
> +#define __HYPERCALL ".word 0xe1400070 + " __HVC_IMM(XEN_HYPERCALL_TAG)

Please, do not do this.  It won't work in Thumb, where the encodings are
different.

It is reasonable to expect anyone building Xen to have reasonably new
tools, you you can justifiably use

AFLAGS_thisfile.o := -Wa,-march=armv7-a+virt

in the Makefile and just use the hvc instruction directly.


Of course, this is only practical if the HVC invocation is not inlined.
If we can't avoid macro-ising HVC, we should do it globally, not locally
to the Xen code.  That way we at least keep all the horror in one place.

Cheers
---Dave

>  
>  #define __HYPERCALL_RETREG   "r0"
> +#define __HYPERCALL_NUMBER   "r12"
>  #define __HYPERCALL_ARG1REG  "r0"
>  #define __HYPERCALL_ARG2REG  "r1"
>  #define __HYPERCALL_ARG3REG  "r2"
> @@ -48,30 +52,32 @@
>  
>  #define __HYPERCALL_DECLS\
>   register unsigned long __res  asm(__HYPERCALL_RETREG);  \
> + register unsigned long __num  asm(__HYPERCALL_NUMBER) = __num; \
>   register unsigned long __arg1 asm(__HYPERCALL_ARG1REG) = __arg1; \
>   register unsigned long __arg2 asm(__HYPERCALL_ARG2REG) = __arg2; \
>   register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \
>   register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \
>   register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5;
>  
> -#define __HYPERCALL_0PARAM   "=r" (__res)
> +#define __HYPERCALL_0PARAM   "=r" (__res), "+r" (__num)
>  #define __HYPERCALL_1PARAM   __HYPERCALL_0PARAM, "+r" (__arg1)
>  #define __HYPERCALL_2PARAM   __HYPERCALL_1PARAM, "+r" (__arg2)
>  #define __HYPERCALL_3PARAM   __HYPERCALL_2PARAM, "+r" (__arg3)
>  #define __HYPERCALL_4PARAM   __HYPERCALL_3PARAM, "+r" (__arg4)
>  #define __HYPERCALL_5PARAM   __HYPERCALL_4PARAM, "+r" (__arg5)
>  
> -#define __HYPERCALL_0ARG()
> -#define __HYPERCALL_1ARG(a1) \
> - __HYPERCALL_0ARG()  __arg1 = (unsigned long)(a1);
> -#define __HYPERCALL_2ARG(a1,a2)  
> \
> - __HYPERCALL_1ARG(a1)__arg2 = (unsigned long)(a2);
> -#define __HYPERCALL_3ARG(a1,a2,a3)   \
> - __HYPERCALL_2ARG(a1,a2) __arg3 = (unsigned long)(a3);
> -#define __HYPERCALL_4ARG(a1,a2,a3,a4)
> \
> - __HYPERCALL_3ARG(a1,a2,a3)  __arg4 = (unsigned long)(a4);
> -#define __HYPERCALL_5ARG(a1,a2,a3,a4,a5) \
> - __HYPERCALL_4ARG(a1,a2,a3,a4)   __arg5 = (unsigned long)(a5);
> +#define __HYPERCALL_0ARG(hypercall)  
> \
> + __num = (unsigned long)hypercall;
> +#define __HYPERCALL_1ARG(hypercall,a1)   
> \
> + __HYPERCALL_0ARG(hypercall) __arg1 = (unsigned long)(a1);
> +#define __HYPERCALL_2ARG(hypercall,a1,a2)
> \
> + __HYPERCALL_1ARG(hypercall,a1)  __arg2 = (unsigned long)(a2);
> +#define __HYPERCALL_3ARG(hypercall,a1,a2,a3) 
> \
> + __HYPERCALL_2ARG(hypercall,a1,a2)   __arg3 = (unsigned 
> long)(a3);
> +#define __HYPERCALL_4ARG(hypercall,a1,a2,a3,a4)  
> \
> + __HYPERCALL_3ARG(hypercall,a1,a2,a3)__arg4 = (unsigned long)(a4);
> +#define __HYPERCALL_5ARG(hypercall,a1,a2,a3,a4,a5)

Re: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks

2012-02-27 Thread Alexander Graf


On 02/27/2012 05:34 PM, Bhushan Bharat-R65777 wrote:



+}
+
+/*
+ * Common checks before entering the guest world.  Call with interrupts
+ * disabled.
+ *
+ * returns !0 if a signal is pending and check_signal is true  */
+static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool
+check_signal) {
+   int r = 0;
+
+   WARN_ON_ONCE(!irqs_disabled());
+   while (true) {
+   if (need_resched()) {
+   local_irq_enable();
+   cond_resched();
+   local_irq_disable();
+   continue;
+   }
+
+   if (kvmppc_core_prepare_to_enter(vcpu)) {

kvmppc_prepare_to_enter() is called even on heavyweight_exit. Should not this 
be called only on lightweight_exit?


Yeah, we don't need to call it when exiting anyways. That's a functional 
change though, which this patch is trying not to introduce. So we should 
rather do that as a patch on top.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] kvm tools, seabios: Add "--bios" option to "vm run"

2012-02-27 Thread ron minnich

I'm less interested in this argument than getting something that
works, however it happens, so I'll let it stop with the comment that i
don't agree with you :-)

thanks

ron
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] kvm tools, seabios: Add "--bios" option to "vm run"

2012-02-27 Thread Gleb Natapov

On Mon, Feb 27, 2012 at 08:46:10AM -0800, ron minnich wrote:
> On Mon, Feb 27, 2012 at 8:38 AM, Gleb Natapov  wrote:
> 
> > Then make kvm-tool set it. Why do you need coreboot/seabios for that?
> 
> First thing we looked at. For a number of reasons it seems ugly.
That is not "ugly", this is by design. Isn't it.

> Either we can pick a bunch of fixed base addresses for the kvm-tool
> resources, and track them all as new devices are added to kvm-tool
> over time (yuck), or we can replicate some of the pci BAR code in
> kvm-tool that already exists in seabios/coreboot, which also seems
> yucky.
kvm-tool design goal, as far as I can tell, was to be as much self
contained as possible. kvm-tool should be usable without seabios at
all if legacy bios functionality is not needed. To achieve that some
firmware functionality (mostly HW initialization related) is already
replicated in kvm-tool. Again, this is by design. Coreboot, when used
in conjunction with seabios on real HW, takes care of HW initialization,
so if kvm-tool want to follow its design direction it should assume role
of the coreboot and load Seabios only when legacy bios functionality is
needed.

BTW there is code duplication between coreboot and seabios too. Both of
them can be used to init QEMU HW.

> 
> kvm-tool provides a more realistic environment in some ways for guests
> than qemu.
How is kvm-tool provides a more realistic environment? You get it
opposite.

> Less is initialized.
Nothing is initialized in QEMU. Every single bit of initialization is
done by a guest code. Be it Seabios/coreboot/openfirmware/tianocore or
guest OS itself. And yes, qemu can run all of the firmwares above
without single line of code to support any of them.

>  Given that coreboot does the things we
> need done, and it's easy to build, I don't see a reason not to use it.
I am not very familiar with coreboot, but I think you will have to write
kvm-tool platform support for coreboot before you can run it in
kvm-tool.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] KVM call agenda for Tuesday 28th

2012-02-27 Thread Eric Blake

On 02/27/2012 05:22 AM, Juan Quintela wrote:
> 
> Hi
> 
> Please send in any agenda items you are interested in covering.

Given all the threads on snapshot/mirror/migrate/reopen in the blockdev
layer, that sounds like a worthwhile topic to discuss on a phone call.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [PATCH 2/2] Expose tsc deadline timer cpuid to guest

2012-02-27 Thread Jan Kiszka

On 2012-02-27 17:05, Liu, Jinsong wrote:
> Jan Kiszka wrote:
>> On 2012-01-07 19:23, Liu, Jinsong wrote:
>>> Jan Kiszka wrote:
 On 2012-01-05 18:07, Liu, Jinsong wrote:
>> Sorry, it remains bogus to expose the tsc deadline timer feature
>> on machines < pc-1.1. That's just like we introduced kvmclock
>> only to pc-0.14 onward. The reason is that guest OSes so far
>> running on qemu-1.0 or older without deadline timer support must
>> not find that feature when being migrated to a host with qemu-1.1
>> in pc-1.0 compat mode. Yes, the user can explicitly disable it,
>> but that is not the idea of legacy machine models. They should
>> provide the very same environment that older qemu versions
>> offered. 
>>
>
> Not quite clear about this point.
> Per my understanding, if a kvm guest running on an older qemu
> without tsc deadline timer support,
> then after migrate, the guest would still cannot find tsc deadline
> feature, no matter older or newer host/qemu/pc-xx it migrate to.

 What should prevent this? The feature flags are not part of the
 vmstate. They are part of the vm configuration which is not migrated
 but defined by starting qemu on the target host.

>>>
>>> Thanks! understand this point ("They are part of the vm
>>> configuration which is not migrated but defined by starting qemu on
>>> the target host").  
>>>
>>> But kvmclock example still cannot satisfy the purpose "guest running
>>> on old qemu/pc-0.13 without kvmclock support must not find kvmclock
>>> feature when being migrated to a host with new qemu/pc-0.13 compat
>>> mode". After migration, guest can possibily find kvmclock
>>> feature CPUID.0x4001.KVM_FEATURE_CLOCKSOURCE: pc_init1(...,
>>> kvmclock_enabled) { pc_cpus_init(cpu_model);// the point to
>>> decide and expose cpuid features to guest   
>>>
>>> if (kvmclock_enabled) {// the difference point between
>>> pc-0.13 vs. pc-0.14, related nothing to cpuid features.
>>> kvmclock_create(); } }
>>
>> Right, not a perfect example: the cpuid feature is not influenced by
>> this mechanism, only the fact if a kvmclock device (for save/restore)
>> should be created. I guess we ignored this back then, only focusing on
>> the more obvious issue of the addition device.
>>
>>>
>>> Seems currently there is no good way to satisfy "guest running on
>>> old qemu/pc-xx without feature A support must not find feature A
>>> when being migrated to a host with new qemu/pc-xx compat mode", i.e.
>>> considering   
>>> * if running with '-cpu host' then migrate;
>>> * each time we add a new cpuid feature it need add one or more new
>>> machine model? is it necessary to bind pc-xx with cpuid feature? 
>>> * logically cpuid features should better be controlled by cpu model,
>>> not by machine model. 
>>
>> The compatibility machines define the possible cpu models. If I select
> 
> How does machine define possible cpu models?
> cpu model defined by qemu option '-cpu ...', while machine model defined by 
> '-machine ...'
> 
>> pc-0.14, e.g. -cpu kvm64 should not give me features that 0.14 was not
>> exposing.
>>
> 
> in such case, it's '-cpu kvm64' who take effect to decide what cpuid features 
> would exposed to guest, not '-machine pc-0.14'.
> 
> IMO, what our patch need to do is to expose a cpuid feature to guest 
> (CPUID.01H:ECX.TSC_Deadline[bit 24]), it decided by cpu model, not machine 
> model:
> pc_init1(..., cpu_model, ...)
> {
> pc_cpus_init(cpu_model);   // this is the whole logic exposing cpuid 
> features to guest
> ...
> }
> 
> Do I misunderstanding something?

My point is that

  qemu-version-A [-cpu whatever]

should provide the same VM as

  qemu-version-B -machine pc-A [-cpu whatever]

specifically if you leave out the cpu specification.

So the compat machine could establish a feature mask (e.g. append some
"-tsc_deadline" in this case). But, indeed, we need a new channel for this.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] kvm tools, seabios: Add "--bios" option to "vm run"

2012-02-27 Thread ron minnich

On Mon, Feb 27, 2012 at 8:38 AM, Gleb Natapov  wrote:

> Then make kvm-tool set it. Why do you need coreboot/seabios for that?

First thing we looked at. For a number of reasons it seems ugly.
Either we can pick a bunch of fixed base addresses for the kvm-tool
resources, and track them all as new devices are added to kvm-tool
over time (yuck), or we can replicate some of the pci BAR code in
kvm-tool that already exists in seabios/coreboot, which also seems
yucky.

kvm-tool provides a more realistic environment in some ways for guests
than qemu. Less is initialized. Given that coreboot does the things we
need done, and it's easy to build, I don't see a reason not to use it.

Of course due to the ongoing ld alignment bug issue, I can't build
seabios to test this idea, so have not. If I had time I'd put in a
patch to seabios to not depend on this feature, since it's broken on
every linux system I own, but ...

ron
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] kvm tools, seabios: Add "--bios" option to "vm run"

2012-02-27 Thread Gleb Natapov

On Mon, Feb 27, 2012 at 08:24:02AM -0800, ron minnich wrote:
> On Mon, Feb 27, 2012 at 3:15 AM, Gleb Natapov  wrote:
> 
> > AFAIK kvm-tool initialize PCI by itself so it actually may work (and,
> > very likely, the best thing to do).
> 
> we're going in circles. kvm-tool does not set the base address of some
> of the BARs and seabios does not do it quite correctly. That's what
> we're trying to decide how to fix. Probably the fastest way out is
> just run coreboot+seabios for kvm-tool.
> 
Then make kvm-tool set it. Why do you need coreboot/seabios for that?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks

2012-02-27 Thread Bhushan Bharat-R65777



> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
> Of
> Alexander Graf
> Sent: Friday, February 24, 2012 7:56 PM
> To: kvm-...@vger.kernel.org
> Cc: kvm@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421
> Subject: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks
> 
> Instead of checking whether we should reschedule only when we exited due to an
> interrupt, let's always check before entering the guest back again. This gets
> the target more in line with the other archs.
> 
> Also while at it, generalize the whole thing so that eventually we could have 
> a
> single kvmppc_prepare_to_enter function for all ppc targets that does signal 
> and
> reschedule checking for us.
> 
> Signed-off-by: Alexander Graf 
> ---
>  arch/powerpc/include/asm/kvm_ppc.h |2 +-
>  arch/powerpc/kvm/book3s.c  |4 ++-
>  arch/powerpc/kvm/booke.c   |   70 ---
>  3 files changed, 52 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h
> b/arch/powerpc/include/asm/kvm_ppc.h
> index e709975..7f0a3da 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -95,7 +95,7 @@ extern int kvmppc_core_vcpu_translate(struct kvm_vcpu *vcpu,
> extern void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu);  extern 
> void
> kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu);
> 
> -extern void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu);
> +extern int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu);
>  extern int kvmppc_core_pending_dec(struct kvm_vcpu *vcpu);  extern void
> kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags);  extern void
> kvmppc_core_queue_dec(struct kvm_vcpu *vcpu); diff --git
> a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 7d54f4e..c8ead7b
> 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -258,7 +258,7 @@ static bool clear_irqprio(struct kvm_vcpu *vcpu, unsigned
> int priority)
>   return true;
>  }
> 
> -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  {
>   unsigned long *pending = &vcpu->arch.pending_exceptions;
>   unsigned long old_pending = vcpu->arch.pending_exceptions; @@ -283,6
> +283,8 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> 
>   /* Tell the guest about our interrupt status */
>   kvmppc_update_int_pending(vcpu, *pending, old_pending);
> +
> + return 0;
>  }
> 
>  pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn) diff --git
> a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 9979be1..3fcec2c
> 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -439,8 +439,9 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu
> *vcpu)  }
> 
>  /* Check pending exceptions and deliver one, if possible. */ -void
> kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  {
> + int r = 0;
>   WARN_ON_ONCE(!irqs_disabled());
> 
>   kvmppc_core_check_exceptions(vcpu);
> @@ -451,8 +452,44 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>   local_irq_disable();
> 
>   kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS);
> - kvmppc_core_check_exceptions(vcpu);
> + r = 1;
>   };
> +
> + return r;
> +}
> +
> +/*
> + * Common checks before entering the guest world.  Call with interrupts
> + * disabled.
> + *
> + * returns !0 if a signal is pending and check_signal is true  */
> +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool
> +check_signal) {
> + int r = 0;
> +
> + WARN_ON_ONCE(!irqs_disabled());
> + while (true) {
> + if (need_resched()) {
> + local_irq_enable();
> + cond_resched();
> + local_irq_disable();
> + continue;
> + }
> +
> + if (kvmppc_core_prepare_to_enter(vcpu)) {

kvmppc_prepare_to_enter() is called even on heavyweight_exit. Should not this 
be called only on lightweight_exit?

Thanks
-Bharat

> + /* interrupts got enabled in between, so we
> +are back at square 1 */
> + continue;
> + }
> +
> + if (check_signal && signal_pending(current))
> + r = 1;
> +
> + break;
> + }
> +
> + return r;
>  }
> 
>  int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) @@ 
> -470,10
> +507,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>   }
> 
>   local_irq_disable();
> -
> - kvmppc_core_prepare_to_enter(vcpu);
> -
> - if (signal_pending(current)) {
> + if (kvmppc_prepare_to_enter(vcpu, true)) {
>   kvm_run->exit_reason = KVM_EXIT_INTR;
>

Re: [PATCH-WIP 01/13] xen/arm: use r12 to pass the hypercall number to the hypervisor

2012-02-27 Thread Ian Campbell

On Thu, 2012-02-23 at 17:48 +, Stefano Stabellini wrote:
> We need a register to pass the hypercall number because we might not
> know it at compile time and HVC only takes an immediate argument.
> 
> Among the available registers r12 seems to be the best choice because it
> is defined as "intra-procedure call scratch register".

R12 is not accessible from the 16 bit "T1" Thumb encoding of mov
immediate (which can only target r0..r7).

Since we support only ARMv7+ there are "T2" and "T3" encodings available
which do allow direct mov of an immediate into R12, but are 32 bit Thumb
instructions.

Should we use r7 instead to maximise instruction density for Thumb code?

Ian.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] kvm tools, seabios: Add "--bios" option to "vm run"

2012-02-27 Thread ron minnich

On Mon, Feb 27, 2012 at 3:15 AM, Gleb Natapov  wrote:

> AFAIK kvm-tool initialize PCI by itself so it actually may work (and,
> very likely, the best thing to do).

we're going in circles. kvm-tool does not set the base address of some
of the BARs and seabios does not do it quite correctly. That's what
we're trying to decide how to fix. Probably the fastest way out is
just run coreboot+seabios for kvm-tool.

ron
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 2/2] Expose tsc deadline timer cpuid to guest

2012-02-27 Thread Liu, Jinsong

Jan Kiszka wrote:
> On 2012-01-07 19:23, Liu, Jinsong wrote:
>> Jan Kiszka wrote:
>>> On 2012-01-05 18:07, Liu, Jinsong wrote:
> Sorry, it remains bogus to expose the tsc deadline timer feature
> on machines < pc-1.1. That's just like we introduced kvmclock
> only to pc-0.14 onward. The reason is that guest OSes so far
> running on qemu-1.0 or older without deadline timer support must
> not find that feature when being migrated to a host with qemu-1.1
> in pc-1.0 compat mode. Yes, the user can explicitly disable it,
> but that is not the idea of legacy machine models. They should
> provide the very same environment that older qemu versions
> offered. 
> 
 
 Not quite clear about this point.
 Per my understanding, if a kvm guest running on an older qemu
 without tsc deadline timer support,
 then after migrate, the guest would still cannot find tsc deadline
 feature, no matter older or newer host/qemu/pc-xx it migrate to.
>>> 
>>> What should prevent this? The feature flags are not part of the
>>> vmstate. They are part of the vm configuration which is not migrated
>>> but defined by starting qemu on the target host.
>>> 
>> 
>> Thanks! understand this point ("They are part of the vm
>> configuration which is not migrated but defined by starting qemu on
>> the target host").  
>> 
>> But kvmclock example still cannot satisfy the purpose "guest running
>> on old qemu/pc-0.13 without kvmclock support must not find kvmclock
>> feature when being migrated to a host with new qemu/pc-0.13 compat
>> mode". After migration, guest can possibily find kvmclock
>> feature CPUID.0x4001.KVM_FEATURE_CLOCKSOURCE: pc_init1(...,
>> kvmclock_enabled) { pc_cpus_init(cpu_model);// the point to
>> decide and expose cpuid features to guest   
>> 
>> if (kvmclock_enabled) {// the difference point between
>> pc-0.13 vs. pc-0.14, related nothing to cpuid features.
>> kvmclock_create(); } }
> 
> Right, not a perfect example: the cpuid feature is not influenced by
> this mechanism, only the fact if a kvmclock device (for save/restore)
> should be created. I guess we ignored this back then, only focusing on
> the more obvious issue of the addition device.
> 
>> 
>> Seems currently there is no good way to satisfy "guest running on
>> old qemu/pc-xx without feature A support must not find feature A
>> when being migrated to a host with new qemu/pc-xx compat mode", i.e.
>> considering   
>> * if running with '-cpu host' then migrate;
>> * each time we add a new cpuid feature it need add one or more new
>> machine model? is it necessary to bind pc-xx with cpuid feature? 
>> * logically cpuid features should better be controlled by cpu model,
>> not by machine model. 
> 
> The compatibility machines define the possible cpu models. If I select

How does machine define possible cpu models?
cpu model defined by qemu option '-cpu ...', while machine model defined by 
'-machine ...'

> pc-0.14, e.g. -cpu kvm64 should not give me features that 0.14 was not
> exposing.
> 

in such case, it's '-cpu kvm64' who take effect to decide what cpuid features 
would exposed to guest, not '-machine pc-0.14'.

IMO, what our patch need to do is to expose a cpuid feature to guest 
(CPUID.01H:ECX.TSC_Deadline[bit 24]), it decided by cpu model, not machine 
model:
pc_init1(..., cpu_model, ...)
{
pc_cpus_init(cpu_model);   // this is the whole logic exposing cpuid 
features to guest
...
}

Do I misunderstanding something?

Thanks,
Jinsong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction

2012-02-27 Thread Tabi Timur-B04825

On Mon, Feb 27, 2012 at 4:59 AM, Olivia Yin  wrote:
> So that we can call it in kernel.

And why would we want that?

-- 
Timur Tabi
Linux kernel developer at Freescale
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices

2012-02-27 Thread Jan Kiszka

On 2012-02-10 19:17, Jan Kiszka wrote:
> PCI 2.3 allows to generically disable IRQ sources at device level. This
> enables us to share legacy IRQs of such devices with other host devices
> when passing them to a guest.
> 
> The new IRQ sharing feature introduced here is optional, user space has
> to request it explicitly. Moreover, user space can inform us about its
> view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
> interrupt and signaling it if the guest masked it via the virtualized
> PCI config space.
> 
> Signed-off-by: Jan Kiszka 
> ---
> 
> Changes in v3:
>  - rebased over current kvm.git (no code conflict, just api.txt)
> 
>  Documentation/virtual/kvm/api.txt |   31 ++
>  arch/x86/kvm/x86.c|1 +
>  include/linux/kvm.h   |6 +
>  include/linux/kvm_host.h  |2 +
>  virt/kvm/assigned-dev.c   |  208 +++-
>  5 files changed, 219 insertions(+), 29 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index 59a3826..5ce0e29 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1169,6 +1169,14 @@ following flags are specified:
>  
>  /* Depends on KVM_CAP_IOMMU */
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU  (1 << 0)
> +/* The following two depend on KVM_CAP_PCI_2_3 */
> +#define KVM_DEV_ASSIGN_PCI_2_3   (1 << 1)
> +#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2)
> +
> +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx 
> interrupts
> +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with 
> other
> +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
> +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
>  
>  The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
>  isolation of the device.  Usages not specifying this flag are deprecated.
> @@ -1441,6 +1449,29 @@ The "num_dirty" field is a performance hint for KVM to 
> determine whether it
>  should skip processing the bitmap and just invalidate everything.  It must
>  be set to the number of set bits in the bitmap.
>  
> +4.60 KVM_ASSIGN_SET_INTX_MASK
> +
> +Capability: KVM_CAP_PCI_2_3
> +Architectures: x86
> +Type: vm ioctl
> +Parameters: struct kvm_assigned_pci_dev (in)
> +Returns: 0 on success, -1 on error
> +
> +Informs the kernel about the guest's view on the INTx mask. As long as the
> +guest masks the legacy INTx, the kernel will refrain from unmasking it at
> +hardware level and will not assert the guest's IRQ line. User space is still
> +responsible for applying this state to the assigned device's real config 
> space
> +by setting or clearing the Interrupt Disable bit 10 in the Command register.
> +
> +To avoid that the kernel overwrites the state user space wants to set,
> +KVM_ASSIGN_SET_INTX_MASK has to be called prior to updating the config space.
> +Moreover, user space has to write back its own view on the Interrupt Disable
> +bit whenever modifying the Command word.
> +
> +See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified
> +by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is
> +evaluated.
> +
>  4.62 KVM_CREATE_SPAPR_TCE
>  
>  Capability: KVM_CAP_SPAPR_TCE
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2bd77a3..1f11435 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2099,6 +2099,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>   case KVM_CAP_XSAVE:
>   case KVM_CAP_ASYNC_PF:
>   case KVM_CAP_GET_TSC_KHZ:
> + case KVM_CAP_PCI_2_3:
>   r = 1;
>   break;
>   case KVM_CAP_COALESCED_MMIO:
> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> index acbe429..6c322a9 100644
> --- a/include/linux/kvm.h
> +++ b/include/linux/kvm.h
> @@ -588,6 +588,7 @@ struct kvm_ppc_pvinfo {
>  #define KVM_CAP_TSC_DEADLINE_TIMER 72
>  #define KVM_CAP_S390_UCONTROL 73
>  #define KVM_CAP_SYNC_REGS 74
> +#define KVM_CAP_PCI_2_3 75
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> @@ -784,6 +785,9 @@ struct kvm_s390_ucas_mapping {
>  /* Available with KVM_CAP_TSC_CONTROL */
>  #define KVM_SET_TSC_KHZ   _IO(KVMIO,  0xa2)
>  #define KVM_GET_TSC_KHZ   _IO(KVMIO,  0xa3)
> +/* Available with KVM_CAP_PCI_2_3 */
> +#define KVM_ASSIGN_SET_INTX_MASK  _IOW(KVMIO,  0xa4, \
> +struct kvm_assigned_pci_dev)
>  
>  /*
>   * ioctls for vcpu fds
> @@ -857,6 +861,8 @@ struct kvm_s390_ucas_mapping {
>  #define KVM_SET_ONE_REG_IOW(KVMIO,  0xac, struct kvm_one_reg)
>  
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU  (1 << 0)
> +#define KVM_DEV_ASSIGN_PCI_2_3   (1 << 1)
> +#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2)
>  
>  struct kvm_assigned_pci_dev {
>   __u32 assigned_dev_id;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 9698080..d1d68f4 100644
> --- a/in

Re: [PATCH] kvm: notify host when guest paniced

2012-02-27 Thread Jan Kiszka

On 2012-02-27 04:01, Wen Congyang wrote:
> We can know the guest is paniced when the guest runs on xen.
> But we do not have such feature on kvm. This patch implemnts
> this feature, and the implementation is the same as xen:
> register panic notifier, and call hypercall when the guest
> is paniced.
> 
> Signed-off-by: Wen Congyang 
> ---
>  arch/x86/kernel/kvm.c|   12 
>  arch/x86/kvm/svm.c   |8 ++--
>  arch/x86/kvm/vmx.c   |8 ++--
>  arch/x86/kvm/x86.c   |   13 +++--
>  include/linux/kvm.h  |1 +
>  include/linux/kvm_para.h |1 +
>  6 files changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index f0c6fd6..b928d1d 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -331,6 +331,17 @@ static struct notifier_block kvm_pv_reboot_nb = {
>   .notifier_call = kvm_pv_reboot_notify,
>  };
>  
> +static int
> +kvm_pv_panic_notify(struct notifier_block *nb, unsigned long code, void 
> *unused)
> +{
> + kvm_hypercall0(KVM_HC_GUEST_PANIC);
> + return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block kvm_pv_panic_nb = {
> + .notifier_call = kvm_pv_panic_notify,
> +};
> +

You should split up host and guest-side changes.

>  static u64 kvm_steal_clock(int cpu)
>  {
>   u64 steal;
> @@ -417,6 +428,7 @@ void __init kvm_guest_init(void)
>  
>   paravirt_ops_setup();
>   register_reboot_notifier(&kvm_pv_reboot_nb);
> + atomic_notifier_chain_register(&panic_notifier_list, &kvm_pv_panic_nb);
>   for (i = 0; i < KVM_TASK_SLEEP_HASHSIZE; i++)
>   spin_lock_init(&async_pf_sleepers[i].lock);
>   if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF))
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 0b7690e..38b4705 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1900,10 +1900,14 @@ static int halt_interception(struct vcpu_svm *svm)
>  
>  static int vmmcall_interception(struct vcpu_svm *svm)
>  {
> + int ret;
> +
>   svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
>   skip_emulated_instruction(&svm->vcpu);
> - kvm_emulate_hypercall(&svm->vcpu);
> - return 1;
> + ret = kvm_emulate_hypercall(&svm->vcpu);
> +
> + /* Ignore the error? */
> + return ret == 0 ? 0 : 1;

Why can't kvm_emulate_hypercall return the right value?

>  }
>  
>  static unsigned long nested_svm_get_tdp_cr3(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 66147ca..1b57ebb 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4582,9 +4582,13 @@ static int handle_halt(struct kvm_vcpu *vcpu)
>  
>  static int handle_vmcall(struct kvm_vcpu *vcpu)
>  {
> + int ret;
> +
>   skip_emulated_instruction(vcpu);
> - kvm_emulate_hypercall(vcpu);
> - return 1;
> + ret = kvm_emulate_hypercall(vcpu);
> +
> + /* Ignore the error? */
> + return ret == 0 ? 0 : 1;
>  }
>  
>  static int handle_invd(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c9d99e5..3fc2853 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4923,7 +4923,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>   u64 param, ingpa, outgpa, ret;
>   uint16_t code, rep_idx, rep_cnt, res = HV_STATUS_SUCCESS, rep_done = 0;
>   bool fast, longmode;
> - int cs_db, cs_l;
> + int cs_db, cs_l, r = 1;
>  
>   /*
>* hypercall generates UD from non zero cpl and real mode
> @@ -4964,6 +4964,10 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>   case HV_X64_HV_NOTIFY_LONG_SPIN_WAIT:
>   kvm_vcpu_on_spin(vcpu);
>   break;
> + case KVM_HC_GUEST_PANIC:
> + vcpu->run->exit_reason = KVM_EXIT_GUEST_PANIC;
> + r = 0;
> + break;

That's the wrong place. This is a KVM hypercall, not a HyperV one.

>   default:
>   res = HV_STATUS_INVALID_HYPERCALL_CODE;
>   break;
> @@ -4977,7 +4981,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>   kvm_register_write(vcpu, VCPU_REGS_RAX, ret & 0x);
>   }
>  
> - return 1;
> + return r;
>  }
>  
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> @@ -5013,6 +5017,11 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>   case KVM_HC_VAPIC_POLL_IRQ:
>   ret = 0;
>   break;
> + case KVM_HC_GUEST_PANIC:
> + ret = 0;
> + vcpu->run->exit_reason = KVM_EXIT_GUEST_PANIC;
> + r = 0;
> + break;
>   default:
>   ret = -KVM_ENOSYS;
>   break;
> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> index acbe429..8f0e31b 100644
> --- a/include/linux/kvm.h
> +++ b/include/linux/kvm.h
> @@ -163,6 +163,7 @@ struct kvm_pit_config {
>  #define KVM_EXIT_OSI  18
>  #define KVM_EXIT_PAPR_HCALL19
>  #define KVM_EXIT_S390_UCONTROL 20
> +#define KVM_EXIT_GUEST_PANIC

Re: [PATCHv3 0/4] standard pci bridge device

2012-02-27 Thread Gerd Hoffmann

On 02/20/12 23:52, Michael S. Tsirkin wrote:
> Here's a new version of the patch.
> 
> TODOs:
> - windows guest testing
> 
> Changes from v2:
> - added slot id capability
> - migration support
> - misc fixes
> - fix checkpatch errors

64bit prefetch memory window works now:

00:10.0 PCI bridge: Red Hat, Inc. Device 0001 (prog-if 00 [Normal decode])
Physical Slot: 16
Flags: bus master, 66MHz, fast devsel, latency 0
Memory at f5126000 (32-bit, non-prefetchable) [size=256]
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
Memory behind bridge: f500-f50f
Prefetchable memory behind bridge: f800-fbff
Capabilities: [48] Slot ID: 0 slots, First+, chassis 01
Capabilities: [40] Hot-plug capable
Kernel modules: shpchp

Looks good to me, did only light testing though.

cheers,
  Gerd
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

KVM call agenda for Tuesday 28th

2012-02-27 Thread Juan Quintela


Hi

Please send in any agenda items you are interested in covering.

Cheers,

Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] vhost: don't forget to schedule()

2012-02-27 Thread Nadav Har'El

This is a tiny, but important, patch to vhost.

Vhost's worker thread only called schedule() when it had no work to do, and
it wanted to go to sleep. But if there's always work to do, e.g., the guest
is running a network-intensive program like netperf with small message sizes,
schedule() was *never* called. This had several negative implications (on
non-preemptive kernels):

 1. Passing time was not properly accounted to the "vhost" process (ps and
top would wrongly show it using zero CPU time).

 2. Sometimes error messages about RCU timeouts would be printed, if the
core running the vhost thread didn't schedule() for a very long time.

 3. Worst of all, a vhost thread would "hog" the core. If several vhost
threads need to share the same core, typically one would get most of the
CPU time (and its associated guest most of the performance), while the
others hardly get any work done.

The trivial solution is to add

if (need_resched())
schedule();

After doing every piece of work. This will not do the heavy schedule() all
the time, just when the timer interrupt decided a reschedule is warranted
(so need_resched returns true).

Thanks to Abel Gordon for this patch.

Signed-off-by: Nadav Har'El 
---
 vhost.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index c14c42b..ae66278 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -222,6 +222,8 @@ static int vhost_worker(void *data)
if (work) {
__set_current_state(TASK_RUNNING);
work->fn(work);
+   if (need_resched())
+   schedule();
} else
schedule();
 
-- 
Nadav Har'El|Monday, Feb 27 2012, 
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Ways to Relieve Stress #10: Make up a
http://nadav.harel.org.il   |language and ask people for directions.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Current kernel fails to compile with KVM on PowerPC

2012-02-27 Thread Jörg Sommer

Alexander Graf hat am Mon 27. Feb, 01:30 (+0100) geschrieben:
> On 27.02.2012, at 01:08, Jörg Sommer wrote:
> 
> > Alexander Graf hat am Sun 26. Feb, 12:43 (+0100) geschrieben:
> >> On 25.02.2012, at 15:51, Jörg Sommer wrote:
> >>> Jörg Sommer hat am Tue 21. Feb, 09:32 (+0100) geschrieben:
>  Alexander Graf hat am Mon 20. Feb, 22:27 (+0100) geschrieben:
> > On 20.02.2012, at 18:38, Jörg Sommer wrote:
> >> Alexander Graf hat am Tue 22. Nov, 22:29 (+0100) geschrieben:
> >>> On 22.11.2011, at 21:04, Jörg Sommer wrote:
>  [1] »kernel BUG at include/linux/kvm_host.h:603!«
>  http://www.mail-archive.com/kvm@vger.kernel.org/msg61433.html
> >>> 
> >>> This is unfortunately still there. It's because of preemption being
> >>> enabled. Please just use CONFIG_PREEMPT_NONE for the time being
> >> 
> >> This doesn't help. I've build with CONFIG_PREEMPT_NONE, but I'm getting
> >> this Oops, when I start qemu.
> > 
> > Could you please try git://git.kernel.org/pub/scm/virt/kvm/kvm.git? I
> > fixed a bunch of things with preemption since then and it definitely
> > worked for me. If it still fails in that tree, I can try again to
> > reproduce it :).
>  
>  This kernel (e9badff4b38a3f8b2c20aa8a30db210caf85a497) fails to build:
>  
>  CC [M]  arch/powerpc/kvm/book3s_pr.o
>  arch/powerpc/kvm/book3s_pr.c: In function ‘kvm_vcpu_ioctl_get_one_reg’:
>  arch/powerpc/kvm/book3s_pr.c:883:45: error: cast to pointer from integer 
>  of different size [-Werror=int-to-pointer-cast]
>  arch/powerpc/kvm/book3s_pr.c:883:80: error: cast to pointer from integer 
>  of different size [-Werror=int-to-pointer-cast]
> > 
> >> Yikes. Does this patch work for you?
> > 
> >> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> >> index ee222ec..f329eae 100644
> >> --- a/arch/powerpc/kvm/book3s_pr.c
> >> +++ b/arch/powerpc/kvm/book3s_pr.c
> >> @@ -880,7 +880,8 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
> >> struct kvm_one_reg *reg)
> >> 
> >>switch (reg->id) {
> >>case KVM_REG_PPC_HIOR:
> >> -   r = put_user(to_book3s(vcpu)->hior, (u64 __user 
> >> *)reg->addr);
> >> +   r = put_user(to_book3s(vcpu)->hior,
> >> +(u64 __user *)(long)reg->addr);
> > 
> > Yes and no. It brings me a step further, but not to a working kernel.
> > 
> >  CHK include/linux/version.h
> >  CHK include/generated/utsrelease.h
> >  CALLscripts/checksyscalls.sh
> >  CC [M]  arch/powerpc/kvm/book3s_pr.o
> >  LD [M]  arch/powerpc/kvm/kvm.o
> >  Building modules, stage 2.
> >  MODPOST 227 modules
> > ERROR: "__get_user_bad" [arch/powerpc/kvm/kvm.ko] undefined!
> 
> Ah, because you can't get_user u64s I suppose. Sigh. As a quick hack,
> just comment out the get/put_user lines - you don't care about
> configuring HIOR on ppc32 anyways. I'll try to come up with something
> :)

I've removed these lines and got a kernel. But it crashes:

# modprobe kvm
# qemu-system-ppc -enable-kvm -curses
[  155.982144] BUG: sleeping function called from invalid context at 
arch/powerpc/kvm/../../../virt/kvm/kvm_main.c:1078
[  155.982552] in_atomic(): 0, irqs_disabled(): 1, pid: 1727, name: 
qemu-system-ppc
[  155.982807] Call Trace:
[  155.982916] [e31ad820] [c000bc44] show_stack+0xbc/0x194 (unreliable)
[  155.983175] [e31ad870] [c047bc2c] dump_stack+0x30/0x38
[  155.983372] [e31ad880] [c0062070] __might_sleep+0xf8/0x100
[  155.983620] [e31ad890] [ea6c1830] hva_to_pfn.isra.41+0xc0/0x340 [kvm]
[  155.983869] [e31ad8d0] [ea6c1b6c] __gfn_to_pfn+0xbc/0xc4 [kvm]
[  155.984110] [e31ad8f0] [ea6c1bec] gfn_to_pfn+0x38/0x40 [kvm]
[  155.984335] [e31ad900] [ea6c9f60] kvmppc_gfn_to_pfn+0xb8/0xc8 [kvm]
[  155.984571] [e31ad920] [ea6ce454] kvmppc_mmu_map_page+0x3c/0x274 [kvm]
[  155.984817] [e31ad970] [ea6cadc4] kvmppc_handle_pagefault+0x264/0x3d0 [kvm]
[  155.985083] [e31ad9c0] [ea6cb22c] kvmppc_handle_exit+0x18c/0x800 [kvm]
[  155.985329] [e31ada00] [ea6cd18c] kvmppc_handler_highmem+0x5c/0x6c [kvm]
[  155.985580] [e31adac0] [ea6cbebc] kvmppc_vcpu_run+0x184/0x244 [kvm]
[  155.985817] [e31ade20] [ea6c6170] kvm_arch_vcpu_ioctl_run+0x348/0x374 [kvm]
[  155.986080] [e31ade50] [ea6bfc70] kvm_vcpu_ioctl+0x158/0x888 [kvm]
[  155.986308] [e31adea0] [c0129080] do_vfs_ioctl+0x714/0x78c
[  155.986506] [e31adf10] [c0129160] sys_ioctl+0x68/0x8c
[  155.986693] [e31adf40] [c0013b70] ret_from_syscall+0x0/0x38
[  155.986915] --- Exception: c01 at 0xf4eda98
[  155.986921] LR = 0xf4ed9fc
[  155.992590] Page fault in user mode with in_atomic() = 1 mm = e3021e00
[  155.992869] NIP = 1017551c  MSR = d032
[  155.993273] PowerMac
[  155.993357] Modules linked in: kvm ipv6 fuse option usb_wwan usbserial 
snd_powermac b43 mac80211 cfg80211 snd_aoa_i2sbus usb_storage snd_pcm_oss 
snd_mixer_oss snd_pcm snd_page_alloc snd_seq snd_timer snd_seq_d
[  155.994742] NIP: 1017551c LR: 10175514 CTR: 0f5a3420
[  155.994920]

Re: [RFC/PATCH 1/2] kvm tools, seabios: Add "--bios" option to "vm run"

2012-02-27 Thread Gleb Natapov

On Mon, Feb 27, 2012 at 11:44:57AM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > So looking at SeaBIOS code, it seems to me we could simply make LKVM
> > lie to it by claiming to be coreboot and get away with it, no? We'd
> > basically avoid all the PCI allocation passes and such.
> 
> I doubt this is going to fly if you want seabios boot from a
> virtio-blk-pci device ...
> 
AFAIK kvm-tool initialize PCI by itself so it actually may work (and,
very likely, the best thing to do).

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] KVM: booke: Improve SPE switch

2012-02-27 Thread Olivia Yin

Like book3s did for fp switch,
instead of switch SPE between host and guest,
the patch switch SPE state between qemu and guest.
In this way, we can simulate a host loadup SPE when load guest SPE state,
and let host to decide when to giveup SPE state.
Therefor it cooperates better with host SPE usage,
and so that has some performance benifit in UP host(lazy SPE).

Moreover, since the patch save guest SPE state into linux thread field,
it creates the condition to emulate guest SPE instructions in host,
so that we can avoid injecting SPE exception to guest.

The patch also turns all asm code into C code,
and add SPE stat counts.

Signed-off-by: Liu Yu 
---
 arch/powerpc/include/asm/kvm_host.h |   11 +-
 arch/powerpc/kernel/asm-offsets.c   |7 
 arch/powerpc/kvm/booke.c|   63 +++
 arch/powerpc/kvm/booke.h|8 +
 arch/powerpc/kvm/booke_interrupts.S |   37 
 arch/powerpc/kvm/e500.c |5 ---
 arch/powerpc/kvm/timing.c   |5 +++
 arch/powerpc/kvm/timing.h   |   11 ++
 8 files changed, 83 insertions(+), 64 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 1843d5d..6186d08 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -117,6 +117,11 @@ struct kvm_vcpu_stat {
u32 st;
u32 st_slow;
 #endif
+#ifdef CONFIG_SPE
+   u32 spe_unavail;
+   u32 spe_fp_data;
+   u32 spe_fp_round;
+#endif
 };
 
 enum kvm_exit_types {
@@ -147,6 +152,11 @@ enum kvm_exit_types {
FP_UNAVAIL,
DEBUG_EXITS,
TIMEINGUEST,
+#ifdef CONFIG_SPE
+   SPE_UNAVAIL,
+   SPE_FP_DATA,
+   SPE_FP_ROUND,
+#endif
__NUMBER_OF_KVM_EXIT_TYPES
 };
 
@@ -330,7 +340,6 @@ struct kvm_vcpu_arch {
 #ifdef CONFIG_SPE
ulong evr[32];
ulong spefscr;
-   ulong host_spefscr;
u64 acc;
 #endif
 #ifdef CONFIG_ALTIVEC
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 8e0db0b..ff68f71 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -604,13 +604,6 @@ int main(void)
DEFINE(TLBCAM_MAS7, offsetof(struct tlbcam, MAS7));
 #endif
 
-#if defined(CONFIG_KVM) && defined(CONFIG_SPE)
-   DEFINE(VCPU_EVR, offsetof(struct kvm_vcpu, arch.evr[0]));
-   DEFINE(VCPU_ACC, offsetof(struct kvm_vcpu, arch.acc));
-   DEFINE(VCPU_SPEFSCR, offsetof(struct kvm_vcpu, arch.spefscr));
-   DEFINE(VCPU_HOST_SPEFSCR, offsetof(struct kvm_vcpu, arch.host_spefscr));
-#endif
-
 #ifdef CONFIG_KVM_EXIT_TIMING
DEFINE(VCPU_TIMING_EXIT_TBU, offsetof(struct kvm_vcpu,
arch.timing_exit.tv32.tbu));
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ee9e1ee..f20010b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -55,6 +55,11 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "dec",VCPU_STAT(dec_exits) },
{ "ext_intr",   VCPU_STAT(ext_intr_exits) },
{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
+#ifdef CONFIG_SPE
+   { "spe_unavail", VCPU_STAT(spe_unavail) },
+   { "spe_fp_data", VCPU_STAT(spe_fp_data) },
+   { "spe_fp_round", VCPU_STAT(spe_fp_round) },
+#endif
{ NULL }
 };
 
@@ -80,11 +85,11 @@ void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu)
 }
 
 #ifdef CONFIG_SPE
-void kvmppc_vcpu_disable_spe(struct kvm_vcpu *vcpu)
+static void kvmppc_vcpu_disable_spe(struct kvm_vcpu *vcpu)
 {
preempt_disable();
-   enable_kernel_spe();
-   kvmppc_save_guest_spe(vcpu);
+   if (current->thread.regs->msr & MSR_SPE)
+   giveup_spe(current);
vcpu->arch.shadow_msr &= ~MSR_SPE;
preempt_enable();
 }
@@ -92,8 +97,10 @@ void kvmppc_vcpu_disable_spe(struct kvm_vcpu *vcpu)
 static void kvmppc_vcpu_enable_spe(struct kvm_vcpu *vcpu)
 {
preempt_disable();
-   enable_kernel_spe();
-   kvmppc_load_guest_spe(vcpu);
+   if (!(current->thread.regs->msr & MSR_SPE)) {
+   load_up_spe(NULL);
+   current->thread.regs->msr |= MSR_SPE;
+   }
vcpu->arch.shadow_msr |= MSR_SPE;
preempt_enable();
 }
@@ -104,7 +111,7 @@ static void kvmppc_vcpu_sync_spe(struct kvm_vcpu *vcpu)
if (!(vcpu->arch.shadow_msr & MSR_SPE))
kvmppc_vcpu_enable_spe(vcpu);
} else if (vcpu->arch.shadow_msr & MSR_SPE) {
-   kvmppc_vcpu_disable_spe(vcpu);
+   vcpu->arch.shadow_msr &= ~MSR_SPE;
}
 }
 #else
@@ -124,7 +131,8 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
vcpu->arch.shared->msr = new_msr;
 
kvmppc_mmu_msr_notify(vcpu, old_msr);
-   kvmppc_vcpu_sync_spe(vcpu);
+   if ((old_msr ^ new_msr) & MSR_SPE)
+   kvmppc_vcpu_sync_spe(vcpu);
 }
 
 static void kvmppc_booke_queue_irqprio(struc

[PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction

2012-02-27 Thread Olivia Yin

So that we can call it in kernel.

Signed-off-by: Liu Yu 
---
 arch/powerpc/kernel/head_fsl_booke.S |   23 ++-
 1 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kernel/head_fsl_booke.S 
b/arch/powerpc/kernel/head_fsl_booke.S
index d5d78c4..c96e025 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -539,8 +539,10 @@ interrupt_base:
/* SPE Unavailable */
START_EXCEPTION(SPEUnavailable)
NORMAL_EXCEPTION_PROLOG
-   bne load_up_spe
-   addir3,r1,STACK_FRAME_OVERHEAD
+   beq 1f
+   bl  load_up_spe
+   b   fast_exception_return
+1: addir3,r1,STACK_FRAME_OVERHEAD
EXC_XFER_EE_LITE(0x2010, KernelSPE)
 #else
EXCEPTION(0x2020, SPEUnavailable, unknown_exception, EXC_XFER_EE)
@@ -743,7 +745,7 @@ tlb_write_entry:
 /* Note that the SPE support is closely modeled after the AltiVec
  * support.  Changes to one are likely to be applicable to the
  * other!  */
-load_up_spe:
+_GLOBAL(load_up_spe)
 /*
  * Disable SPE for the task which had SPE previously,
  * and save its SPE registers in its thread_struct.
@@ -791,20 +793,7 @@ load_up_spe:
subir4,r5,THREAD
stw r4,last_task_used_spe@l(r3)
 #endif /* !CONFIG_SMP */
-   /* restore registers and return */
-2: REST_4GPRS(3, r11)
-   lwz r10,_CCR(r11)
-   REST_GPR(1, r11)
-   mtcrr10
-   lwz r10,_LINK(r11)
-   mtlrr10
-   REST_GPR(10, r11)
-   mtspr   SPRN_SRR1,r9
-   mtspr   SPRN_SRR0,r12
-   REST_GPR(9, r11)
-   REST_GPR(12, r11)
-   lwz r11,GPR11(r11)
-   rfi
+   blr
 
 /*
  * SPE unavailable trap from kernel - print a message, but let
-- 
1.6.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: AMD SVM specification

2012-02-27 Thread Andre Przywara


On 02/27/2012 11:09 AM, Prateek Sharma wrote:

Hello,
  I know this is not the right forum for this, but i am looking for
the recent specification/documentation of the AMD NPT and SVM
features. All i can find is the old pacifica document dating back to
2005, and only the NPT whitepaper.


The technical SVM documentation is in the AMD64 Architecture 
Programmer's Manual (APM) Volume 2:

http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf

SVM is detailed in chapter 15, with appendix B & C containing the needed 
bits for the data structures. Other chapters in this document contain 
details about paging, though this is mostly not AMD specific.


Regards,
Andre.


 The reason i seek this documentation is that i wish to modify some
KVM code. Specifically, i am interested in how the hardware
sets/resets the accessed/dirty bits of the guest/nested tables
[https://lkml.org/lkml/2011/6/22/20]

Any help will be appreciated.

Prateek



--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] kvm tools, seabios: Add "--bios" option to "vm run"

2012-02-27 Thread Gerd Hoffmann

  Hi,

> So looking at SeaBIOS code, it seems to me we could simply make LKVM
> lie to it by claiming to be coreboot and get away with it, no? We'd
> basically avoid all the PCI allocation passes and such.

I doubt this is going to fly if you want seabios boot from a
virtio-blk-pci device ...

cheers,
  Gerd
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] kvm tools, seabios: Add "--bios" option to "vm run"

2012-02-27 Thread Gerd Hoffmann

On 02/24/12 19:54, Pekka Enberg wrote:
> Hi,
> 
> I played around with the "--debug-ioport" command line option and was able 
> to cheat my way past SeaBIOS POST phase. Should SeaBIOS automatically pick 
> up virtio devices and attempt to boot them?

Yes, it can handle virtio-blk (and soon virtio-scsi too).

You probably want to implement the fw_cfg (firmware config) interface
which is used as communication path between qemu and seabios, to pass
around stuff like e820 table and option roms.

Also http://sgabios.googlecode.com/svn/trunk is nice when working with a
serial console.

HTH,
  Gerd

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]qemu: deal with guest paniced event

2012-02-27 Thread Jan Kiszka

On 2012-02-27 04:05, Wen Congyang wrote:
> When the host knows the guest is paniced, it will set
> exit_reason to KVM_EXIT_GUEST_PANIC. So if qemu receive
> this exit_reason, we can send a event to tell management
> application that the guest is paniced.
> 
> Signed-off-by: Wen Congyang 
> ---
>  kvm-all.c |3 +++
>  linux-headers/linux/kvm.h |1 +
>  monitor.c |3 +++
>  monitor.h |1 +
>  4 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/kvm-all.c b/kvm-all.c
> index c4babda..ae428ab 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -1190,6 +1190,9 @@ int kvm_cpu_exec(CPUState *env)
>  (uint64_t)run->hw.hardware_exit_reason);
>  ret = -1;
>  break;
> +case KVM_EXIT_GUEST_PANIC:
> +monitor_protocol_event(QEVENT_GUEST_PANICED, NULL);
> +break;
>  case KVM_EXIT_INTERNAL_ERROR:
>  ret = kvm_handle_internal_error(env, run);
>  break;
> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> index f6b5343..45dd031 100644
> --- a/linux-headers/linux/kvm.h
> +++ b/linux-headers/linux/kvm.h
> @@ -163,6 +163,7 @@ struct kvm_pit_config {
>  #define KVM_EXIT_OSI  18
>  #define KVM_EXIT_PAPR_HCALL19
>  #define KVM_EXIT_S390_UCONTROL 20
> +#define KVM_EXIT_GUEST_PANIC   21
>  
>  /* For KVM_EXIT_INTERNAL_ERROR */
>  #define KVM_INTERNAL_ERROR_EMULATION 1

linux-headers are supposed to be synchronized in a separate patch,
naming the upstream or kvm.git hash they pull in. IOW: the KVM ABI
change has to be applied first.

Jan



signature.asc
Description: OpenPGP digital signature

52 matches

Mail list logo