[PATCH] KVM: VMX: Inform user about INTEL_TXT dependency

2010-11-14 Thread Jan Kiszka
From: Jan Kiszka 

Without CONFIG_INTEL_TXT, the user must not enable this feature in the
BIOS. Otherwise, KVM will not work. Explain this dependency via a kernel
log message.

Signed-off-by: Jan Kiszka 
---
 arch/x86/kvm/vmx.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9367abc..ebafd57 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1306,8 +1306,13 @@ static __init int vmx_disabled_by_bios(void)
&& tboot_enabled())
return 1;
if (!(msr & FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX)
-   && !tboot_enabled())
+   && !tboot_enabled()) {
+#ifndef CONFIG_INTEL_TXT
+   printk(KERN_INFO "kvm: if TXT is enabled in the bios, "
+"kvm depends on CONFIG_INTEL_TXT\n");
+#endif
return 1;
+   }
}
 
return 0;
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] intel-iommu: Fix use after release during device attach

2010-11-14 Thread Jan Kiszka
Am 02.11.2010 08:31, Sheng Yang wrote:
> On Tuesday 02 November 2010 15:05:51 Jan Kiszka wrote:
>> From: Jan Kiszka 
>>
>> Obtail the new pgd pointer before releasing the page containing this
>> value.
>>
>> Signed-off-by: Jan Kiszka 
>> ---
>>
>> Who is taking care of this? The kvm tree?
>>
>>  drivers/pci/intel-iommu.c |2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
>> index 4789f8e..35463dd 100644
>> --- a/drivers/pci/intel-iommu.c
>> +++ b/drivers/pci/intel-iommu.c
>> @@ -3627,9 +3627,9 @@ static int intel_iommu_attach_device(struct
>> iommu_domain *domain,
>>
>>  pte = dmar_domain->pgd;
>>  if (dma_pte_present(pte)) {
>> -free_pgtable_page(dmar_domain->pgd);
>>  dmar_domain->pgd = (struct dma_pte *)
>>  phys_to_virt(dma_pte_addr(pte));
>> +free_pgtable_page(pte);
>>  }
>>  dmar_domain->agaw--;
>>  }
> 
> Reviewed-by: Sheng Yang 
> 
> CC iommu mailing list and David.

Ping...

I think this fix also qualifies for stable (.35 and .36).

Jan




signature.asc
Description: OpenPGP digital signature


Re: [PATCH] KVM: VMX: Inform user about INTEL_TXT dependency

2010-11-14 Thread Avi Kivity

On 11/14/2010 11:18 AM, Jan Kiszka wrote:

From: Jan Kiszka

Without CONFIG_INTEL_TXT, the user must not enable this feature in the
BIOS. Otherwise, KVM will not work. Explain this dependency via a kernel
log message.

Signed-off-by: Jan Kiszka
---
  arch/x86/kvm/vmx.c |7 ++-
  1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9367abc..ebafd57 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1306,8 +1306,13 @@ static __init int vmx_disabled_by_bios(void)
&&  tboot_enabled())
return 1;
if (!(msr&  FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX)
-   &&  !tboot_enabled())
+   &&  !tboot_enabled()) {
+#ifndef CONFIG_INTEL_TXT
+   printk(KERN_INFO "kvm: if TXT is enabled in the bios, "
+"kvm depends on CONFIG_INTEL_TXT\n");
+#endif
return 1;
+   }
}



Maybe reword to an instruction?

Something like

  kvm: TXT enabled in the bios.  Either disable TXT in the bios, or 
enable CONFIG_INTEL_TXT in your kernel.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify

2010-11-14 Thread Avi Kivity

On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:

>  Who guarantees that less common virtio-blk and virtio-net guest drivers
>  for non-Linux OSes are fine with it?  Maybe you should add a feature flag
>  that the guest has to ACK to enable it.

Virtio-blk and virtio-net are fine.  Both of those devices are
expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
drivers spin but they expect to and it is okay in those environments.
They already burn CPU today.

Virtio-console expects synchronous virtqueue kick.  In Linux,
virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
userspace is able to complete those requests synchronously so that the
guest never actually burns CPU (e.g.
hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
in places where we previously didn't.


This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor 
implementations cannot even provide synchronous notifications.



It's good that QEMU can decide whether or not to handle virtqueue kick
in the vcpu thread.  For high performance asynchronous devices like
virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
it may not be useful.  I'm not sure a feature bit that exposes this
detail to the guest would be useful.


The guest should always assume that virtio devices are asynchronous.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify

2010-11-14 Thread Stefan Hajnoczi
On Sun, Nov 14, 2010 at 10:34 AM, Avi Kivity  wrote:
> On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:
>>
>> >  Who guarantees that less common virtio-blk and virtio-net guest drivers
>> >  for non-Linux OSes are fine with it?  Maybe you should add a feature
>> > flag
>> >  that the guest has to ACK to enable it.
>>
>> Virtio-blk and virtio-net are fine.  Both of those devices are
>> expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
>> drivers spin but they expect to and it is okay in those environments.
>> They already burn CPU today.
>>
>> Virtio-console expects synchronous virtqueue kick.  In Linux,
>> virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
>> userspace is able to complete those requests synchronously so that the
>> guest never actually burns CPU (e.g.
>> hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
>> in places where we previously didn't.
>
> This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor
> implementations cannot even provide synchronous notifications.
>
>> It's good that QEMU can decide whether or not to handle virtqueue kick
>> in the vcpu thread.  For high performance asynchronous devices like
>> virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
>> it may not be useful.  I'm not sure a feature bit that exposes this
>> detail to the guest would be useful.
>
> The guest should always assume that virtio devices are asynchronous.

I agree, but let's enable virtio-ioeventfd carefully because bad code
is out there.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: VMX: Inform user about INTEL_TXT dependency

2010-11-14 Thread Jan Kiszka
Am 14.11.2010 11:30, Avi Kivity wrote:
> On 11/14/2010 11:18 AM, Jan Kiszka wrote:
>> From: Jan Kiszka
>>
>> Without CONFIG_INTEL_TXT, the user must not enable this feature in the
>> BIOS. Otherwise, KVM will not work. Explain this dependency via a kernel
>> log message.
>>
>> Signed-off-by: Jan Kiszka
>> ---
>>   arch/x86/kvm/vmx.c |7 ++-
>>   1 files changed, 6 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 9367abc..ebafd57 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -1306,8 +1306,13 @@ static __init int vmx_disabled_by_bios(void)
>>   &&  tboot_enabled())
>>   return 1;
>>   if (!(msr&  FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX)
>> -&&  !tboot_enabled())
>> +&&  !tboot_enabled()) {
>> +#ifndef CONFIG_INTEL_TXT
>> +printk(KERN_INFO "kvm: if TXT is enabled in the bios, "
>> + "kvm depends on CONFIG_INTEL_TXT\n");
>> +#endif
>>   return 1;
>> +}
>>   }
>>
> 
> Maybe reword to an instruction?
> 
> Something like
> 
>   kvm: TXT enabled in the bios.  Either disable TXT in the bios, or
> enable CONFIG_INTEL_TXT in your kernel.
> 

I always get an aching head when thinking about these dependency: Does
FEATURE_CONTROL_LOCKED && !FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX
imply that the BIOS uses TXT? Or could it also mean that it just
disabled VT-x explicitly? As CONFIG_INTEL_TXT is off, we do not know if
tboot_enabled is off as well.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 5/5] KVM: MMU: retry #PF for softmmu

2010-11-14 Thread Avi Kivity

On 11/12/2010 08:50 AM, Xiao Guangrong wrote:

Retry #PF for softmmu only when the current vcpu has the same
root shadow page as the time when #PF occurs. it means they
have same paging environment



The process could have been killed and replaced by another using the 
same cr3.  Or we may be running a guest that uses the same cr3 for all 
processes.  Or another thread may have mmap()ed something else over the 
same address.  So I think this is incorrect.


Meanwhile, I applied patches 1-4.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM: MMU: don't drop spte if overwrite it from W to RO

2010-11-14 Thread Avi Kivity

On 11/12/2010 12:30 PM, Xiao Guangrong wrote:

We just need flush tlb if overwrite a writable spte with a read-only one



What are the advantages?  Avoid playing with rmap, and avoid a window 
where the spte is missing?


(they are worth the patch, just seeing if I'm not missing something)

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: [PATCH]: An implementation of HyperV KVP functionality

2010-11-14 Thread Dor Laor

FYI. Long ago we discussed key value approach on top of virtio-serial.

 Original Message 
Subject: [PATCH]: An implementation of HyperV  KVP  functionality
Date: Thu, 11 Nov 2010 13:03:10 -0700
From: Ky Srinivasan 
To: , 
CC: Haiyang Zhang , Greg KH 

I am enclosing a patch that implements the KVP (Key Value Pair) 
functionality for Linux guests on HyperV. This functionality allows 
Microsoft Management stack to query information from the guest. This 
functionality is implemented in two parts: (a) A kernel component that 
communicates with the host and (b) A user level daemon that implements 
data gathering. The attached patch (kvp.patch) implements the kernel 
component. I am also attaching the code for the user-level daemon 
(kvp_daemon.c)  for reference.


Regards,

K. Y


From: K. Y. Srinivasan 

Subject: An implementation of key/value pair feature (KVP) for Linux 
on HyperV.

Signed-off-by: K. Y. Srinivasan  

Index: linux.trees.git/drivers/staging/hv/kvp.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux.trees.git/drivers/staging/hv/kvp.c2010-11-11 13:45:17.0 
-0500
@@ -0,0 +1,404 @@
+/*
+ * An implementation of key value pair (KVP) functionality for Linux.
+ *
+ *
+ * Copyright (C) 2010, Novell, Inc.
+ * Author : K. Y. Srinivasan 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT.  See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+
+
+#include 
+#include 
+#include 
+
+#include "logging.h"
+#include "osd.h"
+#include "vmbus.h"
+#include "vmbus_packet_format.h"
+#include "vmbus_channel_interface.h"
+#include "version_info.h"
+#include "channel.h"
+#include "vmbus_private.h"
+#include "vmbus_api.h"
+#include "utils.h"
+#include "kvp.h"
+
+
+/*
+ *
+ * The following definitions are shared with the user-mode component; do not
+ * change any of this without making the corresponding changes in
+ * the KVP user-mode component.
+ */
+
+#define CN_KVP_VAL 0x1 /* This supports queries from the kernel */
+#define CN_KVP_USER_VAL   0x2 /* This supports queries from the user */
+
+
+/*
+ * KVP protocol: The user mode component first registers with the
+ * the kernel component. Subsequently, the kernel component requests, data
+ * for the specified keys. In response to this message the user mode component
+ * fills in the value corresponding to the specified key. We overload the
+ * sequence field in the cn_msg header to define our KVP message types.
+ *
+ * XXXKYS: Have a shared header file between the user and kernel (TODO)
+ */
+
+enum kvp_op {
+   KVP_REGISTER = 0, /* Register the user mode component */
+   KVP_KERNEL_GET,/*Kernel is requesting the value for the specified key*/
+   KVP_KERNEL_SET, /*Kernel is providing the value for the specified key*/
+   KVP_USER_GET, /*User is requesting the value for the specified key*/
+   KVP_USER_SET /*User is providing the value for the specified key*/
+};
+
+
+
+#define KVP_KEY_SIZE512
+#define KVP_VALUE_SIZE  2048
+
+
+typedef struct kvp_msg {
+   __u32 kvp_key; /* Key */
+   __u8  kvp_value[0]; /* Corresponding value */
+} kvp_msg_t;
+
+/*
+ * End of shared definitions.
+ */
+
+/*
+ * Registry value types.
+ */
+
+#define REG_SZ 1
+
+/*
+ * Array of keys we support in Linux.
+ *
+ */
+#define KVP_MAX_KEY10
+#define KVP_LIC_VERSION 1
+
+
+static char *kvp_keys[KVP_MAX_KEY] = {"FullyQualifiedDomainName",
+   "IntegrationServicesVersion",
+   "NetworkAddressIPv4",
+   "NetworkAddressIPv6",
+   "OSBuildNumber",
+   "OSName",
+   "OSMajorVersion",
+   "OSMinorVersion",
+   "OSVersion",
+   "ProcessorArchitecture",
+   };
+
+/*
+ * Global state maintained for transaction that is being processed.
+ * Note that only one transaction can be active at any point in time.
+ *
+ * This state is set when we receive a request from the host; we
+ * cleanup this state when the transaction is completed - when we respond
+ * to the host with the key value.
+ */
+
+static u8 *recv_buffer; /* the receive buffer that we allocated */
+static int recv_len; /*

Re: [PATCH 3/4] KVM: MMU: notrap it if gpte's reserved is set

2010-11-14 Thread Avi Kivity

On 11/12/2010 12:34 PM, Xiao Guangrong wrote:

We can past the page fault to guest directly if gpte's reserved
is set



How can that work? shadow_notrap_nonpresent_pte causes a fault with 
PFEC.P=PFEC.RSVD=0, while we need PFEC.P=PFEC.RSVD=1.



Signed-off-by: Xiao Guangrong
---
  arch/x86/kvm/paging_tmpl.h |   10 ++
  1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 291342d..952357a 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -760,6 +760,7 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct 
kvm_mmu_page *sp,
pt_element_t gpte;
gpa_t pte_gpa;
gfn_t gfn;
+   bool gpte_invalid;

if (!is_shadow_present_pte(sp->spt[i]))
continue;
@@ -771,12 +772,13 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct 
kvm_mmu_page *sp,
return -EINVAL;

gfn = gpte_to_gfn(gpte);
-   if (is_rsvd_bits_set(&vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL)
- || gfn != sp->gfns[i] || !is_present_gpte(gpte)
- || !(gpte&  PT_ACCESSED_MASK)) {
+   gpte_invalid = is_present_gpte(gpte) ||
+  is_rsvd_bits_set(&vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL);


Shouldn't that be &&?


+   if (gpte_invalid || gfn != sp->gfns[i] ||
+ !(gpte&  PT_ACCESSED_MASK)) {
u64 nonpresent;

-   if (is_present_gpte(gpte) || !clear_unsync)
+   if (gpte_invalid || !clear_unsync)
nonpresent = shadow_trap_nonpresent_pte;
else
nonpresent = shadow_notrap_nonpresent_pte;



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: VMX: Inform user about INTEL_TXT dependency

2010-11-14 Thread Avi Kivity

On 11/14/2010 12:41 PM, Jan Kiszka wrote:

Am 14.11.2010 11:30, Avi Kivity wrote:
>  On 11/14/2010 11:18 AM, Jan Kiszka wrote:
>>  From: Jan Kiszka
>>
>>  Without CONFIG_INTEL_TXT, the user must not enable this feature in the
>>  BIOS. Otherwise, KVM will not work. Explain this dependency via a kernel
>>  log message.
>>
>>  Signed-off-by: Jan Kiszka
>>  ---
>>arch/x86/kvm/vmx.c |7 ++-
>>1 files changed, 6 insertions(+), 1 deletions(-)
>>
>>  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>  index 9367abc..ebafd57 100644
>>  --- a/arch/x86/kvm/vmx.c
>>  +++ b/arch/x86/kvm/vmx.c
>>  @@ -1306,8 +1306,13 @@ static __init int vmx_disabled_by_bios(void)
>>&&   tboot_enabled())
>>return 1;
>>if (!(msr&   FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX)
>>  -&&   !tboot_enabled())
>>  +&&   !tboot_enabled()) {
>>  +#ifndef CONFIG_INTEL_TXT
>>  +printk(KERN_INFO "kvm: if TXT is enabled in the bios, "
>>  + "kvm depends on CONFIG_INTEL_TXT\n");
>>  +#endif
>>return 1;
>>  +}
>>}
>>
>
>  Maybe reword to an instruction?
>
>  Something like
>
>kvm: TXT enabled in the bios.  Either disable TXT in the bios, or
>  enable CONFIG_INTEL_TXT in your kernel.
>

I always get an aching head when thinking about these dependency: Does
FEATURE_CONTROL_LOCKED&&  !FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX
imply that the BIOS uses TXT? Or could it also mean that it just
disabled VT-x explicitly?


Probably the latter, at least that's what we took it to mean before it 
was renamed to that long string.



As CONFIG_INTEL_TXT is off, we do not know if
tboot_enabled is off as well.


I guess, if FEATURE_CONTROL_VMXON_ENABLED_INSIDER_SMX_YADA_YADA_YADA is 
set, then the bios wants us to enable TXT.  But if both bits are clear, 
the bios really doesn't want us to play with vmx.  But it would be good 
to get Intel guidance before we pass our confusion on to users.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify

2010-11-14 Thread Avi Kivity

On 11/14/2010 12:37 PM, Stefan Hajnoczi wrote:

On Sun, Nov 14, 2010 at 10:34 AM, Avi Kivity  wrote:
>  On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:
>>
>>  >Who guarantees that less common virtio-blk and virtio-net guest drivers
>>  >for non-Linux OSes are fine with it?  Maybe you should add a feature
>>  >  flag
>>  >that the guest has to ACK to enable it.
>>
>>  Virtio-blk and virtio-net are fine.  Both of those devices are
>>  expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
>>  drivers spin but they expect to and it is okay in those environments.
>>  They already burn CPU today.
>>
>>  Virtio-console expects synchronous virtqueue kick.  In Linux,
>>  virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
>>  userspace is able to complete those requests synchronously so that the
>>  guest never actually burns CPU (e.g.
>>  hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
>>  in places where we previously didn't.
>
>  This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor
>  implementations cannot even provide synchronous notifications.
>
>>  It's good that QEMU can decide whether or not to handle virtqueue kick
>>  in the vcpu thread.  For high performance asynchronous devices like
>>  virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
>>  it may not be useful.  I'm not sure a feature bit that exposes this
>>  detail to the guest would be useful.
>
>  The guest should always assume that virtio devices are asynchronous.

I agree, but let's enable virtio-ioeventfd carefully because bad code
is out there.


Sure.  Note as long as the thread waiting on ioeventfd doesn't consume 
too much cpu, it will awaken quickly and we won't have the "transaction 
per timeslice" effect.


btw, what about virtio-blk with linux-aio?  Have you benchmarked that 
with and without ioeventfd?


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] Monitor command: pfa2hva, translate guest physical address to host virtual address

2010-11-14 Thread Avi Kivity

On 11/12/2010 03:16 AM, Huang Ying wrote:

On Thu, 2010-11-11 at 17:39 +0800, Avi Kivity wrote:
>  On 11/11/2010 02:56 AM, Huang Ying wrote:
>  >  On Thu, 2010-11-11 at 00:49 +0800, Anthony Liguori wrote:
>  >  >   On 11/10/2010 02:34 AM, Avi Kivity wrote:
>  >  >   >>   Why the gpa ->hva mapping is not
>  >  >   >>   consistent for RAM if -mempath is not used?
>  >  >   >
>  >  >   >   Video RAM in the range a-b and PCI mapped RAM can change 
gpas
>  >  >   >   (while remaining in the same hva).
>  >  >   >
>  >  >   >   Even for ordinary RAM, which doesn't normally change gpa/hva, I'm 
not
>  >  >   >   sure we want to guarantee that it won't.
>  >  >
>  >  >   We can't universally either.  Memory hot remove potentially breaks the
>  >  >   mapping and some non-x86 architectures (like ARM) can alias RAM via a
>  >  >   guest triggered mechanism.
>  >
>  >  Thanks for clarification. Now I think we have two options.
>  >
>  >  1) Clearly mark gpa2hva (pfa2hva now, should renamed to gpa2hva) as a
>  >  testing only interface, and should be used only on restricted
>  >  environment (normal memory, without hot-remove, maybe only on x86).
>  >
>  >  2) Find some way to lock the gpa ->   hva mapping during operating. Such
>  >  as gpa2hva_begin and gpa2hva_end and lock the mapping in between.
>  >
>  >  I think 2) may be possible. But if there are no other users, why do that
>  >  for a test case? So I think 1) may be the better option.
>  >
>
>  3) Move the poisoning code into qemu, so the command becomes
>
>  posison-address
>
>  (though physical addresses aren't stable either)

Poisoning includes:

a) gva ->  gpa
b) gpa ->  hva
c) hva ->  hpa
d) inject the MCE into host via some external tool

poison-address need to do b, c, d. Do you think it is good to do all
these inside qemu?


If d is not too complicated (an ioctl?), we might to it in qemu.


>  4) Use -mempath on /dev/shm and poison a page in the backing file

If we can poison a page in the backing file, how do we know the
corresponding gpa and hpa?


I think you currently can't know it's gpa (why do you want to?); the 
upcoming NUMA enhancements should allow this (the plan is to have a file 
per guest NUMA node, so we can tell the host what policy to apply for 
that node).


gpa->hpa translation can be derived from /proc/pid/maps.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] Monitor command: pfa2hva, translate guest physical address to host virtual address

2010-11-14 Thread Avi Kivity

On 11/14/2010 01:08 PM, Avi Kivity wrote:



>  4) Use -mempath on /dev/shm and poison a page in the backing file

If we can poison a page in the backing file, how do we know the
corresponding gpa and hpa?


I think you currently can't know it's gpa (why do you want to?); the 
upcoming NUMA enhancements should allow this (the plan is to have a 
file per guest NUMA node, so we can tell the host what policy to apply 
for that node).


gpa->hpa translation can be derived from /proc/pid/maps.



That would be gpa->hva.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify

2010-11-14 Thread Avi Kivity

On 11/14/2010 01:05 PM, Avi Kivity wrote:

I agree, but let's enable virtio-ioeventfd carefully because bad code
is out there.



Sure.  Note as long as the thread waiting on ioeventfd doesn't consume 
too much cpu, it will awaken quickly and we won't have the 
"transaction per timeslice" effect.


btw, what about virtio-blk with linux-aio?  Have you benchmarked that 
with and without ioeventfd?




And, what about efficiency?  As in bits/cycle?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough on Sony Vaio F11 laptop...

2010-11-14 Thread Erik Brakkee

Jan Kiszka wrote:


Strange, should work. I would suggest to post your full kernel log,
maybe there is some enlightening message hidden.

I don't think it is a problem of your kernel version, but I'm able to
pass through devices on OpenSUSE 11.3 with
kernel-desktop-2.6.36-90.1.x86_64 from their kernel repository.

Jan

   
Exactly what server logs do you need. Is this only /var/log/messages or 
more? And do I need to set specific options there?

Any other log files that you need?

Before, generating these logs I will upgrade to a later kernel. As far 
as I can tell, that will still be a 2.6.34 kernel. Perhaps I should try 
the 2.6.36 kernel as well. Do you have the URL for the kernel repository 
I should use? (cannot find an obvious kernel repository in YAST2).


Cheers
  Erik

--

Nonsense and other useful things: http://brakkee.org
MountainHoppers: http://mountainhoppers.nl
Track Detective: http://trackdetective.com
Twitter: http://twitter.com/ErikBrakkee



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] qemu:virtio-net: Use TUNSETTXFILTER for MAC filteringlogin register about

2010-11-14 Thread Dragos Tatulea
I'd like to pick up this patch and rework it based on your
suggestions. Need it for [1]. Most opinions seem to be pointing to a
generic software mac filtering in the VLAN layer with hw support if
necessary. Here's what I had in mind:
- For the generic mac filtering we can use something similar to
VirtIONet's mac_table.
- Make receive_filter a callback in NetClientInfo. NIC's that don't
support filtering can use a generic_receive_filter based on the mac
table (but should be turned off by default).
- For hw filtering support I'm not sure how to validate the
"interesting" setup with a tap/macvtap and virtio-net on top of it.
- Of course, this needs to be documented as "best effort" filtering.
Just like the virtio-net mac table.

Please give me your feedback. Thanks.

[1] - http://www.linux-kvm.org/page/GuestProgrammableMacVlanFiltering

-- Dragos
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM test: Add Fedora 14 to the list of guests

2010-11-14 Thread Lucas Meneghel Rodrigues
As usual, make it the default guest on the default
kvm autotest run.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/get_started.py|6 +++---
 client/tests/kvm/tests.cfg.sample  |   14 +++---
 client/tests/kvm/tests_base.cfg.sample |   26 --
 3 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/client/tests/kvm/get_started.py b/client/tests/kvm/get_started.py
index 6fa6b5f..353cfbe 100755
--- a/client/tests/kvm/get_started.py
+++ b/client/tests/kvm/get_started.py
@@ -82,11 +82,11 @@ if __name__ == "__main__":
 logging.info("3 - Verifying iso (make sure we have the OS ISO needed for "
  "the default test set)")
 
-iso_name = "Fedora-13-x86_64-DVD.iso"
-fedora_dir = "pub/fedora/linux/releases/13/Fedora/x86_64/iso"
+iso_name = "Fedora-14-x86_64-DVD.iso"
+fedora_dir = "pub/fedora/linux/releases/14/Fedora/x86_64/iso"
 url = os.path.join("http://download.fedoraproject.org/";, fedora_dir,
iso_name)
-hash = "65c7f1aad3feb888ae3daadaf45d4a2a32b8773a"
+hash = "381d7336c6d1685cbb4eae49cdef2247"
 destination = os.path.join(base_dir, 'isos', 'linux')
 check_iso(url, destination, hash)
 
diff --git a/client/tests/kvm/tests.cfg.sample 
b/client/tests/kvm/tests.cfg.sample
index ce3e307..a564510 100644
--- a/client/tests/kvm/tests.cfg.sample
+++ b/client/tests/kvm/tests.cfg.sample
@@ -43,8 +43,8 @@ variants:
 # Subtest choice. You can modify that line to add more subtests
 only unattended_install.cdrom boot shutdown
 
-# Runs qemu, f13 64 bit guest OS, install, boot, shutdown
-- @qemu_f13_quick:
+# Runs qemu, f14 64 bit guest OS, install, boot, shutdown
+- @qemu_f14_quick:
 # We want qemu for this run
 qemu_binary = /usr/bin/qemu
 only qcow2
@@ -55,13 +55,13 @@ variants:
 only up
 only no_pci_assignable
 only smallpages
-only Fedora.13.64
+only Fedora.14.64
 only unattended_install.cdrom boot shutdown
 # qemu needs -enable-kvm on the cmdline
 extra_params += ' -enable-kvm'
 
-# Runs qemu-kvm, f13 64 bit guest OS, install, boot, shutdown
-- @qemu_kvm_f13_quick:
+# Runs qemu-kvm, f14 64 bit guest OS, install, boot, shutdown
+- @qemu_kvm_f14_quick:
 # We want qemu-kvm for this run
 qemu_binary = /usr/bin/qemu-kvm
 only qcow2
@@ -70,7 +70,7 @@ variants:
 only smp2
 only no_pci_assignable
 only smallpages
-only Fedora.13.64
+only Fedora.14.64
 only unattended_install.cdrom boot shutdown
 
 # You may provide information about the DTM server for WHQL tests here:
@@ -87,4 +87,4 @@ variants:
 #kill_unresponsive_vms.* ?= no
 
 # Choose your test list from the testsets defined
-only qemu_kvm_f13_quick
+only qemu_kvm_f14_quick
diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index f783f6b..2ae7f78 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -930,6 +930,28 @@ variants:
 #floppy = images/f13-64/ks.vfd
 cdrom_unattended = images/f13-64/ks.iso
 
+- 14.32:
+image_name = f14-32
+cdrom_cd1 = linux/Fedora-14-i386-DVD.iso
+md5sum = 1cc67641506d2f931d669b8d3528dded
+md5sum_1m = d314ab126dabab686111e6a0d71d2e67
+unattended_install.cdrom:
+unattended_file = unattended/Fedora-14.ks
+tftp = images/f14-32/tftpboot
+#floppy = images/f14-32/ks.vfd
+cdrom_unattended = images/f14-32/ks.iso
+
+- 14.64:
+image_name = f14-64
+cdrom_cd1 = linux/Fedora-14-x86_64-DVD.iso
+md5sum = f2ebf941dc45f99ee3e8a457c9544552
+md5sum_1m = df029f9cffbc3517937a91124a1e0c3a
+unattended_install.cdrom:
+unattended_file = unattended/Fedora-14.ks
+tftp = images/f14-64/tftpboot
+#floppy = images/f14-64/ks.vfd
+cdrom_unattended = images/f14-64/ks.iso
+
 
 - DSL-4.2.5:
 no setup dbench bonnie linux_s3
@@ -1957,8 +1979,8 @@ variants:
 
 
 virtio_net|virtio_blk|e1000|balloon_check:
-only Fedora.11 Fedora.12 Fedora.13 RHEL.5 RHEL.6 OpenSUSE.11 SLES.11 
Ubuntu-8.10-server
-# only WinXP Win2003 Win2008 WinVista Win7 Fedora.11 Fedora.12 Fedora.13 
RHEL.5 OpenSUSE.11 SLES.11 Ubuntu-8.10-server
+only Fedora.11 Fedora.12 Fedora.13 Fedora.14 RHEL.5 RHEL.6 OpenSUSE.11 
SLES.11 Ubuntu-8.10-server
+# only WinXP Win2003 Win2008 WinVista Win7 Fedora

[PATCHv4 04/15] Add get_fw_dev_path callback to ISA bus in qdev.

2010-11-14 Thread Gleb Natapov
Use device ioports to create unique device path.

Signed-off-by: Gleb Natapov 
---
 hw/isa-bus.c |   16 
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/hw/isa-bus.c b/hw/isa-bus.c
index c0ac7e9..c423c1b 100644
--- a/hw/isa-bus.c
+++ b/hw/isa-bus.c
@@ -31,11 +31,13 @@ static ISABus *isabus;
 target_phys_addr_t isa_mem_base = 0;
 
 static void isabus_dev_print(Monitor *mon, DeviceState *dev, int indent);
+static char *isabus_get_fw_dev_path(DeviceState *dev);
 
 static struct BusInfo isa_bus_info = {
 .name  = "ISA",
 .size  = sizeof(ISABus),
 .print_dev = isabus_dev_print,
+.get_fw_dev_path = isabus_get_fw_dev_path,
 };
 
 ISABus *isa_bus_new(DeviceState *dev)
@@ -188,4 +190,18 @@ static void isabus_register_devices(void)
 sysbus_register_withprop(&isabus_bridge_info);
 }
 
+static char *isabus_get_fw_dev_path(DeviceState *dev)
+{
+ISADevice *d = (ISADevice*)dev;
+char path[40];
+int off;
+
+off = snprintf(path, sizeof(path), "%s", qdev_fw_name(dev));
+if (d->nioports) {
+snprintf(path + off, sizeof(path) - off, "@%04x", d->ioports[0]);
+}
+
+return strdup(path);
+}
+
 device_init(isabus_register_devices)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 02/15] Introduce new BusInfo callback get_fw_dev_path.

2010-11-14 Thread Gleb Natapov
New get_fw_dev_path callback will be used for build device path usable
by firmware in contrast to qdev qemu internal device path.

Signed-off-by: Gleb Natapov 
---
 hw/qdev.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/hw/qdev.h b/hw/qdev.h
index 9f90efe..dc669b3 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -49,12 +49,14 @@ struct DeviceState {
 
 typedef void (*bus_dev_printfn)(Monitor *mon, DeviceState *dev, int indent);
 typedef char *(*bus_get_dev_path)(DeviceState *dev);
+typedef char *(*bus_get_fw_dev_path)(DeviceState *dev);
 
 struct BusInfo {
 const char *name;
 size_t size;
 bus_dev_printfn print_dev;
 bus_get_dev_path get_dev_path;
+bus_get_fw_dev_path get_fw_dev_path;
 Property *props;
 };
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 01/15] Introduce fw_name field to DeviceInfo structure.

2010-11-14 Thread Gleb Natapov
Add "fw_name" to DeviceInfo to use in device path building. In
contrast to "name" "fw_name" should refer to functionality device
provides instead of particular device model like "name" does.

Signed-off-by: Gleb Natapov 
---
 hw/fdc.c|1 +
 hw/ide/isa.c|1 +
 hw/ide/qdev.c   |1 +
 hw/isa-bus.c|1 +
 hw/lance.c  |1 +
 hw/piix_pci.c   |1 +
 hw/qdev.h   |6 ++
 hw/usb-hub.c|1 +
 hw/usb-net.c|1 +
 hw/virtio-pci.c |1 +
 10 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/hw/fdc.c b/hw/fdc.c
index c159dcb..a467c4b 100644
--- a/hw/fdc.c
+++ b/hw/fdc.c
@@ -2040,6 +2040,7 @@ static const VMStateDescription vmstate_isa_fdc ={
 static ISADeviceInfo isa_fdc_info = {
 .init = isabus_fdc_init1,
 .qdev.name  = "isa-fdc",
+.qdev.fw_name  = "fdc",
 .qdev.size  = sizeof(FDCtrlISABus),
 .qdev.no_user = 1,
 .qdev.vmsd  = &vmstate_isa_fdc,
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index 6b57e0d..9856435 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -98,6 +98,7 @@ ISADevice *isa_ide_init(int iobase, int iobase2, int isairq,
 
 static ISADeviceInfo isa_ide_info = {
 .qdev.name  = "isa-ide",
+.qdev.fw_name  = "ide",
 .qdev.size  = sizeof(ISAIDEState),
 .init   = isa_ide_initfn,
 .qdev.reset = isa_ide_reset,
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 0808760..6d27b60 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -134,6 +134,7 @@ static int ide_drive_initfn(IDEDevice *dev)
 
 static IDEDeviceInfo ide_drive_info = {
 .qdev.name  = "ide-drive",
+.qdev.fw_name  = "drive",
 .qdev.size  = sizeof(IDEDrive),
 .init   = ide_drive_initfn,
 .qdev.props = (Property[]) {
diff --git a/hw/isa-bus.c b/hw/isa-bus.c
index 4e306de..26036e0 100644
--- a/hw/isa-bus.c
+++ b/hw/isa-bus.c
@@ -153,6 +153,7 @@ static int isabus_bridge_init(SysBusDevice *dev)
 static SysBusDeviceInfo isabus_bridge_info = {
 .init = isabus_bridge_init,
 .qdev.name  = "isabus-bridge",
+.qdev.fw_name  = "isa",
 .qdev.size  = sizeof(SysBusDevice),
 .qdev.no_user = 1,
 };
diff --git a/hw/lance.c b/hw/lance.c
index dc12144..1a3bb1a 100644
--- a/hw/lance.c
+++ b/hw/lance.c
@@ -141,6 +141,7 @@ static void lance_reset(DeviceState *dev)
 static SysBusDeviceInfo lance_info = {
 .init   = lance_init,
 .qdev.name  = "lance",
+.qdev.fw_name  = "ethernet",
 .qdev.size  = sizeof(SysBusPCNetState),
 .qdev.reset = lance_reset,
 .qdev.vmsd  = &vmstate_lance,
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index b5589b9..38f9d9e 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -365,6 +365,7 @@ static PCIDeviceInfo i440fx_info[] = {
 static SysBusDeviceInfo i440fx_pcihost_info = {
 .init = i440fx_pcihost_initfn,
 .qdev.name= "i440FX-pcihost",
+.qdev.fw_name = "pci",
 .qdev.size= sizeof(I440FXState),
 .qdev.no_user = 1,
 };
diff --git a/hw/qdev.h b/hw/qdev.h
index 579328a..9f90efe 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -139,6 +139,7 @@ typedef void (*qdev_resetfn)(DeviceState *dev);
 
 struct DeviceInfo {
 const char *name;
+const char *fw_name;
 const char *alias;
 const char *desc;
 size_t size;
@@ -288,6 +289,11 @@ void qdev_prop_set_defaults(DeviceState *dev, Property 
*props);
 void qdev_prop_register_global_list(GlobalProperty *props);
 void qdev_prop_set_globals(DeviceState *dev);
 
+static inline const char *qdev_fw_name(DeviceState *dev)
+{
+return dev->info->fw_name ? : dev->info->alias ? : dev->info->name;
+}
+
 /* This is a nasty hack to allow passing a NULL bus to qdev_create.  */
 extern struct BusInfo system_bus_info;
 
diff --git a/hw/usb-hub.c b/hw/usb-hub.c
index 2a1edfc..8e3a96b 100644
--- a/hw/usb-hub.c
+++ b/hw/usb-hub.c
@@ -545,6 +545,7 @@ static int usb_hub_initfn(USBDevice *dev)
 static struct USBDeviceInfo hub_info = {
 .product_desc   = "QEMU USB Hub",
 .qdev.name  = "usb-hub",
+.qdev.fw_name= "hub",
 .qdev.size  = sizeof(USBHubState),
 .init   = usb_hub_initfn,
 .handle_packet  = usb_hub_handle_packet,
diff --git a/hw/usb-net.c b/hw/usb-net.c
index 70f9263..2287ee1 100644
--- a/hw/usb-net.c
+++ b/hw/usb-net.c
@@ -1496,6 +1496,7 @@ static USBDevice *usb_net_init(const char *cmdline)
 static struct USBDeviceInfo net_info = {
 .product_desc   = "QEMU USB Network Interface",
 .qdev.name  = "usb-net",
+.qdev.fw_name= "network",
 .qdev.size  = sizeof(USBNetState),
 .init   = usb_net_initfn,
 .handle_packet  = usb_generic_handle_packet,
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 729917d..be2c92f 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -697,6 +697,7 @@ static int virtio_9p_init_pci(PCIDevice *pci_dev)
 static PCIDeviceInfo virtio_info[] = {
 {
 .qdev.name = "virtio-blk-pci",
+.qdev.alias = "virtio-blk",
 .qdev.size = sizeof(VirtIOPCIProxy),
 .init  = virtio_

[PATCHv4 05/15] Store IDE bus id in IDEBus structure for easy access.

2010-11-14 Thread Gleb Natapov

Signed-off-by: Gleb Natapov 
---
 hw/ide/cmd646.c   |4 ++--
 hw/ide/internal.h |3 ++-
 hw/ide/isa.c  |2 +-
 hw/ide/piix.c |4 ++--
 hw/ide/qdev.c |3 ++-
 hw/ide/via.c  |4 ++--
 6 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c
index ff80dd5..b2cbdbc 100644
--- a/hw/ide/cmd646.c
+++ b/hw/ide/cmd646.c
@@ -257,8 +257,8 @@ static int pci_cmd646_ide_initfn(PCIDevice *dev)
 pci_conf[PCI_INTERRUPT_PIN] = 0x01; // interrupt on pin 1
 
 irq = qemu_allocate_irqs(cmd646_set_irq, d, 2);
-ide_bus_new(&d->bus[0], &d->dev.qdev);
-ide_bus_new(&d->bus[1], &d->dev.qdev);
+ide_bus_new(&d->bus[0], &d->dev.qdev, 0);
+ide_bus_new(&d->bus[1], &d->dev.qdev, 1);
 ide_init2(&d->bus[0], irq[0]);
 ide_init2(&d->bus[1], irq[1]);
 
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index d652e06..c0a1abc 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -448,6 +448,7 @@ struct IDEBus {
 IDEDevice *slave;
 BMDMAState *bmdma;
 IDEState ifs[2];
+int bus_id;
 uint8_t unit;
 uint8_t cmd;
 qemu_irq irq;
@@ -565,7 +566,7 @@ void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo 
*hd0,
 void ide_init_ioport(IDEBus *bus, int iobase, int iobase2);
 
 /* hw/ide/qdev.c */
-void ide_bus_new(IDEBus *idebus, DeviceState *dev);
+void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id);
 IDEDevice *ide_create_drive(IDEBus *bus, int unit, DriveInfo *drive);
 
 #endif /* HW_IDE_INTERNAL_H */
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index 4206afd..8c59c5a 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -67,7 +67,7 @@ static int isa_ide_initfn(ISADevice *dev)
 {
 ISAIDEState *s = DO_UPCAST(ISAIDEState, dev, dev);
 
-ide_bus_new(&s->bus, &s->dev.qdev);
+ide_bus_new(&s->bus, &s->dev.qdev, 0);
 ide_init_ioport(&s->bus, s->iobase, s->iobase2);
 isa_init_irq(dev, &s->irq, s->isairq);
 isa_init_ioport_range(dev, s->iobase, 8);
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index 07483e8..d0b04a3 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -129,8 +129,8 @@ static int pci_piix_ide_initfn(PCIIDEState *d)
 
 vmstate_register(&d->dev.qdev, 0, &vmstate_ide_pci, d);
 
-ide_bus_new(&d->bus[0], &d->dev.qdev);
-ide_bus_new(&d->bus[1], &d->dev.qdev);
+ide_bus_new(&d->bus[0], &d->dev.qdev, 0);
+ide_bus_new(&d->bus[1], &d->dev.qdev, 1);
 ide_init_ioport(&d->bus[0], 0x1f0, 0x3f6);
 ide_init_ioport(&d->bus[1], 0x170, 0x376);
 
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 6d27b60..88ff657 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -29,9 +29,10 @@ static struct BusInfo ide_bus_info = {
 .size  = sizeof(IDEBus),
 };
 
-void ide_bus_new(IDEBus *idebus, DeviceState *dev)
+void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id)
 {
 qbus_create_inplace(&idebus->qbus, &ide_bus_info, dev, NULL);
+idebus->bus_id = bus_id;
 }
 
 static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base)
diff --git a/hw/ide/via.c b/hw/ide/via.c
index b2c7cad..cc48b2b 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -158,8 +158,8 @@ static int vt82c686b_ide_initfn(PCIDevice *dev)
 
 vmstate_register(&dev->qdev, 0, &vmstate_ide_pci, d);
 
-ide_bus_new(&d->bus[0], &d->dev.qdev);
-ide_bus_new(&d->bus[1], &d->dev.qdev);
+ide_bus_new(&d->bus[0], &d->dev.qdev, 0);
+ide_bus_new(&d->bus[1], &d->dev.qdev, 1);
 ide_init2(&d->bus[0], isa_reserve_irq(14));
 ide_init2(&d->bus[1], isa_reserve_irq(15));
 ide_init_ioport(&d->bus[0], 0x1f0, 0x3f6);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 11/15] Add bootindex parameter to net/block/fd device

2010-11-14 Thread Gleb Natapov
If bootindex is specified on command line a string that describes device
in firmware readable way is added into sorted list. Later this list will
be passed into firmware to control boot order.

Signed-off-by: Gleb Natapov 
---
 block_int.h |4 +++-
 hw/e1000.c  |4 
 hw/eepro100.c   |3 +++
 hw/fdc.c|8 
 hw/ide/qdev.c   |5 +
 hw/ne2000.c |3 +++
 hw/pcnet.c  |4 
 hw/qdev.c   |   32 
 hw/qdev.h   |1 +
 hw/rtl8139.c|4 
 hw/usb-net.c|2 ++
 hw/virtio-blk.c |2 ++
 hw/virtio-net.c |2 ++
 net.h   |4 +++-
 sysemu.h|2 ++
 vl.c|   40 
 16 files changed, 118 insertions(+), 2 deletions(-)

diff --git a/block_int.h b/block_int.h
index 3c3adb5..0a0e47d 100644
--- a/block_int.h
+++ b/block_int.h
@@ -227,6 +227,7 @@ typedef struct BlockConf {
 uint16_t logical_block_size;
 uint16_t min_io_size;
 uint32_t opt_io_size;
+int32_t bootindex;
 } BlockConf;
 
 static inline unsigned int get_physical_block_exp(BlockConf *conf)
@@ -249,6 +250,7 @@ static inline unsigned int get_physical_block_exp(BlockConf 
*conf)
 DEFINE_PROP_UINT16("physical_block_size", _state,   \
_conf.physical_block_size, 512), \
 DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0),  \
-DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0)
+DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0),\
+DEFINE_PROP_INT32("bootindex", _state, _conf.bootindex, -1) \
 
 #endif /* BLOCK_INT_H */
diff --git a/hw/e1000.c b/hw/e1000.c
index 532efdc..053f33e 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -30,6 +30,7 @@
 #include "net.h"
 #include "net/checksum.h"
 #include "loader.h"
+#include "sysemu.h"
 
 #include "e1000_hw.h"
 
@@ -1148,6 +1149,9 @@ static int pci_e1000_init(PCIDevice *pci_dev)
   d->dev.qdev.info->name, d->dev.qdev.id, d);
 
 qemu_format_nic_info_str(&d->nic->nc, macaddr);
+
+add_boot_device_path(d->conf.bootindex, &pci_dev->qdev, "ethernet-...@0");
+
 return 0;
 }
 
diff --git a/hw/eepro100.c b/hw/eepro100.c
index 41d792a..80adac6 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -46,6 +46,7 @@
 #include "pci.h"
 #include "net.h"
 #include "eeprom93xx.h"
+#include "sysemu.h"
 
 #define KiB 1024
 
@@ -1907,6 +1908,8 @@ static int e100_nic_init(PCIDevice *pci_dev)
 s->vmstate->name = s->nic->nc.model;
 vmstate_register(&pci_dev->qdev, -1, s->vmstate, s);
 
+add_boot_device_path(s->conf.bootindex, &pci_dev->qdev, "ethernet-...@0");
+
 return 0;
 }
 
diff --git a/hw/fdc.c b/hw/fdc.c
index 5ab754b..7b1349f 100644
--- a/hw/fdc.c
+++ b/hw/fdc.c
@@ -35,6 +35,7 @@
 #include "sysbus.h"
 #include "qdev-addr.h"
 #include "blockdev.h"
+#include "sysemu.h"
 
 //
 /* debug Floppy devices */
@@ -523,6 +524,8 @@ typedef struct FDCtrlSysBus {
 typedef struct FDCtrlISABus {
 ISADevice busdev;
 struct FDCtrl state;
+int32_t bootindexA;
+int32_t bootindexB;
 } FDCtrlISABus;
 
 static uint32_t fdctrl_read (void *opaque, uint32_t reg)
@@ -1992,6 +1995,9 @@ static int isabus_fdc_init1(ISADevice *dev)
 qdev_set_legacy_instance_id(&dev->qdev, iobase, 2);
 ret = fdctrl_init_common(fdctrl);
 
+add_boot_device_path(isa->bootindexA, &dev->qdev, "flo...@0");
+add_boot_device_path(isa->bootindexB, &dev->qdev, "flo...@1");
+
 return ret;
 }
 
@@ -2051,6 +2057,8 @@ static ISADeviceInfo isa_fdc_info = {
 .qdev.props = (Property[]) {
 DEFINE_PROP_DRIVE("driveA", FDCtrlISABus, state.drives[0].bs),
 DEFINE_PROP_DRIVE("driveB", FDCtrlISABus, state.drives[1].bs),
+DEFINE_PROP_INT32("bootindexA", FDCtrlISABus, bootindexA, -1),
+DEFINE_PROP_INT32("bootindexB", FDCtrlISABus, bootindexB, -1),
 DEFINE_PROP_END_OF_LIST(),
 },
 };
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 01a181b..69a00e2 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -21,6 +21,7 @@
 #include "qemu-error.h"
 #include 
 #include "blockdev.h"
+#include "sysemu.h"
 
 /* - */
 
@@ -143,6 +144,10 @@ static int ide_drive_initfn(IDEDevice *dev)
 if (!dev->serial) {
 dev->serial = qemu_strdup(s->drive_serial_str);
 }
+
+add_boot_device_path(dev->conf.bootindex, &dev->qdev,
+ dev->unit ? "d...@1" : "d...@0");
+
 return 0;
 }
 
diff --git a/hw/ne2000.c b/hw/ne2000.c
index 126e7cf..f4bbac2 100644
--- a/hw/ne2000.c
+++ b/hw/ne2000.c
@@ -26,6 +26,7 @@
 #include "net.h"
 #include "ne2000.h"
 #include "loader.h"
+#include "sysemu.h"
 
 /* debug NE2000 card */
 //#define DEBUG_NE2000
@@ -746,6 +747,8 @@ static int pci_ne2000_init(PCIDevice *pci_dev)
 }
 }
 
+add_boot_device_path(s->c.bootindex, &pci_dev->qdev, "

[PATCHv4 07/15] Add get_dev_path callback for system bus.

2010-11-14 Thread Gleb Natapov
Prints out mmio or pio used to access child device.

Signed-off-by: Gleb Natapov 
---
 hw/pci_host.c |2 ++
 hw/sysbus.c   |   30 ++
 hw/sysbus.h   |4 
 3 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/hw/pci_host.c b/hw/pci_host.c
index bc5b771..28d45bf 100644
--- a/hw/pci_host.c
+++ b/hw/pci_host.c
@@ -197,6 +197,7 @@ void pci_host_conf_register_ioport(pio_addr_t ioport, 
PCIHostState *s)
 {
 pci_host_init(s);
 register_ioport_simple(&s->conf_noswap_handler, ioport, 4, 4);
+sysbus_init_ioports(&s->busdev, ioport, 4);
 }
 
 int pci_host_data_register_mmio(PCIHostState *s, int swap)
@@ -215,4 +216,5 @@ void pci_host_data_register_ioport(pio_addr_t ioport, 
PCIHostState *s)
 register_ioport_simple(&s->data_noswap_handler, ioport, 4, 1);
 register_ioport_simple(&s->data_noswap_handler, ioport, 4, 2);
 register_ioport_simple(&s->data_noswap_handler, ioport, 4, 4);
+sysbus_init_ioports(&s->busdev, ioport, 4);
 }
diff --git a/hw/sysbus.c b/hw/sysbus.c
index d817721..1583bd8 100644
--- a/hw/sysbus.c
+++ b/hw/sysbus.c
@@ -22,11 +22,13 @@
 #include "monitor.h"
 
 static void sysbus_dev_print(Monitor *mon, DeviceState *dev, int indent);
+static char *sysbus_get_fw_dev_path(DeviceState *dev);
 
 struct BusInfo system_bus_info = {
 .name   = "System",
 .size   = sizeof(BusState),
 .print_dev  = sysbus_dev_print,
+.get_fw_dev_path = sysbus_get_fw_dev_path,
 };
 
 void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq)
@@ -106,6 +108,16 @@ void sysbus_init_mmio_cb(SysBusDevice *dev, 
target_phys_addr_t size,
 dev->mmio[n].cb = cb;
 }
 
+void sysbus_init_ioports(SysBusDevice *dev, pio_addr_t ioport, pio_addr_t size)
+{
+pio_addr_t i;
+
+for (i = 0; i < size; i++) {
+assert(dev->num_pio < QDEV_MAX_PIO);
+dev->pio[dev->num_pio++] = ioport++;
+}
+}
+
 static int sysbus_device_init(DeviceState *dev, DeviceInfo *base)
 {
 SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev);
@@ -171,3 +183,21 @@ static void sysbus_dev_print(Monitor *mon, DeviceState 
*dev, int indent)
indent, "", s->mmio[i].addr, s->mmio[i].size);
 }
 }
+
+static char *sysbus_get_fw_dev_path(DeviceState *dev)
+{
+SysBusDevice *s = sysbus_from_qdev(dev);
+char path[40];
+int off;
+
+off = snprintf(path, sizeof(path), "%s", qdev_fw_name(dev));
+
+if (s->num_mmio) {
+snprintf(path + off, sizeof(path) - off, "@"TARGET_FMT_plx,
+ s->mmio[0].addr);
+} else if (s->num_pio) {
+snprintf(path + off, sizeof(path) - off, "@i%04x", s->pio[0]);
+}
+
+return strdup(path);
+}
diff --git a/hw/sysbus.h b/hw/sysbus.h
index 5980901..e9eb618 100644
--- a/hw/sysbus.h
+++ b/hw/sysbus.h
@@ -6,6 +6,7 @@
 #include "qdev.h"
 
 #define QDEV_MAX_MMIO 32
+#define QDEV_MAX_PIO 32
 #define QDEV_MAX_IRQ 256
 
 typedef struct SysBusDevice SysBusDevice;
@@ -23,6 +24,8 @@ struct SysBusDevice {
 mmio_mapfunc cb;
 ram_addr_t iofunc;
 } mmio[QDEV_MAX_MMIO];
+int num_pio;
+pio_addr_t pio[QDEV_MAX_PIO];
 };
 
 typedef int (*sysbus_initfn)(SysBusDevice *dev);
@@ -45,6 +48,7 @@ void sysbus_init_mmio_cb(SysBusDevice *dev, 
target_phys_addr_t size,
 mmio_mapfunc cb);
 void sysbus_init_irq(SysBusDevice *dev, qemu_irq *p);
 void sysbus_pass_irq(SysBusDevice *dev, SysBusDevice *target);
+void sysbus_init_ioports(SysBusDevice *dev, pio_addr_t ioport, pio_addr_t 
size);
 
 
 void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 08/15] Add get_fw_dev_path callback for pci bus.

2010-11-14 Thread Gleb Natapov

Signed-off-by: Gleb Natapov 
---
 hw/pci.c |  108 -
 1 files changed, 85 insertions(+), 23 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 962886e..114b435 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -43,12 +43,14 @@
 
 static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent);
 static char *pcibus_get_dev_path(DeviceState *dev);
+static char *pcibus_get_fw_dev_path(DeviceState *dev);
 
 struct BusInfo pci_bus_info = {
 .name   = "PCI",
 .size   = sizeof(PCIBus),
 .print_dev  = pcibus_dev_print,
 .get_dev_path = pcibus_get_dev_path,
+.get_fw_dev_path = pcibus_get_fw_dev_path,
 .props  = (Property[]) {
 DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
 DEFINE_PROP_STRING("romfile", PCIDevice, romfile),
@@ -1061,45 +1063,63 @@ void pci_msi_notify(PCIDevice *dev, unsigned int vector)
 typedef struct {
 uint16_t class;
 const char *desc;
+const char *fw_name;
+uint16_t fw_ign_bits;
 } pci_class_desc;
 
 static const pci_class_desc pci_class_descriptions[] =
 {
-{ 0x0100, "SCSI controller"},
-{ 0x0101, "IDE controller"},
-{ 0x0102, "Floppy controller"},
-{ 0x0103, "IPI controller"},
-{ 0x0104, "RAID controller"},
+{ 0x0001, "VGA controller", "display"},
+{ 0x0100, "SCSI controller", "scsi"},
+{ 0x0101, "IDE controller", "ide"},
+{ 0x0102, "Floppy controller", "fdc"},
+{ 0x0103, "IPI controller", "ipi"},
+{ 0x0104, "RAID controller", "raid"},
 { 0x0106, "SATA controller"},
 { 0x0107, "SAS controller"},
 { 0x0180, "Storage controller"},
-{ 0x0200, "Ethernet controller"},
-{ 0x0201, "Token Ring controller"},
-{ 0x0202, "FDDI controller"},
-{ 0x0203, "ATM controller"},
+{ 0x0200, "Ethernet controller", "ethernet"},
+{ 0x0201, "Token Ring controller", "token-ring"},
+{ 0x0202, "FDDI controller", "fddi"},
+{ 0x0203, "ATM controller", "atm"},
 { 0x0280, "Network controller"},
-{ 0x0300, "VGA controller"},
+{ 0x0300, "VGA controller", "display", 0x00ff},
 { 0x0301, "XGA controller"},
 { 0x0302, "3D controller"},
 { 0x0380, "Display controller"},
-{ 0x0400, "Video controller"},
-{ 0x0401, "Audio controller"},
+{ 0x0400, "Video controller", "video"},
+{ 0x0401, "Audio controller", "sound"},
 { 0x0402, "Phone"},
 { 0x0480, "Multimedia controller"},
-{ 0x0500, "RAM controller"},
-{ 0x0501, "Flash controller"},
+{ 0x0500, "RAM controller", "memory"},
+{ 0x0501, "Flash controller", "flash"},
 { 0x0580, "Memory controller"},
-{ 0x0600, "Host bridge"},
-{ 0x0601, "ISA bridge"},
-{ 0x0602, "EISA bridge"},
-{ 0x0603, "MC bridge"},
-{ 0x0604, "PCI bridge"},
-{ 0x0605, "PCMCIA bridge"},
-{ 0x0606, "NUBUS bridge"},
-{ 0x0607, "CARDBUS bridge"},
+{ 0x0600, "Host bridge", "host"},
+{ 0x0601, "ISA bridge", "isa"},
+{ 0x0602, "EISA bridge", "eisa"},
+{ 0x0603, "MC bridge", "mca"},
+{ 0x0604, "PCI bridge", "pci"},
+{ 0x0605, "PCMCIA bridge", "pcmcia"},
+{ 0x0606, "NUBUS bridge", "nubus"},
+{ 0x0607, "CARDBUS bridge", "cardbus"},
 { 0x0608, "RACEWAY bridge"},
 { 0x0680, "Bridge"},
-{ 0x0c03, "USB controller"},
+{ 0x0700, "Serial port", "serial"},
+{ 0x0701, "Parallel port", "parallel"},
+{ 0x0800, "Interrupt controller", "interrupt-controller"},
+{ 0x0801, "DMA controller", "dma-controller"},
+{ 0x0802, "Timer", "timer"},
+{ 0x0803, "RTC", "rtc"},
+{ 0x0900, "Keyboard", "keyboard"},
+{ 0x0901, "Pen", "pen"},
+{ 0x0902, "Mouse", "mouse"},
+{ 0x0A00, "Dock station", "dock", 0x00ff},
+{ 0x0B00, "i386 cpu", "cpu", 0x00ff},
+{ 0x0c00, "Fireware contorller", "fireware"},
+{ 0x0c01, "Access bus controller", "access-bus"},
+{ 0x0c02, "SSA controller", "ssa"},
+{ 0x0c03, "USB controller", "usb"},
+{ 0x0c04, "Fibre channel controller", "fibre-channel"},
 { 0, NULL}
 };
 
@@ -1825,6 +1845,48 @@ static void pcibus_dev_print(Monitor *mon, DeviceState 
*dev, int indent)
 }
 }
 
+static char *pci_dev_fw_name(DeviceState *dev, char *buf, int len)
+{
+PCIDevice *d = (PCIDevice *)dev;
+const char *name = NULL;
+const pci_class_desc *desc =  pci_class_descriptions;
+int class = pci_get_word(d->config + PCI_CLASS_DEVICE);
+
+while (desc->desc &&
+  (class & ~desc->fw_ign_bits) !=
+  (desc->class & ~desc->fw_ign_bits)) {
+desc++;
+}
+
+if (desc->desc) {
+name = desc->fw_name;
+}
+
+if (name) {
+pstrcpy(buf, len, name);
+} else {
+snprintf(buf, len, "pci%04x,%04x",
+ pci_get_word(d->config + PCI_VENDOR_ID),
+ pci_get_word(d->config + PCI_DEVICE_ID));
+}
+
+return buf;
+}
+
+static char *pcibus_get_fw_dev_path(DeviceState *dev)
+{
+PCIDevice *d = (PCIDevice *)dev;
+char path[50], na

[PATCHv4 10/15] Add get_dev_path callback for usb bus.

2010-11-14 Thread Gleb Natapov

Signed-off-by: Gleb Natapov 
---
 hw/usb-bus.c |   42 ++
 1 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/hw/usb-bus.c b/hw/usb-bus.c
index 256b881..8b4583c 100644
--- a/hw/usb-bus.c
+++ b/hw/usb-bus.c
@@ -5,11 +5,13 @@
 #include "monitor.h"
 
 static void usb_bus_dev_print(Monitor *mon, DeviceState *qdev, int indent);
+static char *usbbus_get_fw_dev_path(DeviceState *dev);
 
 static struct BusInfo usb_bus_info = {
 .name  = "USB",
 .size  = sizeof(USBBus),
 .print_dev = usb_bus_dev_print,
+.get_fw_dev_path = usbbus_get_fw_dev_path,
 };
 static int next_usb_bus = 0;
 static QTAILQ_HEAD(, USBBus) busses = QTAILQ_HEAD_INITIALIZER(busses);
@@ -307,3 +309,43 @@ USBDevice *usbdevice_create(const char *cmdline)
 }
 return usb->usbdevice_init(params);
 }
+
+static int usbbus_get_fw_dev_path_helper(USBDevice *d, USBBus *bus, char *p,
+ int len)
+{
+int l = 0;
+USBPort *port;
+
+QTAILQ_FOREACH(port, &bus->used, next) {
+if (port->dev == d) {
+if (port->pdev) {
+l = usbbus_get_fw_dev_path_helper(port->pdev, bus, p, len);
+}
+l += snprintf(p + l, len - l, "%...@%x/", qdev_fw_name(&d->qdev),
+  port->index);
+break;
+}
+}
+
+return l;
+}
+
+static char *usbbus_get_fw_dev_path(DeviceState *dev)
+{
+USBDevice *d = (USBDevice*)dev;
+USBBus *bus = usb_bus_from_device(d);
+char path[100];
+int l;
+
+assert(d->attached != 0);
+
+l = usbbus_get_fw_dev_path_helper(d, bus, path, sizeof(path));
+
+if (l == 0) {
+abort();
+}
+
+path[l-1] = '\0';
+
+return strdup(path);
+}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 06/15] Add get_fw_dev_path callback to IDE bus.

2010-11-14 Thread Gleb Natapov

Signed-off-by: Gleb Natapov 
---
 hw/ide/qdev.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 88ff657..01a181b 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -24,9 +24,12 @@
 
 /* - */
 
+static char *idebus_get_fw_dev_path(DeviceState *dev);
+
 static struct BusInfo ide_bus_info = {
 .name  = "IDE",
 .size  = sizeof(IDEBus),
+.get_fw_dev_path = idebus_get_fw_dev_path,
 };
 
 void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id)
@@ -35,6 +38,16 @@ void ide_bus_new(IDEBus *idebus, DeviceState *dev, int 
bus_id)
 idebus->bus_id = bus_id;
 }
 
+static char *idebus_get_fw_dev_path(DeviceState *dev)
+{
+char path[30];
+
+snprintf(path, sizeof(path), "%...@%d", qdev_fw_name(dev),
+ ((IDEBus*)dev->parent_bus)->bus_id);
+
+return strdup(path);
+}
+
 static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base)
 {
 IDEDevice *dev = DO_UPCAST(IDEDevice, qdev, qdev);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 09/15] Record which USBDevice USBPort belongs too.

2010-11-14 Thread Gleb Natapov
Ports on root hub will have NULL here. This is needed to reconstruct
path from device to its root hub to build device path.

Signed-off-by: Gleb Natapov 
---
 hw/usb-bus.c  |3 ++-
 hw/usb-hub.c  |2 +-
 hw/usb-musb.c |2 +-
 hw/usb-ohci.c |2 +-
 hw/usb-uhci.c |2 +-
 hw/usb.h  |3 ++-
 6 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/usb-bus.c b/hw/usb-bus.c
index b692503..256b881 100644
--- a/hw/usb-bus.c
+++ b/hw/usb-bus.c
@@ -110,11 +110,12 @@ USBDevice *usb_create_simple(USBBus *bus, const char 
*name)
 }
 
 void usb_register_port(USBBus *bus, USBPort *port, void *opaque, int index,
-   usb_attachfn attach)
+   USBDevice *pdev, usb_attachfn attach)
 {
 port->opaque = opaque;
 port->index = index;
 port->attach = attach;
+port->pdev = pdev;
 QTAILQ_INSERT_TAIL(&bus->free, port, next);
 bus->nfree++;
 }
diff --git a/hw/usb-hub.c b/hw/usb-hub.c
index 8e3a96b..8a3f829 100644
--- a/hw/usb-hub.c
+++ b/hw/usb-hub.c
@@ -535,7 +535,7 @@ static int usb_hub_initfn(USBDevice *dev)
 for (i = 0; i < s->nb_ports; i++) {
 port = &s->ports[i];
 usb_register_port(usb_bus_from_device(dev),
-  &port->port, s, i, usb_hub_attach);
+  &port->port, s, i, &s->dev, usb_hub_attach);
 port->wPortStatus = PORT_STAT_POWER;
 port->wPortChange = 0;
 }
diff --git a/hw/usb-musb.c b/hw/usb-musb.c
index 7f15842..9efe7a6 100644
--- a/hw/usb-musb.c
+++ b/hw/usb-musb.c
@@ -343,7 +343,7 @@ struct MUSBState {
 }
 
 usb_bus_new(&s->bus, NULL /* FIXME */);
-usb_register_port(&s->bus, &s->port, s, 0, musb_attach);
+usb_register_port(&s->bus, &s->port, s, 0, NULL, musb_attach);
 
 return s;
 }
diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
index c60fd8d..59604cf 100644
--- a/hw/usb-ohci.c
+++ b/hw/usb-ohci.c
@@ -1705,7 +1705,7 @@ static void usb_ohci_init(OHCIState *ohci, DeviceState 
*dev,
 usb_bus_new(&ohci->bus, dev);
 ohci->num_ports = num_ports;
 for (i = 0; i < num_ports; i++) {
-usb_register_port(&ohci->bus, &ohci->rhport[i].port, ohci, i, 
ohci_attach);
+usb_register_port(&ohci->bus, &ohci->rhport[i].port, ohci, i, NULL, 
ohci_attach);
 }
 
 ohci->async_td = 0;
diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c
index 1d83400..b9b822f 100644
--- a/hw/usb-uhci.c
+++ b/hw/usb-uhci.c
@@ -1115,7 +1115,7 @@ static int usb_uhci_common_initfn(UHCIState *s)
 
 usb_bus_new(&s->bus, &s->dev.qdev);
 for(i = 0; i < NB_PORTS; i++) {
-usb_register_port(&s->bus, &s->ports[i].port, s, i, uhci_attach);
+usb_register_port(&s->bus, &s->ports[i].port, s, i, NULL, uhci_attach);
 }
 s->frame_timer = qemu_new_timer(vm_clock, uhci_frame_timer, s);
 s->expire_time = qemu_get_clock(vm_clock) +
diff --git a/hw/usb.h b/hw/usb.h
index 00d2802..0b32d77 100644
--- a/hw/usb.h
+++ b/hw/usb.h
@@ -203,6 +203,7 @@ struct USBPort {
 USBDevice *dev;
 usb_attachfn attach;
 void *opaque;
+USBDevice *pdev;
 int index; /* internal port index, may be used with the opaque */
 QTAILQ_ENTRY(USBPort) next;
 };
@@ -312,7 +313,7 @@ USBDevice *usb_create(USBBus *bus, const char *name);
 USBDevice *usb_create_simple(USBBus *bus, const char *name);
 USBDevice *usbdevice_create(const char *cmdline);
 void usb_register_port(USBBus *bus, USBPort *port, void *opaque, int index,
-   usb_attachfn attach);
+   USBDevice *pdev, usb_attachfn attach);
 void usb_unregister_port(USBBus *bus, USBPort *port);
 int usb_device_attach(USBDevice *dev);
 int usb_device_detach(USBDevice *dev);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 13/15] Add bootindex for option roms.

2010-11-14 Thread Gleb Natapov
Extend -option-rom command to have additional parameter ,bootindex=.

Signed-off-by: Gleb Natapov 
---
 hw/loader.c|   16 +++-
 hw/loader.h|8 
 hw/multiboot.c |3 ++-
 hw/ne2000.c|2 +-
 hw/nseries.c   |2 +-
 hw/pc.c|7 ---
 hw/pci.c   |2 +-
 hw/pcnet.c |2 +-
 qemu-config.c  |   17 +
 sysemu.h   |6 +-
 vl.c   |   11 +--
 11 files changed, 56 insertions(+), 20 deletions(-)

diff --git a/hw/loader.c b/hw/loader.c
index 1e98326..eb198f6 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -107,7 +107,7 @@ int load_image_targphys(const char *filename,
 
 size = get_image_size(filename);
 if (size > 0)
-rom_add_file_fixed(filename, addr);
+rom_add_file_fixed(filename, addr, -1);
 return size;
 }
 
@@ -557,10 +557,11 @@ static void rom_insert(Rom *rom)
 }
 
 int rom_add_file(const char *file, const char *fw_dir,
- target_phys_addr_t addr)
+ target_phys_addr_t addr, int32_t bootindex)
 {
 Rom *rom;
 int rc, fd = -1;
+char devpath[100];
 
 rom = qemu_mallocz(sizeof(*rom));
 rom->name = qemu_strdup(file);
@@ -605,7 +606,12 @@ int rom_add_file(const char *file, const char *fw_dir,
 snprintf(fw_file_name, sizeof(fw_file_name), "%s/%s", rom->fw_dir,
  basename);
 fw_cfg_add_file(fw_cfg, fw_file_name, rom->data, rom->romsize);
+snprintf(devpath, sizeof(devpath), "/r...@%s", fw_file_name);
+} else {
+snprintf(devpath, sizeof(devpath), "/rom@" TARGET_FMT_plx, addr);
 }
+
+add_boot_device_path(bootindex, NULL, devpath);
 return 0;
 
 err:
@@ -635,12 +641,12 @@ int rom_add_blob(const char *name, const void *blob, 
size_t len,
 
 int rom_add_vga(const char *file)
 {
-return rom_add_file(file, "vgaroms", 0);
+return rom_add_file(file, "vgaroms", 0, -1);
 }
 
-int rom_add_option(const char *file)
+int rom_add_option(const char *file, int32_t bootindex)
 {
-return rom_add_file(file, "genroms", 0);
+return rom_add_file(file, "genroms", 0, bootindex);
 }
 
 static void rom_reset(void *unused)
diff --git a/hw/loader.h b/hw/loader.h
index 1f82fc5..fc6bdff 100644
--- a/hw/loader.h
+++ b/hw/loader.h
@@ -22,7 +22,7 @@ void pstrcpy_targphys(const char *name,
 
 
 int rom_add_file(const char *file, const char *fw_dir,
- target_phys_addr_t addr);
+ target_phys_addr_t addr, int32_t bootindex);
 int rom_add_blob(const char *name, const void *blob, size_t len,
  target_phys_addr_t addr);
 int rom_load_all(void);
@@ -31,8 +31,8 @@ int rom_copy(uint8_t *dest, target_phys_addr_t addr, size_t 
size);
 void *rom_ptr(target_phys_addr_t addr);
 void do_info_roms(Monitor *mon);
 
-#define rom_add_file_fixed(_f, _a)  \
-rom_add_file(_f, NULL, _a)
+#define rom_add_file_fixed(_f, _a, _i)  \
+rom_add_file(_f, NULL, _a, _i)
 #define rom_add_blob_fixed(_f, _b, _l, _a)  \
 rom_add_blob(_f, _b, _l, _a)
 
@@ -43,6 +43,6 @@ void do_info_roms(Monitor *mon);
 #define PC_ROM_SIZE(PC_ROM_MAX - PC_ROM_MIN_VGA)
 
 int rom_add_vga(const char *file);
-int rom_add_option(const char *file);
+int rom_add_option(const char *file, int32_t bootindex);
 
 #endif
diff --git a/hw/multiboot.c b/hw/multiboot.c
index f9097a2..b438019 100644
--- a/hw/multiboot.c
+++ b/hw/multiboot.c
@@ -325,7 +325,8 @@ int load_multiboot(void *fw_cfg,
 fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
  sizeof(bootinfo));
 
-option_rom[nb_option_roms] = "multiboot.bin";
+option_rom[nb_option_roms].name = "multiboot.bin";
+option_rom[nb_option_roms].bootindex = 0;
 nb_option_roms++;
 
 return 1; /* yes, we are multiboot */
diff --git a/hw/ne2000.c b/hw/ne2000.c
index f4bbac2..67e0cb0 100644
--- a/hw/ne2000.c
+++ b/hw/ne2000.c
@@ -742,7 +742,7 @@ static int pci_ne2000_init(PCIDevice *pci_dev)
 if (!pci_dev->qdev.hotplugged) {
 static int loaded = 0;
 if (!loaded) {
-rom_add_option("pxe-ne2k_pci.bin");
+rom_add_option("pxe-ne2k_pci.bin", -1);
 loaded = 1;
 }
 }
diff --git a/hw/nseries.c b/hw/nseries.c
index 04a028d..39780ef 100644
--- a/hw/nseries.c
+++ b/hw/nseries.c
@@ -1326,7 +1326,7 @@ static void n8x0_init(ram_addr_t ram_size, const char 
*boot_device,
 qemu_register_reset(n8x0_boot_init, s);
 }
 
-if (option_rom[0] && (boot_device[0] == 'n' || !kernel_filename)) {
+if (option_rom[0].name && (boot_device[0] == 'n' || !kernel_filename)) {
 int rom_size;
 uint8_t nolo_tags[0x1];
 /* No, wait, better start at the ROM.  */
diff --git a/hw/pc.c b/hw/pc.c
index 69b13bf..5111a76 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -733,7 +733,8 @@ static void load_linux(void *fw_cfg,
 fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_

[PATCHv4 03/15] Keep track of ISA ports ISA device is using in qdev.

2010-11-14 Thread Gleb Natapov
Store all io ports used by device in ISADevice structure.

Signed-off-by: Gleb Natapov 
---
 hw/cs4231a.c |1 +
 hw/fdc.c |3 +++
 hw/gus.c |4 
 hw/ide/isa.c |2 ++
 hw/isa-bus.c |   25 +
 hw/isa.h |4 
 hw/m48t59.c  |1 +
 hw/mc146818rtc.c |1 +
 hw/ne2000-isa.c  |3 +++
 hw/parallel.c|5 +
 hw/pckbd.c   |3 +++
 hw/sb16.c|4 
 hw/serial.c  |1 +
 13 files changed, 57 insertions(+), 0 deletions(-)

diff --git a/hw/cs4231a.c b/hw/cs4231a.c
index 4d5ce5c..598f032 100644
--- a/hw/cs4231a.c
+++ b/hw/cs4231a.c
@@ -645,6 +645,7 @@ static int cs4231a_initfn (ISADevice *dev)
 isa_init_irq (dev, &s->pic, s->irq);
 
 for (i = 0; i < 4; i++) {
+isa_init_ioport(dev, i);
 register_ioport_write (s->port + i, 1, 1, cs_write, s);
 register_ioport_read (s->port + i, 1, 1, cs_read, s);
 }
diff --git a/hw/fdc.c b/hw/fdc.c
index a467c4b..5ab754b 100644
--- a/hw/fdc.c
+++ b/hw/fdc.c
@@ -1983,6 +1983,9 @@ static int isabus_fdc_init1(ISADevice *dev)
   &fdctrl_write_port, fdctrl);
 register_ioport_write(iobase + 0x07, 1, 1,
   &fdctrl_write_port, fdctrl);
+isa_init_ioport_range(dev, iobase + 1, 5);
+isa_init_ioport(dev, iobase + 7);
+
 isa_init_irq(&isa->busdev, &fdctrl->irq, isairq);
 fdctrl->dma_chann = dma_chann;
 
diff --git a/hw/gus.c b/hw/gus.c
index e9016d8..ff9e7c7 100644
--- a/hw/gus.c
+++ b/hw/gus.c
@@ -264,20 +264,24 @@ static int gus_initfn (ISADevice *dev)
 
 register_ioport_write (s->port, 1, 1, gus_writeb, s);
 register_ioport_write (s->port, 1, 2, gus_writew, s);
+isa_init_ioport_range(dev, s->port, 2);
 
 register_ioport_read ((s->port + 0x100) & 0xf00, 1, 1, gus_readb, s);
 register_ioport_read ((s->port + 0x100) & 0xf00, 1, 2, gus_readw, s);
+isa_init_ioport_range(dev, (s->port + 0x100) & 0xf00, 2);
 
 register_ioport_write (s->port + 6, 10, 1, gus_writeb, s);
 register_ioport_write (s->port + 6, 10, 2, gus_writew, s);
 register_ioport_read (s->port + 6, 10, 1, gus_readb, s);
 register_ioport_read (s->port + 6, 10, 2, gus_readw, s);
+isa_init_ioport_range(dev, s->port + 6, 10);
 
 
 register_ioport_write (s->port + 0x100, 8, 1, gus_writeb, s);
 register_ioport_write (s->port + 0x100, 8, 2, gus_writew, s);
 register_ioport_read (s->port + 0x100, 8, 1, gus_readb, s);
 register_ioport_read (s->port + 0x100, 8, 2, gus_readw, s);
+isa_init_ioport_range(dev, s->port + 0x100, 8);
 
 DMA_register_channel (s->emu.gusdma, GUS_read_DMA, s);
 s->emu.himemaddr = s->himem;
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index 9856435..4206afd 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -70,6 +70,8 @@ static int isa_ide_initfn(ISADevice *dev)
 ide_bus_new(&s->bus, &s->dev.qdev);
 ide_init_ioport(&s->bus, s->iobase, s->iobase2);
 isa_init_irq(dev, &s->irq, s->isairq);
+isa_init_ioport_range(dev, s->iobase, 8);
+isa_init_ioport(dev, s->iobase2);
 ide_init2(&s->bus, s->irq);
 vmstate_register(&dev->qdev, 0, &vmstate_ide_isa, s);
 return 0;
diff --git a/hw/isa-bus.c b/hw/isa-bus.c
index 26036e0..c0ac7e9 100644
--- a/hw/isa-bus.c
+++ b/hw/isa-bus.c
@@ -92,6 +92,31 @@ void isa_init_irq(ISADevice *dev, qemu_irq *p, int isairq)
 dev->nirqs++;
 }
 
+static void isa_init_ioport_one(ISADevice *dev, uint16_t ioport)
+{
+assert(dev->nioports < ARRAY_SIZE(dev->ioports));
+dev->ioports[dev->nioports++] = ioport;
+}
+
+static int isa_cmp_ports(const void *p1, const void *p2)
+{
+return *(uint16_t*)p1 - *(uint16_t*)p2;
+}
+
+void isa_init_ioport_range(ISADevice *dev, uint16_t start, uint16_t length)
+{
+int i;
+for (i = start; i < start + length; i++) {
+isa_init_ioport_one(dev, i);
+}
+qsort(dev->ioports, dev->nioports, sizeof(dev->ioports[0]), isa_cmp_ports);
+}
+
+void isa_init_ioport(ISADevice *dev, uint16_t ioport)
+{
+isa_init_ioport_range(dev, ioport, 1);
+}
+
 static int isa_qdev_init(DeviceState *qdev, DeviceInfo *base)
 {
 ISADevice *dev = DO_UPCAST(ISADevice, qdev, qdev);
diff --git a/hw/isa.h b/hw/isa.h
index aaf0272..4794b76 100644
--- a/hw/isa.h
+++ b/hw/isa.h
@@ -14,6 +14,8 @@ struct ISADevice {
 DeviceState qdev;
 uint32_t isairq[2];
 int nirqs;
+uint16_t ioports[32];
+int nioports;
 };
 
 typedef int (*isa_qdev_initfn)(ISADevice *dev);
@@ -26,6 +28,8 @@ ISABus *isa_bus_new(DeviceState *dev);
 void isa_bus_irqs(qemu_irq *irqs);
 qemu_irq isa_reserve_irq(int isairq);
 void isa_init_irq(ISADevice *dev, qemu_irq *p, int isairq);
+void isa_init_ioport(ISADevice *dev, uint16_t ioport);
+void isa_init_ioport_range(ISADevice *dev, uint16_t start, uint16_t length);
 void isa_qdev_register(ISADeviceInfo *info);
 ISADevice *isa_create(const char *name);
 ISADevice *isa_create_simple(const char *name);
diff --git a/hw/m48t59.c b/hw/m

[PATCHv4 00/15] boot order specification

2010-11-14 Thread Gleb Natapov
This is current state of the patch series for people to comment on.
I am using open firmware naming scheme to specify device path names.
In this submission I addressed all comment from previous one and added
option rom support and rebased to qemu upstream.

Kevin can you double check that the names are usable by Seabios? Reading
PC boot specification it looks like Seabios will not be able to take
full advantage of this though. Only one BCV can be bootable, so only disk
with lowest boot index will be bootable by Seabios. Is this correct? 

Names look like this on pci machine:
/p...@i0cf8/i...@1,1/dr...@1/d...@0
/p...@i0cf8/i...@1/f...@03f1/flo...@1
/p...@i0cf8/i...@1/f...@03f1/flo...@0
/p...@i0cf8/i...@1,1/dr...@1/d...@1
/p...@i0cf8/i...@1,1/dr...@0/d...@0
/p...@i0cf8/s...@3/d...@0
/p...@i0cf8/ether...@4/ethernet-...@0
/p...@i0cf8/ether...@5/ethernet-...@0
/p...@i0cf8/i...@1,1/dr...@0/d...@1
/p...@i0cf8/i...@1/i...@01e8/dr...@0/d...@0
/p...@i0cf8/u...@1,2/netw...@0/ether...@0
/p...@i0cf8/u...@1,2/h...@1/netw...@0/ether...@0
/r...@genroms/linuxboot.bin

and on isa machine:
/isa/i...@0170/dr...@0/d...@0
/isa/f...@03f1/flo...@1
/isa/f...@03f1/flo...@0
/isa/i...@0170/dr...@0/d...@1


Instead of using get_dev_path() callback I introduces another one
get_fw_dev_path. Unfortunately the way get_dev_path() callback is used
in migration code makes it hard to reuse it for other purposes. First
of all it is not called recursively so caller expects it to provide
unique name by itself. Device path though is inherently recursive. Each
individual element may not be unique, but the whole path will be. On
the other hand to call get_dev_path() recursively in migration code we
should implement it for all possible buses first. Other problem is
compatibility. If we change get_dev_path() output format now we will not
be able to migrate from old qemu to new one without some additional
compatibility layer.

Gleb Natapov (15):
  Introduce fw_name field to DeviceInfo structure.
  Introduce new BusInfo callback get_fw_dev_path.
  Keep track of ISA ports ISA device is using in qdev.
  Add get_fw_dev_path callback to ISA bus in qdev.
  Store IDE bus id in IDEBus structure for easy access.
  Add get_fw_dev_path callback to IDE bus.
  Add get_dev_path callback for system bus.
  Add get_fw_dev_path callback for pci bus.
  Record which USBDevice USBPort belongs too.
  Add get_dev_path callback for usb bus.
  Add bootindex parameter to net/block/fd device
  Change fw_cfg_add_file() to get full file path as a parameter.
  Add bootindex for option roms.
  Add notifier that will be called when machine is fully created.
  Pass boot device list to firmware.

 block_int.h   |4 +-
 hw/cs4231a.c  |1 +
 hw/e1000.c|4 ++
 hw/eepro100.c |3 +
 hw/fdc.c  |   12 +
 hw/fw_cfg.c   |   30 -
 hw/fw_cfg.h   |8 ++-
 hw/gus.c  |4 ++
 hw/ide/cmd646.c   |4 +-
 hw/ide/internal.h |3 +-
 hw/ide/isa.c  |5 ++-
 hw/ide/piix.c |4 +-
 hw/ide/qdev.c |   22 +-
 hw/ide/via.c  |4 +-
 hw/isa-bus.c  |   42 +++
 hw/isa.h  |4 ++
 hw/lance.c|1 +
 hw/loader.c   |   32 +++---
 hw/loader.h   |8 ++--
 hw/m48t59.c   |1 +
 hw/mc146818rtc.c  |1 +
 hw/multiboot.c|3 +-
 hw/ne2000-isa.c   |3 +
 hw/ne2000.c   |5 ++-
 hw/nseries.c  |2 +-
 hw/parallel.c |5 ++
 hw/pc.c   |7 ++-
 hw/pci.c  |  110 +++---
 hw/pci_host.c |2 +
 hw/pckbd.c|3 +
 hw/pcnet.c|6 ++-
 hw/piix_pci.c |1 +
 hw/qdev.c |   32 ++
 hw/qdev.h |9 
 hw/rtl8139.c  |4 ++
 hw/sb16.c |4 ++
 hw/serial.c   |1 +
 hw/sysbus.c   |   30 ++
 hw/sysbus.h   |4 ++
 hw/usb-bus.c  |   45 -
 hw/usb-hub.c  |3 +-
 hw/usb-musb.c |2 +-
 hw/usb-net.c  |3 +
 hw/usb-ohci.c |2 +-
 hw/usb-uhci.c |2 +-
 hw/usb.h  |3 +-
 hw/virtio-blk.c   |2 +
 hw/virtio-net.c   |2 +
 hw/virtio-pci.c   |1 +
 net.h |4 +-
 qemu-config.c |   17 
 sysemu.h  |   11 +-
 vl.c  |  117 -
 53 files changed, 565 insertions(+), 77 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 12/15] Change fw_cfg_add_file() to get full file path as a parameter.

2010-11-14 Thread Gleb Natapov
Change fw_cfg_add_file() to get full file path as a parameter instead
of building one internally. Two reasons for that. First caller may need
to know how file is named. Second this moves policy of file naming out
from fw_cfg. Platform may want to use more then two levels of
directories for instance.

Signed-off-by: Gleb Natapov 
---
 hw/fw_cfg.c |   16 
 hw/fw_cfg.h |4 ++--
 hw/loader.c |   16 ++--
 3 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 72866ae..7b9434f 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -277,10 +277,9 @@ int fw_cfg_add_callback(FWCfgState *s, uint16_t key, 
FWCfgCallback callback,
 return 1;
 }
 
-int fw_cfg_add_file(FWCfgState *s,  const char *dir, const char *filename,
-uint8_t *data, uint32_t len)
+int fw_cfg_add_file(FWCfgState *s,  const char *filename, uint8_t *data,
+uint32_t len)
 {
-const char *basename;
 int i, index;
 
 if (!s->files) {
@@ -297,15 +296,8 @@ int fw_cfg_add_file(FWCfgState *s,  const char *dir, const 
char *filename,
 
 fw_cfg_add_bytes(s, FW_CFG_FILE_FIRST + index, data, len);
 
-basename = strrchr(filename, '/');
-if (basename) {
-basename++;
-} else {
-basename = filename;
-}
-
-snprintf(s->files->f[index].name, sizeof(s->files->f[index].name),
- "%s/%s", dir, basename);
+pstrcpy(s->files->f[index].name, sizeof(s->files->f[index].name),
+filename);
 for (i = 0; i < index; i++) {
 if (strcmp(s->files->f[index].name, s->files->f[i].name) == 0) {
 FW_CFG_DPRINTF("%s: skip duplicate: %s\n", __FUNCTION__,
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 4d13a4f..856bf91 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -60,8 +60,8 @@ int fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t 
value);
 int fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value);
 int fw_cfg_add_callback(FWCfgState *s, uint16_t key, FWCfgCallback callback,
 void *callback_opaque, uint8_t *data, size_t len);
-int fw_cfg_add_file(FWCfgState *s, const char *dir, const char *filename,
-uint8_t *data, uint32_t len);
+int fw_cfg_add_file(FWCfgState *s, const char *filename, uint8_t *data,
+uint32_t len);
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
 target_phys_addr_t crl_addr, target_phys_addr_t 
data_addr);
 
diff --git a/hw/loader.c b/hw/loader.c
index 49ac1fa..1e98326 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -592,8 +592,20 @@ int rom_add_file(const char *file, const char *fw_dir,
 }
 close(fd);
 rom_insert(rom);
-if (rom->fw_file && fw_cfg)
-fw_cfg_add_file(fw_cfg, rom->fw_dir, rom->fw_file, rom->data, 
rom->romsize);
+if (rom->fw_file && fw_cfg) {
+const char *basename;
+char fw_file_name[56];
+
+basename = strrchr(rom->fw_file, '/');
+if (basename) {
+basename++;
+} else {
+basename = rom->fw_file;
+}
+snprintf(fw_file_name, sizeof(fw_file_name), "%s/%s", rom->fw_dir,
+ basename);
+fw_cfg_add_file(fw_cfg, fw_file_name, rom->data, rom->romsize);
+}
 return 0;
 
 err:
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Gleb Natapov

Signed-off-by: Gleb Natapov 
---
 hw/fw_cfg.c |   14 ++
 hw/fw_cfg.h |4 +++-
 sysemu.h|1 +
 vl.c|   51 +++
 4 files changed, 69 insertions(+), 1 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 7b9434f..f6a67db 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -53,6 +53,7 @@ struct FWCfgState {
 FWCfgFiles *files;
 uint16_t cur_entry;
 uint32_t cur_offset;
+Notifier machine_ready;
 };
 
 static void fw_cfg_write(FWCfgState *s, uint8_t value)
@@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char *filename, 
uint8_t *data,
 return 1;
 }
 
+static void fw_cfg_machine_ready(struct Notifier* n)
+{
+uint32_t len;
+char *bootindex = get_boot_devices_list(&len);
+
+fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
+ FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
+}
+
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
 target_phys_addr_t ctl_addr, target_phys_addr_t 
data_addr)
 {
@@ -343,6 +353,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
data_port,
 fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
 fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
 
+
+s->machine_ready.notify = fw_cfg_machine_ready;
+qemu_add_machine_init_done_notifier(&s->machine_ready);
+
 return s;
 }
 
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 856bf91..4d61410 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -30,7 +30,9 @@
 
 #define FW_CFG_FILE_FIRST   0x20
 #define FW_CFG_FILE_SLOTS   0x10
-#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
+#define FW_CFG_FILE_LAST_SLOT   (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
+#define FW_CFG_BOOTINDEX(FW_CFG_FILE_LAST_SLOT + 1)
+#define FW_CFG_MAX_ENTRYFW_CFG_BOOTINDEX 
 
 #define FW_CFG_WRITE_CHANNEL0x4000
 #define FW_CFG_ARCH_LOCAL   0x8000
diff --git a/sysemu.h b/sysemu.h
index c42f33a..38a20a3 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -196,4 +196,5 @@ void register_devices(void);
 
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix);
+char *get_boot_devices_list(uint32_t *size);
 #endif
diff --git a/vl.c b/vl.c
index 918d988..cca1e76 100644
--- a/vl.c
+++ b/vl.c
@@ -735,6 +735,57 @@ void add_boot_device_path(int32_t bootindex, DeviceState 
*dev,
 QTAILQ_INSERT_TAIL(&fw_boot_order, node, link);
 }
 
+/*
+ * This function returns device list as an array in a below format:
+ * +-+-+---+-+---+--
+ * |  n  |  l1 |   devpath1|  l2 |  devpath2 | ...
+ * +-+-+---+-+---+--
+ * where:
+ *   n - a number of devise pathes (one byte)
+ *   l - length of following device path string (one byte)
+ *   devpath - non-null terminated string of length l representing
+ * one device path
+ */
+char *get_boot_devices_list(uint32_t *size)
+{
+FWBootEntry *i;
+uint32_t total = 1, c = 0;
+char *list = qemu_malloc(1);
+
+QTAILQ_FOREACH(i, &fw_boot_order, link) {
+char *devpath = NULL, *bootpath;
+int len;
+
+if (i->dev) {
+devpath = qdev_get_fw_dev_path(i->dev);
+assert(devpath);
+}
+
+if (i->suffix && devpath) {
+bootpath = qemu_malloc(strlen(devpath) + strlen(i->suffix) + 2);
+sprintf(bootpath, "%s/%s", devpath, i->suffix);
+qemu_free(devpath);
+} else if (devpath) {
+bootpath = devpath;
+} else {
+bootpath = strdup(i->suffix);
+}
+
+len = strlen(bootpath);
+list = qemu_realloc(list, total + len + 1);
+list[total++] = len;
+memcpy(&list[total], bootpath, len);
+total += len;
+c++;
+qemu_free(bootpath);
+}
+
+list[0] = c;
+*size = total;
+
+return list;
+}
+
 static void numa_add(const char *optarg)
 {
 char option[128];
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv4 14/15] Add notifier that will be called when machine is fully created.

2010-11-14 Thread Gleb Natapov
Action that depends on fully initialized device model should register
with this notifier chain.

Signed-off-by: Gleb Natapov 
---
 sysemu.h |2 ++
 vl.c |   15 +++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/sysemu.h b/sysemu.h
index 48f8eee..c42f33a 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -60,6 +60,8 @@ void qemu_system_reset(void);
 void qemu_add_exit_notifier(Notifier *notify);
 void qemu_remove_exit_notifier(Notifier *notify);
 
+void qemu_add_machine_init_done_notifier(Notifier *notify);
+
 void do_savevm(Monitor *mon, const QDict *qdict);
 int load_vmstate(const char *name);
 void do_delvm(Monitor *mon, const QDict *qdict);
diff --git a/vl.c b/vl.c
index e8ada75..918d988 100644
--- a/vl.c
+++ b/vl.c
@@ -253,6 +253,9 @@ static void *boot_set_opaque;
 static NotifierList exit_notifiers =
 NOTIFIER_LIST_INITIALIZER(exit_notifiers);
 
+static NotifierList machine_init_done_notifiers =
+NOTIFIER_LIST_INITIALIZER(machine_init_done_notifiers);
+
 int kvm_allowed = 0;
 uint32_t xen_domid;
 enum xen_mode xen_mode = XEN_EMULATE;
@@ -1778,6 +1781,16 @@ static void qemu_run_exit_notifiers(void)
 notifier_list_notify(&exit_notifiers);
 }
 
+void qemu_add_machine_init_done_notifier(Notifier *notify)
+{
+notifier_list_add(&machine_init_done_notifiers, notify);
+}
+
+static void qemu_run_machine_init_done_notifiers(void)
+{
+notifier_list_notify(&machine_init_done_notifiers);
+}
+
 static const QEMUOption *lookup_opt(int argc, char **argv,
 const char **poptarg, int *poptind)
 {
@@ -3023,6 +3036,8 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
+qemu_run_machine_init_done_notifiers();
+
 qemu_system_reset();
 if (loadvm) {
 if (load_vmstate(loadvm) < 0) {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough on Sony Vaio F11 laptop...

2010-11-14 Thread Jan Kiszka
Am 14.11.2010 14:21, Erik Brakkee wrote:
> Jan Kiszka wrote:
>>
>> Strange, should work. I would suggest to post your full kernel log,
>> maybe there is some enlightening message hidden.
>>
>> I don't think it is a problem of your kernel version, but I'm able to
>> pass through devices on OpenSUSE 11.3 with
>> kernel-desktop-2.6.36-90.1.x86_64 from their kernel repository.
>>
>> Jan
>>
>>
> Exactly what server logs do you need. Is this only /var/log/messages or
> more? And do I need to set specific options there?
> Any other log files that you need?

dmesg > log-file

> 
> Before, generating these logs I will upgrade to a later kernel. As far
> as I can tell, that will still be a 2.6.34 kernel. Perhaps I should try
> the 2.6.36 kernel as well. Do you have the URL for the kernel repository
> I should use? (cannot find an obvious kernel repository in YAST2).

http://download.opensuse.org/repositories/Kernel:/HEAD/openSUSE_11.3

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCHv4 08/15] Add get_fw_dev_path callback for pci bus.

2010-11-14 Thread Michael S. Tsirkin
On Sun, Nov 14, 2010 at 05:39:34PM +0200, Gleb Natapov wrote:
> 
> Signed-off-by: Gleb Natapov 
> ---
>  hw/pci.c |  108 -
>  1 files changed, 85 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index 962886e..114b435 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -43,12 +43,14 @@
>  
>  static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent);
>  static char *pcibus_get_dev_path(DeviceState *dev);
> +static char *pcibus_get_fw_dev_path(DeviceState *dev);
>  
>  struct BusInfo pci_bus_info = {
>  .name   = "PCI",
>  .size   = sizeof(PCIBus),
>  .print_dev  = pcibus_dev_print,
>  .get_dev_path = pcibus_get_dev_path,
> +.get_fw_dev_path = pcibus_get_fw_dev_path,
>  .props  = (Property[]) {
>  DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
>  DEFINE_PROP_STRING("romfile", PCIDevice, romfile),
> @@ -1061,45 +1063,63 @@ void pci_msi_notify(PCIDevice *dev, unsigned int 
> vector)
>  typedef struct {
>  uint16_t class;
>  const char *desc;
> +const char *fw_name;
> +uint16_t fw_ign_bits;
>  } pci_class_desc;
>  
>  static const pci_class_desc pci_class_descriptions[] =
>  {
> -{ 0x0100, "SCSI controller"},
> -{ 0x0101, "IDE controller"},
> -{ 0x0102, "Floppy controller"},
> -{ 0x0103, "IPI controller"},
> -{ 0x0104, "RAID controller"},
> +{ 0x0001, "VGA controller", "display"},
> +{ 0x0100, "SCSI controller", "scsi"},
> +{ 0x0101, "IDE controller", "ide"},
> +{ 0x0102, "Floppy controller", "fdc"},
> +{ 0x0103, "IPI controller", "ipi"},
> +{ 0x0104, "RAID controller", "raid"},
>  { 0x0106, "SATA controller"},
>  { 0x0107, "SAS controller"},
>  { 0x0180, "Storage controller"},
> -{ 0x0200, "Ethernet controller"},
> -{ 0x0201, "Token Ring controller"},
> -{ 0x0202, "FDDI controller"},
> -{ 0x0203, "ATM controller"},
> +{ 0x0200, "Ethernet controller", "ethernet"},
> +{ 0x0201, "Token Ring controller", "token-ring"},
> +{ 0x0202, "FDDI controller", "fddi"},
> +{ 0x0203, "ATM controller", "atm"},
>  { 0x0280, "Network controller"},
> -{ 0x0300, "VGA controller"},
> +{ 0x0300, "VGA controller", "display", 0x00ff},
>  { 0x0301, "XGA controller"},
>  { 0x0302, "3D controller"},
>  { 0x0380, "Display controller"},
> -{ 0x0400, "Video controller"},
> -{ 0x0401, "Audio controller"},
> +{ 0x0400, "Video controller", "video"},
> +{ 0x0401, "Audio controller", "sound"},
>  { 0x0402, "Phone"},
>  { 0x0480, "Multimedia controller"},
> -{ 0x0500, "RAM controller"},
> -{ 0x0501, "Flash controller"},
> +{ 0x0500, "RAM controller", "memory"},
> +{ 0x0501, "Flash controller", "flash"},
>  { 0x0580, "Memory controller"},
> -{ 0x0600, "Host bridge"},
> -{ 0x0601, "ISA bridge"},
> -{ 0x0602, "EISA bridge"},
> -{ 0x0603, "MC bridge"},
> -{ 0x0604, "PCI bridge"},
> -{ 0x0605, "PCMCIA bridge"},
> -{ 0x0606, "NUBUS bridge"},
> -{ 0x0607, "CARDBUS bridge"},
> +{ 0x0600, "Host bridge", "host"},
> +{ 0x0601, "ISA bridge", "isa"},
> +{ 0x0602, "EISA bridge", "eisa"},
> +{ 0x0603, "MC bridge", "mca"},
> +{ 0x0604, "PCI bridge", "pci"},
> +{ 0x0605, "PCMCIA bridge", "pcmcia"},
> +{ 0x0606, "NUBUS bridge", "nubus"},
> +{ 0x0607, "CARDBUS bridge", "cardbus"},
>  { 0x0608, "RACEWAY bridge"},
>  { 0x0680, "Bridge"},
> -{ 0x0c03, "USB controller"},
> +{ 0x0700, "Serial port", "serial"},
> +{ 0x0701, "Parallel port", "parallel"},
> +{ 0x0800, "Interrupt controller", "interrupt-controller"},
> +{ 0x0801, "DMA controller", "dma-controller"},
> +{ 0x0802, "Timer", "timer"},
> +{ 0x0803, "RTC", "rtc"},
> +{ 0x0900, "Keyboard", "keyboard"},
> +{ 0x0901, "Pen", "pen"},
> +{ 0x0902, "Mouse", "mouse"},
> +{ 0x0A00, "Dock station", "dock", 0x00ff},
> +{ 0x0B00, "i386 cpu", "cpu", 0x00ff},
> +{ 0x0c00, "Fireware contorller", "fireware"},
> +{ 0x0c01, "Access bus controller", "access-bus"},
> +{ 0x0c02, "SSA controller", "ssa"},
> +{ 0x0c03, "USB controller", "usb"},
> +{ 0x0c04, "Fibre channel controller", "fibre-channel"},
>  { 0, NULL}
>  };
>  
> @@ -1825,6 +1845,48 @@ static void pcibus_dev_print(Monitor *mon, DeviceState 
> *dev, int indent)
>  }
>  }
>  
> +static char *pci_dev_fw_name(DeviceState *dev, char *buf, int len)
> +{
> +PCIDevice *d = (PCIDevice *)dev;
> +const char *name = NULL;
> +const pci_class_desc *desc =  pci_class_descriptions;
> +int class = pci_get_word(d->config + PCI_CLASS_DEVICE);
> +
> +while (desc->desc &&
> +  (class & ~desc->fw_ign_bits) !=
> +  (desc->class & ~desc->fw_ign_bits)) {
> +desc++;
> +}
> +
> +if (desc->desc) {
> +name = desc->fw_name;
> +}
> +
> +if (name) {
> +pstrcpy(buf, l

Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Michael S. Tsirkin
On Sun, Nov 14, 2010 at 05:39:41PM +0200, Gleb Natapov wrote:
> 
> Signed-off-by: Gleb Natapov 
> ---
>  hw/fw_cfg.c |   14 ++
>  hw/fw_cfg.h |4 +++-
>  sysemu.h|1 +
>  vl.c|   51 +++
>  4 files changed, 69 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> index 7b9434f..f6a67db 100644
> --- a/hw/fw_cfg.c
> +++ b/hw/fw_cfg.c
> @@ -53,6 +53,7 @@ struct FWCfgState {
>  FWCfgFiles *files;
>  uint16_t cur_entry;
>  uint32_t cur_offset;
> +Notifier machine_ready;
>  };
>  
>  static void fw_cfg_write(FWCfgState *s, uint8_t value)
> @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
> *filename, uint8_t *data,
>  return 1;
>  }
>  
> +static void fw_cfg_machine_ready(struct Notifier* n)
> +{
> +uint32_t len;
> +char *bootindex = get_boot_devices_list(&len);
> +
> +fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
> + FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
> +}
> +
>  FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
>  target_phys_addr_t ctl_addr, target_phys_addr_t 
> data_addr)
>  {
> @@ -343,6 +353,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
> data_port,
>  fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
>  fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
>  
> +
> +s->machine_ready.notify = fw_cfg_machine_ready;
> +qemu_add_machine_init_done_notifier(&s->machine_ready);
> +
>  return s;
>  }
>  
> diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
> index 856bf91..4d61410 100644
> --- a/hw/fw_cfg.h
> +++ b/hw/fw_cfg.h
> @@ -30,7 +30,9 @@
>  
>  #define FW_CFG_FILE_FIRST   0x20
>  #define FW_CFG_FILE_SLOTS   0x10
> -#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> +#define FW_CFG_FILE_LAST_SLOT   (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> +#define FW_CFG_BOOTINDEX(FW_CFG_FILE_LAST_SLOT + 1)
> +#define FW_CFG_MAX_ENTRYFW_CFG_BOOTINDEX 
>  
>  #define FW_CFG_WRITE_CHANNEL0x4000
>  #define FW_CFG_ARCH_LOCAL   0x8000
> diff --git a/sysemu.h b/sysemu.h
> index c42f33a..38a20a3 100644
> --- a/sysemu.h
> +++ b/sysemu.h
> @@ -196,4 +196,5 @@ void register_devices(void);
>  
>  void add_boot_device_path(int32_t bootindex, DeviceState *dev,
>const char *suffix);
> +char *get_boot_devices_list(uint32_t *size);
>  #endif
> diff --git a/vl.c b/vl.c
> index 918d988..cca1e76 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -735,6 +735,57 @@ void add_boot_device_path(int32_t bootindex, DeviceState 
> *dev,
>  QTAILQ_INSERT_TAIL(&fw_boot_order, node, link);
>  }
>  
> +/*
> + * This function returns device list as an array in a below format:
> + * +-+-+---+-+---+--
> + * |  n  |  l1 |   devpath1|  l2 |  devpath2 | ...
> + * +-+-+---+-+---+--

No one will ever want > 256 devices? Let's make it 4 byte or something.

> + * where:
> + *   n - a number of devise pathes (one byte)
> + *   l - length of following device path string (one byte)

Might not fit: with pci we can have 256 nested buses.
How about simply null-terminating each path?

> + *   devpath - non-null terminated string of length l representing
> + * one device path
> + */

Document return value + parameters as well?

> +char *get_boot_devices_list(uint32_t *size)
> +{
> +FWBootEntry *i;
> +uint32_t total = 1, c = 0;
> +char *list = qemu_malloc(1);
> +
> +QTAILQ_FOREACH(i, &fw_boot_order, link) {
> +char *devpath = NULL, *bootpath;
> +int len;
> +
> +if (i->dev) {
> +devpath = qdev_get_fw_dev_path(i->dev);
> +assert(devpath);
> +}
> +
> +if (i->suffix && devpath) {
> +bootpath = qemu_malloc(strlen(devpath) + strlen(i->suffix) + 2);
> +sprintf(bootpath, "%s/%s", devpath, i->suffix);
> +qemu_free(devpath);

devpath is allocated with strdup, not qemu_malloc,
so I guess it should be freed with free?
Alternatively, let's add qemu_strdup
Might be a good idea: fix error handling here and elsewhere.

> +} else if (devpath) {
> +bootpath = devpath;
> +} else {
> +bootpath = strdup(i->suffix);
> +}

assert(bootpath).

> +
> +len = strlen(bootpath);
> +list = qemu_realloc(list, total + len + 1);
> +list[total++] = len;
> +memcpy(&list[total], bootpath, len);
> +total += len;
> +c++;
> +qemu_free(bootpath);

Man, is this tricky.

> +}
> +
> +list[0] = c;
> +*size = total;
> +
> +return list;
> +}
> +
>  static void numa_add(const char *optarg)
>  {
>  char option[128];
> -- 
> 1.7.1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo i

Re: [PATCHv4 08/15] Add get_fw_dev_path callback for pci bus.

2010-11-14 Thread Gleb Natapov
On Sun, Nov 14, 2010 at 08:27:20PM +0200, Michael S. Tsirkin wrote:
> On Sun, Nov 14, 2010 at 05:39:34PM +0200, Gleb Natapov wrote:
> > +static char *pcibus_get_fw_dev_path(DeviceState *dev)
> > +{
> > +PCIDevice *d = (PCIDevice *)dev;
> > +char path[50], name[33];
> > +int off;
> > +
> > +off = snprintf(path, sizeof(path), "%...@%x",
> > +   pci_dev_fw_name(dev, name, sizeof name),
> > +   PCI_SLOT(d->devfn));
> > +if (PCI_FUNC(d->devfn))
> > +snprintf(path + off, sizeof(path) + off, ",%x", 
> > PCI_FUNC(d->devfn));
> 
> Can we *always* specify a slot number in the name?
> If yes I think we should, because I think we saw that the short form is
> ambiguous: if there's a device at a slot != 0, openfirmware will
> to match that against the name.
You accidentally something here.

> Right?
> 
I do not see what is ambiguous here. Can you explain?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Gleb Natapov
On Sun, Nov 14, 2010 at 08:41:37PM +0200, Michael S. Tsirkin wrote:
> On Sun, Nov 14, 2010 at 05:39:41PM +0200, Gleb Natapov wrote:
> > 
> > Signed-off-by: Gleb Natapov 
> > ---
> >  hw/fw_cfg.c |   14 ++
> >  hw/fw_cfg.h |4 +++-
> >  sysemu.h|1 +
> >  vl.c|   51 +++
> >  4 files changed, 69 insertions(+), 1 deletions(-)
> > 
> > diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> > index 7b9434f..f6a67db 100644
> > --- a/hw/fw_cfg.c
> > +++ b/hw/fw_cfg.c
> > @@ -53,6 +53,7 @@ struct FWCfgState {
> >  FWCfgFiles *files;
> >  uint16_t cur_entry;
> >  uint32_t cur_offset;
> > +Notifier machine_ready;
> >  };
> >  
> >  static void fw_cfg_write(FWCfgState *s, uint8_t value)
> > @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
> > *filename, uint8_t *data,
> >  return 1;
> >  }
> >  
> > +static void fw_cfg_machine_ready(struct Notifier* n)
> > +{
> > +uint32_t len;
> > +char *bootindex = get_boot_devices_list(&len);
> > +
> > +fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
> > + FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
> > +}
> > +
> >  FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
> >  target_phys_addr_t ctl_addr, target_phys_addr_t 
> > data_addr)
> >  {
> > @@ -343,6 +353,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
> > data_port,
> >  fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
> >  fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
> >  
> > +
> > +s->machine_ready.notify = fw_cfg_machine_ready;
> > +qemu_add_machine_init_done_notifier(&s->machine_ready);
> > +
> >  return s;
> >  }
> >  
> > diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
> > index 856bf91..4d61410 100644
> > --- a/hw/fw_cfg.h
> > +++ b/hw/fw_cfg.h
> > @@ -30,7 +30,9 @@
> >  
> >  #define FW_CFG_FILE_FIRST   0x20
> >  #define FW_CFG_FILE_SLOTS   0x10
> > -#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> > +#define FW_CFG_FILE_LAST_SLOT   (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> > +#define FW_CFG_BOOTINDEX(FW_CFG_FILE_LAST_SLOT + 1)
> > +#define FW_CFG_MAX_ENTRYFW_CFG_BOOTINDEX 
> >  
> >  #define FW_CFG_WRITE_CHANNEL0x4000
> >  #define FW_CFG_ARCH_LOCAL   0x8000
> > diff --git a/sysemu.h b/sysemu.h
> > index c42f33a..38a20a3 100644
> > --- a/sysemu.h
> > +++ b/sysemu.h
> > @@ -196,4 +196,5 @@ void register_devices(void);
> >  
> >  void add_boot_device_path(int32_t bootindex, DeviceState *dev,
> >const char *suffix);
> > +char *get_boot_devices_list(uint32_t *size);
> >  #endif
> > diff --git a/vl.c b/vl.c
> > index 918d988..cca1e76 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -735,6 +735,57 @@ void add_boot_device_path(int32_t bootindex, 
> > DeviceState *dev,
> >  QTAILQ_INSERT_TAIL(&fw_boot_order, node, link);
> >  }
> >  
> > +/*
> > + * This function returns device list as an array in a below format:
> > + * +-+-+---+-+---+--
> > + * |  n  |  l1 |   devpath1|  l2 |  devpath2 | ...
> > + * +-+-+---+-+---+--
> 
> No one will ever want > 256 devices? Let's make it 4 byte or something.
> 
More then 256 _boot_ devices. I think this is more then reasonable.

> > + * where:
> > + *   n - a number of devise pathes (one byte)
> > + *   l - length of following device path string (one byte)
> 
> Might not fit: with pci we can have 256 nested buses.
Theoretically. But will it practically happen?

> How about simply null-terminating each path?
> 
Will be harder for Seabios. I can use more then one byte for length, but
I tried to avoid endianess handling.

> > + *   devpath - non-null terminated string of length l representing
> > + * one device path
> > + */
> 
> Document return value + parameters as well?
> 
Return value is documented above :)

> > +char *get_boot_devices_list(uint32_t *size)
> > +{
> > +FWBootEntry *i;
> > +uint32_t total = 1, c = 0;
> > +char *list = qemu_malloc(1);
> > +
> > +QTAILQ_FOREACH(i, &fw_boot_order, link) {
> > +char *devpath = NULL, *bootpath;
> > +int len;
> > +
> > +if (i->dev) {
> > +devpath = qdev_get_fw_dev_path(i->dev);
> > +assert(devpath);
> > +}
> > +
> > +if (i->suffix && devpath) {
> > +bootpath = qemu_malloc(strlen(devpath) + strlen(i->suffix) + 
> > 2);
> > +sprintf(bootpath, "%s/%s", devpath, i->suffix);
> > +qemu_free(devpath);
> 
> devpath is allocated with strdup, not qemu_malloc,
> so I guess it should be freed with free?
There is code that do it like this all over qemu.

> Alternatively, let's add qemu_strdup
> Might be a good idea: fix error handling here and elsewhere.
> 
> > +} else if (devpath) {
> > +bootpath = devpath;
> > +} else {
>

Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Blue Swirl
On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov  wrote:
>
> Signed-off-by: Gleb Natapov 
> ---
>  hw/fw_cfg.c |   14 ++
>  hw/fw_cfg.h |    4 +++-
>  sysemu.h    |    1 +
>  vl.c        |   51 +++
>  4 files changed, 69 insertions(+), 1 deletions(-)
>
> diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> index 7b9434f..f6a67db 100644
> --- a/hw/fw_cfg.c
> +++ b/hw/fw_cfg.c
> @@ -53,6 +53,7 @@ struct FWCfgState {
>     FWCfgFiles *files;
>     uint16_t cur_entry;
>     uint32_t cur_offset;
> +    Notifier machine_ready;
>  };
>
>  static void fw_cfg_write(FWCfgState *s, uint8_t value)
> @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
> *filename, uint8_t *data,
>     return 1;
>  }
>
> +static void fw_cfg_machine_ready(struct Notifier* n)
> +{
> +    uint32_t len;
> +    char *bootindex = get_boot_devices_list(&len);
> +
> +    fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
> +                     FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);

I started to implement this to OpenBIOS but I noticed a small issue.
First the first byte must be read to determine length. Then the read
routine will be called again to read the correct amount of bytes. This
would work, but since there is no shortage of IDs, I'd prefer a system
where one ID is used to query the length and another ID is used to
read the data, without the length byte. This is similar how command
line, initrd etc. are handled.

This would have the advantage that since fw_cfg uses little endian
format, the length value would easily scale to for example 64 bits to
support terabytes of boot device lists. ;-)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Michael S. Tsirkin
On Sun, Nov 14, 2010 at 08:49:59PM +, Blue Swirl wrote:
> On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov  wrote:
> >
> > Signed-off-by: Gleb Natapov 
> > ---
> >  hw/fw_cfg.c |   14 ++
> >  hw/fw_cfg.h |    4 +++-
> >  sysemu.h    |    1 +
> >  vl.c        |   51 +++
> >  4 files changed, 69 insertions(+), 1 deletions(-)
> >
> > diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> > index 7b9434f..f6a67db 100644
> > --- a/hw/fw_cfg.c
> > +++ b/hw/fw_cfg.c
> > @@ -53,6 +53,7 @@ struct FWCfgState {
> >     FWCfgFiles *files;
> >     uint16_t cur_entry;
> >     uint32_t cur_offset;
> > +    Notifier machine_ready;
> >  };
> >
> >  static void fw_cfg_write(FWCfgState *s, uint8_t value)
> > @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
> > *filename, uint8_t *data,
> >     return 1;
> >  }
> >
> > +static void fw_cfg_machine_ready(struct Notifier* n)
> > +{
> > +    uint32_t len;
> > +    char *bootindex = get_boot_devices_list(&len);
> > +
> > +    fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
> > +                     FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
> 
> I started to implement this to OpenBIOS but I noticed a small issue.
> First the first byte must be read to determine length. Then the read
> routine will be called again to read the correct amount of bytes. This
> would work, but since there is no shortage of IDs, I'd prefer a system
> where one ID is used to query the length and another ID is used to
> read the data, without the length byte. This is similar how command
> line, initrd etc. are handled.
> 
> This would have the advantage that since fw_cfg uses little endian
> format, the length value would easily scale to for example 64 bits to
> support terabytes of boot device lists. ;-)

Yea. Let's just print # of devices as a property, in ASCII.
No endian-ness, no nothing.
Also - can we just NULL-terminate each ID?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Michael S. Tsirkin
On Sun, Nov 14, 2010 at 08:52:37PM +0200, Gleb Natapov wrote:
> On Sun, Nov 14, 2010 at 08:41:37PM +0200, Michael S. Tsirkin wrote:
> > On Sun, Nov 14, 2010 at 05:39:41PM +0200, Gleb Natapov wrote:
> > > 
> > > Signed-off-by: Gleb Natapov 
> > > ---
> > >  hw/fw_cfg.c |   14 ++
> > >  hw/fw_cfg.h |4 +++-
> > >  sysemu.h|1 +
> > >  vl.c|   51 +++
> > >  4 files changed, 69 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> > > index 7b9434f..f6a67db 100644
> > > --- a/hw/fw_cfg.c
> > > +++ b/hw/fw_cfg.c
> > > @@ -53,6 +53,7 @@ struct FWCfgState {
> > >  FWCfgFiles *files;
> > >  uint16_t cur_entry;
> > >  uint32_t cur_offset;
> > > +Notifier machine_ready;
> > >  };
> > >  
> > >  static void fw_cfg_write(FWCfgState *s, uint8_t value)
> > > @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
> > > *filename, uint8_t *data,
> > >  return 1;
> > >  }
> > >  
> > > +static void fw_cfg_machine_ready(struct Notifier* n)
> > > +{
> > > +uint32_t len;
> > > +char *bootindex = get_boot_devices_list(&len);
> > > +
> > > +fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
> > > + FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
> > > +}
> > > +
> > >  FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
> > >  target_phys_addr_t ctl_addr, target_phys_addr_t 
> > > data_addr)
> > >  {
> > > @@ -343,6 +353,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
> > > data_port,
> > >  fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
> > >  fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
> > >  
> > > +
> > > +s->machine_ready.notify = fw_cfg_machine_ready;
> > > +qemu_add_machine_init_done_notifier(&s->machine_ready);
> > > +
> > >  return s;
> > >  }
> > >  
> > > diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
> > > index 856bf91..4d61410 100644
> > > --- a/hw/fw_cfg.h
> > > +++ b/hw/fw_cfg.h
> > > @@ -30,7 +30,9 @@
> > >  
> > >  #define FW_CFG_FILE_FIRST   0x20
> > >  #define FW_CFG_FILE_SLOTS   0x10
> > > -#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> > > +#define FW_CFG_FILE_LAST_SLOT   (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> > > +#define FW_CFG_BOOTINDEX(FW_CFG_FILE_LAST_SLOT + 1)
> > > +#define FW_CFG_MAX_ENTRYFW_CFG_BOOTINDEX 
> > >  
> > >  #define FW_CFG_WRITE_CHANNEL0x4000
> > >  #define FW_CFG_ARCH_LOCAL   0x8000
> > > diff --git a/sysemu.h b/sysemu.h
> > > index c42f33a..38a20a3 100644
> > > --- a/sysemu.h
> > > +++ b/sysemu.h
> > > @@ -196,4 +196,5 @@ void register_devices(void);
> > >  
> > >  void add_boot_device_path(int32_t bootindex, DeviceState *dev,
> > >const char *suffix);
> > > +char *get_boot_devices_list(uint32_t *size);
> > >  #endif
> > > diff --git a/vl.c b/vl.c
> > > index 918d988..cca1e76 100644
> > > --- a/vl.c
> > > +++ b/vl.c
> > > @@ -735,6 +735,57 @@ void add_boot_device_path(int32_t bootindex, 
> > > DeviceState *dev,
> > >  QTAILQ_INSERT_TAIL(&fw_boot_order, node, link);
> > >  }
> > >  
> > > +/*
> > > + * This function returns device list as an array in a below format:
> > > + * +-+-+---+-+---+--
> > > + * |  n  |  l1 |   devpath1|  l2 |  devpath2 | ...
> > > + * +-+-+---+-+---+--
> > 
> > No one will ever want > 256 devices? Let's make it 4 byte or something.
> > 
> More then 256 _boot_ devices. I think this is more then reasonable.
> 
> > > + * where:
> > > + *   n - a number of devise pathes (one byte)
> > > + *   l - length of following device path string (one byte)
> > 
> > Might not fit: with pci we can have 256 nested buses.
> Theoretically. But will it practically happen?

Why not? It's easy to specify this on qemu command line.
You do nothing to detect this and gracefully fail either, do you?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Michael S. Tsirkin
On Sun, Nov 14, 2010 at 08:52:37PM +0200, Gleb Natapov wrote:
> > > +
> > > +len = strlen(bootpath);
> > > +list = qemu_realloc(list, total + len + 1);
> > > +list[total++] = len;
> > > +memcpy(&list[total], bootpath, len);
> > > +total += len;
> > > +c++;
> > > +qemu_free(bootpath);
> > 
> > Man, is this tricky.
> > 
> Nah, not at all.

I think it will be easier if we don't try to do this in one pass.
1. pass: calculate total length and # of devices
2. allocate
2. pass fill in

> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Blue Swirl
On Sun, Nov 14, 2010 at 8:54 PM, Michael S. Tsirkin  wrote:
> On Sun, Nov 14, 2010 at 08:49:59PM +, Blue Swirl wrote:
>> On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov  wrote:
>> >
>> > Signed-off-by: Gleb Natapov 
>> > ---
>> >  hw/fw_cfg.c |   14 ++
>> >  hw/fw_cfg.h |    4 +++-
>> >  sysemu.h    |    1 +
>> >  vl.c        |   51 +++
>> >  4 files changed, 69 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
>> > index 7b9434f..f6a67db 100644
>> > --- a/hw/fw_cfg.c
>> > +++ b/hw/fw_cfg.c
>> > @@ -53,6 +53,7 @@ struct FWCfgState {
>> >     FWCfgFiles *files;
>> >     uint16_t cur_entry;
>> >     uint32_t cur_offset;
>> > +    Notifier machine_ready;
>> >  };
>> >
>> >  static void fw_cfg_write(FWCfgState *s, uint8_t value)
>> > @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
>> > *filename, uint8_t *data,
>> >     return 1;
>> >  }
>> >
>> > +static void fw_cfg_machine_ready(struct Notifier* n)
>> > +{
>> > +    uint32_t len;
>> > +    char *bootindex = get_boot_devices_list(&len);
>> > +
>> > +    fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
>> > +                     FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
>>
>> I started to implement this to OpenBIOS but I noticed a small issue.
>> First the first byte must be read to determine length. Then the read
>> routine will be called again to read the correct amount of bytes. This
>> would work, but since there is no shortage of IDs, I'd prefer a system
>> where one ID is used to query the length and another ID is used to
>> read the data, without the length byte. This is similar how command
>> line, initrd etc. are handled.
>>
>> This would have the advantage that since fw_cfg uses little endian
>> format, the length value would easily scale to for example 64 bits to
>> support terabytes of boot device lists. ;-)
>
> Yea. Let's just print # of devices as a property, in ASCII.
> No endian-ness, no nothing.
> Also - can we just NULL-terminate each ID?

No, we should use LE numbers like other IDs. To be more specific, this
is what I meant (instead of FW_CFG_BOOTINDEX):
FW_CFG_BOOTINDEX_LEN: get LE integer length of the boot device data.
FW_CFG_BOOTINDEX_DATA: get the boot device data as NUL terminated C
strings, all strings back-to-back. The reader can determine number of
strings.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 13/15] Add bootindex for option roms.

2010-11-14 Thread Blue Swirl
On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov  wrote:
> Extend -option-rom command to have additional parameter ,bootindex=.

This patch is broken:
  CCarm-softmmu/palm.o
/src/qemu/hw/palm.c: In function 'palmte_init':
/src/qemu/hw/palm.c:237: error: incompatible type for argument 1 of
'get_image_size'
/src/qemu/hw/palm.c:245: error: incompatible type for argument 1 of
'load_image_targphys'
cc1: warnings being treated as errors
/src/qemu/hw/palm.c:250: error: format '%s' expects type 'char *', but
argument 4 has type 'QEMUOptionRom'
  CCarm-softmmu/nseries.o
/src/qemu/hw/nseries.c: In function 'n8x0_init':
/src/qemu/hw/nseries.c:1346: error: incompatible type for argument 1
of 'load_image_targphys'
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Michael S. Tsirkin
On Sun, Nov 14, 2010 at 09:13:03PM +, Blue Swirl wrote:
> On Sun, Nov 14, 2010 at 8:54 PM, Michael S. Tsirkin  wrote:
> > On Sun, Nov 14, 2010 at 08:49:59PM +, Blue Swirl wrote:
> >> On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov  wrote:
> >> >
> >> > Signed-off-by: Gleb Natapov 
> >> > ---
> >> >  hw/fw_cfg.c |   14 ++
> >> >  hw/fw_cfg.h |    4 +++-
> >> >  sysemu.h    |    1 +
> >> >  vl.c        |   51 +++
> >> >  4 files changed, 69 insertions(+), 1 deletions(-)
> >> >
> >> > diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> >> > index 7b9434f..f6a67db 100644
> >> > --- a/hw/fw_cfg.c
> >> > +++ b/hw/fw_cfg.c
> >> > @@ -53,6 +53,7 @@ struct FWCfgState {
> >> >     FWCfgFiles *files;
> >> >     uint16_t cur_entry;
> >> >     uint32_t cur_offset;
> >> > +    Notifier machine_ready;
> >> >  };
> >> >
> >> >  static void fw_cfg_write(FWCfgState *s, uint8_t value)
> >> > @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
> >> > *filename, uint8_t *data,
> >> >     return 1;
> >> >  }
> >> >
> >> > +static void fw_cfg_machine_ready(struct Notifier* n)
> >> > +{
> >> > +    uint32_t len;
> >> > +    char *bootindex = get_boot_devices_list(&len);
> >> > +
> >> > +    fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
> >> > +                     FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
> >>
> >> I started to implement this to OpenBIOS but I noticed a small issue.
> >> First the first byte must be read to determine length. Then the read
> >> routine will be called again to read the correct amount of bytes. This
> >> would work, but since there is no shortage of IDs, I'd prefer a system
> >> where one ID is used to query the length and another ID is used to
> >> read the data, without the length byte. This is similar how command
> >> line, initrd etc. are handled.
> >>
> >> This would have the advantage that since fw_cfg uses little endian
> >> format, the length value would easily scale to for example 64 bits to
> >> support terabytes of boot device lists. ;-)
> >
> > Yea. Let's just print # of devices as a property, in ASCII.
> > No endian-ness, no nothing.
> > Also - can we just NULL-terminate each ID?
> 
> No, we should use LE numbers like other IDs. To be more specific, this
> is what I meant (instead of FW_CFG_BOOTINDEX):
> FW_CFG_BOOTINDEX_LEN: get LE integer length of the boot device data.
> FW_CFG_BOOTINDEX_DATA: get the boot device data as NUL terminated C
> strings, all strings back-to-back. The reader can determine number of
> strings.

Sounds good.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Blue Swirl
On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov  wrote:
>
> Signed-off-by: Gleb Natapov 
> ---
>  hw/fw_cfg.c |   14 ++
>  hw/fw_cfg.h |    4 +++-
>  sysemu.h    |    1 +
>  vl.c        |   51 +++
>  4 files changed, 69 insertions(+), 1 deletions(-)
>
> diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> index 7b9434f..f6a67db 100644
> --- a/hw/fw_cfg.c
> +++ b/hw/fw_cfg.c
> @@ -53,6 +53,7 @@ struct FWCfgState {
>     FWCfgFiles *files;
>     uint16_t cur_entry;
>     uint32_t cur_offset;
> +    Notifier machine_ready;
>  };
>
>  static void fw_cfg_write(FWCfgState *s, uint8_t value)
> @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
> *filename, uint8_t *data,
>     return 1;
>  }
>
> +static void fw_cfg_machine_ready(struct Notifier* n)
> +{
> +    uint32_t len;
> +    char *bootindex = get_boot_devices_list(&len);
> +
> +    fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
> +                     FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
> +}
> +
>  FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
>                         target_phys_addr_t ctl_addr, target_phys_addr_t 
> data_addr)
>  {
> @@ -343,6 +353,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
> data_port,
>     fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
>     fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
>
> +
> +    s->machine_ready.notify = fw_cfg_machine_ready;
> +    qemu_add_machine_init_done_notifier(&s->machine_ready);
> +
>     return s;
>  }
>
> diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
> index 856bf91..4d61410 100644
> --- a/hw/fw_cfg.h
> +++ b/hw/fw_cfg.h
> @@ -30,7 +30,9 @@
>
>  #define FW_CFG_FILE_FIRST       0x20
>  #define FW_CFG_FILE_SLOTS       0x10
> -#define FW_CFG_MAX_ENTRY        (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> +#define FW_CFG_FILE_LAST_SLOT   (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> +#define FW_CFG_BOOTINDEX        (FW_CFG_FILE_LAST_SLOT + 1)
> +#define FW_CFG_MAX_ENTRY        FW_CFG_BOOTINDEX

This should be
#define FW_CFG_MAX_ENTRY(FW_CFG_BOOTINDEX + 1)
because the check is like this:
if ((key & FW_CFG_ENTRY_MASK) >= FW_CFG_MAX_ENTRY) {
s->cur_entry = FW_CFG_INVALID;

With that change, I got the bootindex passed to OpenBIOS:
OpenBIOS for Sparc64
Configuration device id QEMU version 1 machine id 0
kernel cmdline
CPUs: 1 x SUNW,UltraSPARC-IIi
UUID: ----
bootindex num_strings 1
bootindex /p...@01fe/i...@5/dr...@1/d...@0

The device path does not match exactly, but it's close:
/p...@1fe,0/pci-...@5/i...@600/d...@0
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] Monitor command: pfa2hva, translate guest physical address to host virtual address

2010-11-14 Thread Huang Ying
On Sun, 2010-11-14 at 19:08 +0800, Avi Kivity wrote:
> On 11/12/2010 03:16 AM, Huang Ying wrote:
> > On Thu, 2010-11-11 at 17:39 +0800, Avi Kivity wrote:
> > >  On 11/11/2010 02:56 AM, Huang Ying wrote:
> > >  >  On Thu, 2010-11-11 at 00:49 +0800, Anthony Liguori wrote:
> > >  >  >   On 11/10/2010 02:34 AM, Avi Kivity wrote:
> > >  >  >   >>   Why the gpa ->hva mapping is not
> > >  >  >   >>   consistent for RAM if -mempath is not used?
> > >  >  >   >
> > >  >  >   >   Video RAM in the range a-b and PCI mapped RAM can 
> > > change gpas
> > >  >  >   >   (while remaining in the same hva).
> > >  >  >   >
> > >  >  >   >   Even for ordinary RAM, which doesn't normally change gpa/hva, 
> > > I'm not
> > >  >  >   >   sure we want to guarantee that it won't.
> > >  >  >
> > >  >  >   We can't universally either.  Memory hot remove potentially 
> > > breaks the
> > >  >  >   mapping and some non-x86 architectures (like ARM) can alias RAM 
> > > via a
> > >  >  >   guest triggered mechanism.
> > >  >
> > >  >  Thanks for clarification. Now I think we have two options.
> > >  >
> > >  >  1) Clearly mark gpa2hva (pfa2hva now, should renamed to gpa2hva) as a
> > >  >  testing only interface, and should be used only on restricted
> > >  >  environment (normal memory, without hot-remove, maybe only on x86).
> > >  >
> > >  >  2) Find some way to lock the gpa ->   hva mapping during operating. 
> > > Such
> > >  >  as gpa2hva_begin and gpa2hva_end and lock the mapping in between.
> > >  >
> > >  >  I think 2) may be possible. But if there are no other users, why do 
> > > that
> > >  >  for a test case? So I think 1) may be the better option.
> > >  >
> > >
> > >  3) Move the poisoning code into qemu, so the command becomes
> > >
> > >  posison-address
> > >
> > >  (though physical addresses aren't stable either)
> >
> > Poisoning includes:
> >
> > a) gva ->  gpa
> > b) gpa ->  hva
> > c) hva ->  hpa
> > d) inject the MCE into host via some external tool
> >
> > poison-address need to do b, c, d. Do you think it is good to do all
> > these inside qemu?
> 
> If d is not too complicated (an ioctl?), we might to it in qemu.

The issue of d) is that there are multiple ways to inject MCE. Now one
software based, one APEI based, and maybe some others in the future.
They all use different interfaces. And as debug interface, there are not
considered kernel ABI too (some are in debugfs). So I think it is better
to use these ABI only in some test suite.

Best Regards,
Huang Ying


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH] virtio-9p: fix build on !CONFIG_UTIMENSAT v2

2010-11-14 Thread Hidetoshi Seto
(2010/11/14 14:58), Chris Wright wrote:
> * Hidetoshi Seto (seto.hideto...@jp.fujitsu.com) wrote:
>> +/*
>> + * Fallback: use utimes() instead of utimensat().
>> + * See commit 74bc02b2d2272dc88fb98d43e631eb154717f517 for known 
>> problem.
>> + */
>> +struct timeval tv[2];
>> +int i;
>> +
>> +for (i = 0; i < 2; i++) {
>> +if (times[i].tv_nsec == UTIME_OMIT || times[i].tv_nsec == 
>> UTIME_NOW) {
>> +tv[i].tv_sec = 0;
>> +tv[i].tv_usec = 0;
> 
> I don't think this is accurate in either case.  It will set the
> atime, mtime, or both to 0.
> 
> For UTIME_NOW (in both) we'd simply pass NULL to utimes(2).  For
> UTIME_OMIT (in both) we'd simply skip the call to utimes(2) altogether.
> 
> The harder part is a mixed mode (i.e. the truncate fix mentioned in the
> above commit).  I think the only way to handle UTIME_NOW in one is to
> call gettimeofday (or clock_gettime for better resolution) to find out
> what current time is.  And for UTIME_OMIT call stat to find out what the
> current setting is and reset to that value.  Both of those cases can
> possibly zero out the extra precision (providing only seconds
> resolution).

Thank you for comments!
I'll post an updated one soon.

Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] virtio-9p: fix build on !CONFIG_UTIMENSAT

2010-11-14 Thread Hidetoshi Seto
This patch introduce a fallback mechanism for old systems that do not
support utimensat().  This fix build failure with following warnings:

hw/virtio-9p-local.c: In function 'local_utimensat':
hw/virtio-9p-local.c:479: warning: implicit declaration of function 'utimensat'
hw/virtio-9p-local.c:479: warning: nested extern declaration of 'utimensat'

and:

hw/virtio-9p.c: In function 'v9fs_setattr_post_chmod':
hw/virtio-9p.c:1410: error: 'UTIME_NOW' undeclared (first use in this function)
hw/virtio-9p.c:1410: error: (Each undeclared identifier is reported only once
hw/virtio-9p.c:1410: error: for each function it appears in.)
hw/virtio-9p.c:1413: error: 'UTIME_OMIT' undeclared (first use in this function)
hw/virtio-9p.c: In function 'v9fs_wstat_post_chmod':
hw/virtio-9p.c:2905: error: 'UTIME_OMIT' undeclared (first use in this function)

v3:
  - Use better alternative handling for UTIME_NOW/OMIT
  - Move qemu_utimensat() to cutils.c
V2:
  - Introduce qemu_utimensat()

Signed-off-by: Hidetoshi Seto 
---
 cutils.c |   43 +++
 hw/virtio-9p-local.c |4 ++--
 qemu-common.h|   10 ++
 3 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/cutils.c b/cutils.c
index 536ee93..3c18941 100644
--- a/cutils.c
+++ b/cutils.c
@@ -288,3 +288,46 @@ int fcntl_setfl(int fd, int flag)
 }
 #endif
 
+int qemu_utimensat(int dirfd, const char *path, const struct timespec *times,
+   int flags)
+{
+#ifdef CONFIG_UTIMENSAT
+return utimensat(dirfd, path, times, flags);
+#else
+/* Fallback: use utimes() instead of utimensat() */
+struct timeval tv[2], tv_now;
+struct stat st;
+int i;
+
+/* happy if special cases */
+if (times[0].tv_nsec == UTIME_OMIT && times[1].tv_nsec == UTIME_OMIT) {
+return 0;
+}
+if (times[0].tv_nsec == UTIME_NOW && times[1].tv_nsec == UTIME_NOW) {
+return utimes(path, NULL);
+}
+
+/* prepare for hard cases */
+if (times[0].tv_nsec == UTIME_NOW || times[1].tv_nsec == UTIME_NOW) {
+gettimeofday(&tv_now, NULL);
+}
+if (times[0].tv_nsec == UTIME_OMIT || times[1].tv_nsec == UTIME_OMIT) {
+stat(path, &st);
+}
+
+for (i = 0; i < 2; i++) {
+if (times[i].tv_nsec == UTIME_NOW) {
+tv[i].tv_sec = tv_now.tv_sec;
+tv[i].tv_usec = 0;
+} else if (times[i].tv_nsec == UTIME_OMIT) {
+tv[i].tv_sec = (i == 0) ? st.st_atime : st.st_mtime;
+tv[i].tv_usec = 0;
+} else {
+tv[i].tv_sec = times[i].tv_sec;
+tv[i].tv_usec = times[i].tv_nsec / 1000;
+}
+}
+
+return utimes(path, &tv[0]);
+#endif
+}
diff --git a/hw/virtio-9p-local.c b/hw/virtio-9p-local.c
index 0d52020..41603ea 100644
--- a/hw/virtio-9p-local.c
+++ b/hw/virtio-9p-local.c
@@ -480,9 +480,9 @@ static int local_chown(FsContext *fs_ctx, const char *path, 
FsCred *credp)
 }
 
 static int local_utimensat(FsContext *s, const char *path,
-  const struct timespec *buf)
+   const struct timespec *buf)
 {
-return utimensat(AT_FDCWD, rpath(s, path), buf, AT_SYMLINK_NOFOLLOW);
+return qemu_utimensat(AT_FDCWD, rpath(s, path), buf, AT_SYMLINK_NOFOLLOW);
 }
 
 static int local_remove(FsContext *ctx, const char *path)
diff --git a/qemu-common.h b/qemu-common.h
index 2fbc27f..7fe4c16 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -146,6 +146,16 @@ time_t mktimegm(struct tm *tm);
 int qemu_fls(int i);
 int qemu_fdatasync(int fd);
 int fcntl_setfl(int fd, int flag);
+#ifndef CONFIG_UTIMENSAT
+#ifndef UTIME_NOW
+# define UTIME_NOW ((1l << 30) - 1l)
+#endif
+#ifndef UTIME_OMIT
+# define UTIME_OMIT((1l << 30) - 2l)
+#endif
+#endif
+int qemu_utimensat(int dirfd, const char *path, const struct timespec *times,
+int flags);
 
 /* path.c */
 void init_paths(const char *prefix);
-- 
1.7.3.1


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] virtio-9p: fix build on !CONFIG_UTIMENSAT

2010-11-14 Thread Chris Wright
* Hidetoshi Seto (seto.hideto...@jp.fujitsu.com) wrote:
> This patch introduce a fallback mechanism for old systems that do not
> support utimensat().  This fix build failure with following warnings:
> 
> hw/virtio-9p-local.c: In function 'local_utimensat':
> hw/virtio-9p-local.c:479: warning: implicit declaration of function 
> 'utimensat'
> hw/virtio-9p-local.c:479: warning: nested extern declaration of 'utimensat'
> 
> and:
> 
> hw/virtio-9p.c: In function 'v9fs_setattr_post_chmod':
> hw/virtio-9p.c:1410: error: 'UTIME_NOW' undeclared (first use in this 
> function)
> hw/virtio-9p.c:1410: error: (Each undeclared identifier is reported only once
> hw/virtio-9p.c:1410: error: for each function it appears in.)
> hw/virtio-9p.c:1413: error: 'UTIME_OMIT' undeclared (first use in this 
> function)
> hw/virtio-9p.c: In function 'v9fs_wstat_post_chmod':
> hw/virtio-9p.c:2905: error: 'UTIME_OMIT' undeclared (first use in this 
> function)
> 
> v3:
>   - Use better alternative handling for UTIME_NOW/OMIT
>   - Move qemu_utimensat() to cutils.c
> V2:
>   - Introduce qemu_utimensat()
> 
> Signed-off-by: Hidetoshi Seto 

Looks good to me (no strong opinion on the cutils vs oslib-posix that
Jes mentioned).

Acked-by: Chris Wright 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Kevin O'Connor
On Sun, Nov 14, 2010 at 05:39:41PM +0200, Gleb Natapov wrote:
> +/*
> + * This function returns device list as an array in a below format:
> + * +-+-+---+-+---+--
> + * |  n  |  l1 |   devpath1|  l2 |  devpath2 | ...
> + * +-+-+---+-+---+--
> + * where:
> + *   n - a number of devise pathes (one byte)
> + *   l - length of following device path string (one byte)
> + *   devpath - non-null terminated string of length l representing
> + * one device path
> + */

Why not just return a newline separated list that is null terminated?

-Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 9/9] pci: Store capability offsets in PCIDevice

2010-11-14 Thread Alex Williamson
On Sat, 2010-11-13 at 23:05 +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 12, 2010 at 10:47:21AM -0700, Alex Williamson wrote:
> > This not only makes pci_find_capability a directly lookup, but also
> > allows us to better track added capabilities and avoids the proliferation
> > of random additional capability offset markers.
> > 
> > Signed-off-by: Alex Williamson 
> 
> There shouldn't be any need to store offsets
> separately as find_capability gives you the value,
> and duplicating same data in two places is bad
> as we need to keep them in sync now.
> 
> We track offset to msi and msix capabilities as an optimization:
> because they are used on data path. I can't see why would
> we need to optimize any other capability like this.

That's what I figured.  I'm not going to fight for this one, but I do
like the consistency of accessing all capabilities directly instead of
having a few that are more equal than the rest.  Can the benefit of
having msix_cap or msi_cap even be measured?  In my casual observations
with device assignment, those registers don't get touched enough to
warrant a special case either.  We've got a lot bigger problems if we
can't keep a couple tables in sync between an add and delete call.
Thanks,

Alex

> > ---
> > 
> >  hw/msix.c |   15 +++
> >  hw/pci.c  |   20 ++--
> >  hw/pci.h  |5 +++--
> >  3 files changed, 28 insertions(+), 12 deletions(-)
> > 
> > diff --git a/hw/msix.c b/hw/msix.c
> > index b98b34a..060f27b 100644
> > --- a/hw/msix.c
> > +++ b/hw/msix.c
> > @@ -204,7 +204,6 @@ static int msix_add_config(struct PCIDevice *pdev, 
> > unsigned short nentries,
> >  pci_set_long(config + MSIX_PBA_OFFSET, (bar_size + 
> > MSIX_PAGE_PENDING) |
> >   bar_nr);
> >  }
> > -pdev->msix_cap = config_offset;
> >  /* Make flags bit writeable. */
> >  pdev->wmask[config_offset + MSIX_CONTROL_OFFSET] |= MSIX_ENABLE_MASK |
> > MSIX_MASKALL_MASK;
> > @@ -253,7 +252,8 @@ static void msix_clr_pending(PCIDevice *dev, int vector)
> >  
> >  static int msix_function_masked(PCIDevice *dev)
> >  {
> > -return dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] & 
> > MSIX_MASKALL_MASK;
> > +return dev->config[dev->caps[PCI_CAP_ID_MSIX] +
> > +   MSIX_CONTROL_OFFSET] & MSIX_MASKALL_MASK;
> >  }
> >  
> >  static int msix_is_masked(PCIDevice *dev, int vector)
> > @@ -275,7 +275,7 @@ static void msix_handle_mask_update(PCIDevice *dev, int 
> > vector)
> >  void msix_write_config(PCIDevice *dev, uint32_t addr,
> > uint32_t val, int len)
> >  {
> > -unsigned enable_pos = dev->msix_cap + MSIX_CONTROL_OFFSET;
> > +unsigned enable_pos = dev->caps[PCI_CAP_ID_MSIX] + MSIX_CONTROL_OFFSET;
> >  int vector;
> >  
> >  if (!range_covers_byte(addr, len, enable_pos)) {
> > @@ -334,7 +334,7 @@ static CPUReadMemoryFunc * const msix_mmio_read[] = {
> >  void msix_mmio_map(PCIDevice *d, int region_num,
> > pcibus_t addr, pcibus_t size, int type)
> >  {
> > -uint8_t *config = d->config + d->msix_cap;
> > +uint8_t *config = d->config + d->caps[PCI_CAP_ID_MSIX];
> >  uint32_t table = pci_get_long(config + MSIX_TABLE_OFFSET);
> >  uint32_t offset = table & ~(MSIX_PAGE_SIZE - 1);
> >  /* TODO: for assigned devices, we'll want to make it possible to map
> > @@ -437,7 +437,6 @@ int msix_uninit(PCIDevice *dev)
> >  if (!(dev->cap_present & QEMU_PCI_CAP_MSIX))
> >  return 0;
> >  pci_del_capability(dev, PCI_CAP_ID_MSIX, MSIX_CAP_LENGTH);
> > -dev->msix_cap = 0;
> >  msix_free_irq_entries(dev);
> >  dev->msix_entries_nr = 0;
> >  cpu_unregister_io_memory(dev->msix_mmio_index);
> > @@ -493,7 +492,7 @@ int msix_present(PCIDevice *dev)
> >  int msix_enabled(PCIDevice *dev)
> >  {
> >  return (dev->cap_present & QEMU_PCI_CAP_MSIX) &&
> > -(dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &
> > +(dev->config[dev->caps[PCI_CAP_ID_MSIX] + MSIX_CONTROL_OFFSET] &
> >   MSIX_ENABLE_MASK);
> >  }
> >  
> > @@ -534,8 +533,8 @@ void msix_reset(PCIDevice *dev)
> >  if (!(dev->cap_present & QEMU_PCI_CAP_MSIX))
> >  return;
> >  msix_free_irq_entries(dev);
> > -dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=
> > -   ~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET];
> > +dev->config[dev->caps[PCI_CAP_ID_MSIX] + MSIX_CONTROL_OFFSET] &=
> > +   ~dev->wmask[dev->caps[PCI_CAP_ID_MSIX] + MSIX_CONTROL_OFFSET];
> >  memset(dev->msix_table_page, 0, MSIX_PAGE_SIZE);
> >  msix_mask_all(dev, dev->msix_entries_nr);
> >  }
> > diff --git a/hw/pci.c b/hw/pci.c
> > index bc25be7..773afa5 100644
> > --- a/hw/pci.c
> > +++ b/hw/pci.c
> > @@ -1990,15 +1990,24 @@ int pci_add_capability_at_offset(PCIDevice *pdev, 
> > uint8_t cap_id,
> >  {
> >  uint8_t i, *config = pdev->config + offset;
> >  
> > +/* Check overlap with existing capabilities, valid cap, already added 

Re: Why so many vm exits caused by ept violation

2010-11-14 Thread lidong chen
i use the virtio dirvier of rhle6, the guest os version is 2.6.16.

the vm exits which caused by io instruction and apic access reduce a lot.
the total number of vm exits decreased by 50% :)

but the ept violation seem increase on cpu 0, cpu 1, cpu 2, cpu3.

(KVM)10 times on cpu 2,
0, 1989
1, 9937
7, 7951
12, 2709
28, 20
30, 18536
44, 29058
48, 29800

and not all cpu have this problem: cpu4,cpu5,cpu6,cpu7
10 times on cpu 5,
0, 13934
1, 35955
7, 26
12, 3841
28, 11224
30, 29389
44, 5631

I bind vcpu like this:
  virsh vcpupin brd1vm4 0 7
virsh vcpupin brd3vm4 0 0
virsh vcpupin brd3vm4 1 4
virsh vcpupin brd5vm4 0 1
virsh vcpupin brd5vm4 1 5
virsh vcpupin brd9vm4 0 2
virsh vcpupin brd9vm4 1 6
virsh vcpupin brd11vm4 0 3
virsh vcpupin brd11vm4 1 7

the /proc/interrupt of guest os is below:
   CPU0   CPU1
  0:1896802  0IO-APIC-edge  timer
  1:  8  0IO-APIC-edge  i8042
  4: 14  0IO-APIC-edge  serial
  8:  0  0IO-APIC-edge  rtc
  9:  0  0   IO-APIC-level  acpi
 10:  0  0   IO-APIC-level  virtio1, virtio2, virtio5
 11:  1  0   IO-APIC-level  virtio0, virtio3, virtio4
 12:104  0IO-APIC-edge  i8042
 50:  1  0   PCI-MSI-X  virtio2-output
 58:  0  0   PCI-MSI-X  virtio3-config
 66:2046985  0   PCI-MSI-X  virtio3-input
 74:  2  0   PCI-MSI-X  virtio3-output
 82:  0  0   PCI-MSI-X  virtio4-config
 90:217  0   PCI-MSI-X  virtio4-input
 98:  0  0   PCI-MSI-X  virtio4-output
177:  0  0   PCI-MSI-X  virtio0-config
185: 341831  0   PCI-MSI-X  virtio0-input
193:  1  0   PCI-MSI-X  virtio0-output
201:  0  0   PCI-MSI-X  virtio1-config
209: 188747  0   PCI-MSI-X  virtio1-input
217:  1  0   PCI-MSI-X  virtio1-output
225:  0  0   PCI-MSI-X  virtio2-config
233:2204149  0   PCI-MSI-X  virtio2-input
NMI:14557671426226
LOC:18960991896637
ERR:  0
MIS:  0

the application on all vcpu is the same. so i think the new pv driver
caused ept violation .


2010/11/10 Avi Kivity :
> On 11/10/2010 09:09 AM, lidong chen wrote:
>>
>> after slove the ept violation problem, i found  the vm exits times is
>> still more than xen.
>>
>> and i found most of them is caused by io instruction. then i stat the
>> port number, most of them is caused by virio_net.
>
> This is also solved by the msi-capable virtio driver.
>
> --
> error compiling committee.c: too many arguments to function
>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] KVM: MMU: retry #PF for softmmu

2010-11-14 Thread Xiao Guangrong
On 11/14/2010 06:46 PM, Avi Kivity wrote:
> On 11/12/2010 08:50 AM, Xiao Guangrong wrote:
>> Retry #PF for softmmu only when the current vcpu has the same
>> root shadow page as the time when #PF occurs. it means they
>> have same paging environment
>>
> 

Hi Avi,

Thanks for your review.

> The process could have been killed and replaced by another using the
> same cr3.  

Yeah, this 'retry' is unnecessary if the process is killed, but this
case is infrequent, the most case is the process keeps running and try
to access the fault address later. 

And, we can get few advantages even if the process have been killed,
since we can fix the page mapping for the other processes which have
the same CR3, if other process accessed the fault address, the #PF
can be avoid. (of course we can't speculate other process can access
the fault address later)

After all, this is a speculate path, i thinks it can work well in most
case. :-)

> Or we may be running a guest that uses the same cr3 for all
> processes.  

We can allow to retry #PF in the same CR3 even if there are the different
processes, since these processes have the same page mapping, the later #PF
can avoid if the page mapping have been fixed.

> Or another thread may have mmap()ed something else over the
> same address. 

The mmap virtual address is also visible for other threads since the threads
have the same page table, so i think this case is the same as above?




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM: MMU: don't drop spte if overwrite it from W to RO

2010-11-14 Thread Xiao Guangrong
On 11/14/2010 06:52 PM, Avi Kivity wrote:
> On 11/12/2010 12:30 PM, Xiao Guangrong wrote:
>> We just need flush tlb if overwrite a writable spte with a read-only one
>>
> 
> What are the advantages?  Avoid playing with rmap, and avoid a window
> where the spte is missing?
> 

Both, but only the first was in my mind when i'm making the patch :-)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] KVM: MMU: notrap it if gpte's reserved is set

2010-11-14 Thread Xiao Guangrong
On 11/14/2010 06:56 PM, Avi Kivity wrote:
> On 11/12/2010 12:34 PM, Xiao Guangrong wrote:
>> We can past the page fault to guest directly if gpte's reserved
>> is set
>>
> 
> How can that work? shadow_notrap_nonpresent_pte causes a fault with
> PFEC.P=PFEC.RSVD=0, while we need PFEC.P=PFEC.RSVD=1.
> 

Ah, i missed it for a long time, thanks for you point it out.

The same mistake is in 'prefetch' path, i'll fix it in the v2 version.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why so many vm exits caused by ept violation

2010-11-14 Thread lidong chen
the gpa caused ept violation is below:

most of them is F202.(4060217344)

error gpa is 4060217344
error gpa is 4060217344
error gpa is 4060217344
error gpa is 4060217344
error gpa is 4060217344
error gpa is 4060217344
error gpa is 4060217344
error gpa is 4060217344
error gpa is 4060217344
error gpa is 4060217344

other is F20A(4060741632).

error gpa is 4060741632
error gpa is 4060741632


2010/11/15, lidong chen :
> i use the virtio dirvier of rhle6, the guest os version is 2.6.16.
>
> the vm exits which caused by io instruction and apic access reduce a lot.
> the total number of vm exits decreased by 50% :)
>
> but the ept violation seem increase on cpu 0, cpu 1, cpu 2, cpu3.
>
> (KVM)10 times on cpu 2,
> 0, 1989
> 1, 9937
> 7, 7951
> 12, 2709
> 28, 20
> 30, 18536
> 44, 29058
> 48, 29800
>
> and not all cpu have this problem: cpu4,cpu5,cpu6,cpu7
> 10 times on cpu 5,
> 0, 13934
> 1, 35955
> 7, 26
> 12, 3841
> 28, 11224
> 30, 29389
> 44, 5631
>
> I bind vcpu like this:
>   virsh vcpupin brd1vm4 0 7
> virsh vcpupin brd3vm4 0 0
> virsh vcpupin brd3vm4 1 4
> virsh vcpupin brd5vm4 0 1
> virsh vcpupin brd5vm4 1 5
> virsh vcpupin brd9vm4 0 2
> virsh vcpupin brd9vm4 1 6
> virsh vcpupin brd11vm4 0 3
> virsh vcpupin brd11vm4 1 7
>
> the /proc/interrupt of guest os is below:
>CPU0   CPU1
>   0:1896802  0IO-APIC-edge  timer
>   1:  8  0IO-APIC-edge  i8042
>   4: 14  0IO-APIC-edge  serial
>   8:  0  0IO-APIC-edge  rtc
>   9:  0  0   IO-APIC-level  acpi
>  10:  0  0   IO-APIC-level  virtio1, virtio2, virtio5
>  11:  1  0   IO-APIC-level  virtio0, virtio3, virtio4
>  12:104  0IO-APIC-edge  i8042
>  50:  1  0   PCI-MSI-X  virtio2-output
>  58:  0  0   PCI-MSI-X  virtio3-config
>  66:2046985  0   PCI-MSI-X  virtio3-input
>  74:  2  0   PCI-MSI-X  virtio3-output
>  82:  0  0   PCI-MSI-X  virtio4-config
>  90:217  0   PCI-MSI-X  virtio4-input
>  98:  0  0   PCI-MSI-X  virtio4-output
> 177:  0  0   PCI-MSI-X  virtio0-config
> 185: 341831  0   PCI-MSI-X  virtio0-input
> 193:  1  0   PCI-MSI-X  virtio0-output
> 201:  0  0   PCI-MSI-X  virtio1-config
> 209: 188747  0   PCI-MSI-X  virtio1-input
> 217:  1  0   PCI-MSI-X  virtio1-output
> 225:  0  0   PCI-MSI-X  virtio2-config
> 233:2204149  0   PCI-MSI-X  virtio2-input
> NMI:14557671426226
> LOC:18960991896637
> ERR:  0
> MIS:  0
>
> the application on all vcpu is the same. so i think the new pv driver
> caused ept violation .
>
>
> 2010/11/10 Avi Kivity :
>> On 11/10/2010 09:09 AM, lidong chen wrote:
>>>
>>> after slove the ept violation problem, i found  the vm exits times is
>>> still more than xen.
>>>
>>> and i found most of them is caused by io instruction. then i stat the
>>> port number, most of them is caused by virio_net.
>>
>> This is also solved by the msi-capable virtio driver.
>>
>> --
>> error compiling committee.c: too many arguments to function
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why so many vm exits caused by ept violation

2010-11-14 Thread lidong chen
the address is the Region 1 of virtio_net.

why virtio_net use this address caused ept violation?

00:04.0 Ethernet controller: Unknown device 1af4:1000
Subsystem: Unknown device 1af4:0001
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
SERR- TAbort-
SERR- :
> the gpa caused ept violation is below:
>
> most of them is F202.(4060217344)
>
> error gpa is 4060217344
> error gpa is 4060217344
> error gpa is 4060217344
> error gpa is 4060217344
> error gpa is 4060217344
> error gpa is 4060217344
> error gpa is 4060217344
> error gpa is 4060217344
> error gpa is 4060217344
> error gpa is 4060217344
>
> other is F20A(4060741632).
>
> error gpa is 4060741632
> error gpa is 4060741632
>
>
> 2010/11/15, lidong chen :
>> i use the virtio dirvier of rhle6, the guest os version is 2.6.16.
>>
>> the vm exits which caused by io instruction and apic access reduce a lot.
>> the total number of vm exits decreased by 50% :)
>>
>> but the ept violation seem increase on cpu 0, cpu 1, cpu 2, cpu3.
>>
>> (KVM)10 times on cpu 2,
>> 0, 1989
>> 1, 9937
>> 7, 7951
>> 12, 2709
>> 28, 20
>> 30, 18536
>> 44, 29058
>> 48, 29800
>>
>> and not all cpu have this problem: cpu4,cpu5,cpu6,cpu7
>> 10 times on cpu 5,
>> 0, 13934
>> 1, 35955
>> 7, 26
>> 12, 3841
>> 28, 11224
>> 30, 29389
>> 44, 5631
>>
>> I bind vcpu like this:
>>   virsh vcpupin brd1vm4 0 7
>> virsh vcpupin brd3vm4 0 0
>> virsh vcpupin brd3vm4 1 4
>> virsh vcpupin brd5vm4 0 1
>> virsh vcpupin brd5vm4 1 5
>> virsh vcpupin brd9vm4 0 2
>> virsh vcpupin brd9vm4 1 6
>> virsh vcpupin brd11vm4 0 3
>> virsh vcpupin brd11vm4 1 7
>>
>> the /proc/interrupt of guest os is below:
>>CPU0   CPU1
>>   0:1896802  0IO-APIC-edge  timer
>>   1:  8  0IO-APIC-edge  i8042
>>   4: 14  0IO-APIC-edge  serial
>>   8:  0  0IO-APIC-edge  rtc
>>   9:  0  0   IO-APIC-level  acpi
>>  10:  0  0   IO-APIC-level  virtio1, virtio2, virtio5
>>  11:  1  0   IO-APIC-level  virtio0, virtio3, virtio4
>>  12:104  0IO-APIC-edge  i8042
>>  50:  1  0   PCI-MSI-X  virtio2-output
>>  58:  0  0   PCI-MSI-X  virtio3-config
>>  66:2046985  0   PCI-MSI-X  virtio3-input
>>  74:  2  0   PCI-MSI-X  virtio3-output
>>  82:  0  0   PCI-MSI-X  virtio4-config
>>  90:217  0   PCI-MSI-X  virtio4-input
>>  98:  0  0   PCI-MSI-X  virtio4-output
>> 177:  0  0   PCI-MSI-X  virtio0-config
>> 185: 341831  0   PCI-MSI-X  virtio0-input
>> 193:  1  0   PCI-MSI-X  virtio0-output
>> 201:  0  0   PCI-MSI-X  virtio1-config
>> 209: 188747  0   PCI-MSI-X  virtio1-input
>> 217:  1  0   PCI-MSI-X  virtio1-output
>> 225:  0  0   PCI-MSI-X  virtio2-config
>> 233:2204149  0   PCI-MSI-X  virtio2-input
>> NMI:14557671426226
>> LOC:18960991896637
>> ERR:  0
>> MIS:  0
>>
>> the application on all vcpu is the same. so i think the new pv driver
>> caused ept violation .
>>
>>
>> 2010/11/10 Avi Kivity :
>>> On 11/10/2010 09:09 AM, lidong chen wrote:

 after slove the ept violation problem, i found  the vm exits times is
 still more than xen.

 and i found most of them is caused by io instruction. then i stat the
 port number, most of them is caused by virio_net.
>>>
>>> This is also solved by the msi-capable virtio driver.
>>>
>>> --
>>> error compiling committee.c: too many arguments to function
>>>
>>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] KVM: assigned dev: MSI-X mask support

2010-11-14 Thread Sheng Yang
On Friday 12 November 2010 19:25:16 Michael S. Tsirkin wrote:
> On Fri, Nov 12, 2010 at 06:54:01PM +0800, Sheng Yang wrote:
> > On Friday 12 November 2010 18:47:29 Michael S. Tsirkin wrote:
> > > On Fri, Nov 12, 2010 at 06:13:48PM +0800, Sheng Yang wrote:
> > > > On Friday 12 November 2010 17:53:13 Michael S. Tsirkin wrote:
> > > > > On Thu, Nov 11, 2010 at 03:46:59PM +0800, Sheng Yang wrote:
> > > > > > This patch enable per-vector mask for assigned devices using
> > > > > > MSI-X.
> > > > > > 
> > > > > > This patch provided two new APIs: one is for guest to specific
> > > > > > device's MSI-X table address in MMIO, the other is for userspace
> > > > > > to get information about mask bit.
> > > > > > 
> > > > > > All the mask bit operation are kept in kernel, in order to
> > > > > > accelerate. Userspace shouldn't access the device MMIO directly
> > > > > > for the information, instead it should uses provided API to do
> > > > > > so.
> > > > > > 
> > > > > > Signed-off-by: Sheng Yang 
> > > > > > ---
> > > > > > 
> > > > > >  arch/x86/kvm/x86.c   |1 +
> > > > > >  include/linux/kvm.h  |   32 +
> > > > > >  include/linux/kvm_host.h |5 +
> > > > > >  virt/kvm/assigned-dev.c  |  316
> > > > > >  +- 4 files
> > 
> > changed,
> > 
> > > > 353
> > > > 
> > > > > >  insertions(+), 1 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > > index f3f86b2..847f1e1 100644
> > > > > > --- a/arch/x86/kvm/x86.c
> > > > > > +++ b/arch/x86/kvm/x86.c
> > > > > > @@ -1926,6 +1926,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > > > > 
> > > > > > case KVM_CAP_DEBUGREGS:
> > > > > > case KVM_CAP_X86_ROBUST_SINGLESTEP:
> > > > > > 
> > > > > > case KVM_CAP_XSAVE:
> > > > > > +   case KVM_CAP_MSIX_MASK:
> > > > > > r = 1;
> > > > > > break;
> > > > > > 
> > > > > > case KVM_CAP_COALESCED_MMIO:
> > > > > > diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> > > > > > index 919ae53..32cd244 100644
> > > > > > --- a/include/linux/kvm.h
> > > > > > +++ b/include/linux/kvm.h
> > > > > > @@ -540,6 +540,9 @@ struct kvm_ppc_pvinfo {
> > > > > > 
> > > > > >  #endif
> > > > > >  #define KVM_CAP_PPC_GET_PVINFO 57
> > > > > >  #define KVM_CAP_PPC_IRQ_LEVEL 58
> > > > > > 
> > > > > > +#ifdef __KVM_HAVE_MSIX
> > > > > > +#define KVM_CAP_MSIX_MASK 59
> > > > > > +#endif
> > > > > > 
> > > > > >  #ifdef KVM_CAP_IRQ_ROUTING
> > > > > > 
> > > > > > @@ -671,6 +674,9 @@ struct kvm_clock_data {
> > > > > > 
> > > > > >  #define KVM_XEN_HVM_CONFIG_IOW(KVMIO,  0x7a, struct
> > > > > >  kvm_xen_hvm_config) #define KVM_SET_CLOCK
> > > > > >  _IOW(KVMIO, 0x7b, struct kvm_clock_data) #define KVM_GET_CLOCK
> > > > > >  _IOR(KVMIO,  0x7c, struct kvm_clock_data)
> > > > > > 
> > > > > > +/* Available with KVM_CAP_MSIX_MASK */
> > > > > > +#define KVM_GET_MSIX_ENTRY_IOWR(KVMIO,  0x7d, struct
> > > > > > kvm_msix_entry) +#define KVM_UPDATE_MSIX_MMIO  _IOW(KVMIO, 
> > > > > > 0x7e, struct kvm_msix_mmio)
> > > > > > 
> > > > > >  /* Available with KVM_CAP_PIT_STATE2 */
> > > > > >  #define KVM_GET_PIT2  _IOR(KVMIO,  0x9f, struct
> > > > > >  kvm_pit_state2) #define KVM_SET_PIT2  _IOW(KVMIO, 
> > > > > >  0xa0, struct kvm_pit_state2)
> > > > > > 
> > > > > > @@ -794,4 +800,30 @@ struct kvm_assigned_msix_entry {
> > > > > > 
> > > > > > __u16 padding[3];
> > > > > >  
> > > > > >  };
> > > > > > 
> > > > > > +#define KVM_MSIX_TYPE_ASSIGNED_DEV 1
> > > > > > +
> > > > > > +#define KVM_MSIX_FLAG_MASKBIT  (1 << 0)
> > > > > > +#define KVM_MSIX_FLAG_QUERY_MASKBIT(1 << 0)
> > > > > > +
> > > > > > +struct kvm_msix_entry {
> > > > > > +   __u32 id;
> > > > > > +   __u32 type;
> > > > > > +   __u32 entry; /* The index of entry in the MSI-X table */
> > > > > > +   __u32 flags;
> > > > > > +   __u32 query_flags;
> > > > > > +   __u32 reserved[5];
> > > > > > +};
> > > > > > +
> > > > > > +#define KVM_MSIX_MMIO_FLAG_REGISTER(1 << 0)
> > > > > > +#define KVM_MSIX_MMIO_FLAG_UNREGISTER  (1 << 1)
> > > > > > +
> > > > > > +struct kvm_msix_mmio {
> > > > > > +   __u32 id;
> > > > > > +   __u32 type;
> > > > > > +   __u64 base_addr;
> > > > > > +   __u32 max_entries_nr;
> > > > > > +   __u32 flags;
> > > > > > +   __u32 reserved[6];
> > > > > > +};
> > > > > > +
> > > > > > 
> > > > > >  #endif /* __LINUX_KVM_H */
> > > > > > 
> > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > > > > index 4b31539..9d49074 100644
> > > > > > --- a/include/linux/kvm_host.h
> > > > > > +++ b/include/linux/kvm_host.h
> > > > > > @@ -445,6 +445,7 @@ struct kvm_guest_msix_entry {
> > > > > > 
> > > > > >  };
> > > > > >  
> > > > > >  #define KVM_ASSIGNED_ENABLED_IOMMU (1 << 0)
> > > > > > 
> > > > > > +#define KVM_ASSIGNED_ENABLED_MSIX_MMIO (1 << 1)
> > > > > > 
> > > > > >  struct kvm_ass

Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Gleb Natapov
On Sun, Nov 14, 2010 at 10:40:33PM -0500, Kevin O'Connor wrote:
> On Sun, Nov 14, 2010 at 05:39:41PM +0200, Gleb Natapov wrote:
> > +/*
> > + * This function returns device list as an array in a below format:
> > + * +-+-+---+-+---+--
> > + * |  n  |  l1 |   devpath1|  l2 |  devpath2 | ...
> > + * +-+-+---+-+---+--
> > + * where:
> > + *   n - a number of devise pathes (one byte)
> > + *   l - length of following device path string (one byte)
> > + *   devpath - non-null terminated string of length l representing
> > + * one device path
> > + */
> 
> Why not just return a newline separated list that is null terminated?
> 
Doing it like this will needlessly complicate firmware side. How do you
know how much memory to allocate before reading device list? Doing it
like Blue suggest (have BOOTINDEX_LEN and BOOTINDEX_STRING) solves this.
To create nice array from bootindex string you firmware will still have
to do additional pass on it though. With format like above the code
would look like that:

qemu_cfg_read(&n, 1);
arr = alloc(n);
for (i=0; ihttp://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] KVM: assigned dev: MSI-X mask support

2010-11-14 Thread Michael S. Tsirkin
On Mon, Nov 15, 2010 at 03:37:21PM +0800, Sheng Yang wrote:
> > > We can back to them if there is someone really did it in that way. But
> > > for all hypervisors using QEmu, I think we haven't seen such kind of
> > > behavior yet.
> > 
> > I would rather stick to the spec than go figure out what do BSD/Sun/Mac do,
> > or will do.
> 
> Sure, but no hurry for that. It doesn't similar to the API case, so we can 
> achieve 
> it incrementally.

Isn't the proposed way to solve this to move vector address/data
handling into kernel too? If yes it does affect the API.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] KVM: assigned dev: MSI-X mask support

2010-11-14 Thread Sheng Yang
On Monday 15 November 2010 15:42:50 Michael S. Tsirkin wrote:
> On Mon, Nov 15, 2010 at 03:37:21PM +0800, Sheng Yang wrote:
> > > > We can back to them if there is someone really did it in that way.
> > > > But for all hypervisors using QEmu, I think we haven't seen such
> > > > kind of behavior yet.
> > > 
> > > I would rather stick to the spec than go figure out what do BSD/Sun/Mac
> > > do, or will do.
> > 
> > Sure, but no hurry for that. It doesn't similar to the API case, so we
> > can achieve it incrementally.
> 
> Isn't the proposed way to solve this to move vector address/data
> handling into kernel too? If yes it does affect the API.

It didn't afffect the API used by this patch. So the code can still be modified 
after later.

--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-14 Thread Michael S. Tsirkin
On Mon, Nov 15, 2010 at 09:40:08AM +0200, Gleb Natapov wrote:
> On Sun, Nov 14, 2010 at 10:40:33PM -0500, Kevin O'Connor wrote:
> > On Sun, Nov 14, 2010 at 05:39:41PM +0200, Gleb Natapov wrote:
> > > +/*
> > > + * This function returns device list as an array in a below format:
> > > + * +-+-+---+-+---+--
> > > + * |  n  |  l1 |   devpath1|  l2 |  devpath2 | ...
> > > + * +-+-+---+-+---+--
> > > + * where:
> > > + *   n - a number of devise pathes (one byte)
> > > + *   l - length of following device path string (one byte)
> > > + *   devpath - non-null terminated string of length l representing
> > > + * one device path
> > > + */
> > 
> > Why not just return a newline separated list that is null terminated?
> > 
> Doing it like this will needlessly complicate firmware side. How do you
> know how much memory to allocate before reading device list?

Do a memory scan, count newlines until you reach 0?

> Doing it
> like Blue suggest (have BOOTINDEX_LEN and BOOTINDEX_STRING) solves this.
> To create nice array from bootindex string you firmware will still have
> to do additional pass on it though.

Why is this a problem? Pass over memory is cheap, isn't it?

> With format like above the code
> would look like that:
> 
> qemu_cfg_read(&n, 1);
> arr = alloc(n);
> for (i=0; i  qemu_cfg_read(&l, 1);
>  arr[i] = zalloc(l+1);
>  qemu_cfg_read(arr[i], l);
> }
>  
> 
> --
>   Gleb.


At this point I don't care about format.
But I would like one without 1-byte-length limitations,
just so we can cover whatever pci can through at us.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html