Re: Call for help: moving the kvm wiki

2008-10-12 Thread Avi Kivity

Tomasz Chmielewski wrote:

As you may have noticed, the kvm wiki is overrun by spammers.  It the
past I've regularly cleaned up the spam, but some time ago I've given 
up.


So I'm looking for a volunteer to locate a spam-free public wiki host
(candidates include wiki.kernel.org and fedorahosted.org) and transfer
the contents (minus the spam).  I don't think we need to transfer the
editing history, but the conversion should adapt to the target's wiki
syntax.


Just use captcha or a similar system with your existing wiki.



I'm not the wiki's administrator.


My experience with administering wikis is:

* if you don't use any preventive measures, your wiki will turn into a
  collection of garbage very soon, unless you spend lots of time
  monitoring changes



This has happened, even with registration.


* requiring users to register before using the wiki results in:
  - much less contributions from users - very big disadvantage,
  - fewer spam, but bots can register, so some spam will go through



I'm not worried about the registration burden, if someone is too lazy to 
register, they'll be lazy with the content as well.



* using captcha or a similar system prevents 100% of spam
  - lots of people don't like captchas very much, as often they are hard
to read


Definitely.


  - personally, on my site (http://wpkg.org) user has to solve a simple
mathematics equation before the changes are accepted, like:

  10 + 7 = ... [Accept button]


So I suggest going through a list of extensions/plugins for MoinMoin 
(this is the wiki KVM uses), choosing something appropriate, and the 
spam should be gone.


Problem is, we're on an old version, with no prospect of upgrading.  
That's why I'd like to move to an existing, well maintained wiki host.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] kvm: qemu: Improve virtio_net recv buffer allocation scheme

2008-10-12 Thread Avi Kivity

Mark McLoughlin wrote:

From: Herbert Xu [EMAIL PROTECTED]

Currently, in order to receive large packets, the guest must allocate
max-sized packet buffers and pass them to the host. Each of these
max-sized packets occupy 20 ring entries, which means we can only
transfer a maximum of 12 packets in a single batch with a 256 entry
ring.

When receiving packets from external networks, we only receive MTU
sized packets and so the throughput observed is throttled by the
number of packets the ring can hold.

Implement the VIRTIO_NET_F_MRG_RXBUF feature to let guests know that
we can merge smaller buffers together in order to handle large packets.

This scheme allows us to be efficient in our use of ring entries
while still supporting large packets. Benchmarking using netperf from
an external machine to a guest over a 10Gb/s network shows a 100%
improvement from ~1Gb/s to ~2Gb/s. With a local host-guest benchmark
with GSO disabled on the host side, throughput was seen to increase
from 700Mb/s to 1.7Gb/s.

Based on a patch from Herbert, with the feature renamed from
datahead and some re-factoring for readability.


diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index 403247b..afa5fe5 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -34,9 +34,13 @@
 #define VIRTIO_NET_F_HOST_TSO6 12  /* Host can handle TSOv6 in. */
 #define VIRTIO_NET_F_HOST_ECN  13  /* Host can handle TSO[6] w/ ECN in. */
 #define VIRTIO_NET_F_HOST_UFO  14  /* Host can handle UFO in. */
+#define VIRTIO_NET_F_MRG_RXBUF 15  /* Host can merge receive buffers. */
 
  


What's the status of the guest side of this feature?


 #define TX_TIMER_INTERVAL 15 /* 150 us */
 
+/* Should be the largest MAX_SKB_FRAGS supported. */

+#define VIRTIO_NET_MAX_FRAGS   18
+
  


This should be advertised by the host to the guest (or vice-versa?).  
We're embedding Linux-specific magic numbers in a guest-OS-agnostic ABI.


Perfereably, there shouldn't be a limit at all.


@@ -209,7 +220,12 @@ static void virtio_net_receive(void *opaque, const uint8_t 
*buf, int size)
 if (virtqueue_pop(n-rx_vq, elem) == 0)
return;
 
-if (elem.in_num  1 || elem.in_sg[0].iov_len != sizeof(*hdr)) {

+if (n-mergeable_rx_bufs) {
+   if (elem.in_num  1 || elem.in_sg[0].iov_len  TARGET_PAGE_SIZE) {
+   fprintf(stderr, virtio-net IOV is irregular\n);
+   exit(1);
+   }
  


Again, this is burying details of the current Linux stack into the ABI.  
The Linux stack may change not to be page oriented, or maybe this won't 
fit will to how Windows views things.  Can this be made not to depend on 
the size of the iov elements?



+} else if (elem.in_num  1 || elem.in_sg[0].iov_len != sizeof(*hdr)) {
fprintf(stderr, virtio-net header not in first element\n);
exit(1);
 }
@@ -229,11 +245,49 @@ static void virtio_net_receive(void *opaque, const 
uint8_t *buf, int size)
 }
 
 /* copy in packet.  ugh */

-iov_fill(elem.in_sg[1], elem.in_num - 1,
-buf + offset, size - offset);
 
-/* signal other side */

-virtqueue_push(n-rx_vq, elem, total);
+if (n-mergeable_rx_bufs) {
+   int i = 0;
+
+   elem.in_sg[0].iov_base += sizeof(*hdr);
+   elem.in_sg[0].iov_len  -= sizeof(*hdr);
+
+   offset += iov_fill(elem.in_sg[0], elem.in_num,
+  buf + offset, size - offset);
+
+   /* signal other side */
+   virtqueue_fill(n-rx_vq, elem, total, i++);
+
+   while (offset  size) {
+   int len;
+
+   if (virtqueue_pop(n-rx_vq, elem) == 0) {
+   fprintf(stderr, virtio-net truncating packet\n);
+   exit(1);
+   }
+
+   if (elem.in_num  1 || elem.in_sg[0].iov_len  TARGET_PAGE_SIZE) {
+   fprintf(stderr, virtio-net IOV is irregular\n);
+   exit(1);
+   }
+
+   len = iov_fill(elem.in_sg[0], elem.in_num,
+  buf + offset, size - offset);
+
+   virtqueue_fill(n-rx_vq, elem, len, i++);
+
+   offset += len;
+   }
+
+   virtqueue_flush(n-rx_vq, i);
+} else {
+   iov_fill(elem.in_sg[1], elem.in_num - 1,
+buf + offset, size - offset);
+
+   /* signal other side */
+   virtqueue_push(n-rx_vq, elem, total);
+}
+
  


Can we merge the two sides of the if () so that the only difference is 
the number of times we go through the loop?


Anthony, please review this as well, my virtio-foo is pretty superficial.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] kvm-77 release

2008-10-12 Thread Avi Kivity
This release fixes the -std-vga regression which bothered those of us 
who have large or widescreen monitors (note the option is now named 
'-vga std' due to upstream qemu changes).  Other significant changes 
include better disk performance if you have a fast host storage subsystem.


Changes from kvm-76:
- merge bochs-bios-cvs
- merge qemu-svn
  - more -cpu options
  - faster disk emulation (esp. with scsi/virtio)
- improved NMI support (Jan Kiszka)
- improve 4GB memory support (Alex Williamson)
- memory alias cleanups (Glauber Costa)
- fix kvmtrace segfault (Ryota OZAKI)
- make external module compile on split source/object configs (Alexander 
Graf)

  - allows compiling on opensuse
- fix -std-vga regression
- fix migration failure at end of migration protocol
- map mmio pages for device assignment (Weidong Han)
- silence lapic kernel messages (Jan Kiszka)
- fix vcpu reset (Gleb Natapov)
- fix missed invlpg on EPT-enabled machines with EPT disabled (Marcelo 
Tosatti)

- device assignment on ia64 (Xiantao Zhang)
- memory type support on EPT (Sheng Yang)


Notes:
If you use the modules bundled with kvm-77, you can use any version
of Linux from 2.6.16 upwards.  You may also use kvm-77 userspace with
the kvm modules provided by Linux 2.6.25 or above.  Some features may
only be available in newer releases.

http://kvm.qumranet.com


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ACPI

2008-10-12 Thread Avi Kivity

jd wrote:
Hi 
 We ship some images (that kicks the install ) out of the box and would like to know peoples experiences and developer opinions on ACPI. This would help us determine if this should be enabled by default or not.


-- For Windows guests 
-- For Linix guests 

  


I recommend you enable ACPI for all guests.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][RFC] vmchannel a data channel between host and guest.

2008-10-12 Thread Gleb Natapov
Hello,

 Sometimes there is a need to pass various bits of information between host
and guest (mostly for management purposes such as host screen resolution
changes or runtime statistics of a guest). To do that we need some way to
pass data between host and guest. Attached patch implements vmchannel that can
be used for this purpose. It is based on virtio infrastructure and
support more then one channel. The vmchannel presents itself as PCI
device to a guest so guest driver is also required. The one for linux is
attached. It uses netlink connector to communicate with userspace.

Comments are welcome.

--
Gleb.
diff --git a/qemu/Makefile.target b/qemu/Makefile.target
index 5462092..6cf13f7 100644
--- a/qemu/Makefile.target
+++ b/qemu/Makefile.target
@@ -612,7 +612,7 @@ OBJS += rtl8139.o
 OBJS += e1000.o
 
 # virtio devices
-OBJS += virtio.o virtio-net.o virtio-blk.o virtio-balloon.o
+OBJS += virtio.o virtio-net.o virtio-blk.o virtio-balloon.o virtio-vmchannel.o
 
 OBJS += device-hotplug.o
 
diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c
index 1d42aa7..e8c5531 100644
--- a/qemu/hw/pc.c
+++ b/qemu/hw/pc.c
@@ -1141,6 +1141,7 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size,
 			drives_table[index].bdrv);
 	unit_id++;
 	}
+virtio_vmchannel_init(pci_bus);
 }
 
 if (extboot_drive != -1) {
diff --git a/qemu/hw/virtio-vmchannel.c b/qemu/hw/virtio-vmchannel.c
new file mode 100644
index 000..1ce76ec
--- /dev/null
+++ b/qemu/hw/virtio-vmchannel.c
@@ -0,0 +1,239 @@
+/*
+ * Virtio VMChannel Device
+ *
+ * Copyright RedHat, inc. 2008
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include qemu-common.h
+#include sysemu.h
+#include virtio.h
+#include pc.h
+#include qemu-kvm.h
+#include qemu-char.h
+#include virtio-vmchannel.h
+
+#define DEBUG_VMCHANNEL
+
+#ifdef DEBUG_VMCHANNEL
+#define VMCHANNEL_DPRINTF(fmt, args...) \
+	do { printf(VMCHANNEL:  fmt , ##args); } while (0)
+#else
+#define VMCHANNEL_DPRINTF(fmt, args...)
+#endif
+
+typedef struct VirtIOVMChannel {
+VirtIODevice vdev;
+VirtQueue *sq;
+VirtQueue *rq;
+} VirtIOVMChannel;
+
+typedef struct VMChannel {
+CharDriverState *hd;
+VirtQueueElement elem;
+uint32_t id;
+size_t len;
+} VMChannel;
+
+typedef struct VMChannelDesc {
+uint32_t id;
+uint32_t len;
+} VMChannelDesc;
+
+typedef struct VMChannelCfg {
+uint32_t count;
+uint32_t ids[MAX_VMCHANNEL_DEVICES];
+} VMChannelCfg;
+
+static VirtIOVMChannel *vmchannel;
+
+static VMChannel vmchannel_descs[MAX_VMCHANNEL_DEVICES];
+static int vmchannel_desc_idx;
+
+static int vmchannel_can_read(void *opaque)
+{
+VMChannel *c = opaque;
+
+/* device not yet configured */
+if (vmchannel-rq-vring.avail == NULL)
+return 0;
+
+if (!c-len) {
+int i;
+
+if (virtqueue_pop(vmchannel-rq, c-elem) == 0)
+return 0;
+
+if (c-elem.in_num  1 ||
+c-elem.in_sg[0].iov_len  sizeof(VMChannelDesc)) {
+fprintf(stderr, vmchannel: wrong receive descriptor\n);
+return 0;
+}
+
+for (i = 0; i  c-elem.in_num; i++)
+c-len += c-elem.in_sg[i].iov_len;
+
+c-len -= sizeof(VMChannelDesc);
+}
+
+return (int)c-len;
+}
+
+static void vmchannel_read(void *opaque, const uint8_t *buf, int size)
+{
+VMChannel *c = opaque;
+VMChannelDesc *desc;
+int i;
+
+VMCHANNEL_DPRINTF(read %d bytes from channel %d\n, size, c-id);
+
+if (!c-len) {
+fprintf(stderr, vmchannel: trying to receive into empty descriptor\n);
+exit(1);
+}
+
+if (size = 0 || size  c-len) {
+fprintf(stderr, vmchannel: read size is wrong\n);
+exit(1);
+}
+
+desc = (VMChannelDesc*)c-elem.in_sg[0].iov_base;
+desc-id = c-id;
+desc-len = size;
+
+c-elem.in_sg[0].iov_base = desc + 1;
+c-elem.in_sg[0].iov_len -= sizeof(VMChannelDesc);
+
+for (i = 0; i  c-elem.in_num  size; i++) {
+struct iovec *iov = c-elem.in_sg[i];
+size_t len;
+
+len = MIN(size, iov-iov_len);
+memcpy(iov-iov_base, buf, len);
+size -= len;
+buf += len;
+}
+
+if (size) {
+fprintf(stderr, vmchannel: dropping %d bytes of data\n, size);
+exit(1);
+}
+
+virtqueue_push(vmchannel-rq, c-elem, desc-len);
+c-len = 0;
+virtio_notify(vmchannel-vdev, vmchannel-rq);
+}
+
+static void virtio_vmchannel_handle_recv(VirtIODevice *vdev, VirtQueue *outputq)
+{
+if (kvm_enabled())
+qemu_kvm_notify_work();
+}
+
+static VMChannel *vmchannel_lookup(uint32_t id)
+{
+int i;
+
+for (i = 0; i  vmchannel_desc_idx; i++) {
+if (vmchannel_descs[i].id == id)
+return vmchannel_descs[i];
+}
+return NULL;
+}
+
+static void virtio_vmchannel_handle_send(VirtIODevice *vdev, VirtQueue *outputq)
+{
+

bridging a wifi interface into kvm guest possible?

2008-10-12 Thread Michael Tokarev

[cross-posted to netdev and kvm lists]
[..which failed due to wrong (old) kvm address.
 Please excuse me for the repost]

Hello!

I'm trying to set up a [virtual/guest] network of hosts to
form something like a DMZ and a gateway, but in virtual
hardware instead of real hardware.  One of the things
I tried is to run the gateway/router machine inside a
guest system too, not only all the dmz hosts (there are
some obscure historical reasons for that, don't ask ;).

Real hardware has 2 ethernet interfaces - external and
internal LAN.  In order for the gateway to run as a
guest, one has to move external interface into guest.

Since kvm does not [fully] support PCI device moving
(what's the right word for this?) from host to guest
(which is the simplest solution possible), I were
thinking about something different: bridging.  Since
bridge is already used to connect gateway host to the
LAN, why not use it for external=gateway link too?
The difference is that there will be no IP address on
the host on that external bridge, i.e. the host will
not participate in the IP traffic transmission, only
ethernet.

So far so good, and that setup worked on a test environment,
worked flawlessly (well.. almost -- for some reason, under
some circumstances, linux starts broadcasting certain
packets over all bridges it has.. but that's different
issue/topic).  Worked up until I tried it on production,
which is different from the test setup by the fact that
for external interface, we have an old 11Mbps wifi card,
instead of a real ethernet NIC.

And I learned the hard way that bridging does not really
work with wifi cards (it works with some, and even that
requires.. some tweaking and additional software).

I tried to set up the mac address on the guest-gateway
to be the same as the one on wifi, but that obviously
didn't help.

After browsing kernel options (unrelated to this issue),
I noticed a device called macvlan.  So I wonder if that
can be used in my case, -- just to move a wifi interface
to a guest system.

I found very little documentation about macvlan.  The
patchset that introduced it back in 2007 says that macvlan
puts the underlying device into promisc mode (which is where
a wifi driver has problems).

Or maybe there's another solution to this my problem (not
counting getting additional hardware for the wifi link,
which obviously will work; or replacing the wifi card
with something more advanced).

Thank you!

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2138166 ] Vista guest fails to start on kvm-76

2008-10-12 Thread SourceForge.net
Bugs item #2138166, was opened at 2008-09-30 08:39
Message generated for change (Comment added) made by johnrrousseau
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2138166group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: qemu
Group: None
Status: Open
Resolution: Fixed
Priority: 5
Private: No
Submitted By: John Rousseau (johnrrousseau)
Assigned to: Nobody/Anonymous (nobody)
Summary: Vista guest fails to start on kvm-76

Initial Comment:
CPU: Intel(R) Core(TM)2 Duo CPU T7250  @ 2.00GHz
Build: kvm-76
Host kernel: 2.6.26.3-29.fc9.x86_64
Host arch: x86_64
Guest: Windows Vista Ultimate 64-bit
QEMU command: qemu-system-x86_64 -hda /home/jrr/vista-x86_64.img -m 2048M -net 
nic,vlan=0,macaddr=52:54:00:12:32:00 -net tap,vlan=0,ifname=tap0 -std-vga 
-full-screen -smp 2

I've been running this guest on this host with kvm-75 without difficulty. 
kvm-76, built the same way that kvm-75 was (and on the same machine), fails to 
start my guest. The guest window is up, but the guest fails to complete startup.

Command line output is:
kvm_create_phys_mem: File existsset_vram_mapping: cannot allocate memory: File 
exists
set_vram_mapping failed
kvm: get_dirty_pages returned -2

The last line repeats hundreds of times. 

--

Comment By: John Rousseau (johnrrousseau)
Date: 2008-10-12 08:50

Message:
I've confirmed that this issue is resolved with kvm-77.

--

Comment By: Marco Menardi (markit)
Date: 2008-10-10 08:02

Message:
I've the same issue with my XP-32 guests, I've Debian64 sid, Phenom 9550,
kernel 2.6.26-1-amd64. Everything works like a charm with kvm-75 instead
(and I've had to revert to 75, of course). Any news? Would love to have
forecoming kvm77 with this blocking bug fixed.

--

Comment By: John Rousseau (johnrrousseau)
Date: 2008-10-02 20:06

Message:
kvm-2646c5.tar.gz: Worked fine
kvm-d558461.tar.gz: Failed (showed this bug)

I've never used git before, but if you teach me to fish...

I installed git, pulled the userspace and kernel trees, built kvm-75 and
kvm-76 and got the expected results, but when I did a bisect on kvm-75
(good) and kvm-76 (bad) I kept getting sparse trees that I couldn't build.
configure among other things was missing. What am I doing wrong?

Also, what should I be syncing my kernel tree to when I am bisecting the
userspace tree?

Thanks.

--

Comment By: Glauber de Oliveira Costa (glommer)
Date: 2008-10-02 12:27

Message:
Are you using git? If so, can you bisect to find out who the culprit is?

If not, I've managed to archive two strategic commits you should try:

http://glommer.net/kvm-2646c5.tar.gz  and
http://glommer.net/kvm-d558461.tar.gz

please report success or failure with them

thanks!

--

Comment By: John Rousseau (johnrrousseau)
Date: 2008-10-02 11:48

Message:
I applied the patch to kvm-76 and ran into basically the same problem. The
guest still hung during boot and I got the plume of kvm: get_dirty_pages
returned -2 errors, but the first message kvm_create_phys_mem: File
existsset_vram_mapping: cannot allocate memory:
File exists wasn't displayed.

--

Comment By: Glauber de Oliveira Costa (glommer)
Date: 2008-10-02 09:01

Message:
can you please test the patch at http://glommer.net/band-aid.patch ?

--

Comment By: Brian Jackson (iggy_cav)
Date: 2008-09-30 10:06

Message:
This was reported on the mailing list. It's a problem with sdl output. Not
specific to any guest. Until the problem is fixed, I'd suggest using vnc
output.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2138166group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] read UUID from qemu

2008-10-12 Thread Gleb Natapov
Similar patch was sent to bochs devel list, but I propose to apply this
patch now rather than waiting for bochs developers to apply it and then
merger.

---

Add support for new FW configuration channel to the BIOS.
Read UUID from QEMU using this channel.

Signed-off-by: Gleb Natapov [EMAIL PROTECTED]

diff --git a/bios/rombios32.c b/bios/rombios32.c
index 921e202..a91b155 100755
--- a/bios/rombios32.c
+++ b/bios/rombios32.c
@@ -444,31 +444,51 @@ void wrmsr_smp(uint32_t index, uint64_t val)
 p-ecx = 0;
 }
 
-void uuid_probe(void)
-{
 #ifdef BX_QEMU
-uint32_t eax, ebx, ecx, edx;
+#define QEMU_CFG_CTL_PORT 0x510
+#define QEMU_CFG_DATA_PORT 0x511
+#define QEMU_CFG_SIGNATURE  0x00
+#define QEMU_CFG_ID 0x01
+#define QEMU_CFG_UUID   0x02
+
+int qemu_cfg_port;
+
+void qemu_cfg_select(int f)
+{
+outw(QEMU_CFG_CTL_PORT, f);
+}
 
-// check if backdoor port exists
-asm volatile (outl %%eax, %%dx
-: =a (eax), =b (ebx), =c (ecx), =d (edx)
-: a (0x564d5868), b (0), c (0xa), d (0x5658));
-if (ebx == 0x564d5868) {
-uint32_t *uuid_ptr = (uint32_t *)bios_uuid;
-// get uuid
-asm volatile (outl %%eax, %%dx
-: =a (eax), =b (ebx), =c (ecx), =d (edx)
-: a (0x564d5868), c (0x13), d (0x5658));
-uuid_ptr[0] = eax;
-uuid_ptr[1] = ebx;
-uuid_ptr[2] = ecx;
-uuid_ptr[3] = edx;
-} else
+int qemu_cfg_port_probe()
+{
+char *sig = QEMU;
+int i;
+
+qemu_cfg_select(QEMU_CFG_SIGNATURE);
+
+for (i = 0; i  4; i++)
+if (inb(QEMU_CFG_DATA_PORT) != sig[i])
+return 0;
+
+return 1;
+}
+
+void qemu_cfg_read(uint8_t *buf, int len)
+{
+while (len--)
+*(buf++) = inb(QEMU_CFG_DATA_PORT);
+}
 #endif
-{
-// UUID not set
-memset(bios_uuid, 0, 16);
+
+void uuid_probe(void)
+{
+#ifdef BX_QEMU
+if(qemu_cfg_port) {
+qemu_cfg_select(QEMU_CFG_UUID);
+qemu_cfg_read(bios_uuid, 16);
+return;
 }
+#endif
+memset(bios_uuid, 0, 16);
 }
 
 void cpu_probe(void)
@@ -2085,6 +2105,10 @@ void rombios32_init(void)
 
 init_smp_msrs();
 
+#ifdef BX_QEMU
+qemu_cfg_port = qemu_cfg_port_probe();
+#endif
+
 ram_probe();
 
 cpu_probe();
--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC] Disk integrity in QEMU

2008-10-12 Thread Dor Laor

Avi Kivity wrote:

Chris Wright wrote:

I think it's safe to say the perf folks are concerned w/ data integrity
first, stable/reproducible results second, and raw performance third.

So seeing data cached in host was simply not what they expected.  I 
think

write through is sufficient.  However I think that uncached vs. wt will
show up on the radar under reproducible results (need to tune based on
cache size).  And in most overcommit scenarios memory is typically more
precious than cpu, it's unclear to me if the extra buffering is anything
other than memory overhead.  As long as it's configurable then it's
comparable and benchmarking and best practices can dictate best choice.
  


Getting good performance because we have a huge amount of free memory 
in the host is not a good benchmark.  Under most circumstances, the 
free memory will be used either for more guests, or will be given to 
the existing guests, which can utilize it more efficiently than the host.


I can see two cases where this is not true:

- using older, 32-bit guests which cannot utilize all of the cache.  I 
think Windows XP is limited to 512MB of cache, and usually doesn't 
utilize even that.  So if you have an application running on 32-bit 
Windows (or on 32-bit Linux with pae disabled), and a huge host, you 
will see a significant boost from cache=writethrough.  This is a case 
where performance can exceed native, simply because native cannot 
exploit all the resources of the host.


- if cache requirements vary in time across the different guests, and 
if some smart ballooning is not in place, having free memory on the 
host means we utilize it for whichever guest has the greatest need, so 
overall performance improves.




Another justification for ODIRECT is that many production system will 
use the base images for their VMs.
It's mainly true for desktop virtualization but probably for some server 
virtualization deployments.
In these type of scenarios, we can have all of the base image chain 
opened as default with caching for read-only while the

leaf images are open with cache=off.
Since there is ongoing effort (both by IT and developers) to keep the 
base images as big as possible, it guarantees that
this data is best suited for caching in the host while the private leaf 
images will be uncached.
This way we provide good performance and caching for the shared parent 
images while also promising correctness.

Actually this is what happens on mainline qemu with cache=off.

Cheers,
Dor
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC] Disk integrity in QEMU

2008-10-12 Thread Jamie Lokier
Dor Laor wrote:
 Actually this is what happens on mainline qemu with cache=off.

Have I understood right that cache=off on a qcow2 image only uses
O_DIRECT for the leaf image, and the chain of base images don't use
O_DIRECT?

Sometimes on a memory constrained host, where the (collective) guest
memory is nearly as big as the host memory, I'm not sure this is what
I want.

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC] Disk integrity in QEMU

2008-10-12 Thread Jamie Lokier
Chris Wright wrote:
 Either wt or uncached (so host O_DSYNC or O_DIRECT) would suffice to get
 it through to host's storage subsytem, and I think that's been the core
 of the discussion (plus defaults, etc).

Just want to point out that the storage commitment from O_DIRECT can
be _weaker_ than O_DSYNC.

On Linux,m O_DIRECT never uses storage-device barriers or
transactions, but O_DSYNC sometimes does, and fsync is even more
likely to than O_DSYNC.

I'm not certain, but I think the same applies to other host OSes too -
including Windows, which has its own equivalents to O_DSYNC and
O_DIRECT, and extra documented semantics when they are used together.

Although this is a host implementation detail, unfortunately it means
that O_DIRECT=no-cache and O_DSYNC=write-through-cache is not an
accurate characterisation.

Some might be mislead into assuming that cache=off is as strongly
committing their data to hard storage as cache=wb would.

I think you can assume this only when the underlying storage devices'
write caches are disabled.  You cannot assume this if the host
filesystem uses barriers instead of disabling the storage devices'
write cache.

Unfortunately there's not a lot qemu can do about these various quirks,
but at least it should be documented, so that someone requiring
storage commitment (e.g. for a critical guest database) is advised to
investigate whether O_DIRECT and/or O_DSYNC give them what they
require with their combination of host kernel, filesystem, filesystem
options and storage device(s).

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] qemu: qemu_fopen_fd: differentiate between reader and writer user

2008-10-12 Thread Uri Lublin
Currently qemu_fopen_ops accepts both get_buffer and put_buffer, but
if both are given (non NULL) we encounter problems:
  1. There is only one buffer and index, which may mean data corruption.
  2. qemu_flush (which is also called by qemu_fclose) is writing (flushing)
 some of the data that was read (for the reader part).

Currently qemu_fopen_fd registers both get_buffer and put_buffer functions.

This breaks migration for tcp and ssh migration protocols.

The following patch fix the above by:
  1. It makes sure that at most one of get_buffer and put_buffer is
 given to qemu_fopen_ops.
  2. It changes qemu_fopen_fd to register only get_buffer for a reader
 and only put_buffer for a writer (adding a 'reader' parameter).
  3. The incoming fd migration code calls qemu_fopen_fd as a reader only.

Signed-off-by: Uri Lublin [EMAIL PROTECTED]
---
 qemu/hw/hw.h |2 +-
 qemu/migration.c |2 +-
 qemu/vl.c|   12 ++--
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/qemu/hw/hw.h b/qemu/hw/hw.h
index c9390c1..d965c47 100644
--- a/qemu/hw/hw.h
+++ b/qemu/hw/hw.h
@@ -34,7 +34,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc 
*put_buffer,
  QEMUFileCloseFunc *close,
  QEMUFileRateLimit *rate_limit);
 QEMUFile *qemu_fopen(const char *filename, const char *mode);
-QEMUFile *qemu_fopen_fd(int fd);
+QEMUFile *qemu_fopen_fd(int fd, int reader);
 void qemu_fflush(QEMUFile *f);
 int qemu_fclose(QEMUFile *f);
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
diff --git a/qemu/migration.c b/qemu/migration.c
index 44cb9eb..587c67e 100644
--- a/qemu/migration.c
+++ b/qemu/migration.c
@@ -820,7 +820,7 @@ static int migrate_incoming_page(QEMUFile *f, uint32_t addr)
 static int migrate_incoming_fd(int fd)
 {
 int ret = 0;
-QEMUFile *f = qemu_fopen_fd(fd);
+QEMUFile *f = qemu_fopen_fd(fd, 1);
 uint32_t addr, size;
 extern void qemu_announce_self(void);
 unsigned char running;
diff --git a/qemu/vl.c b/qemu/vl.c
index 36e3bb7..1ce188b 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -6712,7 +6712,7 @@ static int fd_close(void *opaque)
 return 0;
 }
 
-QEMUFile *qemu_fopen_fd(int fd)
+QEMUFile *qemu_fopen_fd(int fd, int reader)
 {
 QEMUFileFD *s = qemu_mallocz(sizeof(QEMUFileFD));
 
@@ -6720,7 +6720,10 @@ QEMUFile *qemu_fopen_fd(int fd)
 return NULL;
 
 s-fd = fd;
-s-file = qemu_fopen_ops(s, fd_put_buffer, fd_get_buffer, fd_close, NULL);
+if (reader)
+s-file = qemu_fopen_ops(s, NULL, fd_get_buffer, fd_close, NULL);
+else
+s-file = qemu_fopen_ops(s, fd_put_buffer, NULL, fd_close, NULL);
 return s-file;
 }
 
@@ -6826,6 +6829,11 @@ QEMUFile *qemu_fopen_ops(void *opaque, 
QEMUFilePutBufferFunc *put_buffer,
 {
 QEMUFile *f;
 
+if (put_buffer  get_buffer) {
+fprintf(stderr, %s: only one of get_buffer and put_buffer 
+functions may be given\n, __FUNCTION__);
+return NULL;
+}
 f = qemu_mallocz(sizeof(QEMUFile));
 if (!f)
 return NULL;
-- 
1.5.5.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2149609 ] Booting IA32e Windows guest meets BSOD

2008-10-12 Thread SourceForge.net
Bugs item #2149609, was opened at 2008-10-06 16:25
Message generated for change (Comment added) made by kiszka
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2149609group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Jiajun Xu (jiajun)
Assigned to: Nobody/Anonymous (nobody)
Summary: Booting IA32e Windows guest meets BSOD

Initial Comment:
With latest commit, kvm.git 77d0a44d2393f43836fd38c235dfda2de4c4630a and 
userspace.git d32aaf6d02d102855e5d8d1e23f0c55ca214e871, IA32e Windows 
guest(including Windows XP, Windows 2003, Windows Vista) fails to boot. Guest 
will meet BSOD when booting, error code 0x000A(IRQL_NOT_LESS_OR_EQUAL).

Previous commit, kvm.git a509fff8ed134115f2fd413e92a92cddc1709a5f userspace.git 
42621e776ac3a12930c3fec19c60e68e563df4cc has no such issue.

--

Comment By: Jan Kiszka (kiszka)
Date: 2008-10-12 18:32

Message:
OK, this (the NMI watchdog) now likely became a regression of -77. I tried
reproducing it with a Windows Server 2003 R2 (64-bit), but without success.
Can you describe your scenario in more details? Vanilla Windows
installation? Which qemu command line switches precisely? Anything else I
may need to know to reproduce?

Further tests: Does
http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/22635 change the
situation? What does
http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/22634 make kvm
report to the kernel log?

--

Comment By: Jiajun Xu (jiajun)
Date: 2008-10-10 16:09

Message:
We find that kernel.git cbb44eaa2d961f1eb975b52c7be1c82178b3c580 introduces
the issue first. With kernel.git ba8ab77ebfba9898764c39bc2f00540a5a67a1e9,
windows guest can boot up successfully.

--

Comment By: Jiajun Xu (jiajun)
Date: 2008-10-06 16:53

Message:
OK. I will try to find the causing. with no-kvm-irqchip or no-kvm-pit, we
did not meet such issue.

--

Comment By: Jan Kiszka (kiszka)
Date: 2008-10-06 16:38

Message:
Can you try to narrow down the causing patches, specifically on the kernel
side? I assume you run with in-kernel irqchip, right?

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2149609group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu: qemu_fopen_fd: differentiate between reader and writer user

2008-10-12 Thread Avi Kivity

Uri Lublin wrote:

Currently qemu_fopen_ops accepts both get_buffer and put_buffer, but
if both are given (non NULL) we encounter problems:
  1. There is only one buffer and index, which may mean data corruption.
  2. qemu_flush (which is also called by qemu_fclose) is writing (flushing)
 some of the data that was read (for the reader part).

Currently qemu_fopen_fd registers both get_buffer and put_buffer functions.

This breaks migration for tcp and ssh migration protocols.

The following patch fix the above by:
  1. It makes sure that at most one of get_buffer and put_buffer is
 given to qemu_fopen_ops.
  2. It changes qemu_fopen_fd to register only get_buffer for a reader
 and only put_buffer for a writer (adding a 'reader' parameter).
  3. The incoming fd migration code calls qemu_fopen_fd as a reader only.

  


Anthony, this is a problem with qemu-upstream so I'd like to solve it in 
a way that's acceptable for upstream.


The proposed patch is less that ideal IMO as it introduces limitations 
on what you can do with a file.  An alternative implementation would add 
a read/write mode to the buffer, based on the last access type.  When 
switching from read to write, we drop the buffer, and when switching 
from write to read, we flush it and then drop it.  This is more complex 
but results in a cleaner API.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC] Disk integrity in QEMU

2008-10-12 Thread Anthony Liguori

Jamie Lokier wrote:

Dor Laor wrote:
  

Actually this is what happens on mainline qemu with cache=off.



Have I understood right that cache=off on a qcow2 image only uses
O_DIRECT for the leaf image, and the chain of base images don't use
O_DIRECT?
  


Yeah, that's a bug IMHO and in my patch to add O_DSYNC, I fix that.  I 
think an argument for O_DIRECT in a leaf and wb in the leaf is seriously 
flawed...


Regards,

Anthony Liguori


Sometimes on a memory constrained host, where the (collective) guest
memory is nearly as big as the host memory, I'm not sure this is what
I want.

-- Jamie


  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC] Disk integrity in QEMU

2008-10-12 Thread Anthony Liguori

Dor Laor wrote:

Avi Kivity wrote:

Since there is ongoing effort (both by IT and developers) to keep the 
base images as big as possible, it guarantees that
this data is best suited for caching in the host while the private 
leaf images will be uncached.


A proper CAS solution is really such a better approach.  qcow2 
deduplification is an interesting concept, but such a hack :-)


This way we provide good performance and caching for the shared parent 
images while also promising correctness.


You get correctness by using O_DSYNC.  cache=off should disable the use 
of the page cache everywhere.


Regards,

Anthony Liguori


Actually this is what happens on mainline qemu with cache=off.

Cheers,
Dor
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu: qemu_fopen_fd: differentiate between reader and writer user

2008-10-12 Thread Anthony Liguori

Avi Kivity wrote:

Uri Lublin wrote:

Currently qemu_fopen_ops accepts both get_buffer and put_buffer, but
if both are given (non NULL) we encounter problems:
  1. There is only one buffer and index, which may mean data corruption.
  2. qemu_flush (which is also called by qemu_fclose) is writing 
(flushing)

 some of the data that was read (for the reader part).

Currently qemu_fopen_fd registers both get_buffer and put_buffer 
functions.


This breaks migration for tcp and ssh migration protocols.

The following patch fix the above by:
  1. It makes sure that at most one of get_buffer and put_buffer is
 given to qemu_fopen_ops.
  2. It changes qemu_fopen_fd to register only get_buffer for a reader
 and only put_buffer for a writer (adding a 'reader' parameter).
  3. The incoming fd migration code calls qemu_fopen_fd as a reader 
only.


  


Anthony, this is a problem with qemu-upstream so I'd like to solve it 
in a way that's acceptable for upstream.


The proposed patch is less that ideal IMO as it introduces limitations 
on what you can do with a file.  An alternative implementation would 
add a read/write mode to the buffer, based on the last access type.  
When switching from read to write, we drop the buffer, and when 
switching from write to read, we flush it and then drop it.  This is 
more complex but results in a cleaner API.


I would think a better solution would introduce two buffers, one for 
read and one for write.  That way, you can have a proper bidirectional 
stream.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu: qemu_fopen_fd: differentiate between reader and writer user

2008-10-12 Thread Avi Kivity
Anthony Liguori wrote:

 The proposed patch is less that ideal IMO as it introduces
 limitations on what you can do with a file.  An alternative
 implementation would add a read/write mode to the buffer, based on
 the last access type.  When switching from read to write, we drop the
 buffer, and when switching from write to read, we flush it and then
 drop it.  This is more complex but results in a cleaner API.

 I would think a better solution would introduce two buffers, one for
 read and one for write.  That way, you can have a proper bidirectional
 stream.


Complexity goes way up.  Now you need to intercept reads that go to the
write buffer, and vice versa.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-76 fails to boot kubuntu 64bits (ok with kvm-75)

2008-10-12 Thread Xavier Gnata

John Rousseau wrote:

My guess is it's this:

https://sourceforge.net/tracker/?func=detailatid=893831aid=2138166group_id=180599 



-John

Xavier Gnata wrote:

Hi,

kubuntu 64bits 8.10 beta boots without problem with kvm-75 using this 
command line:
qemu-system-x86_64 -no-quit -serial file:serial.log -hda 
intrepid.img  -boot c -m 1024 -smb qemu -soundhw es1370


It never boots with kvm-76.

Here is the serial output:

it always crashes at boot time with kvm-76:
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 2.6.27-4-generic ([EMAIL PROTECTED]) (gcc 
version 4.3.2
(Ubuntu 4.3.2-1ubuntu7) ) #1 SMP Wed Sep 24 01:29:06 UTC 2008 (Ubuntu 
2.6.27-4.

6-generic)
[0.00] Command line: 
root=UUID=cab01a2d-f54f-4c16-be37-4555fd50a068 cons

ole=ttyS0,115200 earlyprintk=serial,ttyS0,115200 quiet splash
[0.00] KERNEL supported cpus:
[0.00]   Intel GenuineIntel
[0.00]   AMD AuthenticAMD
[0.00]   Centaur CentaurHauls
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009fc00 (usable)
[0.00]  BIOS-e820: 0009fc00 - 000a 
(reserved)
[0.00]  BIOS-e820: 000e8000 - 0010 
(reserved)

[0.00]  BIOS-e820: 0010 - 3fff (usable)
[0.00]  BIOS-e820: 3fff - 4000 (ACPI 
data)
[0.00]  BIOS-e820: fffbd000 - 0001 
(reserved)

[0.00] console [earlyser0] enabled
[1.575835] pci :00:01.0: PIIX3: Enabling Passive Release
Loading, please wait...
Couldnt get a file descriptor referring to the console
*** glibc detected *** modprobe: realloc(): invalid next size: 
0x00ef8c4

0 ***
Aborted
*** glibc detected *** modprobe: realloc(): invalid next size: 
0x015ddc4

0 ***
Aborted
usplash: libusplash.c:289: switch_console: Assertion `(saved_vt = 0) 
 (saved_

vt  10)' failed.
*** glibc detected *** modprobe: realloc(): invalid old size: 
0x00725160

***
Aborted
*** glibc detected *** modprobe: realloc(): invalid old size: 
0x01b76160

***
Aborted
*** glibc detected *** modprobe: realloc(): invalid old size: 
0x01f88160

***
Aborted
*** glibc detected *** modprobe: realloc(): invalid old size: 
0x00db4160

***
Aborted
*** glibc detected *** modprobe: realloc(): invalid old size: 
0x01d0d160

***
Aborted
*** glibc detected *** modprobe: realloc(): invalid old size: 
0x008d5160

***
Aborted
udevd[952]: parse_config_file: error parsing /etc/udev/udev.conf, 
line 1:0


udevd[952]: add_to_rules: invalid rule 
'/etc/udev/rules.d/05-options.rules:1'


udevd[952]: add_to_rules: invalid rule 
'/etc/udev/rules.d/05-options.rules:2'


udevd[952]: parse_file: line too long, rule skipped 
'/etc/udev/rules.d/20-names.rules:7'


udevd[952]: add_to_rules: invalid rule 
'/etc/udev/rules.d/40-basic-permissions.rules:7'


udevd[952]: parse_file: line too long, rule skipped 
'/etc/udev/rules.d/60-persistent-storage.rules:7'


udevd[952]: add_to_rules: invalid rule 
'/etc/udev/rules.d/61-persistent-storage-edd.rules:7'


udevd[952]: parse_file: line too long, rule skipped 
'/etc/udev/rules.d/90-modprobe.rules:13'


udevtrigger[954]: parse_config_file: error parsing 
/etc/udev/udev.conf, line 1:0


uname -a 2.6.26.5-1 #1 SMP

processor   : 0 (and 1)
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Duo CPU T7500  @ 2.20GHz

Any idea?

Xavier
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



It is the same with kvm-77. Same failure.
Any patch to be tested?

Xavier
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC] Disk integrity in QEMU

2008-10-12 Thread Izik Eidus

Avi Kivity wrote:


LRU typically makes fairly bad decisions since it throws most of the
information it has away.  I recommend looking up LRU-K and similar
algorithms, just to get a feel for this; it is basically the simplest
possible algorithm short of random selection.

Note that Linux doesn't even have an LRU; it has to approximate since it
can't sample all of the pages all of the time.  With a hypervisor that
uses Intel's EPT, it's even worse since we don't have an accessed bit.
On silly benchmarks that just exercise the disk and touch no memory, and
if you tune the host very aggresively, LRU will win on long running
guests since it will eventually page out all unused guest memory (with
Linux guests, it will never even page guest memory in).  On real life
applications I don't think there is much chance.

  

But when using O_DIRECT you actuality make the pages not swappable at all...
or am i wrong?
maybe somekind of combination with the mm shrink could be good,
do_try_to_free_pages is good point for reference.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: with kerenl 2.6.27, CONFIG_KVM_GUEST does not work

2008-10-12 Thread Held Bernhard
 Does the attached work for you?
 
 Avi, do you have thoughts on how to proceed with pvmmu? Using hypercalls
 instead of faults can still be beneficial (for the first write before
 page goes out of sync, or for non-leaf tables which currently don't go
 oos). But at the current state pvmmu should be slower in most loads.
 Perhaps disable it?
 
 KVM: MMU: sync root on paravirt TLB flush
 
 The pvmmu TLB flush handler should request a root sync, similarly to
 a native read-write CR3.
 
 Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
 
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index 79cb4a9..7e70e97 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -2747,6 +2747,7 @@ static int kvm_pv_mmu_write(struct kvm_vcpu *vcpu,
  static int kvm_pv_mmu_flush_tlb(struct kvm_vcpu *vcpu)
  {
   kvm_x86_ops-tlb_flush(vcpu);
 + set_bit(KVM_REQ_MMU_SYNC, vcpu-requests);
   return 1;
  }
This patch works for me (kvm-77, 2.6.27 host and guest)!

kvm-75 works fine, but kvm-76 and kvm-77 (all unpatched) show lot's of
segfaults in the guest (2.6.26.5 or 2.6.27, x86_64 on host and guest).

Thanks for the patch!

HTH,
Bernhard

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu: qemu_fopen_fd: differentiate between reader and writer user

2008-10-12 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:
  

The proposed patch is less that ideal IMO as it introduces
limitations on what you can do with a file.  An alternative
implementation would add a read/write mode to the buffer, based on
the last access type.  When switching from read to write, we drop the
buffer, and when switching from write to read, we flush it and then
drop it.  This is more complex but results in a cleaner API.
  

I would think a better solution would introduce two buffers, one for
read and one for write.  That way, you can have a proper bidirectional
stream.




Complexity goes way up.  Now you need to intercept reads that go to the
write buffer, and vice versa.
  


Yeah, Uri: instead of passing an argument to qemu_fopen_ops, it may be 
better to direct the cases where we do a write and set a flag.  Then in 
the fflush() function, only do the put_buffer if the is_write flag is set.


Also, having checks and the read and write functions to determine if the 
is_write flag is set along with whether buf_index  0 that fprintf()'d 
and aborted would be good for debugging.


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm] Re: [PATCH 0/5] bios: 4G updates

2008-10-12 Thread Kevin O'Connor
Hi,

On Thu, Oct 02, 2008 at 03:33:58PM +0300, Avi Kivity wrote:
 Alex Williamson wrote:
 It works, so I pushed it out.  Alex, can you rebase your bios patches
 on top of current HEAD?
 

 I updated and resent the first patch in the 4 patch follow-on to this
 one.  The remaining 3 patches still apply cleanly.  I think Sheng was
 going to send out a patch to better follow the SDM when changing the
 MTRRs, but the first 3 patches are independent of that.  Thanks,

   

 Applied all, thanks.

As an aside, is there any interest in using SeaBIOS with kvm?

SeaBIOS is a port of bochs bios to gcc.  I've been using SeaBIOS
(along with coreboot) to boot and provide bios functions on real
hardware.  It works fine under qemu also.

I looked at the changes that kvm has in its local bochs bios repo.
Most of the code is the same, however I noticed a number of msr
settings which I didn't fully understand.

If there is interest, the source code repository can be pulled by
running:

git clone git://git.linuxtogo.org/home/seabios.git

There is a git browser at:

http://git.linuxtogo.org/?p=kevin/seabios.git;a=summary

And some precompiled binaries at:

http://linuxtogo.org/~kevin/SeaBIOS/

Thoughts?
-Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Handle multiple interrupt sources

2008-10-12 Thread Sheng Yang
On Saturday 11 October 2008 16:10:51 Amit Shah wrote:
 From: Sheng Yang [EMAIL PROTECTED]

 Keep a record of current interrupt state before injecting. Don't
 assert/deassert repeatedly, so that every caller of kvm_set_irq()
 can be identified as a separate interrupt source for the IOAPIC/PIC
 to implement logical OR of level triggered interrupts on one IRQ line.

 Notice that userspace devices are treated as one device for each IRQ
 line. The correctness of sharing interrupt for each IRQ line should be
 ensured by the userspace program (QEmu).

 [Amit: rebase to kvm.git HEAD]

Hi, Amit

Thanks for your work! 

But maybe I miss something. I suppose my later patch can work indepently? I 
think the second patch should solve the whole problem (sorry to reply it to 
the second rather than [0/2] which made confusion...). Can you have a check?

Thanks!
--
regards
Yang, Sheng

 Signed-off-by: Sheng Yang [EMAIL PROTECTED]
 Signed-off-by: Amit Shah [EMAIL PROTECTED]
 ---
  arch/x86/kvm/x86.c   |   13 -
  include/linux/kvm_host.h |3 +++
  virt/kvm/kvm_main.c  |   12 +---
  3 files changed, 24 insertions(+), 4 deletions(-)

 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index dda478e..6f45428 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -1816,7 +1816,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
   goto out;
   if (irqchip_in_kernel(kvm)) {
   mutex_lock(kvm-lock);
 - kvm_set_irq(kvm, irq_event.irq, irq_event.level);
 + /*
 +  * Take one IRQ line as from one device, shared IRQ
 +  * line should also be handled in the userspace before
 +  * use KVM_IRQ_LINE ioctl to change IRQ line state.
 +  */
 + if (kvm-userspace_intrsource_states[irq_event.irq]
 + != irq_event.level) {
 + kvm_set_irq(kvm, irq_event.irq,
 + irq_event.level);
 + kvm-userspace_intrsource_states[irq_event.irq]
 + = irq_event.level;
 + }
   mutex_unlock(kvm-lock);
   r = 0;
   }
 diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
 index 3833c48..d392e31 100644
 --- a/include/linux/kvm_host.h
 +++ b/include/linux/kvm_host.h
 @@ -129,6 +129,8 @@ struct kvm {
   unsigned long mmu_notifier_seq;
   long mmu_notifier_count;
  #endif
 +
 + int userspace_intrsource_states[KVM_IOAPIC_NUM_PINS];
  };

  /* The guest did something we don't support. */
 @@ -306,6 +308,7 @@ struct kvm_assigned_dev_kernel {
   int host_irq;
   int guest_irq;
   int irq_requested;
 + int irq_state;
   struct pci_dev *dev;
   struct kvm *kvm;
  };
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index cf0ab8e..faa56fb 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -104,8 +104,11 @@ static void
 kvm_assigned_dev_interrupt_work_handler(struct work_struct *work) *
 finer-grained lock, update this
*/
   mutex_lock(assigned_dev-kvm-lock);
 - kvm_set_irq(assigned_dev-kvm,
 - assigned_dev-guest_irq, 1);
 + if (assigned_dev-irq_state == 0) {
 + kvm_set_irq(assigned_dev-kvm,
 + assigned_dev-guest_irq, 1);
 + assigned_dev-irq_state = 1;
 + }
   mutex_unlock(assigned_dev-kvm-lock);
   kvm_put_kvm(assigned_dev-kvm);
  }
 @@ -134,7 +137,10 @@ static void kvm_assigned_dev_ack_irq(struct
 kvm_irq_ack_notifier *kian)

   dev = container_of(kian, struct kvm_assigned_dev_kernel,
  ack_notifier);
 - kvm_set_irq(dev-kvm, dev-guest_irq, 0);
 + if (dev-irq_state == 1) {
 + kvm_set_irq(dev-kvm, dev-guest_irq, 0);
 + dev-irq_state = 0;
 + }
   enable_irq(dev-host_irq);
  }


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ Re: unhandled vm exit: 0x80000021 vcpu_id 0]

2008-10-12 Thread Sheng Yang
Hi Pier

The only thing I can tell that is, seems guest completely messed up...
It ran into some non-code segment.

 unhandled vm exit: 0x8021 vcpu_id 0
 rax 0007 rbx 1490 rcx  rdx
 19a0 rsi  rdi  rsp
 0080 rbp 96bf r8   r9
  r10  r11  r12
  r13  r14  r15
  rip 002a rflags 00023202
 cs 14a2 (/ p 0 dpl 0 db 0 s 0 type 9 l 0 g 0 avl 0)
 ds 19a0 (/ p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
 es 1a31 (/ p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
 ss 1a29 (/ p 0 dpl 0 db 0 s 0 type 1 l 0 g 0 avl 0)
Segments maybe messed up...

 fs  (/ p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
 gs  (/ p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
 tr 0058 (00201ffa/ p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
 ldt  (/ p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
 gdt 20/1dd8
 idt 201df0/188
 cr0 8019 cr2 0 cr3 144 cr4 0 cr8 0 efer 0

CR0.PE set(sorry for wrong decode before...), CR0.PG set. Guest in
protected mode. But CR4 is wrong, at least CR4.PAE and CR4.VMXE should
be set.

 code: 00 f0 53 ff 00 f0 53 ff 00 f0 a5 fe 00 f0 87 e9 00 f0 53 ff -- 00 f0
 53 ff 00 f0 53 ff 00 f0 53 ff 00 f0 57 ef 00 f0 53 ff 00 f0 3a 83 00 c0 4d
 f8 00 f0

Seems like meaningless code...

Well, still don't know what the checkpoint done caused this... At
least it seems more than a emulation bug.

Anybody else have idea?...

--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu: qemu_fopen_fd: differentiate between reader and writer user

2008-10-12 Thread Anthony Liguori

Anthony Liguori wrote:


Also, having checks and the read and write functions to determine if 
the is_write flag is set along with whether buf_index  0 that 
fprintf()'d and aborted would be good for debugging.


I have a patch that does this along with fixing a few other bugs.  It's 
attached.


Regards,

Anthony Liguori



Regards,

Anthony Liguori




diff --git a/hw/hw.h b/hw/hw.h
index e130355..8edd788 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -11,8 +11,8 @@
  * The pos argument can be ignored if the file is only being used for
  * streaming.  The handler should try to write all of the data it can.
  */
-typedef void (QEMUFilePutBufferFunc)(void *opaque, const uint8_t *buf,
- int64_t pos, int size);
+typedef int (QEMUFilePutBufferFunc)(void *opaque, const uint8_t *buf,
+int64_t pos, int size);
 
 /* Read a chunk of data from a file at the given position.  The pos argument
  * can be ignored if the file is only be used for streaming.  The number of
@@ -64,6 +64,7 @@ unsigned int qemu_get_be16(QEMUFile *f);
 unsigned int qemu_get_be32(QEMUFile *f);
 uint64_t qemu_get_be64(QEMUFile *f);
 int qemu_file_rate_limit(QEMUFile *f);
+int qemu_file_has_error(QEMUFile *f);
 
 /* Try to send any outstanding data.  This function is useful when output is
  * halted due to rate limiting or EAGAIN errors occur as it can be used to
diff --git a/vl.c b/vl.c
index 5659fea..d49c648 100644
--- a/vl.c
+++ b/vl.c
@@ -6197,12 +6197,15 @@ struct QEMUFile {
 QEMUFileCloseFunc *close;
 QEMUFileRateLimit *rate_limit;
 void *opaque;
+int is_write;
 
 int64_t buf_offset; /* start of buffer when writing, end of buffer
when reading */
 int buf_index;
 int buf_size; /* 0 when writing */
 uint8_t buf[IO_BUF_SIZE];
+
+int has_error;
 };
 
 typedef struct QEMUFileFD
@@ -6211,34 +6214,6 @@ typedef struct QEMUFileFD
 QEMUFile *file;
 } QEMUFileFD;
 
-static void fd_put_notify(void *opaque)
-{
-QEMUFileFD *s = opaque;
-
-/* Remove writable callback and do a put notify */
-qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
-qemu_file_put_notify(s-file);
-}
-
-static void fd_put_buffer(void *opaque, const uint8_t *buf,
-  int64_t pos, int size)
-{
-QEMUFileFD *s = opaque;
-ssize_t len;
-
-do {
-len = write(s-fd, buf, size);
-} while (len == -1  errno == EINTR);
-
-if (len == -1)
-len = -errno;
-
-/* When the fd becomes writable again, register a callback to do
- * a put notify */
-if (len == -EAGAIN)
-qemu_set_fd_handler2(s-fd, NULL, NULL, fd_put_notify, s);
-}
-
 static int fd_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
 {
 QEMUFileFD *s = opaque;
@@ -6269,7 +6244,7 @@ QEMUFile *qemu_fopen_fd(int fd)
 return NULL;
 
 s-fd = fd;
-s-file = qemu_fopen_ops(s, fd_put_buffer, fd_get_buffer, fd_close, NULL);
+s-file = qemu_fopen_ops(s, NULL, fd_get_buffer, fd_close, NULL);
 return s-file;
 }
 
@@ -6278,12 +6253,13 @@ typedef struct QEMUFileStdio
 FILE *outfile;
 } QEMUFileStdio;
 
-static void file_put_buffer(void *opaque, const uint8_t *buf,
+static int file_put_buffer(void *opaque, const uint8_t *buf,
 int64_t pos, int size)
 {
 QEMUFileStdio *s = opaque;
 fseek(s-outfile, pos, SEEK_SET);
 fwrite(buf, 1, size, s-outfile);
+return size;
 }
 
 static int file_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
@@ -6331,11 +6307,12 @@ typedef struct QEMUFileBdrv
 int64_t base_offset;
 } QEMUFileBdrv;
 
-static void bdrv_put_buffer(void *opaque, const uint8_t *buf,
-int64_t pos, int size)
+static int bdrv_put_buffer(void *opaque, const uint8_t *buf,
+   int64_t pos, int size)
 {
 QEMUFileBdrv *s = opaque;
 bdrv_pwrite(s-bs, s-base_offset + pos, buf, size);
+return size;
 }
 
 static int bdrv_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
@@ -6384,18 +6361,29 @@ QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer,
 f-get_buffer = get_buffer;
 f-close = close;
 f-rate_limit = rate_limit;
+f-is_write = 0;
 
 return f;
 }
 
+int qemu_file_has_error(QEMUFile *f)
+{
+return f-has_error;
+}
+
 void qemu_fflush(QEMUFile *f)
 {
 if (!f-put_buffer)
 return;
 
-if (f-buf_index  0) {
-f-put_buffer(f-opaque, f-buf, f-buf_offset, f-buf_index);
-f-buf_offset += f-buf_index;
+if (f-is_write  f-buf_index  0) {
+int len;
+
+	len = f-put_buffer(f-opaque, f-buf, f-buf_offset, f-buf_index);
+	if (len  0)
+	f-buf_offset += f-buf_index;
+	else
+	f-has_error = 1;
 f-buf_index = 0;
 }
 }
@@ -6407,13 +6395,16 @@ static void qemu_fill_buffer(QEMUFile *f)
 if (!f-get_buffer)
 return;
 
-len = f-get_buffer(f-opaque, f-buf, f-buf_offset, 

Re: [PATCH 1/2] KVM: Handle multiple interrupt sources

2008-10-12 Thread Amit Shah
- Sheng Yang [EMAIL PROTECTED] wrote:

 On Saturday 11 October 2008 16:10:51 Amit Shah wrote:
  From: Sheng Yang [EMAIL PROTECTED]
 
  Keep a record of current interrupt state before injecting. Don't
  assert/deassert repeatedly, so that every caller of kvm_set_irq()
  can be identified as a separate interrupt source for the IOAPIC/PIC
  to implement logical OR of level triggered interrupts on one IRQ
 line.
 
  Notice that userspace devices are treated as one device for each
 IRQ
  line. The correctness of sharing interrupt for each IRQ line should
 be
  ensured by the userspace program (QEmu).
 
  [Amit: rebase to kvm.git HEAD]
 
 Hi, Amit
 
 Thanks for your work! 
 
 But maybe I miss something. I suppose my later patch can work
 indepently? I 
 think the second patch should solve the whole problem (sorry to reply
 it to 
 the second rather than [0/2] which made confusion...). Can you have a
 check?

I'm not sure I understand. Which concern are you talking about?

I used the latest patch that you sent and I also verified that it works.

Amit
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Handle multiple interrupt sources

2008-10-12 Thread Sheng Yang
On Monday 13 October 2008 13:06:18 Amit Shah wrote:
 - Sheng Yang [EMAIL PROTECTED] wrote:
  On Saturday 11 October 2008 16:10:51 Amit Shah wrote:
   From: Sheng Yang [EMAIL PROTECTED]
  
   Keep a record of current interrupt state before injecting. Don't
   assert/deassert repeatedly, so that every caller of kvm_set_irq()
   can be identified as a separate interrupt source for the IOAPIC/PIC
   to implement logical OR of level triggered interrupts on one IRQ
 
  line.
 
   Notice that userspace devices are treated as one device for each
 
  IRQ
 
   line. The correctness of sharing interrupt for each IRQ line should
 
  be
 
   ensured by the userspace program (QEmu).
  
   [Amit: rebase to kvm.git HEAD]
 
  Hi, Amit
 
  Thanks for your work!
 
  But maybe I miss something. I suppose my later patch can work
  indepently? I
  think the second patch should solve the whole problem (sorry to reply
  it to
  the second rather than [0/2] which made confusion...). Can you have a
  check?

 I'm not sure I understand. Which concern are you talking about?

 I used the latest patch that you sent and I also verified that it works.

Well, at least I meant to replace all of my first two patches with my later 
one... I suppose the second patch(my later one, derived from Avi's 
suggestion) should work alone without the first one...
--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html