[PATCH] kvm: qemu: Provide dummy cpu_vendor_string

2008-08-13 Thread Avi Kivity
From: Philippe Gerum <[EMAIL PROTECTED]>

This patch defines cpu_vendor_string for linux-user builds.

Signed-off-by: Philippe Gerum <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/linux-user/main.c b/qemu/linux-user/main.c
index 023aac3..8387b16 100644
--- a/qemu/linux-user/main.c
+++ b/qemu/linux-user/main.c
@@ -33,6 +33,7 @@
 
 static const char *interp_prefix = CONFIG_QEMU_PREFIX;
 const char *qemu_uname_release = CONFIG_UNAME_RELEASE;
+const char *cpu_vendor_string = NULL;
 
 #if defined(__i386__) && !defined(CONFIG_STATIC)
 /* Force usage of an ELF interpreter even if it is an ELF shared
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: Fix kvm startup script

2008-08-13 Thread Avi Kivity
From: Sheng Yang <[EMAIL PROTECTED]>

Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/kvm b/kvm
index 2a7dc85..cb9ecf8 100755
--- a/kvm
+++ b/kvm
@@ -18,6 +18,14 @@ config = ShellConfigParser()
 config.read('config.mak')
 
 external_module = config.get('shell', 'want_module')
+
+arch = config.get('shell', 'arch')
+p = re.compile("^i\d86$")
+if len(p.findall(arch)):
+arch = 'x86_64'
+if arch != 'x86_64' and arch != 'ia64':
+raise Exception('unsupported architecture %s' % arch)
+
 privileged = os.getuid() == 0
 
 optparser = optparse.OptionParser()
@@ -153,8 +161,12 @@ def remove_module(module):
 raise Exception('failed to remove %s module' % (module,))
 
 def insert_module(module):
+if arch == 'x86_64':
+   archdir = 'x86'
+elif arch == 'ia64':
+   archdir = 'ia64'
 if os.spawnl(os.P_WAIT, '/sbin/insmod', 'insmod',
- 'kernel/%s.ko' % (module,)) != 0:
+ 'kernel/' + archdir + '/%s.ko' % (module,)) != 0:
 raise Exception('failed to load kvm module')
 
 def probe_module(module):
@@ -197,8 +209,6 @@ bootdisk = 'c'
 if options.install:
 bootdisk = 'd'
 
-arch = 'x86_64'
-
 if arch == 'x86_64':
 cmd = 'qemu-system-' + arch
 else:
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Remove virtio_net tx ring-full heuristic

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

virtio_net tries to guess when it has received a tx
notification from the guest whether it indicates that the
guest has no more room in the tx ring and it should
immediately flush the queued buffers.

The heuristic is based on the fact that there are 128
buffer entries in the ring and each packet uses 2 buffers
(i.e. the virtio_net_hdr and the packet's linear data).

Using GSO or increasing the size of the rings will break
that heuristic, so let's remove it and assume that any
notification from the guest after we've disabled
notifications indicates that we should flush our buffers.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index 3a39c8f..b001475 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -175,8 +175,7 @@ static void virtio_net_handle_tx(VirtIODevice *vdev, 
VirtQueue *vq)
 {
 VirtIONet *n = to_virtio_net(vdev);
 
-if (n->tx_timer_active &&
-   (vq->vring.avail->idx - vq->last_avail_idx) == 64) {
+if (n->tx_timer_active) {
vq->vring.used->flags &= ~VRING_USED_F_NO_NOTIFY;
qemu_del_timer(n->tx_timer);
n->tx_timer_active = 0;
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Fix virtio_net tx timer

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

The current virtio_net tx timer is 2ns, which doesn't make
any sense. Set it to a more reasonable 250us instead.

However, even though we were requesting a 2ns tx timer, it
was actually getting limited to MIN_TIMER_REARM_US which is
currently 250us.

So, even though the timer itself would only fire after
250us, expire_time was only set to +2ns, so we'd get the
timeout callback next time qemu_run_timers() was called from
the mainloop.

This probably accounted for a lot of the jitter in the
throughput numbers - the effective tx timer length was
anywhere between 2ns and 250us depending on e.g. whether
there was rx data available on the tap fd.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index 2e57e5a..3a39c8f 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -26,7 +26,7 @@
 #define VIRTIO_NET_F_MAC   5
 #define VIRTIO_NET_F_GS0   6
 
-#define TX_TIMER_INTERVAL (1000 / 500)
+#define TX_TIMER_INTERVAL 25 /* 250 us */
 
 /* The config defining mac address (6 bytes) */
 struct virtio_net_config
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Add VIRTIO_F_NOTIFY_ON_EMPTY

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

Set the VIRTIO_F_NOTIFY_ON_EMPTY feature bit so the
guest can rely on us notifying them when the queue
is empty.

Also, only notify when the available queue is empty
*and* when we've finished with all the buffers we
had detached. Right now, when the queue is empty,
we notify the guest for every used buffer.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/hw/virtio.c b/qemu/hw/virtio.c
index 3429ac8..e035e4e 100644
--- a/qemu/hw/virtio.c
+++ b/qemu/hw/virtio.c
@@ -138,6 +138,7 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement 
*elem,
 /* Make sure buffer is written before we update index. */
 wmb();
 vq->vring.used->idx++;
+vq->inuse--;
 }
 
 int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
@@ -187,6 +188,8 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
 elem->index = head;
 
+vq->inuse++;
+
 return elem->in_num + elem->out_num;
 }
 
@@ -275,6 +278,7 @@ static uint32_t virtio_ioport_read(void *opaque, uint32_t 
addr)
 switch (addr) {
 case VIRTIO_PCI_HOST_FEATURES:
ret = vdev->get_features(vdev);
+   ret |= (1 << VIRTIO_F_NOTIFY_ON_EMPTY);
break;
 case VIRTIO_PCI_GUEST_FEATURES:
ret = vdev->features;
@@ -431,7 +435,7 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 void virtio_notify(VirtIODevice *vdev, VirtQueue *vq)
 {
 /* Always notify when queue is empty */
-if (vq->vring.avail->idx != vq->last_avail_idx &&
+if ((vq->inuse || vq->vring.avail->idx != vq->last_avail_idx) &&
(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
return;
 
diff --git a/qemu/hw/virtio.h b/qemu/hw/virtio.h
index 61f5038..1adaed3 100644
--- a/qemu/hw/virtio.h
+++ b/qemu/hw/virtio.h
@@ -30,6 +30,10 @@
 /* We've given up on this device. */
 #define VIRTIO_CONFIG_S_FAILED 0x80
 
+/* We notify when the ring is completely used, even if the guest is supressing
+ * callbacks */
+#define VIRTIO_F_NOTIFY_ON_EMPTY24
+
 /* from Linux's linux/virtio_ring.h */
 
 /* This marks a buffer as continuing via the next field. */
@@ -86,6 +90,7 @@ struct VirtQueue
 VRing vring;
 uint32_t pfn;
 uint16_t last_avail_idx;
+int inuse;
 void (*handle_output)(VirtIODevice *vdev, VirtQueue *vq);
 };
 
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Disable recv notifications until avail buffers exhausted

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

Once we know we have buffers available on the receive ring, we can
safely disable notifications.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index b001475..47349ce 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -106,9 +106,12 @@ static int virtio_net_can_receive(void *opaque)
!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK))
return 0;
 
-if (n->rx_vq->vring.avail->idx == n->rx_vq->last_avail_idx)
+if (n->rx_vq->vring.avail->idx == n->rx_vq->last_avail_idx) {
+   n->rx_vq->vring.used->flags &= ~VRING_USED_F_NO_NOTIFY;
return 0;
+}
 
+n->rx_vq->vring.used->flags |= VRING_USED_F_NO_NOTIFY;
 return 1;
 }
 
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Move some code around for the next commit

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/vl.c b/qemu/vl.c
index 126944d..f5aacf0 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -4369,19 +4369,6 @@ typedef struct TAPState {
 unsigned int has_vnet_hdr : 1;
 } TAPState;
 
-static void tap_receive(void *opaque, const uint8_t *buf, int size)
-{
-TAPState *s = opaque;
-int ret;
-for(;;) {
-ret = write(s->fd, buf, size);
-if (ret < 0 && (errno == EINTR || errno == EAGAIN)) {
-} else {
-break;
-}
-}
-}
-
 static ssize_t tap_receive_iov(void *opaque, const struct iovec *iov,
   int iovcnt)
 {
@@ -4395,6 +4382,19 @@ static ssize_t tap_receive_iov(void *opaque, const 
struct iovec *iov,
 return len;
 }
 
+static void tap_receive(void *opaque, const uint8_t *buf, int size)
+{
+TAPState *s = opaque;
+int ret;
+for(;;) {
+ret = write(s->fd, buf, size);
+if (ret < 0 && (errno == EINTR || errno == EAGAIN)) {
+} else {
+break;
+}
+}
+}
+
 static int tap_can_send(void *opaque)
 {
 TAPState *s = opaque;
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Rename tap_readv() to tap_receive_iov()

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

Rename tap_readv() so as to match tap_receive() naming and
also to allow a tap_writev() helper function to not seem so
weird.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/vl.c b/qemu/vl.c
index fc53ae0..126944d 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -4382,8 +4382,8 @@ static void tap_receive(void *opaque, const uint8_t *buf, 
int size)
 }
 }
 
-static ssize_t tap_readv(void *opaque, const struct iovec *iov,
-int iovcnt)
+static ssize_t tap_receive_iov(void *opaque, const struct iovec *iov,
+  int iovcnt)
 {
 TAPState *s = opaque;
 ssize_t len;
@@ -4504,7 +4504,7 @@ static TAPState *net_tap_fd_init(VLANState *vlan, int fd, 
int vnet_hdr)
 s->fd = fd;
 s->has_vnet_hdr = vnet_hdr != 0;
 s->vc = qemu_new_vlan_client(vlan, tap_receive, NULL, s);
-s->vc->fd_readv = tap_readv;
+s->vc->fd_readv = tap_receive_iov;
 #ifdef TUNSETOFFLOAD
 s->vc->set_offload = tap_set_offload;
 #endif
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Actually enable GSO support

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

Now that GSO support doesn't break e.g. e1000, we can
enable it.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/vl.c b/qemu/vl.c
index 63e21f2..0ba8ace 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -4746,6 +4746,19 @@ static int tap_open(char *ifname, int ifname_size, int 
*vnet_hdr)
 }
 memset(&ifr, 0, sizeof(ifr));
 ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
+
+#if defined(TUNGETFEATURES) && defined(IFF_VNET_HDR)
+{
+unsigned int features;
+
+if (ioctl(fd, TUNGETFEATURES, &features) == 0 &&
+features & IFF_VNET_HDR) {
+*vnet_hdr = 1;
+ifr.ifr_flags |= IFF_VNET_HDR;
+}
+}
+#endif
+
 if (ifname[0] != '\0')
 pstrcpy(ifr.ifr_name, IFNAMSIZ, ifname);
 else
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Increase size of virtio_net rings

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

GSO packets uses up a lot more buffer entries, so double
the size of the rings

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index 2298316..61215b1 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -35,7 +35,7 @@
 #define VIRTIO_NET_F_HOST_ECN  13  /* Host can handle TSO[6] w/ ECN in. */
 #define VIRTIO_NET_F_HOST_UFO  14  /* Host can handle UFO in. */
 
-#define TX_TIMER_INTERVAL 25 /* 250 us */
+#define TX_TIMER_INTERVAL 15 /* 150 us */
 
 /* The config defining mac address (6 bytes) */
 struct virtio_net_config
@@ -306,8 +306,8 @@ PCIDevice *virtio_net_init(PCIBus *bus, NICInfo *nd, int 
devfn)
 n->vdev.update_config = virtio_net_update_config;
 n->vdev.get_features = virtio_net_get_features;
 n->vdev.set_features = virtio_net_set_features;
-n->rx_vq = virtio_add_queue(&n->vdev, 128, virtio_net_handle_rx);
-n->tx_vq = virtio_add_queue(&n->vdev, 128, virtio_net_handle_tx);
+n->rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
+n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx);
 memcpy(n->mac, nd->macaddr, 6);
 n->vc = qemu_new_vlan_client(nd->vlan, virtio_net_receive,
  virtio_net_can_receive, n);
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Drop the mutex while reading from tapfd

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

The idea here is that with GSO, packets are much larger
and we can allow the vcpu threads to e.g. process irq
acks during the window where we're reading these
packets from the tapfd.

One known issue with this is that it triggers a subtle
SMP race in the kernel's posix-timers and signalfd code.
See here for more details and a test case:

  http://lkml.org/lkml/2008/7/17/151

The symptoms of this are that:

  a) occassionally throughput drops almost to zero
  b) manually doing "killall -ALRM qemu-kvm" kicks qemu
 out if its funk.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/vl.c b/qemu/vl.c
index 0ba8ace..2dc1311 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -4498,7 +4498,9 @@ static void tap_send(void *opaque)
sbuf.buf = s->buf;
s->size = getmsg(s->fd, NULL, &sbuf, &f) >=0 ? sbuf.len : -1;
 #else
+   kvm_sleep_begin();
s->size = read(s->fd, s->buf, sizeof(s->buf));
+   kvm_sleep_end();
 #endif
 
if (s->size == -1 && errno == EINTR)
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: Add support for partial csums and GSO

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

The tun/tap driver in 2.6.27 contains a new IFF_VNET_HDR
flag which makes every packet read from or written to the
tap fd be preceded by a virtio_net_hdr header.

This allows us to pass larger packets and packets with
partial checkums between the guest and the host, greatly
increasing the achievable bandwidth.

If the tap device has IFF_VNET_HDR enabled, the virtio-net
driver the merely needs to shuffle the headers supplied
by the guest or host to the other side.

We also inform the guest that we can now receive GSO packets
and have it confirm whether it can do likewise. If the guest
can receive GSO packets, we enable GSO on the tun device
using TUNSETOFFLOAD.

Note also that we increase the size of the tap packet buffer
to accomodate the largest possible GSO packet.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index 47349ce..9b1298e 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -22,9 +22,18 @@
 #define VIRTIO_ID_NET  1
 
 /* The feature bitmap for virtio net */
-#define VIRTIO_NET_F_NO_CSUM   0
-#define VIRTIO_NET_F_MAC   5
-#define VIRTIO_NET_F_GS0   6
+#define VIRTIO_NET_F_CSUM  0   /* Host handles pkts w/ partial csum */
+#define VIRTIO_NET_F_GUEST_CSUM1   /* Guest handles pkts w/ 
partial csum */
+#define VIRTIO_NET_F_MAC   5   /* Host has given MAC address. */
+#define VIRTIO_NET_F_GSO   6   /* Host handles pkts w/ any GSO type */
+#define VIRTIO_NET_F_GUEST_TSO47   /* Guest can handle TSOv4 in. */
+#define VIRTIO_NET_F_GUEST_TSO68   /* Guest can handle TSOv6 in. */
+#define VIRTIO_NET_F_GUEST_ECN 9   /* Guest can handle TSO[6] w/ ECN in. */
+#define VIRTIO_NET_F_GUEST_UFO 10  /* Guest can handle UFO in. */
+#define VIRTIO_NET_F_HOST_TSO4 11  /* Host can handle TSOv4 in. */
+#define VIRTIO_NET_F_HOST_TSO6 12  /* Host can handle TSOv6 in. */
+#define VIRTIO_NET_F_HOST_ECN  13  /* Host can handle TSO[6] w/ ECN in. */
+#define VIRTIO_NET_F_HOST_UFO  14  /* Host can handle UFO in. */
 
 #define TX_TIMER_INTERVAL 25 /* 250 us */
 
@@ -42,8 +51,6 @@ struct virtio_net_hdr
 uint8_t flags;
 #define VIRTIO_NET_HDR_GSO_NONE0   // Not a GSO frame
 #define VIRTIO_NET_HDR_GSO_TCPV4   1   // GSO frame, IPv4 TCP (TSO)
-/* FIXME: Do we need this?  If they said they can handle ECN, do they care? */
-#define VIRTIO_NET_HDR_GSO_TCPV4_ECN   2   // GSO frame, IPv4 TCP w/ ECN
 #define VIRTIO_NET_HDR_GSO_UDP 3   // GSO frame, IPv4 UDP (UFO)
 #define VIRTIO_NET_HDR_GSO_TCPV6   4   // GSO frame, IPv6 TCP
 #define VIRTIO_NET_HDR_GSO_ECN 0x80// TCP has ECN set
@@ -85,7 +92,38 @@ static void virtio_net_update_config(VirtIODevice *vdev, 
uint8_t *config)
 
 static uint32_t virtio_net_get_features(VirtIODevice *vdev)
 {
-return (1 << VIRTIO_NET_F_MAC);
+VirtIONet *n = to_virtio_net(vdev);
+VLANClientState *host = n->vc->vlan->first_client;
+uint32_t features = (1 << VIRTIO_NET_F_MAC);
+
+if (tap_has_vnet_hdr(host)) {
+   features |= (1 << VIRTIO_NET_F_CSUM);
+   features |= (1 << VIRTIO_NET_F_GUEST_CSUM);
+   features |= (1 << VIRTIO_NET_F_GUEST_TSO4);
+   features |= (1 << VIRTIO_NET_F_GUEST_TSO6);
+   features |= (1 << VIRTIO_NET_F_GUEST_ECN);
+   features |= (1 << VIRTIO_NET_F_HOST_TSO4);
+   features |= (1 << VIRTIO_NET_F_HOST_TSO6);
+   features |= (1 << VIRTIO_NET_F_HOST_ECN);
+   /* Kernel can't actually handle UFO in software currently. */
+}
+
+return features;
+}
+
+static void virtio_net_set_features(VirtIODevice *vdev, uint32_t features)
+{
+VirtIONet *n = to_virtio_net(vdev);
+VLANClientState *host = n->vc->vlan->first_client;
+
+if (!tap_has_vnet_hdr(host) || !host->set_offload)
+   return;
+
+host->set_offload(host,
+ (features >> VIRTIO_NET_F_GUEST_CSUM) & 1,
+ (features >> VIRTIO_NET_F_GUEST_TSO4) & 1,
+ (features >> VIRTIO_NET_F_GUEST_TSO6) & 1,
+ (features >> VIRTIO_NET_F_GUEST_ECN)  & 1);
 }
 
 /* RX */
@@ -121,6 +159,7 @@ static void virtio_net_receive(void *opaque, const uint8_t 
*buf, int size)
 VirtQueueElement elem;
 struct virtio_net_hdr *hdr;
 int offset, i;
+int total;
 
 if (virtqueue_pop(n->rx_vq, &elem) == 0)
return;
@@ -134,18 +173,26 @@ static void virtio_net_receive(void *opaque, const 
uint8_t *buf, int size)
 hdr->flags = 0;
 hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
 
-/* copy in packet.  ugh */
 offset = 0;
+total = sizeof(*hdr);
+
+if (tap_has_vnet_hdr(n->vc->vlan->first_client)) {
+   memcpy(hdr, buf, sizeof(*hdr));
+   offset += total;
+}
+
+/* copy in packet.  ugh */
 i = 1;
 while (offset < size && i < elem.in_num) {
   

[PATCH] kvm: qemu: Don't require all drivers to use virtio_net_hdr

2008-08-13 Thread Avi Kivity
From: Mark McLoughlin <[EMAIL PROTECTED]>

Hi Avi,

Thanks for catching the build error in this one.

Here's a new (yet uglier) version; the rest remain the same.

Cheers,
Mark.

Subject: [PATCH 08/11] kvm: qemu: Don't require all drivers to use 
virtio_net_hdr

The virtio-net driver is the only one which wishes to deal
with virtio_net_hdr headers, so add a "using_vnet_hdr" flag
to allow it to indicate this.

Preferably, we'd prefer to only enable IFF_VNET_HDR when
we're using virtio-net, but qemu's various abstractions
would make this very messy.

Signed-off-by: Mark McLoughlin <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index 9b1298e..2298316 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -97,6 +97,7 @@ static uint32_t virtio_net_get_features(VirtIODevice *vdev)
 uint32_t features = (1 << VIRTIO_NET_F_MAC);
 
 if (tap_has_vnet_hdr(host)) {
+   tap_using_vnet_hdr(host, 1);
features |= (1 << VIRTIO_NET_F_CSUM);
features |= (1 << VIRTIO_NET_F_GUEST_CSUM);
features |= (1 << VIRTIO_NET_F_GUEST_TSO4);
diff --git a/qemu/net.h b/qemu/net.h
index ae1a338..4891669 100644
--- a/qemu/net.h
+++ b/qemu/net.h
@@ -46,6 +46,7 @@ void qemu_handler_true(void *opaque);
 void do_info_network(void);
 
 int tap_has_vnet_hdr(void *opaque);
+void tap_using_vnet_hdr(void *opaque, int using_vnet_hdr);
 
 int net_client_init(const char *device, const char *opts);
 void net_client_uninit(NICInfo *nd);
diff --git a/qemu/vl.c b/qemu/vl.c
index f5aacf0..63e21f2 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -4347,6 +4347,10 @@ int tap_has_vnet_hdr(void *opaque)
 return 0;
 }
 
+void tap_using_vnet_hdr(void *opaque, int using_vnet_hdr)
+{
+}
+
 #else /* !defined(_WIN32) */
 
 #ifndef IFF_VNET_HDR
@@ -4367,10 +4371,11 @@ typedef struct TAPState {
 char buf[TAP_BUFSIZE];
 int size;
 unsigned int has_vnet_hdr : 1;
+unsigned int using_vnet_hdr : 1;
 } TAPState;
 
-static ssize_t tap_receive_iov(void *opaque, const struct iovec *iov,
-  int iovcnt)
+static ssize_t tap_writev(void *opaque, const struct iovec *iov,
+ int iovcnt)
 {
 TAPState *s = opaque;
 ssize_t len;
@@ -4382,17 +4387,51 @@ static ssize_t tap_receive_iov(void *opaque, const 
struct iovec *iov,
 return len;
 }
 
+static ssize_t tap_receive_iov(void *opaque, const struct iovec *iov,
+  int iovcnt)
+{
+#ifdef IFF_VNET_HDR
+TAPState *s = opaque;
+
+if (s->has_vnet_hdr && !s->using_vnet_hdr) {
+   struct iovec *iov_copy;
+   struct virtio_net_hdr hdr = { 0, };
+
+   iov_copy = alloca(sizeof(struct iovec) * (iovcnt + 1));
+
+   iov_copy[0].iov_base = &hdr;
+   iov_copy[0].iov_len  = sizeof(hdr);
+
+   memcpy(&iov_copy[1], iov, sizeof(struct iovec) * iovcnt);
+
+   return tap_writev(opaque, iov_copy, iovcnt + 1);
+}
+#endif
+
+return tap_writev(opaque, iov, iovcnt);
+}
+
 static void tap_receive(void *opaque, const uint8_t *buf, int size)
 {
+struct iovec iov[2];
+int i = 0;
+
+#ifdef IFF_VNET_HDR
 TAPState *s = opaque;
-int ret;
-for(;;) {
-ret = write(s->fd, buf, size);
-if (ret < 0 && (errno == EINTR || errno == EAGAIN)) {
-} else {
-break;
-}
+struct virtio_net_hdr hdr = { 0, };
+
+if (s->has_vnet_hdr && !s->using_vnet_hdr) {
+   iov[i].iov_base = &hdr;
+   iov[i].iov_len  = sizeof(hdr);
+   i++;
 }
+#endif
+
+iov[i].iov_base = (char *) buf;
+iov[i].iov_len  = size;
+i++;
+
+tap_writev(opaque, iov, i);
 }
 
 static int tap_can_send(void *opaque)
@@ -4421,6 +4460,21 @@ static int tap_can_send(void *opaque)
 return can_receive;
 }
 
+static int tap_send_packet(TAPState *s)
+{
+uint8_t *buf = s->buf;
+int size = s->size;
+
+#ifdef IFF_VNET_HDR
+if (s->has_vnet_hdr && !s->using_vnet_hdr) {
+   buf += sizeof(struct virtio_net_hdr);
+   size -= sizeof(struct virtio_net_hdr);
+}
+#endif
+
+return qemu_send_packet(s->vc, buf, size);
+}
+
 static void tap_send(void *opaque)
 {
 TAPState *s = opaque;
@@ -4430,7 +4484,7 @@ static void tap_send(void *opaque)
int err;
 
/* If noone can receive the packet, buffer it */
-   err = qemu_send_packet(s->vc, s->buf, s->size);
+   err = tap_send_packet(s);
if (err == -EAGAIN)
return;
 }
@@ -4454,7 +4508,7 @@ static void tap_send(void *opaque)
int err;
 
/* If noone can receive the packet, buffer it */
-   err = qemu_send_packet(s->vc, s->buf, s->size);
+   err = tap_send_packet(s);
if (err == -EAGAIN)
break;
}
@@ -4469,6 +4523,17 @@ int tap_has_vnet_hdr(void *opaque)
 return s ? s->has_vnet_hdr : 0;
 }
 
+void tap_using_vnet_hdr(void *opaque, int using_vnet_hdr)
+{
+VLANClientState *vc = opaque;
+TAP

[PATCH] KVM: VMX: Clean up magic number 0x66 in init_rmode_tss

2008-08-13 Thread Avi Kivity
From: Sheng Yang <[EMAIL PROTECTED]>

Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c4510fe..337670b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1732,7 +1732,8 @@ static int init_rmode_tss(struct kvm *kvm)
if (r < 0)
goto out;
data = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE;
-   r = kvm_write_guest_page(kvm, fn++, &data, 0x66, sizeof(u16));
+   r = kvm_write_guest_page(kvm, fn++, &data,
+   TSS_IOPB_BASE_OFFSET, sizeof(u16));
if (r < 0)
goto out;
r = kvm_clear_guest_page(kvm, fn++, 0, PAGE_SIZE);
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: remove unused field from the assigned dev struct

2008-08-13 Thread Avi Kivity
From: Ben-Ami Yassour <[EMAIL PROTECTED]>

Remove unused field: struct kvm_assigned_pci_dev assigned_dev
from struct: struct kvm_assigned_dev_kernel

Signed-off-by: Ben-Ami Yassour <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index 24805dc..5dd2f35 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -341,7 +341,6 @@ struct kvm_assigned_dev_kernel {
struct kvm_irq_ack_notifier ack_notifier;
struct work_struct interrupt_work;
struct list_head list;
-   struct kvm_assigned_pci_dev assigned_dev;
int assigned_dev_id;
int host_busnr;
int host_devfn;
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: set debug registers after "schedulable" section

2008-08-13 Thread Avi Kivity
From: Marcelo Tosatti <[EMAIL PROTECTED]>

The vcpu thread can be preempted after the guest_debug_pre() callback,
resulting in invalid debug registers on the new vcpu.

Move it inside the non-preemptable section.

Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a6299e6..ee005a6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3113,10 +3113,6 @@ static int __vcpu_run(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
down_read(&vcpu->kvm->slots_lock);
vapic_enter(vcpu);
 
-preempted:
-   if (vcpu->guest_debug.enabled)
-   kvm_x86_ops->guest_debug_pre(vcpu);
-
 again:
if (vcpu->requests)
if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests))
@@ -3170,6 +3166,9 @@ again:
goto out;
}
 
+   if (vcpu->guest_debug.enabled)
+   kvm_x86_ops->guest_debug_pre(vcpu);
+
vcpu->guest_mode = 1;
/*
 * Make sure that guest_mode assignment won't happen after
@@ -3244,7 +3243,7 @@ out:
if (r > 0) {
kvm_resched(vcpu);
down_read(&vcpu->kvm->slots_lock);
-   goto preempted;
+   goto again;
}
 
post_kvm_run_save(vcpu, kvm_run);
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: configure: fix qemu options with multiple arguments

2008-08-13 Thread Avi Kivity
From: Avi Kivity <[EMAIL PROTECTED]>

Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/configure b/configure
index 72337c9..3bb10ce 100755
--- a/configure
+++ b/configure
@@ -8,7 +8,7 @@ objcopy=objcopy
 want_module=1
 qemu_cflags=
 qemu_ldflags=
-qemu_opts=
+qemu_opts=()
 cross_prefix=
 arch=`uname -m`
 target_exec=
@@ -40,9 +40,11 @@ EOF
 while [[ "$1" = -* ]]; do
 opt="$1"; shift
 arg=
+hasarg=
 if [[ "$opt" = *=* ]]; then
arg="${opt#*=}"
opt="${opt%%=*}"
+   hasarg=1
 fi
 case "$opt" in
--prefix)
@@ -70,7 +72,7 @@ while [[ "$1" = -* ]]; do
usage
;;
*)
-   qemu_opts="$qemu_opts $opt"
+   qemu_opts=("[EMAIL PROTECTED]" "$opt${hasarg:+=$arg}")
;;
 esac
 done
@@ -114,7 +116,7 @@ fi
 --kernel-path="$libkvm_kerneldir" \
 --prefix="$prefix" \
 ${cross_prefix:+"--cross-prefix=$cross_prefix"} \
-${cross_prefix:+"--cpu=$arch"} $qemu_opts
+${cross_prefix:+"--cpu=$arch"} "[EMAIL PROTECTED]"
 ) || usage
 
 
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: use proper open call for init file

2008-08-13 Thread Avi Kivity
From: Philippe Gerum <[EMAIL PROTECTED]>

This patch fixes misspelled calls to qemu_fopen_file().

Signed-off-by: Philippe Gerum <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

diff --git a/qemu/hw/ds1225y.c b/qemu/hw/ds1225y.c
index 3b91b4f..b1d0284 100644
--- a/qemu/hw/ds1225y.c
+++ b/qemu/hw/ds1225y.c
@@ -171,13 +171,13 @@ void *ds1225y_init(target_phys_addr_t mem_base, const 
char *filename)
 s->protection = 7;
 
 /* Read current file */
-file = qemu_fopen(filename, "rb");
+file = qemu_fopen_file(filename, "rb");
 if (file) {
 /* Read nvram contents */
 qemu_get_buffer(file, s->contents, s->chip_size);
 qemu_fclose(file);
 }
-s->file = qemu_fopen(filename, "wb");
+s->file = qemu_fopen_file(filename, "wb");
 if (s->file) {
 /* Write back contents, as 'wb' mode cleaned the file */
 qemu_put_buffer(s->file, s->contents, s->chip_size);
--
To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html