Re: [PATCH 1/4] virtio_blk: deprecate the 1024-byte ID field.

2009-09-30 Thread Christian Borntraeger
Am Dienstag 29 September 2009 19:18:09 schrieb Rusty Russell:
 PCI, lguest and s390 can all only support 256-byte configuration
 space.  So, this giant field broke just about everyone.
 Unfortunately, removing it is not so simple: we don't want to break
 old userspace, but we're going to want to re-use that part of the
 struct.
 
 So, modern users can #define VIRTIO_BLK_IDENTIFY_DEPRECATED to indicate
 that they know it's no longer in the config struct, and can use any
 new features (all new features which add a configuration field will
 conflict with this deprecated one).


Since s390 never used the giant id field, it would be ok for us just delete it 
(without the #define). IIRC kvm-userspace also never used that. Since qemu 
upstream seems to use that field your way seems to be the only compatible...

O dear, virtio used to look pretty ;-).
I think somewhen in the future we have to create a virtio2 that gets rid of all 
the stuff that accumulated in the early phase of Linux virtualization.

Anyway, your patch was tested successfully on s390 to survive the current 
userspace.

Tested-by: Christian Borntraeger borntrae...@de.ibm.com
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


[PATCH] virtio_ids: let virtio header files include virtio_ids.h and export it

2009-09-30 Thread Christian Borntraeger
[PATCH] virtio_ids: let header files include virtio_ids.h

Rusty,

commit 3ca4f5ca73057a617f9444a91022d7127041970a
virtio: add virtio IDs file
moved all device IDs into a single file. While the change itself is 
a very good one, it can break userspace applications. For example
if a userspace tool wanted to get the ID of virtio_net it used to 
include virtio_net.h. This does no longer work, since virtio_net.h
does not include virtio_ids.h.
This patch moves all #include linux/virtio_ids.h from the C
files into the header files, making the header files compatible with
the old ones.

In addition, this patch exports virtio_ids.h to userspace.

CCed: Fernando Luis Vazquez Cao ferna...@oss.ntt.co.jp
Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
---
 Documentation/lguest/lguest.c   |1 -
 drivers/block/virtio_blk.c  |1 -
 drivers/char/hw_random/virtio-rng.c |1 -
 drivers/char/virtio_console.c   |1 -
 drivers/net/virtio_net.c|1 -
 drivers/virtio/virtio_balloon.c |1 -
 include/linux/Kbuild|1 +
 include/linux/virtio_9p.h   |1 +
 include/linux/virtio_balloon.h  |1 +
 include/linux/virtio_blk.h  |1 +
 include/linux/virtio_console.h  |1 +
 include/linux/virtio_net.h  |1 +
 include/linux/virtio_rng.h  |1 +
 net/9p/trans_virtio.c   |1 -
 14 files changed, 7 insertions(+), 7 deletions(-)

Index: linux-2.6/Documentation/lguest/lguest.c
===
--- linux-2.6.orig/Documentation/lguest/lguest.c
+++ linux-2.6/Documentation/lguest/lguest.c
@@ -42,7 +42,6 @@
 #include signal.h
 #include linux/lguest_launcher.h
 #include linux/virtio_config.h
-#include linux/virtio_ids.h
 #include linux/virtio_net.h
 #include linux/virtio_blk.h
 #include linux/virtio_console.h
Index: linux-2.6/drivers/block/virtio_blk.c
===
--- linux-2.6.orig/drivers/block/virtio_blk.c
+++ linux-2.6/drivers/block/virtio_blk.c
@@ -3,7 +3,6 @@
 #include linux/blkdev.h
 #include linux/hdreg.h
 #include linux/virtio.h
-#include linux/virtio_ids.h
 #include linux/virtio_blk.h
 #include linux/scatterlist.h
 
Index: linux-2.6/drivers/char/hw_random/virtio-rng.c
===
--- linux-2.6.orig/drivers/char/hw_random/virtio-rng.c
+++ linux-2.6/drivers/char/hw_random/virtio-rng.c
@@ -21,7 +21,6 @@
 #include linux/scatterlist.h
 #include linux/spinlock.h
 #include linux/virtio.h
-#include linux/virtio_ids.h
 #include linux/virtio_rng.h
 
 /* The host will fill any buffer we give it with sweet, sweet randomness.  We
Index: linux-2.6/drivers/char/virtio_console.c
===
--- linux-2.6.orig/drivers/char/virtio_console.c
+++ linux-2.6/drivers/char/virtio_console.c
@@ -31,7 +31,6 @@
 #include linux/err.h
 #include linux/init.h
 #include linux/virtio.h
-#include linux/virtio_ids.h
 #include linux/virtio_console.h
 #include hvc_console.h
 
Index: linux-2.6/drivers/net/virtio_net.c
===
--- linux-2.6.orig/drivers/net/virtio_net.c
+++ linux-2.6/drivers/net/virtio_net.c
@@ -22,7 +22,6 @@
 #include linux/ethtool.h
 #include linux/module.h
 #include linux/virtio.h
-#include linux/virtio_ids.h
 #include linux/virtio_net.h
 #include linux/scatterlist.h
 #include linux/if_vlan.h
Index: linux-2.6/drivers/virtio/virtio_balloon.c
===
--- linux-2.6.orig/drivers/virtio/virtio_balloon.c
+++ linux-2.6/drivers/virtio/virtio_balloon.c
@@ -19,7 +19,6 @@
  */
 //#define DEBUG
 #include linux/virtio.h
-#include linux/virtio_ids.h
 #include linux/virtio_balloon.h
 #include linux/swap.h
 #include linux/kthread.h
Index: linux-2.6/include/linux/Kbuild
===
--- linux-2.6.orig/include/linux/Kbuild
+++ linux-2.6/include/linux/Kbuild
@@ -363,6 +363,7 @@ unifdef-y += utsname.h
 unifdef-y += videodev2.h
 unifdef-y += videodev.h
 unifdef-y += virtio_config.h
+unifdef-y += virtio_ids.h
 unifdef-y += virtio_blk.h
 unifdef-y += virtio_net.h
 unifdef-y += virtio_9p.h
Index: linux-2.6/include/linux/virtio_9p.h
===
--- linux-2.6.orig/include/linux/virtio_9p.h
+++ linux-2.6/include/linux/virtio_9p.h
@@ -2,6 +2,7 @@
 #define _LINUX_VIRTIO_9P_H
 /* This header is BSD licensed so anyone can use the definitions to implement
  * compatible drivers/servers. */
+#include linux/virtio_ids.h
 #include linux/virtio_config.h
 
 /* Maximum number of virtio channels per partition (1 for now) */
Index: linux-2.6/include/linux/virtio_balloon.h
===
--- linux-2.6.orig/include/linux/virtio_balloon.h
+++ 

Re: [PATCH] virtio_ids: let virtio header files include virtio_ids.h and export it

2009-09-30 Thread Rusty Russell
On Wed, 30 Sep 2009 06:47:21 pm Christian Borntraeger wrote:
 [PATCH] virtio_ids: let header files include virtio_ids.h

Thanks, applied.

Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [Pv-drivers] [PATCH 2.6.31-rc9] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-30 Thread Arnd Bergmann
On Tuesday 29 September 2009, David Miller wrote:
  
  These header files are indeed shared with the host implementation,
  as you've guessed. If it's not a big deal, we would like to keep
  the names the same, just for our own sanity's sake?
 
 No.  This isn't your source tree, it's everyone's.  So you should
 adhere to basic naming conventions and coding standards of the
 tree regardless of what you happen to use or need to use internally.

Well, there is nothing wrong with making the identifiers the same
everywhere, as long as they all follow the Linux coding style ;-).

I heard that a number of cross-OS device drivers do that nowadays.

Arnd 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


RE: [Pv-drivers] [PATCH 2.6.31-rc9] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-30 Thread Bhavesh Davda
Hi Chris,

Thanks a bunch for your really thorough review! I'll answer some of your 
questions here. Shreyas can respond to your comments about some of the coding 
style/comments/etc. in a separate mail.

  INTx, MSI, MSI-X (25 vectors) interrupts
  16 Rx queues, 8 Tx queues
 
 Driver doesn't appear to actually support more than a single MSI-X
 interrupt.
 What is your plan for doing real multiqueue?

When we first wrote the driver a couple of years ago, Linux lacked proper 
multiqueue support, hence we chose to use only a single queue though the 
emulated device does support 16 Rx and 8 Tx queues, and 25 MSI-X vectors: 16 
for Rx, 8 for Tx and 1 for other asynchronous event notifications, by design. 
Actually a driver can repurpose any of the 25 vectors for any notifications; 
just explaining the rationale for desiging the device with 25 MSI-X vectors.

We do have an internal prototype of a Linux vmxnet3 driver with 4 Tx queues and 
4 Rx queues, using 9 MSI-X vectors, but it needs some work before calling it 
production ready.

 How about GRO conversion?

Looks attractive, and we'll work on that in a subsequent patch. Again, when we 
first wrote the driver, the NETIF_F_GRO stuff didn't exist in Linux.

 Also, heavy use of BUG_ON() (counted 51 of them), are you sure that
 none
 of them can be triggered by guest or remote (esp. the ones that happen
 in interrupt context)?  Some initial thoughts below.

We'll definitely audit all the BUG_ONs again to make sure they can't be 
exploited.

  --- /dev/null
  +++ b/drivers/net/vmxnet3/upt1_defs.h
  +#define UPT1_MAX_TX_QUEUES  64
  +#define UPT1_MAX_RX_QUEUES  64
 
 This is different than the 16/8 described above (and seemingly all moot
 since it becomes a single queue device).

Nice catch! Those are not even used and are from the earliest days of our 
driver development. We'll nuke those.

  +/* interrupt moderation level */
  +#define UPT1_IML_NONE 0 /* no interrupt moderation */
  +#define UPT1_IML_HIGHEST  7 /* least intr generated */
  +#define UPT1_IML_ADAPTIVE 8 /* adpative intr moderation */
 
 enum?  also only appears to support adaptive mode?

Yes, the Linux driver currently only asks for adaptive mode, but the device 
supports 8 interrupt moderation levels.

  --- /dev/null
  +++ b/drivers/net/vmxnet3/vmxnet3_defs.h
  +struct Vmxnet3_MiscConf {
  +   struct Vmxnet3_DriverInfo driverInfo;
  +   uint64_t uptFeatures;
  +   uint64_t ddPA; /* driver data PA */
  +   uint64_t queueDescPA;  /* queue descriptor table
 PA */
  +   uint32_t ddLen;/* driver data len */
  +   uint32_t queueDescLen; /* queue desc. table len
 in bytes */
  +   uint32_t mtu;
  +   uint16_t maxNumRxSG;
  +   uint8_t  numTxQueues;
  +   uint8_t  numRxQueues;
  +   uint32_t reserved[4];
  +};
 
 should this be packed (or others that are shared w/ device)?  i assume
 you've already done 32 vs 64 here
 

No need for packing since the fields are naturally 64-bit aligned. True for all 
structures shared between the driver and device.

  +#define VMXNET3_MAX_TX_QUEUES  8
  +#define VMXNET3_MAX_RX_QUEUES  16
 
 different to UPT, I must've missed some layering here

These are the authoritiative #defines. Ignore the UPT ones.

  --- /dev/null
  +++ b/drivers/net/vmxnet3/vmxnet3_drv.c
  +   VMXNET3_WRITE_BAR0_REG(adapter, VMXNET3_REG_IMR + intr_idx *
 8, 0);
 
   writel(0, adapter-hw_addr0 + VMXNET3_REG_IMR + intr_idx * 8)
 seems just as clear to me.

Fair enough. We were just trying to clearly show which register accesses go to 
BAR 0 versus BAR 1.

 only ever num_intrs=1, so there's some plan to bump this up and make
 these wrappers useful?

Yes.

  +static void
  +vmxnet3_process_events(struct vmxnet3_adapter *adapter)
 
 Should be trivial to break out to it's own MSI-X vector, basically set
 up to do that already.

Yes, and the device is configurable to use any vector for any events, but 
didn't see any compelling reason to do so. ECR events are extremely rare and 
we've got a shadow copy of the ECR register that avoids an expensive round trip 
to the device, stored in adapter-shared-ecr. So we can cheaply handle events 
on the hot Tx/Rx path with minimal overhead. But if you really see a compelling 
reason to allocate a separate MSI-X vector for events, we can certainly do that.

 
 Plan to switch to GRO?

Already answered.

Thanks

- Bhavesh
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


RE: [Pv-drivers] [PATCH 2.6.31-rc9] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-30 Thread Bhavesh Davda
Hi Arnd,

 On Tuesday 29 September 2009, Chris Wright wrote:
   +struct Vmxnet3_MiscConf {
   +   struct Vmxnet3_DriverInfo driverInfo;
   +   uint64_t uptFeatures;
   +   uint64_t ddPA; /* driver data PA */
   +   uint64_t queueDescPA;  /* queue descriptor
 table PA */
   +   uint32_t ddLen;/* driver data len */
   +   uint32_t queueDescLen; /* queue desc. table len
 in bytes */
   +   uint32_t mtu;
   +   uint16_t maxNumRxSG;
   +   uint8_t  numTxQueues;
   +   uint8_t  numRxQueues;
   +   uint32_t reserved[4];
   +};
 
  should this be packed (or others that are shared w/ device)?  i
 assume
  you've already done 32 vs 64 here
 
 I would not mark it packed, because it already is well-defined on all
 systems. You should add __packed only to the fields where you screwed
 up, but not to structures that already work fine.

You're exactly right; I reiterated as much in my response to Chris.

 One thing that should possibly be fixed is the naming of identifiers,
 e.g.
 's/Vmxnet3_MiscConf/vmxnet3_misc_conf/g', unless these header files are
 shared with the host implementation.

These header files are indeed shared with the host implementation, as you've 
guessed. If it's not a big deal, we would like to keep the names the same, just 
for our own sanity's sake?

Thanks!

- Bhavesh

 
   Arnd 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


RE: [Pv-drivers] [PATCH 2.6.31-rc9] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-30 Thread Bhavesh Davda
  Thanks a bunch for your really thorough review! I'll answer some of
 your questions here. Shreyas can respond to your comments about some of
 the coding style/comments/etc. in a separate mail.
 
 The style is less important at this stage, but certainly eases review
 to make it more consistent w/ Linux code.  The StudlyCaps, extra macros
 (screaming caps) and inconistent space/tabs are visual distractions,
 that's all.

Agreed, but we'll definitely address all the style issues in our subsequent 
patch posts. Actually Shreyas showed me his raw patch and it had tabs and not 
spaces, so we're trying to figure out if either Outlook (corporate blessed) or 
our Exchange server is converting those tabs to spaces or something.

  We do have an internal prototype of a Linux vmxnet3 driver with 4 Tx
 queues and 4 Rx queues, using 9 MSI-X vectors, but it needs some work
 before calling it production ready.
 
 I'd expect once you switch to alloc_etherdev_mq(), make napi work per
 rx queue, and fix MSI-X allocation (all needed for 4/4), you should
 have enough to support the max of 16/8 (IOW, 4/4 still sounds like an
 aritificial limitation).

Absolutely: 4/4 was simply a prototype to see if it helps with performance any 
with certain benchmarks. So far it looks like there's a small performance gain 
with microbenchmarks like netperf, but we're hoping having multiple queues with 
multiple vectors might have some promise with macro benchmarks like SPECjbb. If 
it pans out, we'll most likely make it a module_param with some reasonable 
defaults, possibly just 1/1 by default.

   How about GRO conversion?
 
  Looks attractive, and we'll work on that in a subsequent patch.
 Again, when we first wrote the driver, the NETIF_F_GRO stuff didn't
 exist in Linux.
 
 OK, shouldn't be too much work.
 
 Another thing I forgot to mention is that net_device now has
 net_device_stats in it.  So you shouldn't need net_device_stats in
 vmxnet3_adapter.

Cool. Will do.

+#define UPT1_MAX_TX_QUEUES  64
+#define UPT1_MAX_RX_QUEUES  64
  
   This is different than the 16/8 described above (and seemingly all
 moot
   since it becomes a single queue device).
 
  Nice catch! Those are not even used and are from the earliest days of
 our driver development. We'll nuke those.
 
 Could you describe the UPT layer a bit?  There were a number of
 constants that didn't appear to be used.

UPT stands for Uniform Pass Thru, a spec/framework VMware developed with its 
IHV partners to implement the fast path (Tx/Rx) features of vmxnet3 in silicon. 
Some of these #defines that appear not to be used are based on this initial 
spec that VMware shared with its IHV partners.

We divided the emulated vmxnet3 PCIe device's registers into two sets on two 
separate BARs: BAR 0 for the UPT registers we asked IHV partners to implement 
that we emulate in our hypervisor if no physical device compliant with the UPT 
spec is available to pass thru from a virtual machine, and BAR 1 for registers 
we always emulate for slow path/control operations like setting the MAC 
address, or activating/quiescing/resetting the device, etc.

+static void
+vmxnet3_process_events(struct vmxnet3_adapter *adapter)
  
   Should be trivial to break out to it's own MSI-X vector, basically
 set
   up to do that already.
 
  Yes, and the device is configurable to use any vector for any
 events, but didn't see any compelling reason to do so. ECR events
 are extremely rare and we've got a shadow copy of the ECR register that
 avoids an expensive round trip to the device, stored in adapter-
 shared-ecr. So we can cheaply handle events on the hot Tx/Rx path
 with minimal overhead. But if you really see a compelling reason to
 allocate a separate MSI-X vector for events, we can certainly do that.
 
 Nah, just thinking outloud while trying to understand the driver.  I
 figured it'd be the + 1 vector (16 + 8 + 1).

Great. In that case we'll stay with not allocating a separate vector for events 
for now.

Thanks!

- Bhavesh

 
 thanks,
 -chris
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_console: Add support for multiple ports for generic guest and host communication

2009-09-30 Thread Amit Shah
On (Tue) Sep 29 2009 [15:31:23], Christian Borntraeger wrote:
 Am Dienstag 29 September 2009 15:09:50 schrieb Amit Shah:
  Great, thanks. However I was thinking of moving this init to the probe()
  routine instead of in the init_conosle routine just because multiple
  consoles can be added and we don't want to init this each time.. just
  once in probe is fine.
 
 If you have new patch CC me and I can give it a spin.

Hey Christian,

I have a new patch that changes a few things:
- moves the put_char fix to probe instead of doing it in
  init_port_console(), which gets called on each console port found.
- uses port-id instead of a static hvc_vtermno to pass on a value to
  hvc_alloc(). Motivation explained within comments in the code.
- A few other changes that introduce and make use of port-vcon instead
  of accessing the static virtconsole directly -- aimed at easing future
  fix to have multiple virtio-console devices.

It would be great if you could test this.

Amit


diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index 6a06913..7b4602f 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -679,6 +679,12 @@ config VIRTIO_CONSOLE
help
  Virtio console for use with lguest and other hypervisors.
 
+ Also serves as a general-purpose serial device for data
+ transfer between the guest and host. Character devices at
+ /dev/vconNN will be created when corresponding ports are
+ found. If specified by the host, a sysfs attribute called
+ 'name' will be populated with a name for the port which can
+ be used by udev scripts to create a symlink to /dev/vconNN.
 
 config HVCS
tristate IBM Hypervisor Virtual Console Server support
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 0d328b5..16cdcec 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -9,10 +9,8 @@
  * functions.
  :*/
 
-/*M:002 The console can be flooded: while the Guest is processing input the
- * Host can send more.  Buffering in the Host could alleviate this, but it is a
- * difficult problem in general. :*/
 /* Copyright (C) 2006, 2007 Rusty Russell, IBM Corporation
+ * Copyright (C) 2009, Amit Shah, Red Hat, Inc.
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -28,115 +26,468 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
+
+#include linux/cdev.h
+#include linux/device.h
 #include linux/err.h
+#include linux/fs.h
 #include linux/init.h
+#include linux/poll.h
+#include linux/spinlock.h
 #include linux/virtio.h
 #include linux/virtio_ids.h
 #include linux/virtio_console.h
+#include linux/workqueue.h
 #include hvc_console.h
 
-/*D:340 These represent our input and output console queues, and the virtio
- * operations for them. */
-static struct virtqueue *in_vq, *out_vq;
-static struct virtio_device *vdev;
+/* This struct stores data that's common to all the ports */
+struct virtio_console_struct {
+   /*
+* Workqueue handlers where we process deferred work after an
+* interrupt
+*/
+   struct work_struct rx_work;
+   struct work_struct tx_work;
+   struct work_struct config_work;
 
-/* This is our input buffer, and how much data is left in it. */
-static unsigned int in_len;
-static char *in, *inbuf;
+   struct list_head port_head;
+   struct list_head unused_read_head;
+   struct list_head unused_write_head;
 
-/* The operations for our console. */
-static struct hv_ops virtio_cons;
+   /* To protect the list of unused write buffers */
+   spinlock_t write_list_lock;
 
-/* The hvc device */
-static struct hvc_struct *hvc;
+   struct virtio_device *vdev;
+   struct class *class;
+   /* The input and the output queues */
+   struct virtqueue *in_vq, *out_vq;
 
-/*D:310 The put_chars() callback is pretty straightforward.
- *
- * We turn the characters into a scatter-gather list, add it to the output
- * queue and then kick the Host.  Then we sit here waiting for it to finish:
- * inefficient in theory, but in practice implementations will do it
- * immediately (lguest's Launcher does). */
-static int put_chars(u32 vtermno, const char *buf, int count)
+   /* The current config space is stored here */
+   struct virtio_console_config config;
+};
+
+/* This struct holds individual buffers received for each port */
+struct virtio_console_port_buffer {
+   struct list_head next;
+
+   char *buf;
+
+   /* length of the buffer */
+   size_t len;
+   /* offset in the buf from which to consume data */
+   size_t offset;
+};
+
+/* This struct holds the per-port data */
+struct virtio_console_port {
+   /* Next port in the list, head is in the virtio_console_struct */
+   struct 

virtio-blk + suspend to disk

2009-09-30 Thread Gleb Natapov
Hi,

Does anybody knows is subject supported? It doesn't work and
it seems that virtio-pci device has no way to know that suspend/resume
happened.

--
Gleb.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server

2009-09-30 Thread Gregory Haskins
Avi Kivity wrote:
 On 09/26/2009 12:32 AM, Gregory Haskins wrote:

 I realize in retrospect that my choice of words above implies vbus _is_
 complete, but this is not what I was saying.  What I was trying to
 convey is that vbus is _more_ complete.  Yes, in either case some kind
 of glue needs to be written.  The difference is that vbus implements
 more of the glue generally, and leaves less required to be customized
 for each iteration.



 No argument there.  Since you care about non-virt scenarios and virtio
 doesn't, naturally vbus is a better fit for them as the code stands.
  
 Thanks for finally starting to acknowledge there's a benefit, at least.

 
 I think I've mentioned vbus' finer grained layers as helpful here,
 though I doubt the value of this.  Hypervisors are added rarely, while
 devices and drivers are added (and modified) much more often.  I don't
 buy the anything-to-anything promise.

The ease in which a new hypervisor should be able to integrate into the
stack is only one of vbus's many benefits.

 
 To be more precise, IMO virtio is designed to be a performance oriented
 ring-based driver interface that supports all types of hypervisors (e.g.
 shmem based kvm, and non-shmem based Xen).  vbus is designed to be a
 high-performance generic shared-memory interconnect (for rings or
 otherwise) framework for environments where linux is the underpinning
 host (physical or virtual).  They are distinctly different, but
 complementary (the former addresses the part of the front-end, and
 latter addresses the back-end, and a different part of the front-end).

 
 They're not truly complementary since they're incompatible.

No, that is incorrect.  Not to be rude, but for clarity:

  Complementary \Com`ple*menta*ry\, a.
 Serving to fill out or to complete; as, complementary
 numbers.
 [1913 Webster]

Citation: www.dict.org

IOW: Something being complementary has nothing to do with guest/host
binary compatibility.  virtio-pci and virtio-vbus are both equally
complementary to virtio since they fill in the bottom layer of the
virtio stack.

So yes, vbus is truly complementary to virtio afaict.

 A 2.6.27 guest, or Windows guest with the existing virtio drivers, won't work
 over vbus.

Binary compatibility with existing virtio drivers, while nice to have,
is not a specific requirement nor goal.  We will simply load an updated
KMP/MSI into those guests and they will work again.  As previously
discussed, this is how more or less any system works today.  It's like
we are removing an old adapter card and adding a new one to uprev the
silicon.

  Further, non-shmem virtio can't work over vbus.

Actually I misspoke earlier when I said virtio works over non-shmem.
Thinking about it some more, both virtio and vbus fundamentally require
shared-memory, since sharing their metadata concurrently on both sides
is their raison d'ĂȘtre.

The difference is that virtio utilizes a pre-translation/mapping (via
-add_buf) from the guest side.  OTOH, vbus uses a post translation
scheme (via memctx) from the host-side.  If anything, vbus is actually
more flexible because it doesn't assume the entire guest address space
is directly mappable.

In summary, your statement is incorrect (though it is my fault for
putting that idea in your head).

  Since
 virtio is guest-oriented and host-agnostic, it can't ignore
 non-shared-memory hosts (even though it's unlikely virtio will be
 adopted there)

Well, to be fair no one said it has to ignore them.  Either virtio-vbus
transport is present and available to the virtio stack, or it isn't.  If
its present, it may or may not publish objects for consumption.
Providing a virtio-vbus transport in no way limits or degrades the
existing capabilities of the virtio stack.  It only enhances them.

I digress.  The whole point is moot since I realized that the non-shmem
distinction isn't accurate anyway.  They both require shared-memory for
the metadata, and IIUC virtio requires the entire address space to be
mappable whereas vbus only assumes the metadata is.

 
 In addition, the kvm-connector used in AlacrityVM's design strives to
 add value and improve performance via other mechanisms, such as dynamic
   allocation, interrupt coalescing (thus reducing exit-ratio, which is a
 serious issue in KVM)
 
 Do you have measurements of inter-interrupt coalescing rates (excluding
 intra-interrupt coalescing).

I actually do not have a rig setup to explicitly test inter-interrupt
rates at the moment.  Once things stabilize for me, I will try to
re-gather some numbers here.  Last time I looked, however, there were
some decent savings for inter as well.

Inter rates are interesting because they are what tends to ramp up with
IO load more than intra since guest interrupt mitigation techniques like
NAPI often quell intra-rates naturally.  This is especially true for
data-center, cloud, hpc-grid, etc, kind of workloads (vs vanilla
desktops, etc) that tend to have multiple IO 

INFO: task journal:337 blocked for more than 120 seconds

2009-09-30 Thread Shirley Ma
Hello all,

Anybody found this problem before? I kept hitting this issue for 2.6.31
guest kernel even with a simple network test.

INFO: task kjournal:337 blocked for more than 120 seconds.
echo 0  /proc/sys/kernel/hung_task_timeout_sec disables this message.

kjournald   D 0041  0   337 2 0x

My test is totally being blocked.

Thanks
Shirley

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2.6.32-rc1] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-30 Thread Stephen Hemminger
On Wed, 30 Sep 2009 14:34:57 -0700 (PDT)
Shreyas Bhatewara sbhatew...@vmware.com wrote:

Note: your patch was linewrapped again

 +
 +
 +static void
 +vmxnet3_declare_features(struct vmxnet3_adapter *adapter, bool dma64)
 +{
 + struct net_device *netdev = adapter-netdev;
 +
 + netdev-features = NETIF_F_SG |
 + NETIF_F_HW_CSUM |
 + NETIF_F_HW_VLAN_TX |
 + NETIF_F_HW_VLAN_RX |
 + NETIF_F_HW_VLAN_FILTER |
 + NETIF_F_TSO |
 + NETIF_F_TSO6;
 +
 + printk(KERN_INFO features: sg csum vlan jf tso tsoIPv6);
 +
 + adapter-rxcsum = true;
 + adapter-jumbo_frame = true;
 +
 + if (!disable_lro) {
 + adapter-lro = true;
 + printk( lro);
 + }

Why not use NETIF_F_LRO and ethtool to control LRO support?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


RE: [PATCH 2.6.32-rc1] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-30 Thread Shreyas Bhatewara
Stephen,

Thanks for taking a look.



 -Original Message-
 From: Stephen Hemminger [mailto:shemmin...@vyatta.com]
 Sent: Wednesday, September 30, 2009 5:39 PM
 To: Shreyas Bhatewara
 Cc: linux-kernel; netdev; Stephen Hemminger; David S. Miller; Jeff
 Garzik; Anthony Liguori; Chris Wright; Greg Kroah-Hartman; Andrew
 Morton; virtualization; pv-drivers
 Subject: Re: [PATCH 2.6.32-rc1] net: VMware virtual Ethernet NIC
 driver: vmxnet3
 
 On Wed, 30 Sep 2009 14:34:57 -0700 (PDT)
 Shreyas Bhatewara sbhatew...@vmware.com wrote:
 
 Note: your patch was linewrapped again
 

Fixed the alpine option. Should not happen again.

  +
  +
  +static void
  +vmxnet3_declare_features(struct vmxnet3_adapter *adapter, bool
 dma64)
  +{
  +   struct net_device *netdev = adapter-netdev;
  +
  +   netdev-features = NETIF_F_SG |
  +   NETIF_F_HW_CSUM |
  +   NETIF_F_HW_VLAN_TX |
  +   NETIF_F_HW_VLAN_RX |
  +   NETIF_F_HW_VLAN_FILTER |
  +   NETIF_F_TSO |
  +   NETIF_F_TSO6;
  +
  +   printk(KERN_INFO features: sg csum vlan jf tso tsoIPv6);
  +
  +   adapter-rxcsum = true;
  +   adapter-jumbo_frame = true;
  +
  +   if (!disable_lro) {
  +   adapter-lro = true;
  +   printk( lro);
  +   }
 
 Why not use NETIF_F_LRO and ethtool to control LRO support?

Yes, that would be a better way to do it. I will make that change.



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2.6.32-rc1] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-30 Thread David Miller
From: Stephen Hemminger shemmin...@vyatta.com
Date: Wed, 30 Sep 2009 17:39:23 -0700

 Why not use NETIF_F_LRO and ethtool to control LRO support?

In fact, you must, in order to handle bridging and routing
correctly.

Bridging and routing is illegal with LRO enabled, so the kernel
automatically issues the necessary ethtool commands to disable
LRO in the relevant devices.

Therefore you must support the ethtool LRO operation in order to
support LRO at all.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization