Re: [PATCH] kvm tool: add QCOW verions 1 read/write support

2011-04-15 Thread Pekka Enberg
Hi Christoph,

On Wed, Apr 13, 2011 at 08:01:58PM +0100, Prasad Joshi wrote:
 The patch only implements the basic read write support for QCOW version 1
 images. Many of the QCOW features are not implmented, for example

On Fri, Apr 15, 2011 at 6:24 AM, Christoph Hellwig h...@infradead.org wrote:
 What's the point?  Qcow1 has been deprecated for a long time.

We're going to add QCOW2 too. Do you think there's no value in
supporting QCOW1 and we should drop it completely? I have no
objections in being QCOW2-only.

 Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Why QCOW1? (was: [PATCH v2] kvm tool: add QCOW verions 1 read/write support)

2011-04-15 Thread Markus Armbruster
Kevin Wolf kw...@redhat.com writes:

 Am 13.04.2011 21:26, schrieb Prasad Joshi:
 The patch only implements the basic read write support for QCOW version 1
 images. Many of the QCOW features are not implmented, for example
  - image creation
  - snapshot
  - copy-on-write
  - encryption

 Yay, more forks, more code duplication!

 Have you thought about a way to actually share code with qemu instead of
 repeating Xen's mistake of copying code, modifying it until merges are
 no longer possible and then let it bitrot?

 If you shared code, you also wouldn't have to use an obsolete, but
 simple-to-implement format, but could use some of the better maintained
 formats of qemu, like qcow2.

 Also at least your qcow1.c is lacking the copyright header. Please add
 this, otherwise you're violating the license.

As discussed already, reimplementing rather than sharing can be excused,
because the existing code is not ready for sharing.

What hasn't been discussed much is the other half of Kevin's remark: why
QCOW1?  Does anyone still use that old hag in anger?  The format people
actually use is QCOW2, and it is properly maintained.

Even for little-used QOW formats, there are more interesting choices:
QED and FVD.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why QCOW1? (was: [PATCH v2] kvm tool: add QCOW verions 1 read/write support)

2011-04-15 Thread Pekka Enberg
On Fri, Apr 15, 2011 at 9:41 AM, Markus Armbruster arm...@redhat.com wrote:
 What hasn't been discussed much is the other half of Kevin's remark: why
 QCOW1?

QCOW1 was simpler to implement as the first non-raw image format.

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tool: add QCOW verions 1 read/write support

2011-04-15 Thread Kevin Wolf
Am 15.04.2011 08:24, schrieb Pekka Enberg:
 Hi Christoph,
 
 On Wed, Apr 13, 2011 at 08:01:58PM +0100, Prasad Joshi wrote:
 The patch only implements the basic read write support for QCOW version 1
 images. Many of the QCOW features are not implmented, for example
 
 On Fri, Apr 15, 2011 at 6:24 AM, Christoph Hellwig h...@infradead.org wrote:
 What's the point?  Qcow1 has been deprecated for a long time.
 
 We're going to add QCOW2 too. Do you think there's no value in
 supporting QCOW1 and we should drop it completely? I have no
 objections in being QCOW2-only.

We're not going to remove it from qemu soon, but the main point for
keeping it is conversion of old qcow1 images to current formats.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why QCOW1? (was: [PATCH v2] kvm tool: add QCOW verions 1 read/write support)

2011-04-15 Thread Stefan Hajnoczi
On Fri, Apr 15, 2011 at 7:45 AM, Pekka Enberg penb...@kernel.org wrote:
 On Fri, Apr 15, 2011 at 9:41 AM, Markus Armbruster arm...@redhat.com wrote:
 What hasn't been discussed much is the other half of Kevin's remark: why
 QCOW1?

 QCOW1 was simpler to implement as the first non-raw image format.

Why even use a non-raw image format?  The current implementation only
does sparse files, but POSIX sparse raw files gives you the same
feature.

Besides, why not use btrfs or device-mapper instead of doing image
formats, which ultimately duplicate file system and volume management
code in userspace?

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3 v2] kvm tools: Setup bridged network by a script

2011-04-15 Thread Amos Kong
Use original hardcode network by default.

#./kvm run ... -n virtio --tapscript=./util/kvm-ifup-vbr0
# brctl show
bridge name bridge id   STP enabled interfaces
vbr08000.e272c7c391f4   no  tap0
guest)# ifconfig eth6
eth6  Link encap:Ethernet  HWaddr 00:11:22:33:44:55
  inet addr:192.168.33.192  Bcast:192.168.33.255  Mask:255.255.255.0
  inet6 addr: fe80::211:22ff:fe33:4455/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:22 errors:0 dropped:0 overruns:0 frame:0
  TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:3725 (3.6 KiB)  TX bytes:852 (852.0 b)
guest)# ping amosk.info
PING amosk.info (69.175.108.82) 56(84) bytes of data.
64 bytes from nurpulat.uz (69.175.108.82): icmp_seq=1 ttl=43 time=306 ms

Changes from v1:
- rebased to latest tree
- replace system() by execv()

Signed-off-by: Amos Kong kongjian...@gmail.com
---
 tools/kvm/include/kvm/virtio-net.h |1 +
 tools/kvm/kvm-run.c|   10 +-
 tools/kvm/virtio-net.c |   35 +--
 3 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/tools/kvm/include/kvm/virtio-net.h 
b/tools/kvm/include/kvm/virtio-net.h
index c889854..959ccb5 100644
--- a/tools/kvm/include/kvm/virtio-net.h
+++ b/tools/kvm/include/kvm/virtio-net.h
@@ -7,6 +7,7 @@ struct virtio_net_parameters {
struct kvm *self;
const char *host_ip;
char guest_mac[6];
+   const char *script;
 };
 
 void virtio_net__init(const struct virtio_net_parameters *params);
diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c
index 4007402..80e1a6c 100644
--- a/tools/kvm/kvm-run.c
+++ b/tools/kvm/kvm-run.c
@@ -33,6 +33,7 @@
 #define DEFAULT_NETWORKvirtio
 #define DEFAULT_HOST_ADDR  192.168.33.2
 #define DEFAULT_GUEST_MAC  00:11:22:33:44:55
+#define DEFAULT_SCRIPT none
 
 #define MB_SHIFT   (20)
 #define MIN_RAM_SIZE_MB(64ULL)
@@ -65,6 +66,7 @@ static const char *kvm_dev;
 static const char *network;
 static const char *host_ip_addr;
 static const char *guest_mac;
+static const char *script;
 static bool single_step;
 static bool readonly_image;
 extern bool ioport_debug;
@@ -102,6 +104,8 @@ static const struct option options[] = {
Assign this address to the host side networking),
OPT_STRING('\0', guest-mac, guest_mac, aa:bb:cc:dd:ee:ff,
Assign this address to the guest side NIC),
+   OPT_STRING('\0', tapscript, script, Script path,
+Assign a script to process created tap device),
OPT_GROUP(Debug options:),
OPT_STRING('d', kvm-dev, kvm_dev, kvm-dev, KVM device file),
OPT_BOOLEAN('s', single-step, single_step,
@@ -277,6 +281,9 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
if (!guest_mac)
guest_mac = DEFAULT_GUEST_MAC;
 
+   if (!script)
+   script = DEFAULT_SCRIPT;
+
term_init();
 
kvm = kvm__init(kvm_dev, ram_size);
@@ -320,7 +327,8 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
if (!strncmp(network, virtio, 6)) {
net_params = (struct virtio_net_parameters) {
.host_ip = host_ip_addr,
-   .self = kvm
+   .self = kvm,
+   .script = script
};
sscanf(guest_mac,   %hhx:%hhx:%hhx:%hhx:%hhx:%hhx,
net_params.guest_mac,
diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c
index 8d08272..f8d7276 100644
--- a/tools/kvm/virtio-net.c
+++ b/tools/kvm/virtio-net.c
@@ -17,6 +17,8 @@
 #include arpa/inet.h
 #include sys/types.h
 #include sys/socket.h
+#include unistd.h
+#include sys/wait.h
 
 #define VIRTIO_NET_IRQ 14
 #define VIRTIO_NET_QUEUE_SIZE  128
@@ -280,7 +282,7 @@ static bool virtio_net__tap_init(const struct 
virtio_net_parameters *params)
 {
struct ifreq ifr;
int sock = socket(AF_INET, SOCK_STREAM, 0);
-   int i;
+   int i, pid, status;
struct sockaddr_in sin = {0};
 
for (i = 0 ; i  6 ; i++)
@@ -304,18 +306,31 @@ static bool virtio_net__tap_init(const struct 
virtio_net_parameters *params)
 
ioctl(net_device.tap_fd, TUNSETNOCSUM, 1);
 
+   if (strcmp(params-script, none)) {
+   pid = fork();
+   if (pid == 0) {
+   execl(params-script, params-script, 
net_device.tap_name, NULL);
+   _exit(1);
+   } else {
+   waitpid(pid, status, 0);
+   if (WIFEXITED(status)  WEXITSTATUS(status) != 0) {
+   warning(Fail to setup tap by %s, 
params-script);
+   

Re: [RFT/PATCH v2] kvm tools: Add read-only support for block devices

2011-04-15 Thread Ingo Molnar

* Pekka Enberg penb...@kernel.org wrote:

 Add support for booting guests to host block devices in read-only mode.
 
 Cc: Asias He asias.he...@gmail.com
 Cc: Cyrill Gorcunov gorcu...@gmail.com
 Cc: Prasad Joshi prasadjoshi...@gmail.com
 Cc: Sasha Levin levinsasha...@gmail.com
 Cc: Ingo Molnar mi...@elte.hu
 Signed-off-by: Pekka Enberg penb...@kernel.org
 ---
  tools/kvm/disk-image.c |   39 +++
  1 files changed, 31 insertions(+), 8 deletions(-)

Nice patch - booting into the host's userspace works out of box now. The 
following disk-image-less command:

   ./kvm run ../../arch/x86/boot/bzImage --readonly --image=/dev/sda 
--params=root=/dev/vda1

has booted all the way into the host's Fedora Rawhide userspace:

   ===
   Welcome to Fedora release 16 (Rawhide)!

   [...]

   Fedora release 16 (Rawhide)
   Kernel 2.6.39-rc3-tip+ on an x86_64 (ttyS0)

   aldebaran login: 
   ===

I ran this as unprivileged user (who had read access to the partitions in 
question).

This is a very useful feature for testing kernels - i can test any random Linux 
box's userspace via KVM, without having to copy an image there! :-)

Tested-by: Ingo Molnar mi...@elte.hu

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why QCOW1? (was: [PATCH v2] kvm tool: add QCOW verions 1 read/write support)

2011-04-15 Thread Pekka Enberg
On Fri, Apr 15, 2011 at 1:14 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 Why even use a non-raw image format?  The current implementation only
 does sparse files, but POSIX sparse raw files gives you the same
 feature.

Because people have existing images they want to boot to?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why QCOW1? (was: [PATCH v2] kvm tool: add QCOW verions 1 read/write support)

2011-04-15 Thread Stefan Hajnoczi
On Fri, Apr 15, 2011 at 12:17 PM, Pekka Enberg penb...@kernel.org wrote:
 On Fri, Apr 15, 2011 at 1:14 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 Why even use a non-raw image format?  The current implementation only
 does sparse files, but POSIX sparse raw files gives you the same
 feature.

 Because people have existing images they want to boot to?

People don't have existing QCOW1 images they want to boot from :).

They have vmdk, vhd, vdi, or qcow2.  You can use qemu-img to convert
them to raw.  You can use qemu-nbd if you are desperate to boot from
or inspect them in-place.

But I think the natural path for a native Linux KVM tool is to fully
exploit file systems and block layer features in Linux instead of
implementing a userspace block layer.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why QCOW1? (was: [PATCH v2] kvm tool: add QCOW verions 1 read/write support)

2011-04-15 Thread Pekka Enberg
On Fri, Apr 15, 2011 at 3:05 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 People don't have existing QCOW1 images they want to boot from :).

 They have vmdk, vhd, vdi, or qcow2.  You can use qemu-img to convert
 them to raw.  You can use qemu-nbd if you are desperate to boot from
 or inspect them in-place.

 But I think the natural path for a native Linux KVM tool is to fully
 exploit file systems and block layer features in Linux instead of
 implementing a userspace block layer.

For optimum performance, sure, but we want to support non-native
images for usability reasons.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why QCOW1?

2011-04-15 Thread Anthony Liguori

On 04/15/2011 06:17 AM, Pekka Enberg wrote:

On Fri, Apr 15, 2011 at 1:14 PM, Stefan Hajnoczistefa...@gmail.com  wrote:

Why even use a non-raw image format?  The current implementation only
does sparse files, but POSIX sparse raw files gives you the same
feature.

Because people have existing images they want to boot to?


Since this project is trying to be so integrated into the kernel, why 
not add support for image files directly to the kernel?  Add a 
qcow1/qcow2/qed block device type such that you can use normal tools 
(like mount) to work with the image files.


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why QCOW1?

2011-04-15 Thread Kevin Wolf
Am 15.04.2011 14:05, schrieb Stefan Hajnoczi:
 On Fri, Apr 15, 2011 at 12:17 PM, Pekka Enberg penb...@kernel.org wrote:
 On Fri, Apr 15, 2011 at 1:14 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 Why even use a non-raw image format?  The current implementation only
 does sparse files, but POSIX sparse raw files gives you the same
 feature.

 Because people have existing images they want to boot to?
 
 People don't have existing QCOW1 images they want to boot from :).
 
 They have vmdk, vhd, vdi, or qcow2.  You can use qemu-img to convert
 them to raw.  You can use qemu-nbd if you are desperate to boot from
 or inspect them in-place.
 
 But I think the natural path for a native Linux KVM tool is to fully
 exploit file systems and block layer features in Linux instead of
 implementing a userspace block layer.

As a normal user, I can deal with files, but I can't write to block
devices or mount file systems. I'm sure that there are use cases that
don't require file-based formats, but I'm also relatively sure that they
are a minority (at least if you also take convenience into consideration).

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] kvm tool: use correct function names

2011-04-15 Thread Prasad Joshi
The function name sect_to_l1_offset() is changed to get_l1_index() as it
returns the l1 table index rather than offset.
Also change
- sect_to_l2_offset to get_l2_index
- sect_to_cluster_offset to get_cluster_offset

Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com
---
 tools/kvm/qcow.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/kvm/qcow.c b/tools/kvm/qcow.c
index c4e3e48..219fd6b 100644
--- a/tools/kvm/qcow.c
+++ b/tools/kvm/qcow.c
@@ -15,21 +15,21 @@
 #include linux/byteorder.h
 #include linux/types.h
 
-static inline uint64_t sect_to_l1_offset(struct qcow *q, uint64_t offset)
+static inline uint64_t get_l1_index(struct qcow *q, uint64_t offset)
 {
struct qcow1_header *header = q-header;
 
return offset  (header-l2_bits + header-cluster_bits);
 }
 
-static inline uint64_t sect_to_l2_offset(struct qcow *q, uint64_t offset)
+static inline uint64_t get_l2_index(struct qcow *q, uint64_t offset)
 {
struct qcow1_header *header = q-header;
 
return (offset  (header-cluster_bits))  ((1  header-l2_bits)-1);
 }
 
-static inline uint64_t sect_to_cluster_offset(struct qcow *q, uint64_t offset)
+static inline uint64_t get_cluster_offset(struct qcow *q, uint64_t offset)
 {
struct qcow1_header *header = q-header;
 
@@ -53,7 +53,7 @@ static int qcow1_read_sector(struct disk_image *self, 
uint64_t sector, void *dst
if (offset = header-size)
goto out_error;
 
-   l1_idx  = sect_to_l1_offset(self-priv, offset);
+   l1_idx  = get_l1_index(q, offset);
 
if (l1_idx = q-table.table_size)
goto out_error;
@@ -71,7 +71,7 @@ static int qcow1_read_sector(struct disk_image *self, 
uint64_t sector, void *dst
if (pread_in_full(q-fd, l2_table, sizeof(uint64_t) * l2_table_size, 
l2_table_offset)  0)
goto out_error_free_l2;
 
-   l2_idx  = sect_to_l2_offset(self-priv, offset);
+   l2_idx  = get_l2_index(q, offset);
 
if (l2_idx = l2_table_size)
goto out_error_free_l2;
@@ -81,7 +81,7 @@ static int qcow1_read_sector(struct disk_image *self, 
uint64_t sector, void *dst
if (!clust_start)
goto zero_sector;
 
-   clust_offset= sect_to_cluster_offset(self-priv, offset);
+   clust_offset= get_cluster_offset(q, offset);
 
if (pread_in_full(q-fd, dst, dst_len, clust_start + clust_offset)  0)
goto out_error_free_l2;
-- 
1.7.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] kvm tool: deallocate the cached l1_table in qcow1_disk_close() and in error path of qcow1_probe()

2011-04-15 Thread Prasad Joshi
Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com
---
 tools/kvm/qcow.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/kvm/qcow.c b/tools/kvm/qcow.c
index 243bfa8..9b9af86 100644
--- a/tools/kvm/qcow.c
+++ b/tools/kvm/qcow.c
@@ -115,6 +115,7 @@ static void qcow1_disk_close(struct disk_image *self)
 
q = self-priv;
 
+   free(q-table.l1_table);
free(q-header);
free(q);
 }
@@ -200,6 +201,7 @@ error:
if (!q)
return NULL;
 
+   free(q-table.l1_table);
free(q-header);
free(q);
 
-- 
1.7.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] kvm tool: check the cluster boundary in the qcow read code.

2011-04-15 Thread Prasad Joshi
Add a new function qcow1_read_cluster() to read a qcow cluster size data at a
time. The function qcow1_read_sector() is modified to use the function
qcow1_read_cluster().

Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com
---
 tools/kvm/qcow.c |  123 ++
 1 files changed, 78 insertions(+), 45 deletions(-)

diff --git a/tools/kvm/qcow.c b/tools/kvm/qcow.c
index 9b9af86..e6d5897 100644
--- a/tools/kvm/qcow.c
+++ b/tools/kvm/qcow.c
@@ -36,68 +36,101 @@ static inline uint64_t get_cluster_offset(struct qcow *q, 
uint64_t offset)
return offset  ((1  header-cluster_bits)-1);
 }
 
-static int qcow1_read_sector(struct disk_image *self, uint64_t sector, void 
*dst, uint32_t dst_len)
+static uint32_t qcow1_read_cluster(struct qcow *q, uint64_t offset, void *dst, 
uint32_t dst_len)
 {
-   struct qcow *q = self-priv;
struct qcow1_header *header = q-header;
-   uint64_t l2_table_offset;
-   uint64_t l2_table_size;
-   uint64_t clust_offset;
-   uint64_t clust_start;
-   uint64_t *l2_table;
-   uint64_t l1_idx;
-   uint64_t l2_idx;
-   uint64_t offset;
-
-   offset  = sector  SECTOR_SHIFT;
-   if (offset = header-size)
-   goto out_error;
+   struct qcow_table *table= q-table;
+   uint32_t length;
+
+   uint32_t l2_table_size;
+   u64 l2_table_offset;
+   u64 *l2_table;
+   u64 l2_idx;
 
-   l1_idx  = get_l1_index(q, offset);
+   uint32_t clust_size;
+   u64 clust_offset;
+   u64 clust_start;
 
-   if (l1_idx = q-table.table_size)
-   goto out_error;
+   u64 l1_idx;
 
-   l2_table_offset = q-table.l1_table[l1_idx];
-   if (!l2_table_offset)
-   goto zero_sector;
+   clust_size= 1  header-cluster_bits;
+   l2_table_size = 1  header-l2_bits;
 
-   l2_table_size   = 1  header-l2_bits;
+   l1_idx = get_l1_index(q, offset);
+   if (l1_idx = table-table_size)
+   goto error;
 
-   l2_table= calloc(l2_table_size, sizeof(uint64_t));
-   if (!l2_table)
-   goto out_error;
+   l2_idx = get_l2_index(q, offset);
+   if (l2_idx = l2_table_size)
+   goto error;
 
-   if (pread_in_full(q-fd, l2_table, sizeof(uint64_t) * l2_table_size, 
l2_table_offset)  0)
-   goto out_error_free_l2;
+   clust_offset = get_cluster_offset(q, offset);
+   if (clust_offset = clust_size)
+   goto error;
 
-   l2_idx  = get_l2_index(q, offset);
+   length = clust_size - clust_offset;
+   if (length  dst_len)
+   length = dst_len;
 
-   if (l2_idx = l2_table_size)
-   goto out_error_free_l2;
+   l2_table_offset = table-l1_table[l1_idx];
+   if (!l2_table_offset) {
+   /* unallocated level 2 table: returned zeroed data */
+   memset(dst, 0, length);
+   goto out;
+   }
 
-   clust_start = be64_to_cpu(l2_table[l2_idx]);
+   l2_table = calloc(l2_table_size, sizeof(u64));
+   if (!l2_table)
+   goto error;
 
-   if (!clust_start)
-   goto zero_sector;
+   if (pread_in_full(q-fd, l2_table, sizeof(u64) * l2_table_size,
+   l2_table_offset)  0)
+   goto free_l2;
 
-   clust_offset= get_cluster_offset(q, offset);
+   clust_start = be64_to_cpu(l2_table[l2_idx]);
+   if (!clust_start) {
+   /* unalloacted cluster return zeroed data */
+   memset(dst, 0, length);
+   free(l2_table);
+   goto out;
+   }
 
-   if (pread_in_full(q-fd, dst, dst_len, clust_start + clust_offset)  0)
-   goto out_error_free_l2;
+   if (pread_in_full(q-fd, dst, length, clust_start + clust_offset)  0)
+   goto free_l2;
 
+out:
+   return length;
+free_l2:
free(l2_table);
-
+error:
return 0;
+}
 
-zero_sector:
-   memset(dst, 0, dst_len);
+static int qcow1_read_sector(struct disk_image *self, uint64_t sector, void 
*dst, uint32_t dst_len)
+{
+   struct qcow *q = self-priv;
+   struct qcow1_header *header = q-header;
+   uint32_t length = 0;
+   char *buf = dst;
+   uint64_t offset;
+   uint32_t nr;
 
-   return 0;
+   while (length  dst_len) {
+   offset = sector  SECTOR_SHIFT;
+   if (offset = header-size)
+   goto error;
 
-out_error_free_l2:
-   free(l2_table);
-out_error:
+   nr = qcow1_read_cluster(q, offset, buf, dst_len - length);
+   if (!nr)
+   goto error;
+
+   length += nr;
+   buf+= nr;
+   sector += (nr  SECTOR_SHIFT);
+   }
+
+   return 0;
+error:
return -1;
 }
 
-- 
1.7.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message 

[PATCH 2/4] kvm tool: avoid byte-order conversion during each read operation

2011-04-15 Thread Prasad Joshi
Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com
---
 tools/kvm/qcow.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/kvm/qcow.c b/tools/kvm/qcow.c
index 219fd6b..243bfa8 100644
--- a/tools/kvm/qcow.c
+++ b/tools/kvm/qcow.c
@@ -58,7 +58,7 @@ static int qcow1_read_sector(struct disk_image *self, 
uint64_t sector, void *dst
if (l1_idx = q-table.table_size)
goto out_error;
 
-   l2_table_offset = be64_to_cpu(q-table.l1_table[l1_idx]);
+   l2_table_offset = q-table.l1_table[l1_idx];
if (!l2_table_offset)
goto zero_sector;
 
@@ -128,16 +128,23 @@ struct disk_image_operations qcow1_disk_ops = {
 static int qcow_read_l1_table(struct qcow *q)
 {
struct qcow1_header *header = q-header;
+   struct qcow_table *table = q-table;
+   u64 i;
 
-   q-table.table_size = header-size / ((1  header-l2_bits) * (1 
 header-cluster_bits));
+   table-table_size = header-size / ((1  header-l2_bits) *
+   (1  header-cluster_bits));
 
-   q-table.l1_table   = calloc(q-table.table_size, sizeof(uint64_t));
-   if (!q-table.l1_table)
+   table-l1_table = calloc(table-table_size, sizeof(u64));
+   if (!table-l1_table)
return -1;
 
-   if (pread_in_full(q-fd, q-table.l1_table, sizeof(uint64_t) * 
q-table.table_size, header-l1_table_offset)  0)
+   if (pread_in_full(q-fd, table-l1_table, sizeof(u64) *
+   table-table_size, header-l1_table_offset)  0)
return -1;
 
+   for (i = 0; i  table-table_size; i++)
+   be64_to_cpus(table-l1_table[i]);
+
return 0;
 }
 
-- 
1.7.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tools: Fix virtio console input problem

2011-04-15 Thread Asias He
Signed-off-by: Asias He asias.he...@gmail.com
---
 tools/kvm/term.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/kvm/term.c b/tools/kvm/term.c
index 2245c8d..2bd038d 100644
--- a/tools/kvm/term.c
+++ b/tools/kvm/term.c
@@ -71,9 +71,9 @@ int term_getc_iov(int who, struct iovec *iov, int iovcnt)
return 0;
 
*((int *)iov[0].iov_base)   = c;
-   iov[0].iov_len  = sizeof(int);
+   iov[0].iov_len  = sizeof(char);
 
-   return sizeof(int);
+   return sizeof(char);
 }
 
 int term_putc_iov(int who, struct iovec *iov, int iovcnt)
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 2/2 V7] qemu,qmp: add inject-nmi qmp command

2011-04-15 Thread Blue Swirl
On Thu, Apr 14, 2011 at 12:55 PM, Daniel P. Berrange
berra...@redhat.com wrote:
 On Wed, Apr 13, 2011 at 10:56:21PM +0300, Blue Swirl wrote:
 On Wed, Apr 13, 2011 at 4:08 PM, Luiz Capitulino lcapitul...@redhat.com 
 wrote:
  On Tue, 12 Apr 2011 21:31:18 +0300
  Blue Swirl blauwir...@gmail.com wrote:
 
  On Tue, Apr 12, 2011 at 10:52 AM, Avi Kivity a...@redhat.com wrote:
   On 04/11/2011 08:15 PM, Blue Swirl wrote:
  
   On Mon, Apr 11, 2011 at 10:01 AM, Markus Armbrusterarm...@redhat.com
    wrote:
 Avi Kivitya...@redhat.com  writes:
   
 On 04/08/2011 12:41 AM, Anthony Liguori wrote:
   
 And it's a good thing to have, but exposing this as the only API 
to
 do something as simple as generating a guest crash dump is not the
 friendliest thing in the world to do to users.
   
 nmi is a fine name for something that corresponds to a real-life 
nmi
 button (often labeled NMI).
   
 Agree.
  
   We could also introduce an alias mechanism for user friendly names, so
   nmi could be used in addition of full path. Aliases could be useful
   for device paths as well.
  
   Yes.  Perhaps limited to the human monitor.
 
  I'd limit all debugging commands (including NMI) to the human monitor.
 
  Why?

 Do they have any real use in production environment? Also, we should
 have the freedom to change the debugging facilities (for example, to
 improve some internal implementation) as we want without regard to
 compatibility to previous versions.

 We have users of libvirt requesting that we add an API for triggering
 a NMI. They want this for support in a production environment, to be
 able to initiate Windows crash dumps.  We really don't want to have to
 use HMP passthrough for this, instead of a proper QMP command.

OK, maybe I shouldn't be very proud of my imagination.

 More generally I don't want to see stuff in HMP, that isn't in the QMP.
 We already have far too much that we have to do via HMP passthrough in
 libvirt due to lack of QMP commands, to the extent that we might as
 well have just ignored QMP and continued with HMP for everything.

 If we want the flexibility to change the debugging commands between
 releases then we should come up with a plan to do this within the
 scope of QMP, not restrict them to HMP only.

If there is a need for the debugging facilities, full QMP support and
some level of compatibility is fine.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tool: QCOW version 1 write support.

2011-04-15 Thread Prasad Joshi
The code is based on the following QCOW 1 image format specification:

http://people.gnome.org/~markmc/qcow-image-format-version-1.html

Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com
---
 tools/kvm/qcow.c |  148 +-
 1 files changed, 147 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/qcow.c b/tools/kvm/qcow.c
index e6d5897..ff73c55 100644
--- a/tools/kvm/qcow.c
+++ b/tools/kvm/qcow.c
@@ -134,8 +134,154 @@ error:
return -1;
 }
 
-static int qcow1_write_sector(struct disk_image *self, uint64_t sector, void 
*src, uint32_t src_len)
+static inline u64 get_file_length(int fd)
 {
+   struct stat st;
+
+   if (fstat(fd, st)  0)
+   return 0;
+   return st.st_size;
+}
+
+static inline u64 align(u64 address, uint32_t size)
+{
+   return (address + size)  (~(size - 1));
+}
+
+static int qcow_pwrite_with_sync(int fd, void *buf, size_t count, off_t offset)
+{
+   if (pwrite_in_full(fd, buf, count, offset)  0)
+   return -1;
+
+   if (fsync(fd)  0)
+   return -1;
+
+   return 0;
+}
+
+static uint32_t qcow1_write_cluster(struct qcow *q, uint64_t offset, void *buf,
+   uint32_t src_len)
+{
+   struct qcow1_header *header = q-header;
+   struct qcow_table *table= q-table;
+   uint32_t length;
+
+   uint32_t l2_table_size;
+   u64 l2_table_offset;
+   u64 *l2_table;
+   u64 l2_idx;
+
+   uint32_t clust_size;
+   u64 clust_offset;
+   u64 clust_start;
+
+   u64 l1_idx;
+   u64 tmp;
+
+   l2_table_size = 1  header-l2_bits;
+   clust_size= 1  header-cluster_bits;
+
+   l1_idx = get_l1_index(q, offset);
+   if (l1_idx = table-table_size)
+   goto error;
+
+   l2_idx = get_l2_index(q, offset);
+   if (l2_idx = l2_table_size)
+   goto error;
+
+   clust_offset = get_cluster_offset(q, offset);
+   if (clust_offset = clust_size)
+   goto error;
+
+   length = clust_size - clust_offset;
+   if (length  src_len)
+   length = src_len;
+
+   l2_table = calloc(l2_table_size, sizeof(u64));
+   if (!l2_table)
+   goto error;
+
+   l2_table_offset = table-l1_table[l1_idx];
+   if (l2_table_offset) {
+   if (pread_in_full(q-fd, l2_table, l2_table_size * sizeof(u64),
+   l2_table_offset)  0)
+   goto free_l2;
+   } else {
+   /*
+* 1. write a level 2 table of zeros at the end of the file
+* 2. update the level1 table
+*/
+   l2_table_offset = align(get_file_length(q-fd), clust_size);
+   table-l1_table[l1_idx] = l2_table_offset;
+
+   if (qcow_pwrite_with_sync(q-fd, l2_table,
+   l2_table_size * sizeof(u64),
+   l2_table_offset)  0)
+   goto free_l2;
+
+   tmp = cpu_to_be64(l2_table_offset);
+   if (qcow_pwrite_with_sync(q-fd, tmp, sizeof(tmp),
+   header-l1_table_offset +
+   (l1_idx * sizeof(u64)))  0)
+   goto free_l2;
+   }
+
+   clust_start = be64_to_cpu(l2_table[l2_idx]);
+   free(l2_table);
+   if (clust_start) {
+   if (qcow_pwrite_with_sync(q-fd, buf, length,
+   clust_start + clust_offset)  0)
+   goto error;
+   } else {
+   /*
+* Follow the write order to avoid metadata loss
+* 1. write the data at the end of the file
+* 2. update the l2_table with new
+*/
+   clust_start = align(get_file_length(q-fd), clust_size);
+   if (qcow_pwrite_with_sync(q-fd, buf, length,
+   clust_start + clust_offset)  0)
+   goto error;
+
+   tmp = cpu_to_be64(clust_start);
+   if (qcow_pwrite_with_sync(q-fd, tmp, sizeof(tmp),
+   l2_table_offset +
+   (l2_idx * sizeof(u64)))  0)
+   goto error;
+   }
+   return length;
+free_l2:
+   free(l2_table);
+error:
+   return 0;
+}
+
+static int qcow1_write_sector(struct disk_image *self, uint64_t sector,
+   void *src, uint32_t src_len)
+{
+   struct qcow *q = self-priv;
+   struct qcow1_header *header = q-header;
+   uint32_t length = 0;
+   char *buf = src;
+   uint64_t offset;
+   uint32_t nr;
+
+   while (length  src_len) {
+   offset = sector  SECTOR_SHIFT;
+   if (offset = header-size)
+   goto error;
+
+   nr = qcow1_write_cluster(q, offset, buf, src_len 

Re: [PATCH] kvm tools: Fix virtio console input problem

2011-04-15 Thread Ingo Molnar

* Asias He asias.he...@gmail.com wrote:

 Signed-off-by: Asias He asias.he...@gmail.com
 ---
  tools/kvm/term.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

Mind explaining in the changelog what input problem this is about?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tool: Remove the __stringify* defination from the util.h

2011-04-15 Thread Prasad Joshi
Include the Linux kernel header file linux/stringify.h file instead of
redefining the __stringify* macros

Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com
---
 tools/kvm/include/kvm/util.h |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/kvm/include/kvm/util.h b/tools/kvm/include/kvm/util.h
index ae033cc..c308f3f 100644
--- a/tools/kvm/include/kvm/util.h
+++ b/tools/kvm/include/kvm/util.h
@@ -1,3 +1,5 @@
+#include ../../../../include/linux/stringify.h
+
 #ifndef KVM__UTIL_H
 #define KVM__UTIL_H
 
@@ -27,9 +29,6 @@
 #endif
 #endif
 
-#define __stringify_1(x)   #x
-#define __stringify(x) __stringify_1(x)
-
 extern void die(const char *err, ...) NORETURN __attribute__((format (printf, 
1, 2)));
 extern void die_perror(const char *s) NORETURN;
 extern int error(const char *err, ...) __attribute__((format (printf, 1, 2)));
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Store and load PCI device saved state across function resets

2011-04-15 Thread Alex Williamson
Bug https://bugs.launchpad.net/qemu/+bug/754591 is caused because
the KVM module attempts to do a pci_save_state() before assigning
the device to a VM, expecting that the saved state will remain
valid until we release the device.  This is in conflict with our
need to reset devices using PCI sysfs during a VM reset to
quiesce the device.  Any calls to pci_reset_function() will
overwrite the device saved stated prior to reset, and reload and
invalidate the state after.  KVM then ends up trying to restore
the state, but it's already invalid, so the device ends up with
reset values.

This series adds a mechanism to pull the saved state off into
an opaue buffer, which can be reloaded into the device at a later
point.  Thanks,

Alex

---

Alex Williamson (2):
  KVM: Use pci_store/load_saved_state() around VM device usage
  PCI: Add interfaces to store and load the device saved state


 drivers/pci/pci.c|   94 ++
 include/linux/kvm_host.h |1 
 include/linux/pci.h  |3 +
 virt/kvm/assigned-dev.c  |8 ++--
 4 files changed, 103 insertions(+), 3 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] PCI: Add interfaces to store and load the device saved state

2011-04-15 Thread Alex Williamson
For KVM device assignment, we'd like to save off the state of a device
prior to passing it to the guest and restore it later.  We also want
to allow pci_reset_funciton() to be called while the device is owned
by the guest.  This however overwrites and invalidates the struct pci_dev
buffers, so we can't just manually call save and restore.  Add generic
interfaces for the saved state to be stored into a buffer and reloaded
back into struct pci_dev at a later time.

Signed-off-by: Alex Williamson alex.william...@redhat.com
---

 drivers/pci/pci.c   |   94 +++
 include/linux/pci.h |3 ++
 2 files changed, 97 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2472e71..2b00354 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -975,6 +975,100 @@ void pci_restore_state(struct pci_dev *dev)
dev-state_saved = false;
 }
 
+struct pci_state {
+   u32 config_space[16];
+   u16 pcie_state[PCI_EXP_SAVE_REGS];
+   u16 pcix_state[1];
+};
+
+/**
+ * pci_store_saved_state - Store the device saved state into a buffer
+ * @dev: - PCI device that we're dealing with
+ *
+ * Returns an opaque buffer containing the device saved state.
+ * NULL if no state or error.
+ */
+void *pci_store_saved_state(struct pci_dev *dev)
+{
+   struct pci_state *state;
+   struct pci_cap_saved_state *cap_state;
+   int pos;
+
+   if (!dev-state_saved)
+   return NULL;
+
+   state = kzalloc(sizeof(*state), GFP_KERNEL);
+   if (!state)
+   return NULL;
+
+   memcpy(state-config_space, dev-saved_config_space,
+  sizeof(state-config_space));
+
+   pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+   cap_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP);
+   if (cap_state  pos)
+   memcpy(state-pcie_state, cap_state-data,
+  sizeof(state-pcie_state));
+
+   pos = pci_find_capability(dev, PCI_CAP_ID_PCIX);
+   cap_state = pci_find_saved_cap(dev, PCI_CAP_ID_PCIX);
+   if (cap_state  pos)
+   memcpy(state-pcix_state, cap_state-data,
+  sizeof(state-pcix_state));
+
+   return state;
+}
+EXPORT_SYMBOL_GPL(pci_store_saved_state);
+
+/**
+ * pci_load_saved_state - Load the device saved state from buffer
+ * @dev: - PCI device that we're dealing with
+ * @buf: - Saved state returned from pci_store_saved_state()
+ */
+void pci_load_saved_state(struct pci_dev *dev, void *buf)
+{
+   struct pci_state *state = buf;
+   struct pci_cap_saved_state *cap_state;
+   int pos;
+
+   if (!state) {
+   dev-state_saved = false;
+   return;
+   }
+
+   memcpy(dev-saved_config_space, state-config_space,
+  sizeof(state-config_space));
+
+   pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+   cap_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP);
+   if (cap_state  pos)
+   memcpy(cap_state-data, state-pcie_state,
+  sizeof(state-pcie_state));
+
+   pos = pci_find_capability(dev, PCI_CAP_ID_PCIX);
+   cap_state = pci_find_saved_cap(dev, PCI_CAP_ID_PCIX);
+   if (cap_state  pos)
+   memcpy(cap_state-data, state-pcix_state,
+  sizeof(state-pcix_state));
+
+   dev-state_saved = true;
+}
+EXPORT_SYMBOL_GPL(pci_load_saved_state);
+
+/**
+ * pci_load_and_free_saved_state - Load the device saved state from buffer
+ *and free the buffer
+ * @dev: - PCI device that we're dealing with
+ * @buf: - Pointer to saved state returned from pci_store_saved_state()
+ */
+void pci_load_and_free_saved_state(struct pci_dev *dev, void **buf)
+{
+   pci_load_saved_state(dev, *buf);
+   kfree(*buf);
+   *buf = NULL;
+}
+EXPORT_SYMBOL_GPL(pci_load_and_free_saved_state);
+
 static int do_pci_enable_device(struct pci_dev *dev, int bars)
 {
int err;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 96f70d7..67ce42a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -807,6 +807,9 @@ size_t pci_get_rom_size(struct pci_dev *pdev, void __iomem 
*rom, size_t size);
 /* Power management related routines */
 int pci_save_state(struct pci_dev *dev);
 void pci_restore_state(struct pci_dev *dev);
+void *pci_store_saved_state(struct pci_dev *dev);
+void pci_load_saved_state(struct pci_dev *dev, void *buf);
+void pci_load_and_free_saved_state(struct pci_dev *dev, void **buf);
 int __pci_complete_power_transition(struct pci_dev *dev, pci_power_t state);
 int pci_set_power_state(struct pci_dev *dev, pci_power_t state);
 pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state);

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: Use pci_store/load_saved_state() around VM device usage

2011-04-15 Thread Alex Williamson
Store the device saved state so that we can reload the device back
to the original state when it's unassigned.  This has the benefit
that the state survives across pci_reset_function() calls via
the PCI sysfs reset interface while the VM is using the device.

Signed-off-by: Alex Williamson alex.william...@redhat.com
---

 include/linux/kvm_host.h |1 +
 virt/kvm/assigned-dev.c  |8 +---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ab42855..d8a1d18 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -513,6 +513,7 @@ struct kvm_assigned_dev_kernel {
struct kvm *kvm;
spinlock_t intx_lock;
char irq_name[32];
+   void *pci_saved_state;
 };
 
 struct kvm_irq_mask_notifier {
diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index ae72ae6..66c6ccd 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -197,7 +197,9 @@ static void kvm_free_assigned_device(struct kvm *kvm,
 {
kvm_free_assigned_irq(kvm, assigned_dev);
 
-   __pci_reset_function(assigned_dev-dev);
+   pci_reset_function(assigned_dev-dev);
+   pci_load_and_free_saved_state(assigned_dev-dev,
+ assigned_dev-pci_saved_state);
pci_restore_state(assigned_dev-dev);
 
pci_release_regions(assigned_dev-dev);
@@ -516,7 +518,7 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
 
pci_reset_function(dev);
pci_save_state(dev);
-
+   match-pci_saved_state = pci_store_saved_state(dev);
match-assigned_dev_id = assigned_dev-assigned_dev_id;
match-host_segnr = assigned_dev-segnr;
match-host_busnr = assigned_dev-busnr;
@@ -546,7 +548,7 @@ out:
mutex_unlock(kvm-lock);
return r;
 out_list_del:
-   pci_restore_state(dev);
+   pci_load_and_free_saved_state(dev, match-pci_saved_state);
list_del(match-list);
pci_release_regions(dev);
 out_disable:

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Use pci_store/load_saved_state() around VM device usage

2011-04-15 Thread Jan Kiszka
On 2011-04-15 21:54, Alex Williamson wrote:
 Store the device saved state so that we can reload the device back
 to the original state when it's unassigned.  This has the benefit
 that the state survives across pci_reset_function() calls via
 the PCI sysfs reset interface while the VM is using the device.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---
 
  include/linux/kvm_host.h |1 +
  virt/kvm/assigned-dev.c  |8 +---
  2 files changed, 6 insertions(+), 3 deletions(-)
 
 diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
 index ab42855..d8a1d18 100644
 --- a/include/linux/kvm_host.h
 +++ b/include/linux/kvm_host.h
 @@ -513,6 +513,7 @@ struct kvm_assigned_dev_kernel {
   struct kvm *kvm;
   spinlock_t intx_lock;
   char irq_name[32];
 + void *pci_saved_state;
  };
  
  struct kvm_irq_mask_notifier {
 diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
 index ae72ae6..66c6ccd 100644
 --- a/virt/kvm/assigned-dev.c
 +++ b/virt/kvm/assigned-dev.c
 @@ -197,7 +197,9 @@ static void kvm_free_assigned_device(struct kvm *kvm,
  {
   kvm_free_assigned_irq(kvm, assigned_dev);
  
 - __pci_reset_function(assigned_dev-dev);
 + pci_reset_function(assigned_dev-dev);
 + pci_load_and_free_saved_state(assigned_dev-dev,
 +   assigned_dev-pci_saved_state);
   pci_restore_state(assigned_dev-dev);
  
   pci_release_regions(assigned_dev-dev);
 @@ -516,7 +518,7 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
  
   pci_reset_function(dev);
   pci_save_state(dev);
 -
 + match-pci_saved_state = pci_store_saved_state(dev);
   match-assigned_dev_id = assigned_dev-assigned_dev_id;
   match-host_segnr = assigned_dev-segnr;
   match-host_busnr = assigned_dev-busnr;
 @@ -546,7 +548,7 @@ out:
   mutex_unlock(kvm-lock);
   return r;
  out_list_del:
 - pci_restore_state(dev);
 + pci_load_and_free_saved_state(dev, match-pci_saved_state);

Don't you need to keep the balance, ie. load_and_free, then restore?

   list_del(match-list);
   pci_release_regions(dev);
  out_disable:
 

Thanks for addressing the issue!

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 2/2] KVM: Use pci_store/load_saved_state() around VM device usage

2011-04-15 Thread Alex Williamson
On Fri, 2011-04-15 at 22:03 +0200, Jan Kiszka wrote:
 On 2011-04-15 21:54, Alex Williamson wrote:
  Store the device saved state so that we can reload the device back
  to the original state when it's unassigned.  This has the benefit
  that the state survives across pci_reset_function() calls via
  the PCI sysfs reset interface while the VM is using the device.
  
  Signed-off-by: Alex Williamson alex.william...@redhat.com
  ---
  
   include/linux/kvm_host.h |1 +
   virt/kvm/assigned-dev.c  |8 +---
   2 files changed, 6 insertions(+), 3 deletions(-)
  
  diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
  index ab42855..d8a1d18 100644
  --- a/include/linux/kvm_host.h
  +++ b/include/linux/kvm_host.h
  @@ -513,6 +513,7 @@ struct kvm_assigned_dev_kernel {
  struct kvm *kvm;
  spinlock_t intx_lock;
  char irq_name[32];
  +   void *pci_saved_state;
   };
   
   struct kvm_irq_mask_notifier {
  diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
  index ae72ae6..66c6ccd 100644
  --- a/virt/kvm/assigned-dev.c
  +++ b/virt/kvm/assigned-dev.c
  @@ -197,7 +197,9 @@ static void kvm_free_assigned_device(struct kvm *kvm,
   {
  kvm_free_assigned_irq(kvm, assigned_dev);
   
  -   __pci_reset_function(assigned_dev-dev);
  +   pci_reset_function(assigned_dev-dev);
  +   pci_load_and_free_saved_state(assigned_dev-dev,
  + assigned_dev-pci_saved_state);
  pci_restore_state(assigned_dev-dev);
   
  pci_release_regions(assigned_dev-dev);
  @@ -516,7 +518,7 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
   
  pci_reset_function(dev);
  pci_save_state(dev);
  -
  +   match-pci_saved_state = pci_store_saved_state(dev);
  match-assigned_dev_id = assigned_dev-assigned_dev_id;
  match-host_segnr = assigned_dev-segnr;
  match-host_busnr = assigned_dev-busnr;
  @@ -546,7 +548,7 @@ out:
  mutex_unlock(kvm-lock);
  return r;
   out_list_del:
  -   pci_restore_state(dev);
  +   pci_load_and_free_saved_state(dev, match-pci_saved_state);
 
 Don't you need to keep the balance, ie. load_and_free, then restore?

I don't see that pci_save_state() does anything more than buffer the
hardware device state into save areas in struct pci_dev.  So by not
doing a restore, we are leaving that valid, but I don't really see how
that hurts anything.  The only reason we even really need to call
pci_load_and_free_saved_state() here is to free the buffer  Thanks,

Alex

  list_del(match-list);
  pci_release_regions(dev);
   out_disable:
  
 
 Thanks for addressing the issue!
 
 Jan
 



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Use pci_store/load_saved_state() around VM device usage

2011-04-15 Thread Jan Kiszka
On 2011-04-15 22:13, Alex Williamson wrote:
 On Fri, 2011-04-15 at 22:03 +0200, Jan Kiszka wrote:
 On 2011-04-15 21:54, Alex Williamson wrote:
 Store the device saved state so that we can reload the device back
 to the original state when it's unassigned.  This has the benefit
 that the state survives across pci_reset_function() calls via
 the PCI sysfs reset interface while the VM is using the device.

 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---

  include/linux/kvm_host.h |1 +
  virt/kvm/assigned-dev.c  |8 +---
  2 files changed, 6 insertions(+), 3 deletions(-)

 diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
 index ab42855..d8a1d18 100644
 --- a/include/linux/kvm_host.h
 +++ b/include/linux/kvm_host.h
 @@ -513,6 +513,7 @@ struct kvm_assigned_dev_kernel {
 struct kvm *kvm;
 spinlock_t intx_lock;
 char irq_name[32];
 +   void *pci_saved_state;
  };
  
  struct kvm_irq_mask_notifier {
 diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
 index ae72ae6..66c6ccd 100644
 --- a/virt/kvm/assigned-dev.c
 +++ b/virt/kvm/assigned-dev.c
 @@ -197,7 +197,9 @@ static void kvm_free_assigned_device(struct kvm *kvm,
  {
 kvm_free_assigned_irq(kvm, assigned_dev);
  
 -   __pci_reset_function(assigned_dev-dev);
 +   pci_reset_function(assigned_dev-dev);
 +   pci_load_and_free_saved_state(assigned_dev-dev,
 + assigned_dev-pci_saved_state);
 pci_restore_state(assigned_dev-dev);
  
 pci_release_regions(assigned_dev-dev);
 @@ -516,7 +518,7 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
  
 pci_reset_function(dev);
 pci_save_state(dev);
 -
 +   match-pci_saved_state = pci_store_saved_state(dev);
 match-assigned_dev_id = assigned_dev-assigned_dev_id;
 match-host_segnr = assigned_dev-segnr;
 match-host_busnr = assigned_dev-busnr;
 @@ -546,7 +548,7 @@ out:
 mutex_unlock(kvm-lock);
 return r;
  out_list_del:
 -   pci_restore_state(dev);
 +   pci_load_and_free_saved_state(dev, match-pci_saved_state);

 Don't you need to keep the balance, ie. load_and_free, then restore?
 
 I don't see that pci_save_state() does anything more than buffer the
 hardware device state into save areas in struct pci_dev.  So by not
 doing a restore, we are leaving that valid, but I don't really see how
 that hurts anything. 

Right, I'm just unsure if we should encode this knowledge about how
pci_save/restore_state works internally into KVM.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] kvm tools: Fix virtio console input problem

2011-04-15 Thread Asias He
On 04/16/2011 02:23 AM, Ingo Molnar wrote:
 
 * Asias He asias.he...@gmail.com wrote:
 
 Signed-off-by: Asias He asias.he...@gmail.com
 ---
  tools/kvm/term.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 Mind explaining in the changelog what input problem this is about?

Sure. See attached patch.

-- 
Best Regards,
Asias He
From 6a123c8001c3c36ae5cc08d2980588491a6ef1ef Mon Sep 17 00:00:00 2001
From: Asias He asias.he...@gmail.com
Date: Fri, 15 Apr 2011 22:55:04 +0800
Subject: [PATCH] kvm tools: Fix virtio console input problem

term_getc only get one char at a time, so term_getc_iov should
send one char back to guest.

Otherwise, you will get four input chars when you only type one like bewlow:

sid login: r^@^@^@o^@^@^@o^@^@^@t^@^@^@

Signed-off-by: Asias He asias.he...@gmail.com
---
 tools/kvm/term.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/tools/kvm/term.c b/tools/kvm/term.c
index 2245c8d..689d52d 100644
--- a/tools/kvm/term.c
+++ b/tools/kvm/term.c
@@ -71,9 +71,8 @@ int term_getc_iov(int who, struct iovec *iov, int iovcnt)
 		return 0;
 
 	*((int *)iov[0].iov_base)	= c;
-	iov[0].iov_len			= sizeof(int);
 
-	return sizeof(int);
+	return sizeof(char);
 }
 
 int term_putc_iov(int who, struct iovec *iov, int iovcnt)
-- 
1.7.4.4