Re: [PATCH v1 1/1] Submit the codes for QEMU disk I/O limits.

2011-07-25 Thread Zhi Yong Wu
On Mon, Jul 25, 2011 at 1:40 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Mon, Jul 25, 2011 at 5:25 AM, Zhi Yong Wu zwu.ker...@gmail.com wrote:
 On Fri, Jul 22, 2011 at 6:54 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Fri, Jul 22, 2011 at 10:20 AM, Zhi Yong Wu wu...@linux.vnet.ibm.com 
 wrote:
 +static void bdrv_block_timer(void *opaque)
 +{
 +    BlockDriverState *bs = opaque;
 +    BlockQueue *queue = bs-block_queue;
 +    uint64_t intval = 1;
 +
 +    while (!QTAILQ_EMPTY(queue-requests)) {
 +        BlockIORequest *request;
 +        int ret;
 +
 +        request = QTAILQ_FIRST(queue-requests);
 +        QTAILQ_REMOVE(queue-requests, request, entry);
 +
 +        ret = queue-handler(request);

 Please remove the function pointer and call qemu_block_queue_handler()
 directly.  This indirection is not needed and makes it harder to
 follow the code.
 Should it keep the same style with other queue implemetations such as
 network queue? As you have known, network queue has one queue handler.

 You mean net/queue.c:queue-deliver?  There are two deliver functions,
yeah
 qemu_deliver_packet() and qemu_vlan_deliver_packet(), which is why a
 function pointer is necessary.
OK, The handler has been removed and invoked directly.

 In this case there is only one handler function so the indirection is
 not necessary.


 +        if (ret == 0) {
 +            QTAILQ_INSERT_HEAD(queue-requests, request, entry);
 +            break;
 +        }
 +
 +        qemu_free(request);
 +    }
 +
 +    qemu_mod_timer(bs-block_timer, (intval * 10) + 
 qemu_get_clock_ns(vm_clock));

 intval is always 1.  The timer should be set to the next earliest deadline.
 pls see bdrv_aio_readv/writev:
 +    /* throttling disk read I/O */
 +    if (bs-io_limits != NULL) {
 +       if (bdrv_exceed_io_limits(bs, nb_sectors, false, wait_time)) {
 +            ret = qemu_block_queue_enqueue(bs-block_queue, bs, 
 bdrv_aio_readv,
 +                               sector_num, qiov, nb_sectors, cb, opaque);
 +           qemu_mod_timer(bs-block_timer, wait_time +
 qemu_get_clock_ns(vm_clock));

 The timer has been modified when the blk request is enqueued.

 The current algorithm seems to be:
 1. Queue requests that exceed the limit and set the timer.
 2. Dequeue all requests when the timer fires.
Yeah, but for each dequeued requests, it will be still determined if
current I/O rate exceed that limits, if yes, it will still be enqueued
into block queue again.
 3. Set 1s periodic timer.

 Why is the timer set up as a periodic 1 second timer in bdrv_block_timer()?
I was aslo considering if we need to set up this type of timer before.
Since the timer has been modified after qemu_block_queue_enqueue()
function, this timer should be not modified here. I will remove this
line of line. thanks.


 +        bs-slice_start[is_write] = real_time;
 +
 +        bs-io_disps-bytes[is_write] = 0;
 +        if (bs-io_limits-bps[2]) {
 +            bs-io_disps-bytes[!is_write] = 0;
 +        }

 All previous data should be discarded when a new time slice starts:
 bs-io_disps-bytes[IO_LIMIT_READ] = 0;
 bs-io_disps-bytes[IO_LIMIT_WRITE] = 0;
 If only bps_rd is specified, bs-io_disps-bytes[IO_LIMIT_WRITE] will
 never be used; i think that it should not be cleared here. right?

 I think there is no advantage in keeping slices separate for
 read/write.  The code would be simpler and work the same if there is
 only one slice and past history is cleared when a new slice starts.
Done

 Stefan




-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 1/1] Submit the codes for QEMU disk I/O limits.

2011-07-25 Thread Zhi Yong Wu
On Fri, Jul 22, 2011 at 6:54 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Fri, Jul 22, 2011 at 10:20 AM, Zhi Yong Wu wu...@linux.vnet.ibm.com 
 wrote:
 +static void bdrv_block_timer(void *opaque)
 +{
 +    BlockDriverState *bs = opaque;
 +    BlockQueue *queue = bs-block_queue;
 +    uint64_t intval = 1;
 +
 +    while (!QTAILQ_EMPTY(queue-requests)) {
 +        BlockIORequest *request;
 +        int ret;
 +
 +        request = QTAILQ_FIRST(queue-requests);
 +        QTAILQ_REMOVE(queue-requests, request, entry);
 +
 +        ret = queue-handler(request);

 Please remove the function pointer and call qemu_block_queue_handler()
 directly.  This indirection is not needed and makes it harder to
 follow the code.

 +        if (ret == 0) {
 +            QTAILQ_INSERT_HEAD(queue-requests, request, entry);
 +            break;
 +        }
 +
 +        qemu_free(request);
 +    }
 +
 +    qemu_mod_timer(bs-block_timer, (intval * 10) + 
 qemu_get_clock_ns(vm_clock));

 intval is always 1.  The timer should be set to the next earliest deadline.

 +}
 +
  void bdrv_register(BlockDriver *bdrv)
  {
     if (!bdrv-bdrv_aio_readv) {
 @@ -476,6 +508,11 @@ static int bdrv_open_common(BlockDriverState *bs, const 
 char *filename,
         goto free_and_fail;
     }

 +    /* throttling disk I/O limits */
 +    if (bs-io_limits != NULL) {
 +       bs-block_queue = qemu_new_block_queue(qemu_block_queue_handler);
 +    }
 +
  #ifndef _WIN32
     if (bs-is_temporary) {
         unlink(filename);
 @@ -642,6 +679,16 @@ int bdrv_open(BlockDriverState *bs, const char 
 *filename, int flags,
             bs-change_cb(bs-change_opaque, CHANGE_MEDIA);
     }

 +    /* throttling disk I/O limits */
 +    if (bs-io_limits != NULL) {

 block_queue is allocated in bdrv_open_common() but these variables are
 initialized in bdrv_open().  Can you move them together, I don't see a
 reason to keep them apart?

 +       bs-io_disps    = qemu_mallocz(sizeof(BlockIODisp));
 +       bs-block_timer = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs);
 +       qemu_mod_timer(bs-block_timer, qemu_get_clock_ns(vm_clock));

 Why is the timer being scheduled immediately?  There are no queued requests.

 +
 +       bs-slice_start[0] = qemu_get_clock_ns(vm_clock);
 +       bs-slice_start[1] = qemu_get_clock_ns(vm_clock);
 +    }
 +
     return 0;

  unlink_and_fail:
 @@ -680,6 +727,15 @@ void bdrv_close(BlockDriverState *bs)
         if (bs-change_cb)
             bs-change_cb(bs-change_opaque, CHANGE_MEDIA);
     }
 +
 +    /* throttling disk I/O limits */
 +    if (bs-block_queue) {
 +       qemu_del_block_queue(bs-block_queue);

 3 space indent, should be 4 spaces.

 +    }
 +
 +    if (bs-block_timer) {

 qemu_free_timer() will only free() the timer memory but it will not
 cancel the timer.  Use qemu_del_timer() first to ensure that the timer
 is not pending:

 qemu_del_timer(bs-block_timer);

 +        qemu_free_timer(bs-block_timer);
 +    }
  }

  void bdrv_close_all(void)
 @@ -1312,6 +1368,13 @@ void bdrv_get_geometry_hint(BlockDriverState *bs,
     *psecs = bs-secs;
  }

 +/* throttling disk io limits */
 +void bdrv_set_io_limits(BlockDriverState *bs,
 +                           BlockIOLimit *io_limits)
 +{
 +    bs-io_limits            = io_limits;

 This function takes ownership of io_limits but never frees it.  I
 suggest not taking ownership and just copying io_limits into
 bs-io_limits so that the caller does not need to malloc() and the
 lifecycle of bs-io_limits is completely under our control.

 Easiest would be to turn BlockDriverState.io_limits into:

 BlockIOLimit io_limits;

 and just copy in bdrv_set_io_limits():

 bs-io_limits = *io_limits;

 bdrv_exceed_io_limits() returns quite quickly if no limits are set, so
 I wouldn't worry about optimizing it out yet.

 +}
 +
  /* Recognize floppy formats */
  typedef struct FDFormat {
     FDriveType drive;
 @@ -2111,6 +2174,154 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, 
 QEMUSnapshotInfo *sn)
     return buf;
  }

 +bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors,
 +                           bool is_write, uint64_t *wait) {
 +    int64_t  real_time;
 +    uint64_t bps_limit   = 0;
 +    double   bytes_limit, bytes_disp, bytes_res, elapsed_time;
 +    double   slice_time = 0.1, wait_time;
 +
 +    if (bs-io_limits-bps[2]) {
 +        bps_limit = bs-io_limits-bps[2];

 Please define a constant (IO_LIMIT_TOTAL?) instead of using magic number 2.

 +    } else if (bs-io_limits-bps[is_write]) {
 +        bps_limit = bs-io_limits-bps[is_write];
 +    } else {
 +        if (wait) {
 +            *wait = 0;
 +        }
 +
 +        return false;
 +    }
 +
 +    real_time = qemu_get_clock_ns(vm_clock);
 +    if (bs-slice_start[is_write] + 1 = real_time) {

 Please define a constant for the 100 ms time slice instead of using
 1 directly.

 +        bs-slice_start[is_write] = real_time;
 +
 +        bs-io_disps-bytes[is_write] = 0;
 +        

[PATCH] qemu-kvm: Remove qemu_system_is_ready side-channel

2011-07-25 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com

kvm_add/remove_ioport_region need to know if the change is done during
init or while the system is running. We added qemu_system_is_ready for
this purpose which exported qemu_system_ready. The latter will be gone
soon, but we can also obtain the information hot-plug or not from the
callers of those services. They can retrieve it from the associated qdev
device state.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 cpus.c |5 -
 cpus.h |1 -
 hw/device-assignment.c |6 --
 qemu-kvm.c |   12 +++-
 qemu-kvm.h |6 --
 5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/cpus.c b/cpus.c
index 007345a..3035314 100644
--- a/cpus.c
+++ b/cpus.c
@@ -687,11 +687,6 @@ void qemu_main_loop_start(void)
 qemu_cond_broadcast(qemu_system_cond);
 }
 
-bool qemu_system_is_ready(void)
-{
-return qemu_system_ready;
-}
-
 void run_on_cpu(CPUState *env, void (*func)(void *data), void *data)
 {
 struct qemu_work_item wi;
diff --git a/cpus.h b/cpus.h
index f6d37af..f42b54e 100644
--- a/cpus.h
+++ b/cpus.h
@@ -7,7 +7,6 @@ void qemu_main_loop_start(void);
 void resume_all_vcpus(void);
 void pause_all_vcpus(void);
 void cpu_stop_current(void);
-bool qemu_system_is_ready(void);
 
 void cpu_synchronize_all_states(void);
 void cpu_synchronize_all_post_reset(void);
diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 5c24c78..4a3e9b5 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -277,7 +277,8 @@ static void assigned_dev_ioport_map(PCIDevice *pci_dev, int 
region_num,
   addr, region-u.r_baseport, type, size, region_num);
 
 if (first_map  region-region-resource_fd  0) {
-r = kvm_add_ioport_region(region-u.r_baseport, region-r_size);
+r = kvm_add_ioport_region(region-u.r_baseport, region-r_size,
+  pci_dev-qdev.hotplugged);
 if (r  0) {
 fprintf(stderr, %s: failed to enable ioport access (%m)\n,
 __func__);
@@ -676,7 +677,8 @@ static void free_assigned_device(AssignedDevice *dev)
 }
 if (pci_region-type  IORESOURCE_IO) {
 if (pci_region-resource_fd  0) {
-kvm_remove_ioport_region(region-u.r_baseport, region-r_size);
+kvm_remove_ioport_region(region-u.r_baseport, region-r_size,
+ dev-dev.qdev.hotplugged);
 }
 } else if (pci_region-type  IORESOURCE_MEM) {
 if (region-u.r_virtbase) {
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 80cc077..4a10616 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -589,7 +589,8 @@ static void do_set_ioport_access(void *data)
 }
 }
 
-int kvm_add_ioport_region(unsigned long start, unsigned long size)
+int kvm_add_ioport_region(unsigned long start, unsigned long size,
+  bool is_hot_plug)
 {
 KVMIOPortRegion *region = qemu_mallocz(sizeof(KVMIOPortRegion));
 CPUState *env;
@@ -600,12 +601,12 @@ int kvm_add_ioport_region(unsigned long start, unsigned 
long size)
 region-status = 1;
 QLIST_INSERT_HEAD(ioport_regions, region, entry);
 
-if (qemu_system_is_ready()) {
+if (is_hot_plug) {
 for (env = first_cpu; env != NULL; env = env-next_cpu) {
 run_on_cpu(env, do_set_ioport_access, region);
 if (region-status  0) {
 r = region-status;
-kvm_remove_ioport_region(start, size);
+kvm_remove_ioport_region(start, size, is_hot_plug);
 break;
 }
 }
@@ -613,7 +614,8 @@ int kvm_add_ioport_region(unsigned long start, unsigned 
long size)
 return r;
 }
 
-int kvm_remove_ioport_region(unsigned long start, unsigned long size)
+int kvm_remove_ioport_region(unsigned long start, unsigned long size,
+ bool is_hot_unplug)
 {
 KVMIOPortRegion *region, *tmp;
 CPUState *env;
@@ -623,7 +625,7 @@ int kvm_remove_ioport_region(unsigned long start, unsigned 
long size)
 if (region-start == start  region-size == size) {
 region-status = 0;
 }
-if (qemu_system_is_ready()) {
+if (is_hot_unplug) {
 for (env = first_cpu; env != NULL; env = env-next_cpu) {
 run_on_cpu(env, do_set_ioport_access, region);
 }
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 845880e..90bcca9 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -252,8 +252,10 @@ void kvm_tpr_access_report(CPUState *env, uint64_t rip, 
int is_write);
 
 int kvm_arch_init_irq_routing(void);
 
-int kvm_add_ioport_region(unsigned long start, unsigned long size);
-int kvm_remove_ioport_region(unsigned long start, unsigned long size);
+int kvm_add_ioport_region(unsigned long start, unsigned long size,
+  bool is_hot_plug);
+int kvm_remove_ioport_region(unsigned long start, unsigned long 

Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Anthony,

On Mon, Jul 25, 2011 at 4:19 AM, Anthony Liguori anth...@codemonkey.ws wrote:
 lguest already does this and lives in the kernel.

Does Lguest have SMP, usermode networking, and GUI support?

On Mon, Jul 25, 2011 at 4:19 AM, Anthony Liguori anth...@codemonkey.ws wrote:
 So purely from a kernel perspective, why have two tools in the tree that do
 the same thing?  Shouldn't you at least unify the userspace with the lguest
 userspace?

Are you talking about Documentation/lguest/lguest.c? How would you
suggest we unify our code with that?

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Avi Kivity

On 07/25/2011 10:27 AM, Pekka Enberg wrote:

Hi Anthony,

On Mon, Jul 25, 2011 at 4:19 AM, Anthony Liguorianth...@codemonkey.ws  wrote:
  lguest already does this and lives in the kernel.

Does Lguest have SMP, usermode networking, and GUI support?



IIRC, yes, no, and no.


On Mon, Jul 25, 2011 at 4:19 AM, Anthony Liguorianth...@codemonkey.ws  wrote:
  So purely from a kernel perspective, why have two tools in the tree that do
  the same thing?  Shouldn't you at least unify the userspace with the lguest
  userspace?

Are you talking about Documentation/lguest/lguest.c? How would you
suggest we unify our code with that?


It should be easy to have tools/kvm drive lguest - they're both virtio 
based.  All you need to do is provide yet another ops structure to drive 
the two ABIs.


I guess lguest.c has to remain, as point of lguest was a simple teaching 
aid for virtualization (which doesn't work very well, as the techniques 
it uses are obsolete).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Jan,

On Mon, Jul 25, 2011 at 2:12 AM, Jan Kiszka jan.kis...@web.de wrote:
 I've read several times now that developing in a single tree leads to
 better results. Can you provide some example from the QEMU/KVM projects
 where the split is preventing innovation, optimizations, or some other
 kind of progress?

I really don't follow the Qemu project well enough to comment on what
your biggest pain points are there.

As for tools/kvm, it's pretty obvious by now that we want tighter
integration with perf and the tracing facilities (and share code!),
for example so for us merging to mainline is important.

On Mon, Jul 25, 2011 at 2:12 AM, Jan Kiszka jan.kis...@web.de wrote:
 Let's assume this gets merged. What will change for people working on
 generic or x86 KVM features? Will they sooner or later be asked to
 provide corresponding bits also for the tools folder? Having worked on
 two userlands over the past years (QEMU and qemu-kvm), still
 contributing to finally unify them, I'm wondering if this will end up
 repeating the painful history.

I'm not a KVM maintainer so I really don't know how tools/kvm will
change any of the above things. I'm certainly not in a position to
*require* anyone to update tools/kvm when they add new KVM features
but it'd sure be great.

On Mon, Jul 25, 2011 at 2:12 AM, Jan Kiszka jan.kis...@web.de wrote:
 That said, I definitely appreciate the bug fixes as well as code and
 documentation improvements for KVM that originate from this effort! I'm
 just not convinced that writing a new userland and merging it into the
 kernel is the most efficient way to achieve that.

Just to make this crystal clear for everyone: if it weren't for
tools/kvm, I wouldn't be hacking on KVM at all. I've looked at Qemu in
the past (and a lot recently!) and I simply don't see myself
contributing to it, sorry. So 'most efficient' or not, I think
tools/kvm is a net win for Linux and KVM in general.

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Avi,

On Mon, Jul 25, 2011 at 10:36 AM, Avi Kivity a...@redhat.com wrote:
 Are you talking about Documentation/lguest/lguest.c? How would you
 suggest we unify our code with that?

 It should be easy to have tools/kvm drive lguest - they're both virtio
 based.  All you need to do is provide yet another ops structure to drive the
 two ABIs.

Right, we could do that. But for what purpose? I thought Lguest ABI
was mostly for educational and testing purposes.

On Mon, Jul 25, 2011 at 10:36 AM, Avi Kivity a...@redhat.com wrote:
 I guess lguest.c has to remain, as point of lguest was a simple teaching aid
 for virtualization (which doesn't work very well, as the techniques it uses
 are obsolete).

Yup, I don't see point in merging lguest.c with tools/kvm.

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Sasha Levin
On Mon, 2011-07-25 at 01:12 +0200, Jan Kiszka wrote:
 On 2011-07-24 22:37, Pekka Enberg wrote:
  Hi Linus,
  
  Please consider pulling from
  
ssh://master.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git
  kvm-tool-for-linus
  
  to merge the Native Linux KVM tool to Linux 3.1.
  
  [ The changes to 9p headers were already merged but show up in the pull
  request. ]
  
  The goal of this tool is to provide a clean, from-scratch, lightweight
  KVM host
  tool implementation that can boot Linux guest images with no BIOS
  dependencies
  and with only the minimal amount of legacy device emulation. The primary
  focus
  of the tool is to Linux but there are already people on working on
  supporting
  GRUB and other operating systems.
  
  We want the tool to be part of Linux kernel source tree because we
  believe that
  ‘perf’ clearly showed the benefits of a single repository for both
  kernel and
  userspace components. See Ingo Molnar’s email that started the project for
  details:
  
http://thread.gmane.org/gmane.linux.kernel/962051/focus=962620
 
 I've read several times now that developing in a single tree leads to
 better results. Can you provide some example from the QEMU/KVM projects
 where the split is preventing innovation, optimizations, or some other
 kind of progress?
 

Anthony had a talk on last years KVM forum regarding the QEMU threading
model (slide:
http://www.linux-kvm.org/wiki/images/7/70/2010-forum-threading-qemu.pdf) .

It was suggested that the KVM part of QEMU is having a hard time
achieving the ideal threading model due to its need to support TCG -
something which has nothing to do with KVM itself.


-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Pekka Enberg penb...@kernel.org wrote:

 On Mon, Jul 25, 2011 at 2:12 AM, Jan Kiszka jan.kis...@web.de wrote:
  That said, I definitely appreciate the bug fixes as well as code and
  documentation improvements for KVM that originate from this effort! I'm
  just not convinced that writing a new userland and merging it into the
  kernel is the most efficient way to achieve that.
 
 Just to make this crystal clear for everyone: if it weren't for 
 tools/kvm, I wouldn't be hacking on KVM at all. I've looked at Qemu 
 in the past (and a lot recently!) and I simply don't see myself 
 contributing to it, sorry. So 'most efficient' or not, I think 
 tools/kvm is a net win for Linux and KVM in general.

Same here - in fact i first asked Qemu to be put into tools/qemu/ so 
that it all becomes more hackable and more usable - that suggestion 
was rebuked very strongly.

So i wanted to have a lightweight tool that allows me to test KVM and 
tools/kvm/ does that very nicely: i type './kvm run' and i can test a 
native bzImage (which has some virtualization options enabled as 
well) on the _host_ distro i am running, booting to a text shell 
prompt.

I can do that without downloading any (inevitably outdated) 
virtualization images or maintaining my own ones. Maintaining host 
userspace is more than enough for me.

So, since we already have the lguest tool in the kernel tree, why 
cannot we have the much more capable tools/kvm/ in the tree?

So while it is the Qemu folks' right to oppose tools/qemu/, i don't 
see why they are opposing tools/kvm/ ...

Wrt. integration with lguest - this is a new argument that was not 
brought up before (i wish people would not come up with new 
requirements on the day of the pull request) - i don't see how it's 
relevant really: lguest was designed for legacy CPUs and tools/kvm/ 
is precisely about being simple and not doing legacy stuff.

If then Qemu should be the project that integrates lguest. Is anyone 
on the Qemu side looking at lguest integration?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Sasha Levin
On Mon, 2011-07-25 at 10:36 +0300, Avi Kivity wrote:
 On 07/25/2011 10:27 AM, Pekka Enberg wrote:
  Hi Anthony,
 
  On Mon, Jul 25, 2011 at 4:19 AM, Anthony Liguorianth...@codemonkey.ws  
  wrote:
lguest already does this and lives in the kernel.
 
  Does Lguest have SMP, usermode networking, and GUI support?
 
 
 IIRC, yes, no, and no.

Theres no SMP support in lguest.

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] skbuff: clear tx zero-copy flag

2011-07-25 Thread Michael S. Tsirkin
On Mon, Jul 25, 2011 at 08:42:00AM +0800, Herbert Xu wrote:
 Shirley Ma mashi...@us.ibm.com wrote:
  
  This patch clears tx zero-copy flag as needed.
  
  Sign-off-by: Shirley Ma x...@us.ibm.com
 
 I think we also need to copy and clear this flag on the splice
 read path as that takes a direct page reference.
 
 I hope there isn't any other path that does this.
 
 Cheers,

When there's a way for an skb to get into the
host networking stack, (e.g. when tap gains zero copy
support) we'll need to handle that.

However macvtap passes an skb directly to the
lower device, so as long as macvtap is the only user
of that interface, we are fine I think - there's
no way for an skb to get from macvtap to splice
read path I think.

Right?

 -- 
 Email: Herbert Xu herb...@gondor.apana.org.au
 Home Page: http://gondor.apana.org.au/~herbert/
 PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] kvm tools, qcow: Add support for writing to zero refcount clusters

2011-07-25 Thread Kevin Wolf
Am 24.07.2011 21:45, schrieb Pekka Enberg:
 This patch adds support for writing to zero refcount clusters. Refcount blocks
 are cached in like L2 tables and flushed upon VIRTIO_BLK_T_FLUSH and when
 evicted from the LRU cache.
 
 With this patch applied, 'qemu-img check' no longer complains about referenced
 clusters with zero reference count after
 
   dd if=/dev/zero of=/mnt/tmp
 
 where '/mnt' is freshly generated QCOW2 image.
 
 Cc: Asias He asias.he...@gmail.com
 Cc: Cyrill Gorcunov gorcu...@gmail.com
 Cc: Ingo Molnar mi...@elte.hu
 Cc: Kevin Wolf kw...@redhat.com
 Cc: Prasad Joshi prasadjoshi...@gmail.com
 Cc: Sasha Levin levinsasha...@gmail.com
 Signed-off-by: Pekka Enberg penb...@kernel.org

Looks okay, except that in case of a crash you'll most likely corrupt
the image because the order in which refcounts and mapping are written
out is completely undefined.

For a reliable implementation you need to make sure that for cluster
allocation you first write out the refcount update, then fsync, then
write out the mapping. Otherwise if the mapping is written out first and
then a crash happens, you'll have clusters that are used, but marked
free, so that in the next run a second cluster can be mapped to the same
location.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Asias He asias.he...@gmail.com wrote:

 On Mon, Jul 25, 2011 at 3:36 PM, Avi Kivity a...@redhat.com wrote:
 
  On 07/25/2011 10:27 AM, Pekka Enberg wrote:
 
  Hi Anthony,
 
  On Mon, Jul 25, 2011 at 4:19 AM, Anthony Liguorianth...@codemonkey.ws
   wrote:
lguest already does this and lives in the kernel.
 
  Does Lguest have SMP, usermode networking, and GUI support?
 
 
  IIRC, yes, no, and no.
 
 
 And Lguest only supports 32-bit kernel.

The answer to SMP is 'no' as well, as lguest supports only a single 
CPU:

   [Documentation/virtual/lguest/lguest.c]

/* We're CPU 0.  In fact, that's the only CPU possible right now. */
cpu_id = 0;

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 09:53, Ingo Molnar wrote:

 
 * Pekka Enberg penb...@kernel.org wrote:
 
 On Mon, Jul 25, 2011 at 2:12 AM, Jan Kiszka jan.kis...@web.de wrote:
 That said, I definitely appreciate the bug fixes as well as code and
 documentation improvements for KVM that originate from this effort! I'm
 just not convinced that writing a new userland and merging it into the
 kernel is the most efficient way to achieve that.
 
 Just to make this crystal clear for everyone: if it weren't for 
 tools/kvm, I wouldn't be hacking on KVM at all. I've looked at Qemu 
 in the past (and a lot recently!) and I simply don't see myself 
 contributing to it, sorry. So 'most efficient' or not, I think 
 tools/kvm is a net win for Linux and KVM in general.
 
 Same here - in fact i first asked Qemu to be put into tools/qemu/ so 
 that it all becomes more hackable and more usable - that suggestion 
 was rebuked very strongly.

So instead of thinking a bit and trying to realize that there might be a reason 
people don't want all their user space in the kernel tree you go ahead and 
start your own crusade of creating a new user space. Great. That's how I always 
hoped Linux would be :(.

 So i wanted to have a lightweight tool that allows me to test KVM and 
 tools/kvm/ does that very nicely: i type './kvm run' and i can test a 
 native bzImage (which has some virtualization options enabled as 
 well) on the _host_ distro i am running, booting to a text shell 
 prompt.

I do that all the time.

  $ qemu-kvm -nographic -kernel arch/x86/boot/bzImage -append console=ttyS0

does the exact same thing. If that's too much typing for you, make it a bash 
alias.

 I can do that without downloading any (inevitably outdated) 
 virtualization images or maintaining my own ones. Maintaining host 
 userspace is more than enough for me.

Who would need images? I usually only run -kernel and -initrd directly to test 
out things. Or if I really want to boot into a real system I do -snapshot 
/dev/sda.

 So, since we already have the lguest tool in the kernel tree, why 
 cannot we have the much more capable tools/kvm/ in the tree?

Lguest is in Documentation/ for a reason. It's not meant as a user space tool 
that you take as-is and use. It's meant for documenting how lguest works in 
general. I admit though, that that's also the reason people don't use it :).

 So while it is the Qemu folks' right to oppose tools/qemu/, i don't 
 see why they are opposing tools/kvm/ ...

In your logic, you would put all of the GNU utils into tools/. This is just 
plain insane. There's a reason we have the split between kernel and user space. 
What does putting them into the same tree bring you? Fame? Glorious stats on 
kernel commits? Seriously, it's a separate project. It's not the kernel.

 Wrt. integration with lguest - this is a new argument that was not 
 brought up before (i wish people would not come up with new 
 requirements on the day of the pull request) - i don't see how it's 
 relevant really: lguest was designed for legacy CPUs and tools/kvm/ 
 is precisely about being simple and not doing legacy stuff.
 
 If then Qemu should be the project that integrates lguest. Is anyone 
 on the Qemu side looking at lguest integration?

I don't think lguest integration makes sense for anyone or anything. Lguest is 
a toy, so let it be the toy it wants to be.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
On Mon, Jul 25, 2011 at 4:19 AM, Anthony Liguorianth...@codemonkey.ws wrote:
    lguest already does this and lives in the kernel.

On 07/25/2011 10:27 AM, Pekka Enberg wrote:
  Does Lguest have SMP, usermode networking, and GUI support?

On Mon, Jul 25, 2011 at 3:36 PM, Avi Kivity a...@redhat.com wrote:
  IIRC, yes, no, and no.

* Asias He asias.he...@gmail.com wrote:
 And Lguest only supports 32-bit kernel.

On Mon, Jul 25, 2011 at 11:11 AM, Ingo Molnar mi...@elte.hu wrote:
 The answer to SMP is 'no' as well, as lguest supports only a single
 CPU:

   [Documentation/virtual/lguest/lguest.c]

        /* We're CPU 0.  In fact, that's the only CPU possible right now. */
        cpu_id = 0;

Uhm - so why bring up Lguest at all in this pull request? Anthony,
what am I missing here?

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Alexander,

On Mon, Jul 25, 2011 at 11:14 AM, Alexander Graf ag...@suse.de wrote:
 Same here - in fact i first asked Qemu to be put into tools/qemu/ so
 that it all becomes more hackable and more usable - that suggestion
 was rebuked very strongly.

 So instead of thinking a bit and trying to realize that there might be a 
 reason people don't want all their user space in the kernel tree you go ahead 
 and start your own crusade of creating a new user space. Great. That's how I 
 always hoped Linux would be :(.

Damn you Ingo Molnar, I knew you'd somehow get all the credit for our
hard work! ;-)

More seriously, though, I fail to see what's bothering you Alexander.
I and Ingo already mentioned we wouldn't be hacking on Qemu even if
there wasn't no tools/kvm. It's not as if we're putting *your* user
space code into the kernel tree - we wrote our own! What's wrong with
that?

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Alexander,

On Mon, Jul 25, 2011 at 11:14 AM, Alexander Graf ag...@suse.de wrote:
 So i wanted to have a lightweight tool that allows me to test KVM and
 tools/kvm/ does that very nicely: i type './kvm run' and i can test a
 native bzImage (which has some virtualization options enabled as
 well) on the _host_ distro i am running, booting to a text shell
 prompt.

 I do that all the time.

  $ qemu-kvm -nographic -kernel arch/x86/boot/bzImage -append console=ttyS0

 does the exact same thing. If that's too much typing for you, make it a bash 
 alias.

You know, they said the same thing about oprofile. All you needed to do was to
write few simple shell scripts to make it work. One of the key
features of tools/kvm
is 'as little configuration as possible' and I can assure you that
bash alias is really
not a solution for that.

[ Yes, yes, I know people apparently use virtio-manager for lauching Qemu so
  they don't need to figure out the setup themselves. But now you have *three*
  separate components (KVM, Qemu, virtio-manager) which is a completely
  different direction we're taking. Hell, we even went ahead and wrote our own
  mini-BIOS just to keep things in one unified tree. ]

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 10:23, Pekka Enberg wrote:

 Hi Alexander,
 
 On Mon, Jul 25, 2011 at 11:14 AM, Alexander Graf ag...@suse.de wrote:
 Same here - in fact i first asked Qemu to be put into tools/qemu/ so
 that it all becomes more hackable and more usable - that suggestion
 was rebuked very strongly.
 
 So instead of thinking a bit and trying to realize that there might be a 
 reason people don't want all their user space in the kernel tree you go 
 ahead and start your own crusade of creating a new user space. Great. That's 
 how I always hoped Linux would be :(.
 
 Damn you Ingo Molnar, I knew you'd somehow get all the credit for our
 hard work! ;-)

Well, IIUC he's the one initiating the whole thing, no?

 More seriously, though, I fail to see what's bothering you Alexander.
 I and Ingo already mentioned we wouldn't be hacking on Qemu even if
 there wasn't no tools/kvm. It's not as if we're putting *your* user
 space code into the kernel tree - we wrote our own! What's wrong with
 that?

Nothing. I like competition. But why push it into the kernel? It's not a 
kernel, it's not a library the kernels needs for internal stuff. So why would 
it have to be in there?

In Ingo's reasoning, the next step would be to rewrite glibc and put it into 
the kernel tree, because we end up adding syscalls so adding them to the 
in-kernel libc with the same commit would be a lot easier and cleaner.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 09:50, Sasha Levin wrote:

 On Mon, 2011-07-25 at 01:12 +0200, Jan Kiszka wrote:
 On 2011-07-24 22:37, Pekka Enberg wrote:
 Hi Linus,
 
 Please consider pulling from
 
  ssh://master.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git
 kvm-tool-for-linus
 
 to merge the Native Linux KVM tool to Linux 3.1.
 
 [ The changes to 9p headers were already merged but show up in the pull
 request. ]
 
 The goal of this tool is to provide a clean, from-scratch, lightweight
 KVM host
 tool implementation that can boot Linux guest images with no BIOS
 dependencies
 and with only the minimal amount of legacy device emulation. The primary
 focus
 of the tool is to Linux but there are already people on working on
 supporting
 GRUB and other operating systems.
 
 We want the tool to be part of Linux kernel source tree because we
 believe that
 ‘perf’ clearly showed the benefits of a single repository for both
 kernel and
 userspace components. See Ingo Molnar’s email that started the project for
 details:
 
  http://thread.gmane.org/gmane.linux.kernel/962051/focus=962620
 
 I've read several times now that developing in a single tree leads to
 better results. Can you provide some example from the QEMU/KVM projects
 where the split is preventing innovation, optimizations, or some other
 kind of progress?
 
 
 Anthony had a talk on last years KVM forum regarding the QEMU threading
 model (slide:
 http://www.linux-kvm.org/wiki/images/7/70/2010-forum-threading-qemu.pdf) .
 
 It was suggested that the KVM part of QEMU is having a hard time
 achieving the ideal threading model due to its need to support TCG -
 something which has nothing to do with KVM itself.

Sure, but TCG is a really great part of Qemu. And I'm actually one of the few 
people who believe that in the KVM community ;).

Recently, I sat down for a couple of months and implemented s390x emulation 
with TCG in Qemu. You know why?

We had KVM support (device emulation, KVM backend) for s390x for more than a 
year at that point. People constantly broke stuff in the s390x code paths, 
because they couldn't test it - people seemed to not have too many mainframes 
lying around at home. With the emulation target, I could make sure that 
everyone's able to at least compile and run the s390x specific parts and make 
sure he can do regression testing.

So yes, while TCG is keeping us back in certain areas, it's tremendously useful 
in others.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Avi Kivity

On 07/25/2011 11:31 AM, Alexander Graf wrote:

In Ingo's reasoning, the next step would be to rewrite glibc and put it into 
the kernel tree, because we end up adding syscalls so adding them to the 
in-kernel libc with the same commit would be a lot easier and cleaner.


That actually makes a ton of sense.  One immediate win would be that 
klibc can be tuned to the kernel it ships with (the dynamic loader will 
pick the correct object), so less #ifdef trees.  Another would be to 
make klibc the formal kernel interface, which allows us to reimplement 
an older interface in terms of the one that supercedes it.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Alexander,

On Mon, Jul 25, 2011 at 11:31 AM, Alexander Graf ag...@suse.de wrote:
 Damn you Ingo Molnar, I knew you'd somehow get all the credit for our
 hard work! ;-)

 Well, IIUC he's the one initiating the whole thing, no?

As much as I appreciate Ingo's help and support with the project, no,
I don't consider him to have initiated the whole thing. Yes, it's his
idea and that's what got me into hacking on the thing in the first
place. But calling this Ingo's crusade is somewhat missing the point.
It's really people like Asias, Sasha, Cyrill, and Prashad who have
made all the heavy lifting.

On Mon, Jul 25, 2011 at 11:31 AM, Alexander Graf ag...@suse.de wrote:
 More seriously, though, I fail to see what's bothering you Alexander.
 I and Ingo already mentioned we wouldn't be hacking on Qemu even if
 there wasn't no tools/kvm. It's not as if we're putting *your* user
 space code into the kernel tree - we wrote our own! What's wrong with
 that?

 Nothing. I like competition. But why push it into the kernel? It's not a 
 kernel, it's not a library the kernels needs for internal stuff. So why would 
 it have to be in there?

For the same reasons we want tools/perf to be there.

On Mon, Jul 25, 2011 at 11:31 AM, Alexander Graf ag...@suse.de wrote:
 In Ingo's reasoning, the next step would be to rewrite glibc and put it into 
 the kernel tree, because we end up adding syscalls so adding them to the 
 in-kernel libc with the same commit would be a lot easier and cleaner.

It's called klibc, btw. :-)

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 10:30, Pekka Enberg wrote:

 Hi Alexander,
 
 On Mon, Jul 25, 2011 at 11:14 AM, Alexander Graf ag...@suse.de wrote:
 So i wanted to have a lightweight tool that allows me to test KVM and
 tools/kvm/ does that very nicely: i type './kvm run' and i can test a
 native bzImage (which has some virtualization options enabled as
 well) on the _host_ distro i am running, booting to a text shell
 prompt.
 
 I do that all the time.
 
  $ qemu-kvm -nographic -kernel arch/x86/boot/bzImage -append console=ttyS0
 
 does the exact same thing. If that's too much typing for you, make it a bash 
 alias.
 
 You know, they said the same thing about oprofile. All you needed to do was to
 write few simple shell scripts to make it work. One of the key
 features of tools/kvm
 is 'as little configuration as possible' and I can assure you that
 bash alias is really
 not a solution for that.

I like perf. Really. But I still don't see why it wouldn't be able to live in 
its own tree either.

 [ Yes, yes, I know people apparently use virtio-manager for lauching Qemu so
  they don't need to figure out the setup themselves. But now you have *three*
  separate components (KVM, Qemu, virtio-manager) which is a completely

Having the split between KVM and user space is necessary. One lives in kernel 
space, the other one in user space. The split between qemu and virt-manager is 
something I don't like either. But then again I don't like virt-manager :). But 
what does this have to do with pulling something into the kernel? And 
reimplementing something that's already been there?

  different direction we're taking. Hell, we even went ahead and wrote our own
  mini-BIOS just to keep things in one unified tree. ]

Yes, making sure that you have even more non-working non-Linux OSs.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] skbuff: clear tx zero-copy flag

2011-07-25 Thread Herbert Xu
On Mon, Jul 25, 2011 at 11:07:43AM +0300, Michael S. Tsirkin wrote:

 However macvtap passes an skb directly to the
 lower device, so as long as macvtap is the only user
 of that interface, we are fine I think - there's
 no way for an skb to get from macvtap to splice
 read path I think.
 
 Right?

Yes, as long as you can guarantee that the skb never loops back
then you should be fine.

However, does macvtap really bypass everything, including the
qdisc layer? The qdisc layer is certainly capable of looping
the skb back with the redirect action.

Cheers,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 10:37, Pekka Enberg wrote:

 Hi Alexander,
 
 On Mon, Jul 25, 2011 at 11:31 AM, Alexander Graf ag...@suse.de wrote:
 Damn you Ingo Molnar, I knew you'd somehow get all the credit for our
 hard work! ;-)
 
 Well, IIUC he's the one initiating the whole thing, no?
 
 As much as I appreciate Ingo's help and support with the project, no,
 I don't consider him to have initiated the whole thing. Yes, it's his
 idea and that's what got me into hacking on the thing in the first
 place. But calling this Ingo's crusade is somewhat missing the point.
 It's really people like Asias, Sasha, Cyrill, and Prashad who have
 made all the heavy lifting.
 
 On Mon, Jul 25, 2011 at 11:31 AM, Alexander Graf ag...@suse.de wrote:
 More seriously, though, I fail to see what's bothering you Alexander.
 I and Ingo already mentioned we wouldn't be hacking on Qemu even if
 there wasn't no tools/kvm. It's not as if we're putting *your* user
 space code into the kernel tree - we wrote our own! What's wrong with
 that?
 
 Nothing. I like competition. But why push it into the kernel? It's not a 
 kernel, it's not a library the kernels needs for internal stuff. So why 
 would it have to be in there?
 
 For the same reasons we want tools/perf to be there.

Yeah, I want a pony too.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Alexander,

On Mon, Jul 25, 2011 at 11:37 AM, Alexander Graf ag...@suse.de wrote:
  different direction we're taking. Hell, we even went ahead and wrote our own
  mini-BIOS just to keep things in one unified tree. ]

 Yes, making sure that you have even more non-working non-Linux OSs.

You know, I've been a Linux kernel hacker for more than five years now
and I've spent way too much of my spare time to improve it. So yes, I
care about Linux. I care about it a lot, actually. It's fair to say I
care about Linux more than I care about it more than any other
operating system out there.

[ I thought the 'native Linux' part in 'native Linux KVM tool' was a
dead giveaway, really. ]

Now if people want to support other operating systems, that's cool and
I'm happy to help out where I can. But I don't understand why people
keep bringing non-Linux OSs as an argument for not merging tools/kvm
into the Linux kernel tree. I mean really, did someone actually expect
that a Linux kernel developer spends his weekends improving the state
of Windows virtualization?

And don't get this the wrong way either, I'm not hostile against other
operating systems, but I simply am not interested enough in them to
spend my time improving them.

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity
qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.

Signed-off-by: Avi Kivity a...@redhat.com
---

This is part of my memory API patchset, but doesn't really belong there.

 qemu-common.h |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/qemu-common.h b/qemu-common.h
index ba55719..66effa3 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -186,6 +186,9 @@ void qemu_free(void *ptr);
 char *qemu_strdup(const char *str);
 char *qemu_strndup(const char *str, size_t size);
 
+#define QEMU_NEW(type) ((type *)(qemu_malloc(sizeof(type
+#define QEMU_NEWZ(type) ((type *)(qemu_mallocz(sizeof(type
+
 void qemu_mutex_lock_iothread(void);
 void qemu_mutex_unlock_iothread(void);
 
-- 
1.7.5.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
On Mon, Jul 25, 2011 at 11:44 AM, Alexander Graf ag...@suse.de wrote:
 For the same reasons we want tools/perf to be there.

 Yeah, I want a pony too.

I can contact the Linux Foundation to see if we can arrange that.

Seriously, though, I don't understand your point. What is it? Do you
not agree that perf benefited from being part of the Linux kernel
tree? Or do you think it does not apply here?

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Alexander Graf ag...@suse.de wrote:

 On 25.07.2011, at 09:53, Ingo Molnar wrote:
 
  
  * Pekka Enberg penb...@kernel.org wrote:
  
  On Mon, Jul 25, 2011 at 2:12 AM, Jan Kiszka jan.kis...@web.de wrote:
 
  That said, I definitely appreciate the bug fixes as well as 
  code and documentation improvements for KVM that originate from 
  this effort! I'm just not convinced that writing a new userland 
  and merging it into the kernel is the most efficient way to 
  achieve that.
  
  Just to make this crystal clear for everyone: if it weren't for 
  tools/kvm, I wouldn't be hacking on KVM at all. I've looked at 
  Qemu in the past (and a lot recently!) and I simply don't see 
  myself contributing to it, sorry. So 'most efficient' or not, I 
  think tools/kvm is a net win for Linux and KVM in general.
  
  Same here - in fact i first asked Qemu to be put into tools/qemu/ 
  so that it all becomes more hackable and more usable - that 
  suggestion was rebuked very strongly.
 
 So instead of thinking a bit and trying to realize that there might 
 be a reason people don't want all their user space in the kernel 
 tree you go ahead and start your own crusade of creating a new user 
 space. Great. That's how I always hoped Linux would be :(.

Not all their user space but the ones that are tightly integrated 
with the kernel to begin with. How hard is that concept to 
understand?

Virtualization is very tightly bound to the kernel, like it or not. 
So is profiling, power management and a few other things.

And when you do 'ls tools/' you'll see exactly those topics 
populated:

 earth5:~/tip ls tools/
 firewire  kvm  perf  power  slub  testing  usb  virtio

[ In fact tools/virtio/ was merged upstream yesterday and putting 
  that code there was a good call IMO. ]

So just as there are good reasons to keep some user-space projects 
out of the kernel tree (while i'd love to see a usable mail client 
for Linux i dont think it belongs into tools/) there are similarly 
good reasons to keep some of them in the kernel tree.

tools/kvm/ is an excellent example for that, just look at the code - 
it shares code with the kernel, uses the kernel coding style, is in 
significant part developed by kernel developers and it is being used 
by kernel developers.

I wish there was a hackable tools/qemu/ base but there isnt and won't 
be any. The thing i *can* do is to help create a hackable 
virtualization tool.

  So i wanted to have a lightweight tool that allows me to test KVM 
  and tools/kvm/ does that very nicely: i type './kvm run' and i 
  can test a native bzImage (which has some virtualization options 
  enabled as well) on the _host_ distro i am running, booting to a 
  text shell prompt.
 
 I do that all the time.
 
   $ qemu-kvm -nographic -kernel arch/x86/boot/bzImage -append console=ttyS0
 
 does the exact same thing. If that's too much typing for you, make 
 it a bash alias.

This only boots the bzImage itself and panics when it would like to 
hit user-space.

  I can do that without downloading any (inevitably outdated) 
  virtualization images or maintaining my own ones. Maintaining 
  host userspace is more than enough for me.
 
 Who would need images? I usually only run -kernel and -initrd 
 directly to test out things. Or if I really want to boot into a 
 real system I do -snapshot /dev/sda.

that too panics here, while with tools/kvm/ i get to a prompt:

 [root@aldebaran ~]# uptime
  08:42:13 up 0 min,  0 users,  load average: 0.00, 0.00, 0.00

But i agree with you that obviously Qemu (or my usage of parameters, 
whichever is the problem) could be fixed to do this correctly.

  So, since we already have the lguest tool in the kernel tree, why 
  cannot we have the much more capable tools/kvm/ in the tree?
 
 Lguest is in Documentation/ for a reason. It's not meant as a user 
 space tool that you take as-is and use. It's meant for documenting 
 how lguest works in general. I admit though, that that's also the 
 reason people don't use it :).

There's an obvious contrast with the request to merge tools/kvm/ to 
lguest (not brought up by me) and your description of lguest only 
being for documentation purposes, not to be used really, etc.

I wanted to highlight this contrast, so if you thought we disagree 
much about lguest i don't think we do.

  So while it is the Qemu folks' right to oppose tools/qemu/, i 
  don't see why they are opposing tools/kvm/ ...
 
 In your logic, you would put all of the GNU utils into tools/. This 
 is just plain insane. There's a reason we have the split between 
 kernel and user space. What does putting them into the same tree 
 bring you? Fame? Glorious stats on kernel commits? Seriously, it's 
 a separate project. It's not the kernel.

Where did i claim that *all* user-space projects need to move into 
the kernel tree? It's all case by case. Before you argue against a 
position and ridicule it you need to understand it.

Thanks,

Ingo
--
To unsubscribe from this list: 

Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 10:51, Pekka Enberg wrote:

 On Mon, Jul 25, 2011 at 11:44 AM, Alexander Graf ag...@suse.de wrote:
 For the same reasons we want tools/perf to be there.
 
 Yeah, I want a pony too.
 
 I can contact the Linux Foundation to see if we can arrange that.
 
 Seriously, though, I don't understand your point. What is it? Do you
 not agree that perf benefited from being part of the Linux kernel
 tree? Or do you think it does not apply here?

I disagree with the assumption that putting something into the kernel tree 
benefits the project (except for maybe bumping its visibility, so it 
potentially kills competition again). If however it's true, then we should 
analyze why exactly projects benefit from being in the kernel tree, so we can 
maybe find a good solution to achieve the same differently.

The approach let's move everyting into tools/ simply doesn't scale.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Ingo Molnar mi...@elte.hu wrote:

 Virtualization is very tightly bound to the kernel, like it or not. 
 So is profiling, power management and a few other things.
 
 And when you do 'ls tools/' you'll see exactly those topics 
 populated:
 
  earth5:~/tip ls tools/
  firewire  kvm  perf  power  slub  testing  usb  virtio
 
 [ In fact tools/virtio/ was merged upstream yesterday and putting 
   that code there was a good call IMO. ]

Correction: tools/virtio/ was merged much earlier than that - in 
v2.6.37.

Still the point remains: you cannot ridicule us moving user-space 
code to tools/ while virtio itself is moving code there ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 10:54, Ingo Molnar wrote:

 
 * Alexander Graf ag...@suse.de wrote:
 
 On 25.07.2011, at 09:53, Ingo Molnar wrote:
 
 
 * Pekka Enberg penb...@kernel.org wrote:
 
 On Mon, Jul 25, 2011 at 2:12 AM, Jan Kiszka jan.kis...@web.de wrote:
 
 That said, I definitely appreciate the bug fixes as well as 
 code and documentation improvements for KVM that originate from 
 this effort! I'm just not convinced that writing a new userland 
 and merging it into the kernel is the most efficient way to 
 achieve that.
 
 Just to make this crystal clear for everyone: if it weren't for 
 tools/kvm, I wouldn't be hacking on KVM at all. I've looked at 
 Qemu in the past (and a lot recently!) and I simply don't see 
 myself contributing to it, sorry. So 'most efficient' or not, I 
 think tools/kvm is a net win for Linux and KVM in general.
 
 Same here - in fact i first asked Qemu to be put into tools/qemu/ 
 so that it all becomes more hackable and more usable - that 
 suggestion was rebuked very strongly.
 
 So instead of thinking a bit and trying to realize that there might 
 be a reason people don't want all their user space in the kernel 
 tree you go ahead and start your own crusade of creating a new user 
 space. Great. That's how I always hoped Linux would be :(.
 
 Not all their user space but the ones that are tightly integrated 
 with the kernel to begin with. How hard is that concept to 
 understand?

It's very hard to understand. It's similar to religion - I could easily apply 
your point to every reasonably low-level user space project out there. X for 
example. X needs to interact with KMS and DRI and whatdoiknow. So it'd be a 
perfect fit to pull into tools/, no?

 Virtualization is very tightly bound to the kernel, like it or not. 

I don't see why. The whole point of virtualization is to model simple, with KVM 
preferably even very-close-to-real-hardware interfaces that allow you to keep 
things separate.

 So is profiling, power management and a few other things.

I'm also failing to see the reasoning here TBH.

 And when you do 'ls tools/' you'll see exactly those topics 
 populated:
 
 earth5:~/tip ls tools/
 firewire  kvm  perf  power  slub  testing  usb  virtio
 
 [ In fact tools/virtio/ was merged upstream yesterday and putting 
  that code there was a good call IMO. ]
 
 So just as there are good reasons to keep some user-space projects 
 out of the kernel tree (while i'd love to see a usable mail client 
 for Linux i dont think it belongs into tools/) there are similarly 
 good reasons to keep some of them in the kernel tree.
 
 tools/kvm/ is an excellent example for that, just look at the code - 
 it shares code with the kernel, uses the kernel coding style, is in 
 significant part developed by kernel developers and it is being used 
 by kernel developers.

Then make a separate tree, add the kernel as git submodule and simply pull in 
your library/header dependencies from there. Shouldn't be too hard, no?

 I wish there was a hackable tools/qemu/ base but there isnt and won't 
 be any. The thing i *can* do is to help create a hackable 
 virtualization tool.

What's the problem with a code base that's hackable, but not in tools/?

 
 So i wanted to have a lightweight tool that allows me to test KVM 
 and tools/kvm/ does that very nicely: i type './kvm run' and i 
 can test a native bzImage (which has some virtualization options 
 enabled as well) on the _host_ distro i am running, booting to a 
 text shell prompt.
 
 I do that all the time.
 
  $ qemu-kvm -nographic -kernel arch/x86/boot/bzImage -append console=ttyS0
 
 does the exact same thing. If that's too much typing for you, make 
 it a bash alias.
 
 This only boots the bzImage itself and panics when it would like to 
 hit user-space.

Well, sure.

 
 I can do that without downloading any (inevitably outdated) 
 virtualization images or maintaining my own ones. Maintaining 
 host userspace is more than enough for me.
 
 Who would need images? I usually only run -kernel and -initrd 
 directly to test out things. Or if I really want to boot into a 
 real system I do -snapshot /dev/sda.
 
 that too panics here, while with tools/kvm/ i get to a prompt:
 
 [root@aldebaran ~]# uptime
  08:42:13 up 0 min,  0 users,  load average: 0.00, 0.00, 0.00
 
 But i agree with you that obviously Qemu (or my usage of parameters, 
 whichever is the problem) could be fixed to do this correctly.

Well, you need to make sure to pass the right partition into -append. No idea 
what ./kvm run does, but it's certainly something easily scriptable with any 
other virtualization user land.

 
 So, since we already have the lguest tool in the kernel tree, why 
 cannot we have the much more capable tools/kvm/ in the tree?
 
 Lguest is in Documentation/ for a reason. It's not meant as a user 
 space tool that you take as-is and use. It's meant for documenting 
 how lguest works in general. I admit though, that that's also the 
 reason people don't use it :).

Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 10:47, Pekka Enberg wrote:

 Hi Alexander,
 
 On Mon, Jul 25, 2011 at 11:37 AM, Alexander Graf ag...@suse.de wrote:
  different direction we're taking. Hell, we even went ahead and wrote our 
 own
  mini-BIOS just to keep things in one unified tree. ]
 
 Yes, making sure that you have even more non-working non-Linux OSs.
 
 You know, I've been a Linux kernel hacker for more than five years now
 and I've spent way too much of my spare time to improve it. So yes, I
 care about Linux. I care about it a lot, actually. It's fair to say I
 care about Linux more than I care about it more than any other
 operating system out there.
 
 [ I thought the 'native Linux' part in 'native Linux KVM tool' was a
 dead giveaway, really. ]
 
 Now if people want to support other operating systems, that's cool and
 I'm happy to help out where I can. But I don't understand why people
 keep bringing non-Linux OSs as an argument for not merging tools/kvm
 into the Linux kernel tree. I mean really, did someone actually expect
 that a Linux kernel developer spends his weekends improving the state
 of Windows virtualization?
 
 And don't get this the wrong way either, I'm not hostile against other
 operating systems, but I simply am not interested enough in them to
 spend my time improving them.

Then kvm-tool is about as useful as Mac-on-Linux. Why don't we have MoL user 
land in the kernel? I even added support for KVM to it about a year ago. So all 
I need to do is change it to the kernel coding style, add some dependencies on 
kernel headers and I'm good for a pull request?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Pekka Enberg penb...@kernel.org wrote:

 Hi Alexander,
 
 On Mon, Jul 25, 2011 at 11:14 AM, Alexander Graf ag...@suse.de wrote:

  Same here - in fact i first asked Qemu to be put into 
  tools/qemu/ so that it all becomes more hackable and more usable 
  - that suggestion was rebuked very strongly.
 
  So instead of thinking a bit and trying to realize that there 
  might be a reason people don't want all their user space in the 
  kernel tree you go ahead and start your own crusade of creating a 
  new user space. Great. That's how I always hoped Linux would be 
  :(.
 
 Damn you Ingo Molnar, I knew you'd somehow get all the credit for 
 our hard work! ;-)

Yeah, i'm really just a user and code reviewer of tools/kvm/, in 
terms of actual commits i'm barely on the radar compared to the 
massive contributions of Pekka, Sasha, Asias, Cyrill and others.

 More seriously, though, I fail to see what's bothering you 
 Alexander. I and Ingo already mentioned we wouldn't be hacking on 
 Qemu even if there wasn't no tools/kvm. It's not as if we're 
 putting *your* user space code into the kernel tree - we wrote our 
 own! What's wrong with that?

That puzzles me as well.

If tools/kvm/ is really such a bad approach then it will just wither 
and die, like one of the thousand stale drivers we have in the kernel 
- or like lguest it ends up as a cool hack that didnt quite make it.

Judging by the many contributions tools/kvm/ has already attracted in 
its limited form an out of tree project i'd not bet on that outcome 
though.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Pekka Enberg penb...@kernel.org wrote:

 Hi Alexander,
 
 On Mon, Jul 25, 2011 at 11:31 AM, Alexander Graf ag...@suse.de wrote:
  Damn you Ingo Molnar, I knew you'd somehow get all the credit for our
  hard work! ;-)
 
  Well, IIUC he's the one initiating the whole thing, no?
 
 As much as I appreciate Ingo's help and support with the project, 
 no, I don't consider him to have initiated the whole thing. [...]

No, i havent - and if Alexander pulls the URI and looks at the 
changelogs he'll realize that fact too.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Pekka Enberg penb...@kernel.org wrote:

 [ I thought the 'native Linux' part in 'native Linux KVM tool' was 
 a dead giveaway, really. ]
 
 Now if people want to support other operating systems, that's cool 
 and I'm happy to help out where I can. But I don't understand why 
 people keep bringing non-Linux OSs as an argument for not merging 
 tools/kvm into the Linux kernel tree. I mean really, did someone 
 actually expect that a Linux kernel developer spends his weekends 
 improving the state of Windows virtualization?
 
 And don't get this the wrong way either, I'm not hostile against 
 other operating systems, but I simply am not interested enough in 
 them to spend my time improving them.

In fact one of the problems i see with Qemu is that Qemu had to make 
many compromises to support Windows and other weird platforms that 
i'm (and i'd claim most other Linux kernel developers) are personally 
not interested in.

So here's a 14 KLOC tools/kvm/ that does everything that i need from 
virtualization (easy testing of a bzImage before rebooting the host 
system into it), is clean, readable and hackable and is 1.5% of the 
nearly 1 MLOC Qemu code base.

The whole tools/kvm/ code base is (much) smaller than Qemu's IA64 
support code for example. The size difference is startling. 

tools/kvm/ does less and in my experience does it better - is that 
such a surprising thing?

So it was a no brainer for me to pull it into -tip.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 10:51, Avi Kivity wrote:

 qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
 QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.

What does this buy you over

type *x = qemu_malloc(sizeof(type));

? I find the non-C++ version easier to read even.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Sasha Levin
On Mon, 2011-07-25 at 11:32 +0200, Alexander Graf wrote:
 On 25.07.2011, at 10:51, Avi Kivity wrote:
 
  qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
  QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.
 
 What does this buy you over
 
 type *x = qemu_malloc(sizeof(type));
 
 ? I find the non-C++ version easier to read even.

It'll warn when you do silly things such as:

struct some_struct *k;

k = qemu_malloc(sizeof(k));

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 11:26, Ingo Molnar wrote:

 
 * Pekka Enberg penb...@kernel.org wrote:
 
 [ I thought the 'native Linux' part in 'native Linux KVM tool' was 
 a dead giveaway, really. ]
 
 Now if people want to support other operating systems, that's cool 
 and I'm happy to help out where I can. But I don't understand why 
 people keep bringing non-Linux OSs as an argument for not merging 
 tools/kvm into the Linux kernel tree. I mean really, did someone 
 actually expect that a Linux kernel developer spends his weekends 
 improving the state of Windows virtualization?
 
 And don't get this the wrong way either, I'm not hostile against 
 other operating systems, but I simply am not interested enough in 
 them to spend my time improving them.
 
 In fact one of the problems i see with Qemu is that Qemu had to make 
 many compromises to support Windows and other weird platforms that 
 i'm (and i'd claim most other Linux kernel developers) are personally 
 not interested in.

It's what makes it so powerful. Adding a new architecture for KVM for example 
is as easy as only implementing the CPU. All device emulation is already there. 
If you want something Linux only, lguest would've been enough, no?

The only real problem I see with Qemu is that it's been born in a time of 
uniprocessor systems. Making it fully threaded as an after-thought is insanely 
hard.

 So here's a 14 KLOC tools/kvm/ that does everything that i need from 
 virtualization (easy testing of a bzImage before rebooting the host 
 system into it), is clean, readable and hackable and is 1.5% of the 
 nearly 1 MLOC Qemu code base.

Sure, we'll talk again when you can run Android for ARM on kvm-tool.

 The whole tools/kvm/ code base is (much) smaller than Qemu's IA64 
 support code for example. The size difference is startling. 

That's the host code compiler. Qemu doesn't have IA64 guest support.

 tools/kvm/ does less and in my experience does it better - is that 
 such a surprising thing?

I already stated in a few mails before and also in the last discussion that I'm 
in general in favor of having multiple user space targets for KVM. In fact, I 
wrote 2 backends (both for PPC) myself. Though adding those 2 backends to their 
respective projects probably provided a lot more value-add (features that qemu 
didn't already have) than kvm-tool does atm. Kvm-tool really just smells a lot 
like NIH. Which is fine - that's why we have KDE and Gnome, right? ;)

 So it was a no brainer for me to pull it into -tip.

The thing I don't agree with is that it should live in the kernel tree.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Alexander Graf ag...@suse.de wrote:

 On 25.07.2011, at 10:54, Ingo Molnar wrote:
 
  
  * Alexander Graf ag...@suse.de wrote:
  
  On 25.07.2011, at 09:53, Ingo Molnar wrote:
  
  
  * Pekka Enberg penb...@kernel.org wrote:
  
  On Mon, Jul 25, 2011 at 2:12 AM, Jan Kiszka jan.kis...@web.de wrote:
  
  That said, I definitely appreciate the bug fixes as well as 
  code and documentation improvements for KVM that originate from 
  this effort! I'm just not convinced that writing a new userland 
  and merging it into the kernel is the most efficient way to 
  achieve that.
  
  Just to make this crystal clear for everyone: if it weren't for 
  tools/kvm, I wouldn't be hacking on KVM at all. I've looked at 
  Qemu in the past (and a lot recently!) and I simply don't see 
  myself contributing to it, sorry. So 'most efficient' or not, I 
  think tools/kvm is a net win for Linux and KVM in general.
  
  Same here - in fact i first asked Qemu to be put into tools/qemu/ 
  so that it all becomes more hackable and more usable - that 
  suggestion was rebuked very strongly.
  
  So instead of thinking a bit and trying to realize that there might 
  be a reason people don't want all their user space in the kernel 
  tree you go ahead and start your own crusade of creating a new user 
  space. Great. That's how I always hoped Linux would be :(.
  
  Not all their user space but the ones that are tightly 
  integrated with the kernel to begin with. How hard is that 
  concept to understand?
 
 It's very hard to understand. It's similar to religion - I could 
 easily apply your point to every reasonably low-level user space 
 project out there. X for example. X needs to interact with KMS and 
 DRI and whatdoiknow. So it'd be a perfect fit to pull into tools/, 
 no?

If the graphics folks feel comfortable with that approach then yes, i 
think graphics would be a good example as well - in my experience a 
lot of the user-space pain related to graphics development comes from 
ugly version friction between various graphics components, and all 
the APIs are pretty fluid and heavily developed - which is easier to 
do within a single code repository.

But if the Xorg/graphics folks want it separate it's their decision 
and they've said it in the past that they like the current 
modularization.

If someone comes with tools/X11/ that actually *works* and provides a 
usable X11 replacement i sure would try it out and would likely 
contribute to it.

Basic infrastructure and tooling - many projects could apply but the 
most obvious choices are ones that relate to and depend on the 
kernel.

Browsers, email clients, GIMP, games, LibreOffice, not so much.

There's no clear line but that's not a problem - there are seldom any 
clear lines in life. It's a case by case issue and very much depends 
on the attributes of the project and the preferences of the 
developers who are driving it.

I can tell you my first hand experience: for tools/perf/ and 
tools/kvm/ it was highly beneficial to be developed in the kernel 
repo.

I don't see how you can validly bring religion into this discussion.

  Virtualization is very tightly bound to the kernel, like it or 
  not.
 
 I don't see why. The whole point of virtualization is to model 
 simple, with KVM preferably even very-close-to-real-hardware 
 interfaces that allow you to keep things separate.

Look at tools/kvm/ - i cannot do anything useful without a Linux 
kernel under it. It's as deeply bound to the Linux kernel as it gets.

Then look at the actual drivers and interfaces within tools/kvm/. 
It's using the same symbols and conventions for 'guest' and 'host' 
side.

Check out tools/kvm/hw/i8042.c and match it up with 
include/linux/serio.h and drivers/input/serio/i8042.c - you can 
literally walk from one side to the other and understand how guest 
and host are tightly related not just functionality but also 
implementation wise.

This is how Qemu should be doing it as well btw., to ease the 
debugging of host/guest interaction bugs and to ease development.

  So is profiling, power management and a few other things.
 
 I'm also failing to see the reasoning here TBH.

You need to quote the full argument to see the reason in it:

  Virtualization is very tightly bound to the kernel, like it or 
  not. So is profiling, power management and a few other things.

It's a very simple point and observation: tools which integrate to 
the kernel so that they wouldnt even run on another kernel obviously 
are very natural to develop in tools/.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 11:37, Sasha Levin wrote:

 On Mon, 2011-07-25 at 11:32 +0200, Alexander Graf wrote:
 On 25.07.2011, at 10:51, Avi Kivity wrote:
 
 qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
 QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.
 
 What does this buy you over
 
 type *x = qemu_malloc(sizeof(type));
 
 ? I find the non-C++ version easier to read even.
 
 It'll warn when you do silly things such as:
 
 struct some_struct *k;
 
 k = qemu_malloc(sizeof(k));

Hm - is there any way to get this without adding upper case C++'ish macros?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] skbuff: clear tx zero-copy flag

2011-07-25 Thread Michael S. Tsirkin
On Mon, Jul 25, 2011 at 04:40:57PM +0800, Herbert Xu wrote:
 On Mon, Jul 25, 2011 at 11:07:43AM +0300, Michael S. Tsirkin wrote:
 
  However macvtap passes an skb directly to the
  lower device, so as long as macvtap is the only user
  of that interface, we are fine I think - there's
  no way for an skb to get from macvtap to splice
  read path I think.
  
  Right?
 
 Yes, as long as you can guarantee that the skb never loops back
 then you should be fine.
 
 However, does macvtap really bypass everything, including the
 qdisc layer? The qdisc layer is certainly capable of looping
 the skb back with the redirect action.
 
 Cheers,

No, I don't think macvtap bypasses the qdisc.
Is the action in question here?
static int tcf_mirred(struct sk_buff *skb,
const struct tc_action *a,
  struct tcf_result *res)

if yes that seems to always clone an skb, which in turn
does the copy so we are fine?

 -- 
 Email: Herbert Xu herb...@gondor.apana.org.au
 Home Page: http://gondor.apana.org.au/~herbert/
 PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 11:41, Ingo Molnar wrote:

 Virtualization is very tightly bound to the kernel, like it or 
 not. So is profiling, power management and a few other things.
 
 It's a very simple point and observation: tools which integrate to 
 the kernel so that they wouldnt even run on another kernel obviously 
 are very natural to develop in tools/.

Ah, very good. So all we need to do to prove the point that kvm-tool doesn't 
belong in tools/ is port KVM to another OS and make kvm-tool compile there too? 
Shouldn't be too hard. People already have working ports of (old) KVM versions 
on FreeBSD and Windows.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Peter Maydell
On 25 July 2011 10:32, Alexander Graf ag...@suse.de wrote:
 On 25.07.2011, at 10:51, Avi Kivity wrote:
 qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
 QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.

 What does this buy you over

 type *x = qemu_malloc(sizeof(type));

 ? I find the non-C++ version easier to read even.

Yeah, while we're writing in C we should just stick to the C-like
APIs, it's less confusing IMHO than wrapping it up in something else.
I assume Anthony's new object model stuff will have a create me a
new foo object API anyway, so QEMU_NEW() is possibly a bit of a
namespace grab.

-- PMM
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Avi Kivity

On 07/25/2011 12:41 PM, Ingo Molnar wrote:

Look at tools/kvm/ - i cannot do anything useful without a Linux
kernel under it. It's as deeply bound to the Linux kernel as it gets.


The actual code that interacts with the kernel is pretty small, and will 
grow smaller (as a fraction) in time.



Then look at the actual drivers and interfaces within tools/kvm/.
It's using the same symbols and conventions for 'guest' and 'host'
side.

Check out tools/kvm/hw/i8042.c and match it up with
include/linux/serio.h and drivers/input/serio/i8042.c - you can
literally walk from one side to the other and understand how guest
and host are tightly related not just functionality but also
implementation wise.

This is how Qemu should be doing it as well btw., to ease the
debugging of host/guest interaction bugs and to ease development.


No.  That ties the guest and host interfaces together.  If you change 
the guest to use a feature you didn't implement previously in the host, 
it's suddenly broken when virtualized.


The right thing is to implement the hardware interface exactly per spec, 
and ignore the guest(s) completely except for testing and performance.



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 12:43 PM, Alexander Graf wrote:

Hm - is there any way to get this without adding upper case C++'ish macros?


Switch to C++.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 12:48 PM, Peter Maydell wrote:

On 25 July 2011 10:32, Alexander Grafag...@suse.de  wrote:
  On 25.07.2011, at 10:51, Avi Kivity wrote:
  qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
  QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.

  What does this buy you over

  type *x = qemu_malloc(sizeof(type));

  ? I find the non-C++ version easier to read even.

Yeah, while we're writing in C we should just stick to the C-like
APIs, it's less confusing IMHO than wrapping it up in something else.


That argument can be used to block any change.  You'll get used to it in 
time.  The question is, is the new interface better or not.



I assume Anthony's new object model stuff will have a create me a
new foo object API anyway, so QEMU_NEW() is possibly a bit of a
namespace grab.


Anthony's stuff is at a much higher level, hopefully he'll come back to 
the ground one day.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 11:52, Avi Kivity wrote:

 On 07/25/2011 12:48 PM, Peter Maydell wrote:
 On 25 July 2011 10:32, Alexander Grafag...@suse.de  wrote:
   On 25.07.2011, at 10:51, Avi Kivity wrote:
   qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
   QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.
 
   What does this buy you over
 
   type *x = qemu_malloc(sizeof(type));
 
   ? I find the non-C++ version easier to read even.
 
 Yeah, while we're writing in C we should just stick to the C-like
 APIs, it's less confusing IMHO than wrapping it up in something else.
 
 That argument can be used to block any change.  You'll get used to it in 
 time.  The question is, is the new interface better or not.

I agree that it keeps you from accidently malloc'ing a struct of pointer size. 
But couldn't we also just add this to checkpatch.pl?

I sympathize with Peter on the rationale that keeping interfaces aligned with 
how C APIs usually look like helps making the code more readable. 


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] skbuff: clear tx zero-copy flag

2011-07-25 Thread Herbert Xu
On Mon, Jul 25, 2011 at 12:44:14PM +0300, Michael S. Tsirkin wrote:

 if yes that seems to always clone an skb, which in turn
 does the copy so we are fine?

Yes you're right, it should be safe.

However, I think we should add a WARN_ON to the splice skb path
so that should a packet find its way through a path that we haven't
thought of then at least we'll know about it.

Thanks,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] skbuff: clear tx zero-copy flag

2011-07-25 Thread David Miller
From: Herbert Xu herb...@gondor.hengli.com.au
Date: Mon, 25 Jul 2011 17:57:11 +0800

 However, I think we should add a WARN_ON to the splice skb path
 so that should a packet find its way through a path that we haven't
 thought of then at least we'll know about it.

Good idea.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 12:56 PM, Alexander Graf wrote:


  That argument can be used to block any change.  You'll get used to it in 
time.  The question is, is the new interface better or not.

I agree that it keeps you from accidently malloc'ing a struct of pointer size. 
But couldn't we also just add this to checkpatch.pl?


Better APIs trump better patch review.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Avi Kivity a...@redhat.com wrote:

 On 07/25/2011 12:41 PM, Ingo Molnar wrote:

  Look at tools/kvm/ - i cannot do anything useful without a Linux 
  kernel under it. It's as deeply bound to the Linux kernel as it 
  gets.
 
 The actual code that interacts with the kernel is pretty small, and 
 will grow smaller (as a fraction) in time.

Look at the example i have given in my other mail, conceptual 
interaction goes beyond the strict KVM (and hardware) ABIs themselves 
- for every guest side driver there's a host side piece of 
functionality.

What we found is that there's real benefits from tools/kvm/hw/i8042.c 
using the same conventions and constants as 
drivers/input/serio/i8042.c.

The two pieces of code are obviously independent ABI-wise - but to 
developers working on both sides of the fence it's very useful to 
have the code look similar and familiar.

Some other pieces of code share library functionality between the 
kernel and tools/kvm/.

  Then look at the actual drivers and interfaces within tools/kvm/. 
  It's using the same symbols and conventions for 'guest' and 
  'host' side.
 
  Check out tools/kvm/hw/i8042.c and match it up with 
  include/linux/serio.h and drivers/input/serio/i8042.c - you can 
  literally walk from one side to the other and understand how 
  guest and host are tightly related not just functionality but 
  also implementation wise.
 
  This is how Qemu should be doing it as well btw., to ease the 
  debugging of host/guest interaction bugs and to ease development.
 
 No. That ties the guest and host interfaces together. [...]

Why would it tie the interfaces together? For a guest and host driver 
to be written in the same familiar conventions is useful for 
debugging and development easier but does not prevent both of those 
pieces to implement a precise interface. (Especially since in the 
i8042 case the host driver is used with real hardware as well.)

 [...]  If you change the guest to use a feature you didn't 
 implement previously in the host, it's suddenly broken when 
 virtualized.
 
 The right thing is to implement the hardware interface exactly per 
 spec, and ignore the guest(s) completely except for testing and 
 performance.

You can certainly do that too - and nothing prevents from the guest 
and host driver to look and feel similar.

But note that for many of these things there no such thing as a 
'hardware spec' that describes all the details and quirks - what we 
have are host PC drivers that have been implemented via many cycles 
of trial and error. That's especially true of i8042, the example i 
have given.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 12:02, Avi Kivity wrote:

 On 07/25/2011 12:56 PM, Alexander Graf wrote:
 
   That argument can be used to block any change.  You'll get used to it in 
  time.  The question is, is the new interface better or not.
 
 I agree that it keeps you from accidently malloc'ing a struct of pointer 
 size. But couldn't we also just add this to checkpatch.pl?
 
 Better APIs trump better patch review.

Only if you enforce them. The only sensible thing for QEMU_NEW (despite the 
general rule of upper case macros, I'd actually prefer this one to be lower 
case though since it's so often used) would be to remove qemu_malloc, declare 
malloc() as unusable and convert all users of qemu_malloc() to qemu_new().


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Stefan Hajnoczi
On Mon, Jul 25, 2011 at 9:51 AM, Avi Kivity a...@redhat.com wrote:
 qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
 QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.

 Signed-off-by: Avi Kivity a...@redhat.com
 ---

 This is part of my memory API patchset, but doesn't really belong there.

  qemu-common.h |    3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/qemu-common.h b/qemu-common.h
 index ba55719..66effa3 100644
 --- a/qemu-common.h
 +++ b/qemu-common.h
 @@ -186,6 +186,9 @@ void qemu_free(void *ptr);
  char *qemu_strdup(const char *str);
  char *qemu_strndup(const char *str, size_t size);

 +#define QEMU_NEW(type) ((type *)(qemu_malloc(sizeof(type
 +#define QEMU_NEWZ(type) ((type *)(qemu_mallocz(sizeof(type

Does this mean we need to duplicate the type name for each allocation?

struct foo *f;

...
f = qemu_malloc(sizeof(*f));

Becomes:

struct foo *f;

...
f = QEMU_NEW(struct foo);

If you ever change the name of the type you have to search-replace
these instances.  The idomatic C way works well, I don't see a reason
to use QEMU_NEW().

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 01:04 PM, Alexander Graf wrote:

On 25.07.2011, at 12:02, Avi Kivity wrote:

  On 07/25/2011 12:56 PM, Alexander Graf wrote:
  
 That argument can be used to block any change.  You'll get used to it 
in time.  The question is, is the new interface better or not.

  I agree that it keeps you from accidently malloc'ing a struct of pointer 
size. But couldn't we also just add this to checkpatch.pl?

  Better APIs trump better patch review.

Only if you enforce them. The only sensible thing for QEMU_NEW (despite the 
general rule of upper case macros, I'd actually prefer this one to be lower 
case though since it's so often used) would be to remove qemu_malloc, declare 
malloc() as unusable and convert all users of qemu_malloc() to qemu_new().


Some qemu_mallocs() will remain (allocating a byte array or something 
variable sized).


I agree qemu_new() will be nicer, but that will have to wait until Blue 
is several light-days away from Earth.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 01:06 PM, Stefan Hajnoczi wrote:

char *qemu_strndup(const char *str, size_t size);

  +#define QEMU_NEW(type) ((type *)(qemu_malloc(sizeof(type
  +#define QEMU_NEWZ(type) ((type *)(qemu_mallocz(sizeof(type

Does this mean we need to duplicate the type name for each allocation?

struct foo *f;

...
f = qemu_malloc(sizeof(*f));

Becomes:

struct foo *f;

...
f = QEMU_NEW(struct foo);

If you ever change the name of the type you have to search-replace
these instances.  The idomatic C way works well, I don't see a reason
to use QEMU_NEW().


It works as long as you don't make any mistakes:

  f = qemu_malloc(sizeof(*g));
  f = qemu_malloc(sizeof(f));

qemu_malloc() returns a void pointer, these are poisonous.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Alexander Graf ag...@suse.de wrote:

  In fact one of the problems i see with Qemu is that Qemu had to 
  make many compromises to support Windows and other weird 
  platforms that i'm (and i'd claim most other Linux kernel 
  developers) are personally not interested in.
 
 It's what makes it so powerful. [...]

To me and Pekka that is what made Qemu unhackable.

Really, i'm not sure why you are arguing here. We are not trying to 
merge tools/qemu/ upstream. We are trying to merge a Linux-only 
utility that lives in the kernel tree today and which we are actively 
using and developing.

 [...] Adding a new architecture for KVM for example is as easy as 
 only implementing the CPU. All device emulation is already there. 
 If you want something Linux only, lguest would've been enough, no?

That's a rather bizarre argument, we were pretty happy with the 
design of the KVM host side, what we wanted to improve was user-space 
tooling.

With lguest we'd have to write a new host implementation in essence 
...

  [...]
 
  tools/kvm/ does less and in my experience does it better - is 
  that such a surprising thing?

 [...] 
 
  So it was a no brainer for me to pull it into -tip.
 
 The thing I don't agree with is that it should live in the kernel 
 tree.

FYI, tools/kvm/ *already* lives in the kernel tree - that is how it's 
developed and used and it also shares code with the kernel.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Avi Kivity

On 07/25/2011 01:03 PM, Ingo Molnar wrote:

Then look at the actual drivers and interfaces within tools/kvm/.
It's using the same symbols and conventions for 'guest' and
'host' side.
  
Check out tools/kvm/hw/i8042.c and match it up with
include/linux/serio.h and drivers/input/serio/i8042.c - you can
literally walk from one side to the other and understand how
guest and host are tightly related not just functionality but
also implementation wise.
  
This is how Qemu should be doing it as well btw., to ease the
debugging of host/guest interaction bugs and to ease development.

  No. That ties the guest and host interfaces together. [...]

Why would it tie the interfaces together? For a guest and host driver
to be written in the same familiar conventions is useful for
debugging and development easier but does not prevent both of those
pieces to implement a precise interface. (Especially since in the
i8042 case the host driver is used with real hardware as well.)


The driver is under no requirement to use all of the functionality of 
the device.  On the other hand, the device emulation has to be complete, 
or it risks not working with some past or future version of the driver, 
or with another OS (if you support that).



  [...]  If you change the guest to use a feature you didn't
  implement previously in the host, it's suddenly broken when
  virtualized.

  The right thing is to implement the hardware interface exactly per
  spec, and ignore the guest(s) completely except for testing and
  performance.

You can certainly do that too - and nothing prevents from the guest
and host driver to look and feel similar.

But note that for many of these things there no such thing as a
'hardware spec' that describes all the details and quirks - what we
have are host PC drivers that have been implemented via many cycles
of trial and error. That's especially true of i8042, the example i
have given.


The device need not implement any quirks - it can implement the spec 
exactly.  As long as the driver can drive a standard device, all is fine.


Specs for the 8042 certainly exist.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 12:09, Avi Kivity wrote:

 On 07/25/2011 01:04 PM, Alexander Graf wrote:
 On 25.07.2011, at 12:02, Avi Kivity wrote:
 
   On 07/25/2011 12:56 PM, Alexander Graf wrote:
   
  That argument can be used to block any change.  You'll get used to 
  it in time.  The question is, is the new interface better or not.
 
   I agree that it keeps you from accidently malloc'ing a struct of pointer 
  size. But couldn't we also just add this to checkpatch.pl?
 
   Better APIs trump better patch review.
 
 Only if you enforce them. The only sensible thing for QEMU_NEW (despite the 
 general rule of upper case macros, I'd actually prefer this one to be lower 
 case though since it's so often used) would be to remove qemu_malloc, 
 declare malloc() as unusable and convert all users of qemu_malloc() to 
 qemu_new().
 
 Some qemu_mallocs() will remain (allocating a byte array or something 
 variable sized).

Right. But then we really do need a check in checkpatch.pl to see if someone's 
using qemu_new for simple structs.

 I agree qemu_new() will be nicer, but that will have to wait until Blue is 
 several light-days away from Earth.

Blue, any disagreement on adding qemu_new() as a macro? Something used that 
often in upper case would seriously disturb the reading flow :)


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 12:16, Ingo Molnar wrote:

 
 So it was a no brainer for me to pull it into -tip.
 
 The thing I don't agree with is that it should live in the kernel 
 tree.
 
 FYI, tools/kvm/ *already* lives in the kernel tree - that is how it's 
 developed and used and it also shares code with the kernel.

So you're just trying to make this a self-fulfilling prophecy now? I already 
stated before that I could easily add those 'requirements' to any project I 
liked. That still wouldn't mean it's a good idea to clutter the _kernel_ 
repository with its code. It's not a kernel.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Alexander Graf ag...@suse.de wrote:

 On 25.07.2011, at 11:41, Ingo Molnar wrote:
 
  Virtualization is very tightly bound to the kernel, like it or 
  not. So is profiling, power management and a few other things.
  
  It's a very simple point and observation: tools which integrate 
  to the kernel so that they wouldnt even run on another kernel 
  obviously are very natural to develop in tools/.
 
 Ah, very good. So all we need to do to prove the point that 
 kvm-tool doesn't belong in tools/ is port KVM to another OS and 
 make kvm-tool compile there too? Shouldn't be too hard. People 
 already have working ports of (old) KVM versions on FreeBSD and 
 Windows.

Firstly, tools/kvm/ itself only works on Linux and it's developed in 
the kernel repo and we see many benefits of that model and are happy 
with it.

Secondly, even your argument is rather inconsistent: by your argument 
what keeps KVM itself within the Linux kernel, if it's so portable to 
FreeBSD and Windows? By your argument all the virtio drivers should 
be moved out of the Linux kernel tree, to support both the FreeBSD, 
Windows and Linux KVM implementations. By your argument the arch/x86/ 
KVM disassembler should move out of the kernel tree, etc. etc.

You cannot have it both ways really.

So yes, i disagree with you rigid views about this, in reality what 
matters is the quality of the end result and the preferences of the 
developers. Just like i cannot (and should not) force you to develop 
in tools/qemu/, should you not try to force me to not develop in 
tools/kvm/.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Avi Kivity a...@redhat.com wrote:

 On 07/25/2011 01:03 PM, Ingo Molnar wrote:
 Then look at the actual drivers and interfaces within tools/kvm/.
 It's using the same symbols and conventions for 'guest' and
 'host' side.
   
 Check out tools/kvm/hw/i8042.c and match it up with
 include/linux/serio.h and drivers/input/serio/i8042.c - you can
 literally walk from one side to the other and understand how
 guest and host are tightly related not just functionality but
 also implementation wise.
   
 This is how Qemu should be doing it as well btw., to ease the
 debugging of host/guest interaction bugs and to ease development.
 
   No. That ties the guest and host interfaces together. [...]
 
  Why would it tie the interfaces together? For a guest and host 
  driver to be written in the same familiar conventions is useful 
  for debugging and development easier but does not prevent both of 
  those pieces to implement a precise interface. (Especially since 
  in the i8042 case the host driver is used with real hardware as 
  well.)
 
 The driver is under no requirement to use all of the functionality 
 of the device.  On the other hand, the device emulation has to be 
 complete, or it risks not working with some past or future version 
 of the driver, or with another OS (if you support that).

Sure, and nothing in tools/kvm/hw/i8042.c contradicts that and my 
points still remain.

You can have clean, harmonic code *and* have it work to spec, agreed?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Alexander Graf ag...@suse.de wrote:

 
 On 25.07.2011, at 12:16, Ingo Molnar wrote:
 
  
  So it was a no brainer for me to pull it into -tip.
  
  The thing I don't agree with is that it should live in the 
  kernel tree.
  
  FYI, tools/kvm/ *already* lives in the kernel tree - that is how 
  it's developed and used and it also shares code with the kernel.
 
 So you're just trying to make this a self-fulfilling prophecy now? 

No, i'm just trying to point out to you that it's already a fact that 
the tool already lives in a kernel tree and was developed there to 
the tune of 700+ commits within a few short months.

IMO you should not be arguing:

 'You should do X because that's my preference.'

IMO you should be arguing:

 'You should do X because technical advantage Y which is important to the 
project.'

We have listed numerous technical advantages of why we have chosen 
tools/kvm/ - and you have not countered that with any convincing 
technical disadvantage so far.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Christoph Hellwig
On Mon, Jul 25, 2011 at 10:14:13AM +0200, Alexander Graf wrote:
 So instead of thinking a bit and trying to realize that there might be a 
 reason people don't want all their user space in the kernel tree you go ahead 
 and start your own crusade of creating a new user space. Great. That's how I 
 always hoped Linux would be :(.

It's not Linux in general yet, it's mostly a crusade of a few with a
political agenda.

  So i wanted to have a lightweight tool that allows me to test KVM and 
  tools/kvm/ does that very nicely: i type './kvm run' and i can test a 
  native bzImage (which has some virtualization options enabled as 
  well) on the _host_ distro i am running, booting to a text shell 
  prompt.
 
 I do that all the time.
 
   $ qemu-kvm -nographic -kernel arch/x86/boot/bzImage -append console=ttyS0

Same here.  I can't be bothered with all the stuid distro booting crap.

  I can do that without downloading any (inevitably outdated) 
  virtualization images or maintaining my own ones. Maintaining host 
  userspace is more than enough for me.
 
 Who would need images? I usually only run -kernel and -initrd directly to 
 test out things. Or if I really want to boot into a real system I do 
 -snapshot /dev/sda.

Indeed.

 
  So, since we already have the lguest tool in the kernel tree, why 
  cannot we have the much more capable tools/kvm/ in the tree?
 
 Lguest is in Documentation/ for a reason. It's not meant as a user space tool 
 that you take as-is and use. It's meant for documenting how lguest works in 
 general. I admit though, that that's also the reason people don't use it :).

I'd also say it's rather misplaced there, and at least in the storage
area that I know most it didn't help it from totally misunderstanding
kernel concepts and taking them into protocols (e.g. virtio barrier
support).  That for me is a reason why you don't want to couple thing
too tightly, at least you'll have to document and/or explain the
protocol to someone. 
tight

And another argument, calling toyvisor2 kvm is a really bad idea.  The
kvm binary has been used for the kvm-patched qemu binary for quite a
while in various distros, so re-using that name will cause utter
confusion.

I'm happy that you guys do another independent userspace for kvm, but
please:

  a) give it a useful name
  b) just develop it where it belongs, your own little git repository
 somewhere
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread malc
On Mon, 25 Jul 2011, Alexander Graf wrote:

 
 On 25.07.2011, at 12:09, Avi Kivity wrote:
 
  On 07/25/2011 01:04 PM, Alexander Graf wrote:
  On 25.07.2011, at 12:02, Avi Kivity wrote:
  
On 07/25/2011 12:56 PM, Alexander Graf wrote:

   That argument can be used to block any change.  You'll get used to 
   it in time.  The question is, is the new interface better or not.
  
I agree that it keeps you from accidently malloc'ing a struct of 
   pointer size. But couldn't we also just add this to checkpatch.pl?
  
Better APIs trump better patch review.
  
  Only if you enforce them. The only sensible thing for QEMU_NEW (despite 
  the general rule of upper case macros, I'd actually prefer this one to be 
  lower case though since it's so often used) would be to remove 
  qemu_malloc, declare malloc() as unusable and convert all users of 
  qemu_malloc() to qemu_new().
  
  Some qemu_mallocs() will remain (allocating a byte array or something 
  variable sized).
 
 Right. But then we really do need a check in checkpatch.pl to see if 
 someone's using qemu_new for simple structs.
 
  I agree qemu_new() will be nicer, but that will have to wait until Blue is 
  several light-days away from Earth.
 
 Blue, any disagreement on adding qemu_new() as a macro? Something used 
 that often in upper case would seriously disturb the reading flow :)

So do not use it then, macros should be uppercase.

-- 
mailto:av1...@comtv.ru
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Alexander Graf ag...@suse.de wrote:

  You know, they said the same thing about oprofile. All you needed 
  to do was to write few simple shell scripts to make it work. One 
  of the key features of tools/kvm is 'as little configuration as 
  possible' and I can assure you that bash alias is really not a 
  solution for that.
 
 I like perf. Really. But I still don't see why it wouldn't be able 
 to live in its own tree either.

Is the reason that the people who develop it prefer integration with 
the kernel tree not enough for you?

perf could possibly be ported to other OSs. Maybe some day someone 
will try that. But unless that project actually replaces the perf 
project, or perf developers move out of the kernel en masse due to 
difficulties with the development model i don't see why the project 
would want to move out of the kernel tree.

In fact my observations as a perf maintainer show the exact opposite: 
most perf developers are 100% happy to get their stuff merged and 
upstream ASAP. They do not buffer big patch-queues just to not have 
to deal with an integrated kernel tree. The integrated tree is a 
natural model of development to them and often perf tooling patches 
come mixed with kernel side patches such as new tracepoints or 
cleanups/fixes to related kernel code, so it's all very convenient.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] skbuff: clear tx zero-copy flag

2011-07-25 Thread Michael S. Tsirkin
On Mon, Jul 25, 2011 at 03:02:29AM -0700, David Miller wrote:
 From: Herbert Xu herb...@gondor.hengli.com.au
 Date: Mon, 25 Jul 2011 17:57:11 +0800
 
  However, I think we should add a WARN_ON to the splice skb path
  so that should a packet find its way through a path that we haven't
  thought of then at least we'll know about it.
 
 Good idea.

Another place like this is skb_split, I think.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Markus Armbruster
Avi Kivity a...@redhat.com writes:

 On 07/25/2011 01:04 PM, Alexander Graf wrote:
 On 25.07.2011, at 12:02, Avi Kivity wrote:

   On 07/25/2011 12:56 PM, Alexander Graf wrote:
   
  That argument can be used to block any change.  You'll get used to 
  it in time.  The question is, is the new interface better or not.
 
   I agree that it keeps you from accidently malloc'ing a struct of pointer 
  size. But couldn't we also just add this to checkpatch.pl?
 
   Better APIs trump better patch review.

 Only if you enforce them. The only sensible thing for QEMU_NEW (despite the 
 general rule of upper case macros, I'd actually prefer this one to be lower 
 case though since it's so often used) would be to remove qemu_malloc, 
 declare malloc() as unusable and convert all users of qemu_malloc() to 
 qemu_new().

 Some qemu_mallocs() will remain (allocating a byte array or something
 variable sized).

Byte array: add the obvious type-safe allocator for a variable-sized
array T[N], then use it with unsigned char for T.

In fact, I find QEMU_NEW() pretty pointless without a buddy for arrays.

Still not covered: allocating a struct with a variable-size array as
final member.  I guess a solution for that can be found if we care
enough.

[...]
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Markus Armbruster
Kevin Wolf kw...@redhat.com writes:

 Am 25.07.2011 12:06, schrieb Stefan Hajnoczi:
 On Mon, Jul 25, 2011 at 9:51 AM, Avi Kivity a...@redhat.com wrote:
 qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
 QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.

 Signed-off-by: Avi Kivity a...@redhat.com
 ---

 This is part of my memory API patchset, but doesn't really belong there.

  qemu-common.h |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/qemu-common.h b/qemu-common.h
 index ba55719..66effa3 100644
 --- a/qemu-common.h
 +++ b/qemu-common.h
 @@ -186,6 +186,9 @@ void qemu_free(void *ptr);
  char *qemu_strdup(const char *str);
  char *qemu_strndup(const char *str, size_t size);

 +#define QEMU_NEW(type) ((type *)(qemu_malloc(sizeof(type
 +#define QEMU_NEWZ(type) ((type *)(qemu_mallocz(sizeof(type
 
 Does this mean we need to duplicate the type name for each allocation?
 
 struct foo *f;
 
 ...
 f = qemu_malloc(sizeof(*f));
 
 Becomes:
 
 struct foo *f;
 
 ...
 f = QEMU_NEW(struct foo);

 Maybe we should allow this and make it the usual pattern:

 f = qemu_new(typeof(*f));

 It's gcc specific, but we already don't care about portability to other
 compilers in more places.

 On the other hand, how many bugs did we have recently that were caused
 by a wrong sizeof for qemu_malloc? As far as I can say, there's no real
 reason to do it. I think it's the same kind of discussion as with
 forbidding qemu_malloc(0) (except that this time it just won't improve
 things much instead of being really stupid).

Side-stepping the stupid OMG malloc(0) is weird, therefore we must make
qemu_malloc(0) differently weird controversy would be useful all by
itself.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Christoph Hellwig h...@infradead.org wrote:

   So, since we already have the lguest tool in the kernel tree, 
   why cannot we have the much more capable tools/kvm/ in the 
   tree?
  
  Lguest is in Documentation/ for a reason. It's not meant as a 
  user space tool that you take as-is and use. It's meant for 
  documenting how lguest works in general. I admit though, that 
  that's also the reason people don't use it :).
 
 I'd also say it's rather misplaced there, and at least in the 
 storage area that I know most it didn't help it from totally 
 misunderstanding kernel concepts and taking them into protocols 
 (e.g. virtio barrier support).  That for me is a reason why you 
 don't want to couple thing too tightly, at least you'll have to 
 document and/or explain the protocol to someone.

It's funny that you bring up ABIs and virtio, as in my experience 
it's usually this pattern that creates bad ABIs in Linux:

 1) there's an out of tree user-space project, written by a group of 
developers, proposing some new feature for which they'd need 
kernel help

 2) they talk to another set of Linux kernel developers, who come up 
with something that works - still out of tree

 3) they then talk to upstream Linux and the whole review process 
starts. The ABI changes in some minor ways but often survives.

 4) nobody is risking any ABI changes because the distance from 
upstream to the actual user-space project is 3 boundaries. 
Whatever ad-hoc ABI was proposed initially is hammered through
if possible, unless it's 100% unworkable.

 5) upstream often bends for pragmatic reasons, and it's at most 
someone at the top of the maintenance hierarchy who says
'hell NO!' and forces an (expensive!) reiteration of all 5 steps.

I've also seen a couple of good ABIs, which had a very natural 
development cycle:

 1) developers working both upstream and in a user-space project 
propose an ABI and quickly iterate through several versions. Both 
the kernel side and the user-space side changes frequently and 
iteratively to perfect the ABI and the whole process is highly 
visible on lkml and gets review at every stage of this work.

Do you recognize what i'm trying to point out?

While moving a user-space tool project to tools/ certainly does not 
*guarantee* good ABIs, it *does* guarantee friction-less iterations 
of ABIs - which are much harder to achieve with out of tree projects.

We've done this numerous times for the perf ABI and while the ABI is 
far from perfect it worked very well for us - far better than it 
would have worked as a separate project.

In such an integrated tool/kernel tree bugs are also much easier to 
debug: as we can cleanly bisect across interim versions of the ABI, 
as the tools and the kernel side ABI is developed in lock-step. We do 
not have to create cross-compatibility between every interim version 
of the ABI and any future (or past) version of the tool ...

So yes, my first hand experience shows that you are 100% wrong:

 - 'slow-play', out of tree, modularized, to-the-specification ABIs 
   tend to suck

 - 'quickly iterated', in tree, unified, specified-on-lkml ABIs tend 
   to work out much better

There's exceptions, but that's the general trend.

Fact is that developing ABIs within an integrated project is 
*amazingly* powerful. You should try it one day, instead of 
criticizing it :-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 12:59, Markus Armbruster wrote:

 Avi Kivity a...@redhat.com writes:
 
 On 07/25/2011 01:04 PM, Alexander Graf wrote:
 On 25.07.2011, at 12:02, Avi Kivity wrote:
 
 On 07/25/2011 12:56 PM, Alexander Graf wrote:
 
  That argument can be used to block any change.  You'll get used to it 
 in time.  The question is, is the new interface better or not.
 
 I agree that it keeps you from accidently malloc'ing a struct of pointer 
 size. But couldn't we also just add this to checkpatch.pl?
 
 Better APIs trump better patch review.
 
 Only if you enforce them. The only sensible thing for QEMU_NEW (despite the 
 general rule of upper case macros, I'd actually prefer this one to be lower 
 case though since it's so often used) would be to remove qemu_malloc, 
 declare malloc() as unusable and convert all users of qemu_malloc() to 
 qemu_new().
 
 Some qemu_mallocs() will remain (allocating a byte array or something
 variable sized).
 
 Byte array: add the obvious type-safe allocator for a variable-sized
 array T[N], then use it with unsigned char for T.
 
 In fact, I find QEMU_NEW() pretty pointless without a buddy for arrays.

#define QEMU_NEW_MULTI(type, len) ((type *)(qemu_mallocz(sizeof(type) * len)))

char *arr = QEMU_NEW_MULTI(char, 1024);

 Still not covered: allocating a struct with a variable-size array as
 final member.  I guess a solution for that can be found if we care
 enough.

Yeah, but at the end of the day I'd assume most of us know C and can just open 
code this all, no?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Christoph Hellwig
On Mon, Jul 25, 2011 at 01:08:10PM +0200, Ingo Molnar wrote:
 Fact is that developing ABIs within an integrated project is 
 *amazingly* powerful. You should try it one day, instead of 
 criticizing it :-)

I've been doing this long before you declare it the rosetta stone.  Some
of the worst ABIs I know come from that kind of development, e.g. all
the ioctls we have in xfs, inherited from the tightly integrated IRIX
development cycle.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Christoph Hellwig h...@infradead.org wrote:

 On Mon, Jul 25, 2011 at 01:08:10PM +0200, Ingo Molnar wrote:

  Fact is that developing ABIs within an integrated project is 
  *amazingly* powerful. You should try it one day, instead of 
  criticizing it :-)
 
 I've been doing this long before you declare it the rosetta stone.  
 Some of the worst ABIs I know come from that kind of development, 
 e.g. all the ioctls we have in xfs, inherited from the tightly 
 integrated IRIX development cycle.

As i said it's not a guarantee, you can mess up even under the best 
of circumstances.

Also, arguably a filesystem is constrainted wrt. ABIs: it's just one 
filesystem which pretty much pushes it towards ioctls which in turn 
is often an unstructured breeding ground for mess.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Olivier Galibert
On Mon, Jul 25, 2011 at 07:24:12AM -0400, Christoph Hellwig wrote:
 On Mon, Jul 25, 2011 at 01:08:10PM +0200, Ingo Molnar wrote:
  Fact is that developing ABIs within an integrated project is 
  *amazingly* powerful. You should try it one day, instead of 
  criticizing it :-)
 
 I've been doing this long before you declare it the rosetta stone.  Some
 of the worst ABIs I know come from that kind of development, e.g. all
 the ioctls we have in xfs, inherited from the tightly integrated IRIX
 development cycle.

You need someone with taste in the loop.  But if you do, evolved is
always better than designed before you actually know what you need.

As I'm sure you perfectly know, for the matter.

  OG.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 12:32 PM, Alexander Graf wrote:

On 25.07.2011, at 10:51, Avi Kivity wrote:

  qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
  QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.

What does this buy you over

type *x = qemu_malloc(sizeof(type));

? I find the non-C++ version easier to read even.



I'm using it as

MemoryRegion *phys_flash = QEMU_NEW(MemoryRegion);

instead of

MemoryRegion *phys_flash = qemu_malloc(sizeof(*phys_flash));

I find it shorter, and if I make a mistake, the compiler shouts at me 
instead of a runtime crash.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Olivier Galibert
On Mon, Jul 25, 2011 at 11:03:32AM +0200, Alexander Graf wrote:
 It's very hard to understand. It's similar to religion - I could
 easily apply your point to every reasonably low-level user space
 project out there. X for example. X needs to interact with KMS and
 DRI and whatdoiknow. So it'd be a perfect fit to pull into tools/,
 no?

The current development methodology of X11/mesa/v*api is catastrophic,
as the state of the linux graphics attest.  Having a functional
equivalent in tools would be an immense step forward, if simply by
having all the damn related code in the same place, but also because
of the interactions you cite.

  OG.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Christoph Hellwig
On Mon, Jul 25, 2011 at 01:34:25PM +0200, Olivier Galibert wrote:
 You need someone with taste in the loop.  But if you do, evolved is
 always better than designed before you actually know what you need.
 
 As I'm sure you perfectly know, for the matter.

Neither is actually helpful.  You reall want reference implementation
on both sides of an ABI, _and_ documentation.  And yes, usually you'll
need a few iterations over it.  In theory you can do that in a tightly
integrated enviroment as well, but practice shows simply boilds down
to commiting the bloody thing.

I'm also not sure why we even bother to focus with this side-line
discussion.  It's not like the kvm (as in the kernel kvm module)
developers have written the kvm tools.  It's just another userspace for
the kvm userspace (the fifth if I count correctly), and so far the
reference and often only implementation of any new kvm module feature
is for qemu-kvm.  So no matter where kvm tools lives, if you guys one
day actually start doing major kvm core features it will still evolve
discussing them with the main consumer of those interfaces.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 02:02 PM, Markus Armbruster wrote:

Side-stepping the stupid OMG malloc(0) is weird, therefore we must make
qemu_malloc(0) differently weird controversy would be useful all by
itself.


If we all work together, we can make this thread even bigger than the 
tools/kvm pull request.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Ingo Molnar

* Christoph Hellwig h...@infradead.org wrote:

 On Mon, Jul 25, 2011 at 10:14:13AM +0200, Alexander Graf wrote:

  So instead of thinking a bit and trying to realize that there 
  might be a reason people don't want all their user space in the 
  kernel tree you go ahead and start your own crusade of creating a 
  new user space. Great. That's how I always hoped Linux would be 
  :(.
 
 It's not Linux in general yet, it's mostly a crusade of a few with 
 a political agenda.

I find it weird (and insulting) that you are calling our patches that 
implement better tolling a 'crusade' done under a 'political agenda'.

I'd rather say that the extremist position is yours: to keep 
everything separated always, all the time and ban tools from the 
kernel repo.

You have declared it a failure from the very start and if it were up 
to you it would never get a fair chance.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] ioeventfd: Introduce KVM_IOEVENTFD_FLAG_SOCKET

2011-07-25 Thread Sasha Levin
On Wed, 2011-07-13 at 10:07 +0300, Avi Kivity wrote:
 Second, introducing a new type of exit doesn't mean we see more exits.
 
 Third, the new type of exit is probably not needed - we can use the 
 existing mmio/pio exits, and document that in certain cases the kernel 
 will fall back to these instead of delivering the I/O via the sockets 
 (userspace can then try writing itself).

Just waking this up since I want to send a new version and just want to
cover some things before that.

The problem with the original implementation was that if we receive a
signal while we wait for the host to provide a value to be read, we must
abort the operation and exit to do the signal.

What this caused was that read operations with side effects would break
(for example, when reading a byte would change the value in that byte).

The original plan was to notify the host that we ignored his answer via
the socket, and it should provide the response again via regular MMIO
exit, but I couldn't find a good way to pass it through the MMIO exit.
Also, This would complicate this operation on the host quite a bit.

What I did instead was to assume that if the socket write notifying the
host of a read operation went through ok, we can block on the socket
read request.

Does it sound ok? I know it's not what was originally planned, but to me
it looked like the most efficient approach.


-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Anthony Liguori

On 07/25/2011 03:51 AM, Avi Kivity wrote:

qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.


Just use g_new() and g_new0()

Regards,

Anthony Liguori



Signed-off-by: Avi Kivitya...@redhat.com
---

This is part of my memory API patchset, but doesn't really belong there.

  qemu-common.h |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/qemu-common.h b/qemu-common.h
index ba55719..66effa3 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -186,6 +186,9 @@ void qemu_free(void *ptr);
  char *qemu_strdup(const char *str);
  char *qemu_strndup(const char *str, size_t size);

+#define QEMU_NEW(type) ((type *)(qemu_malloc(sizeof(type
+#define QEMU_NEWZ(type) ((type *)(qemu_mallocz(sizeof(type
+
  void qemu_mutex_lock_iothread(void);
  void qemu_mutex_unlock_iothread(void);



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] ioeventfd: Introduce KVM_IOEVENTFD_FLAG_SOCKET

2011-07-25 Thread Avi Kivity

On 07/25/2011 03:10 PM, Sasha Levin wrote:

On Wed, 2011-07-13 at 10:07 +0300, Avi Kivity wrote:
  Second, introducing a new type of exit doesn't mean we see more exits.

  Third, the new type of exit is probably not needed - we can use the
  existing mmio/pio exits, and document that in certain cases the kernel
  will fall back to these instead of delivering the I/O via the sockets
  (userspace can then try writing itself).

Just waking this up since I want to send a new version and just want to
cover some things before that.

The problem with the original implementation was that if we receive a
signal while we wait for the host to provide a value to be read, we must
abort the operation and exit to do the signal.

What this caused was that read operations with side effects would break
(for example, when reading a byte would change the value in that byte).

The original plan was to notify the host that we ignored his answer via
the socket, and it should provide the response again via regular MMIO
exit, but I couldn't find a good way to pass it through the MMIO exit.
Also, This would complicate this operation on the host quite a bit.

What I did instead was to assume that if the socket write notifying the
host of a read operation went through ok, we can block on the socket
read request.

Does it sound ok? I know it's not what was originally planned, but to me
it looked like the most efficient approach.


You can't block when a signal is pending.  You can block, however, after 
you've exited with -EINTR and re-entered.


We need to document that if a vcpu exited with -EINTR, then any socket 
memory transactions need to be flushed before the vcpu's state can be 
considered stable (for live migration).  In fact it's true for any kind 
of exit.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 03:11 PM, Anthony Liguori wrote:

On 07/25/2011 03:51 AM, Avi Kivity wrote:

qemu_malloc() is type-unsafe as it returns a void pointer.  Introduce
QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.


Just use g_new() and g_new0()



These bypass qemu_malloc().  Are we okay with that?

I suppose so, since many library functions can allocate memory and 
bypass qemu_malloc()?


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Anthony Liguori

On 07/25/2011 06:11 AM, Alexander Graf wrote:


#define QEMU_NEW_MULTI(type, len) ((type *)(qemu_mallocz(sizeof(type) * len)))

char *arr = QEMU_NEW_MULTI(char, 1024);


Still not covered: allocating a struct with a variable-size array as
final member.  I guess a solution for that can be found if we care
enough.


Yeah, but at the end of the day I'd assume most of us know C and can just open 
code this all, no?


While it's always fun to reinvent things, glib has already solved all of 
this and we're already dependent on it in the build:


http://developer.gnome.org/glib/stable/glib-Memory-Allocation.html

It also has fancy ways to hook memory allocation for debugging.

Regards,

Anthony Liguori



Alex




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Anthony Liguori

On 07/25/2011 07:18 AM, Avi Kivity wrote:

On 07/25/2011 03:11 PM, Anthony Liguori wrote:

On 07/25/2011 03:51 AM, Avi Kivity wrote:

qemu_malloc() is type-unsafe as it returns a void pointer. Introduce
QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.


Just use g_new() and g_new0()



These bypass qemu_malloc(). Are we okay with that?


Yes.  We can just make qemu_malloc use g_malloc.


I suppose so, since many library functions can allocate memory and
bypass qemu_malloc()?


Right.

Regards,

Anthony Liguori





--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Kevin Wolf
Am 25.07.2011 10:30, schrieb Pekka Enberg:
 Hi Alexander,
 
 On Mon, Jul 25, 2011 at 11:14 AM, Alexander Graf ag...@suse.de wrote:
 So i wanted to have a lightweight tool that allows me to test KVM and
 tools/kvm/ does that very nicely: i type './kvm run' and i can test a
 native bzImage (which has some virtualization options enabled as
 well) on the _host_ distro i am running, booting to a text shell
 prompt.

 I do that all the time.

  $ qemu-kvm -nographic -kernel arch/x86/boot/bzImage -append console=ttyS0

 does the exact same thing. If that's too much typing for you, make it a bash 
 alias.
 
 You know, they said the same thing about oprofile. All you needed to do was to
 write few simple shell scripts to make it work. One of the key
 features of tools/kvm
 is 'as little configuration as possible' and I can assure you that
 bash alias is really
 not a solution for that.

You've just chosen a different default. I'd argue that most users (i.e.
not developers of the tool or the kernel) actually want to run with a
disk image and graphics. You can type qemu-kvm harddisk.img and that's
it. This is clearly superior to something as tedious as ./kvm run -d
harddisk.img --sdl and I can assure you that a bash alias is really not
a solution for that.

So, as always, which set of command line switches works better for you
depends entirely on your use case.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 1/1] Submit the codes for QEMU disk I/O limits.

2011-07-25 Thread Stefan Hajnoczi
On Mon, Jul 25, 2011 at 8:08 AM, Zhi Yong Wu zwu.ker...@gmail.com wrote:
 On Fri, Jul 22, 2011 at 6:54 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Fri, Jul 22, 2011 at 10:20 AM, Zhi Yong Wu wu...@linux.vnet.ibm.com 
 wrote:
 +    elapsed_time  = (real_time - bs-slice_start[is_write]) / 10.0;
 +    fprintf(stderr, real_time = %ld, slice_start = %ld, elapsed_time = 
 %g\n, real_time, bs-slice_start[is_write], elapsed_time);
 +
 +    bytes_limit        = bps_limit * slice_time;
 +    bytes_disp  = bs-io_disps-bytes[is_write];
 +    if (bs-io_limits-bps[2]) {
 +        bytes_disp += bs-io_disps-bytes[!is_write];
 +    }
 +
 +    bytes_res   = (unsigned) nb_sectors * BDRV_SECTOR_SIZE;
 +    fprintf(stderr, bytes_limit = %g bytes_disp = %g, bytes_res = %g, 
 elapsed_time = %g\n, bytes_limit, bytes_disp, bytes_res, elapsed_time);
 +
 +    if (bytes_disp + bytes_res = bytes_limit) {
 +        if (wait) {
 +            *wait = 0;
 +        }
 +
 +       fprintf(stderr, bytes_disp + bytes_res = bytes_limit\n);
 +        return false;
 +    }
 +
 +    /* Calc approx time to dispatch */
 +    wait_time = (bytes_disp + bytes_res - bytes_limit) / bps_limit;
 +    if (!wait_time) {
 +        wait_time = 1;
 +    }
 +
 +    wait_time = wait_time + (slice_time - elapsed_time);
 +    if (wait) {
 +       *wait = wait_time * 10 + 1;
 +    }
 +
 +    return true;
 +}

 After a slice expires all bytes/iops dispatched data is forgotten,
 even if there are still requests queued.  This means that requests
 issued by the guest immediately after a new 100 ms period will be
 issued but existing queued requests will still be waiting.

 And since the queued requests don't get their next chance until later,
 it's possible that they will be requeued because the requests that the
 guest snuck in have brought us to the limit again.

 In order to solve this problem, we need to extend the current slice if
 Extend the current slice? like in-kernel block throttling algorithm?
 Our algorithm seems not to adopt it currently.

I'm not sure if extending the slice is necessary as long as new
requests are queued while previous requests are still queued.  But
extending slices is one way to deal with requests that span across
multiple slices.  See below.

 there are still requests pending.  To prevent extensions from growing
 the slice forever (and keeping too much old data around), it should be
 alright to cull data from 2 slices ago.  The simplest way of doing
 that is to subtract the bps/iops limits from the bytes/iops
 dispatched.
 You mean that the largest value of current_slice_time is not more than
 2 slice_time?

Yes.  If no single request is larger than the I/O limit, then the
timer value for a queued request should always be within the next
slice.  Therefore everything before last slice should be completed
already and we don't need to keep that history around.


 @@ -2129,6 +2341,19 @@ BlockDriverAIOCB *bdrv_aio_readv(BlockDriverState 
 *bs, int64_t sector_num,
     if (bdrv_check_request(bs, sector_num, nb_sectors))
         return NULL;

 +    /* throttling disk read I/O */
 +    if (bs-io_limits != NULL) {
 +       if (bdrv_exceed_io_limits(bs, nb_sectors, false, wait_time)) {
 +            ret = qemu_block_queue_enqueue(bs-block_queue, bs, 
 bdrv_aio_readv,
 +                               sector_num, qiov, nb_sectors, cb, opaque);

 5 space indent, should be 4.

 +           fprintf(stderr, bdrv_aio_readv: wait_time = %ld, timer value = 
 %ld\n, wait_time, wait_time + qemu_get_clock_ns(vm_clock));
 +           qemu_mod_timer(bs-block_timer, wait_time + 
 qemu_get_clock_ns(vm_clock));

 Imagine 3 requests that need to be queued: A, B, and C.  Since the
 timer is unconditionally set each time a request is queued, the timer
 will be set to C's wait_time.  However A or B's wait_time might be
 earlier and we will miss that deadline!
 Yeah, exactly there is this issue.

 We really need a priority queue here.  QEMU's timers solve the same
 problem with a sorted list, which might actually be faster for short
 lists where a fancy data structure has too much overhead.
 You mean the block requests should be handled in FIFO way in order?
 If the block queue is not empty, should this coming request be
 enqueued at first? right?

Yes.  If the limit was previously exceeded, enqueue new requests immediately:

/* If a limit was exceeded, immediately queue this request */
if (!QTAILQ_EMPTY(queue-requests)) {
if (limits-bps[IO_LIMIT_TOTAL]) {
/* queue any rd/wr request */
} else if (limits-bps[is_write]  another_request_is_queued[is_write]) {
/* queue if the rd/wr-specific limit was previously exceeded */
}

...same for iops...
}

This way new requests cannot skip ahead of queued requests due to the
lost history when a new slice starts.


 +BlockDriverAIOCB *qemu_block_queue_enqueue(BlockQueue *queue,
 +                       BlockDriverState *bs,
 +                       BlockRequestHandler 

Re: [PATCH 5/5] ioeventfd: Introduce KVM_IOEVENTFD_FLAG_SOCKET

2011-07-25 Thread Sasha Levin
On Mon, 2011-07-25 at 15:16 +0300, Avi Kivity wrote:
 On 07/25/2011 03:10 PM, Sasha Levin wrote:
  On Wed, 2011-07-13 at 10:07 +0300, Avi Kivity wrote:
Second, introducing a new type of exit doesn't mean we see more exits.
  
Third, the new type of exit is probably not needed - we can use the
existing mmio/pio exits, and document that in certain cases the kernel
will fall back to these instead of delivering the I/O via the sockets
(userspace can then try writing itself).
 
  Just waking this up since I want to send a new version and just want to
  cover some things before that.
 
  The problem with the original implementation was that if we receive a
  signal while we wait for the host to provide a value to be read, we must
  abort the operation and exit to do the signal.
 
  What this caused was that read operations with side effects would break
  (for example, when reading a byte would change the value in that byte).
 
  The original plan was to notify the host that we ignored his answer via
  the socket, and it should provide the response again via regular MMIO
  exit, but I couldn't find a good way to pass it through the MMIO exit.
  Also, This would complicate this operation on the host quite a bit.
 
  What I did instead was to assume that if the socket write notifying the
  host of a read operation went through ok, we can block on the socket
  read request.
 
  Does it sound ok? I know it's not what was originally planned, but to me
  it looked like the most efficient approach.
 
 You can't block when a signal is pending.  You can block, however, after 
 you've exited with -EINTR and re-entered.
 

What would happen with the MMIO then? I need to provide an answer before
I leave the read/write functions to exit with -EINTR, no?

 We need to document that if a vcpu exited with -EINTR, then any socket 
 memory transactions need to be flushed before the vcpu's state can be 
 considered stable (for live migration).  In fact it's true for any kind 
 of exit.
 

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Anthony Liguori

On 07/25/2011 04:52 AM, Avi Kivity wrote:

On 07/25/2011 12:48 PM, Peter Maydell wrote:

On 25 July 2011 10:32, Alexander Grafag...@suse.de wrote:
 On 25.07.2011, at 10:51, Avi Kivity wrote:
 qemu_malloc() is type-unsafe as it returns a void pointer. Introduce
 QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.

 What does this buy you over

 type *x = qemu_malloc(sizeof(type));

 ? I find the non-C++ version easier to read even.

Yeah, while we're writing in C we should just stick to the C-like
APIs, it's less confusing IMHO than wrapping it up in something else.


That argument can be used to block any change. You'll get used to it in
time. The question is, is the new interface better or not.


I assume Anthony's new object model stuff will have a create me a
new foo object API anyway, so QEMU_NEW() is possibly a bit of a
namespace grab.


Anthony's stuff is at a much higher level, hopefully he'll come back to
the ground one day.


The point of introducing glib was to address things like this.  We need 
to start making heavier use of what it provides.


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Christoph,

On Mon, 2011-07-25 at 06:38 -0400, Christoph Hellwig wrote:
 On Mon, Jul 25, 2011 at 10:14:13AM +0200, Alexander Graf wrote:
  So instead of thinking a bit and trying to realize that there might be a 
  reason people don't want all their user space in the kernel tree you go 
  ahead and start your own crusade of creating a new user space. Great. 
  That's how I always hoped Linux would be :(.
 
 It's not Linux in general yet, it's mostly a crusade of a few with a
 political agenda.

Would you mind sharing with the rest of us who exactly those few are and
what is their political agenda? I certainly don't recognize myself or
any of the development team from your description above. And to be
completely honest, I think that's one of the single most stupid comments
I've read in this thread.

On Mon, 2011-07-25 at 06:38 -0400, Christoph Hellwig wrote:
 And another argument, calling toyvisor2 kvm is a really bad idea.  The
 kvm binary has been used for the kvm-patched qemu binary for quite a
 while in various distros, so re-using that name will cause utter
 confusion.

I don't foresee 'utter confusion'. The code definitely wants to live in
tools/kvm so it's logical that the executable name is called 'kvm'. If
distros want to package it with a different name, they're free to do so.

 I'm happy that you guys do another independent userspace for kvm, but
 please:
 
   a) give it a useful name

I don't think toyvisor2 will do, really. I originally wanted to call the
hypervisor 'pvm' for selfish reasons because quite frankly 'penix' just
doesn't have the prestige 'linux' has. However, Ingo convinced me that
'kvm' is a good name and I agree with that.

I'm open to alternative suggestions, though.

   b) just develop it where it belongs, your own little git repository
  somewhere

I already explained why we want it in the kernel tree and what kind of
benefits we think there are. I also explained that we want to reuse and
share code with perf, for example. If you really want to argue against
merging, you need to cough up some actual disadvantages or refute the
suggested advantages.

Pekka

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Introduce QEMU_NEW()

2011-07-25 Thread Avi Kivity

On 07/25/2011 03:21 PM, Anthony Liguori wrote:

On 07/25/2011 07:18 AM, Avi Kivity wrote:

On 07/25/2011 03:11 PM, Anthony Liguori wrote:

On 07/25/2011 03:51 AM, Avi Kivity wrote:

qemu_malloc() is type-unsafe as it returns a void pointer. Introduce
QEMU_NEW() (and QEMU_NEWZ()), which return the correct type.


Just use g_new() and g_new0()



These bypass qemu_malloc(). Are we okay with that?


Yes.  We can just make qemu_malloc use g_malloc.



Excellent.  Patch withdrawn.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
Hi Kevin,

On Mon, 2011-07-25 at 14:24 +0200, Kevin Wolf wrote:
 You've just chosen a different default. I'd argue that most users (i.e.
 not developers of the tool or the kernel) actually want to run with a
 disk image and graphics. You can type qemu-kvm harddisk.img and that's
 it. This is clearly superior to something as tedious as ./kvm run -d
 harddisk.img --sdl and I can assure you that a bash alias is really not
 a solution for that.

I actually agree with you and we eventually want

  kvm run harddisk.img

to do the right thing.

On Mon, 2011-07-25 at 14:24 +0200, Kevin Wolf wrote:
 So, as always, which set of command line switches works better for you
 depends entirely on your use case.

I actually don't agree. I think Qemu requires way too much configuration
from the user and doesn't try hard enough to provide best possible
defaults. Dunno how much virt-manager changes all that, though.

Pekka

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Pekka Enberg
On Mon, Jul 25, 2011 at 12:06 PM, Alexander Graf ag...@suse.de wrote:
 And don't get this the wrong way either, I'm not hostile against other
 operating systems, but I simply am not interested enough in them to
 spend my time improving them.

 Then kvm-tool is about as useful as Mac-on-Linux. Why don't we have MoL user 
 land in the kernel? I even added support for KVM to it about a year ago. So 
 all I need to do is change it to the kernel coding style, add some 
 dependencies on kernel headers and I'm good for a pull request?

Oh, I dunno - have you tried that? If you're interested in sending a
patch against tools/kvm that adds KVM-based support to Mac, I'm happy
to review it and consider for inclusion. ;-)

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Avi Kivity

On 07/25/2011 03:41 PM, Pekka Enberg wrote:

On Mon, 2011-07-25 at 14:24 +0200, Kevin Wolf wrote:
  So, as always, which set of command line switches works better for you
  depends entirely on your use case.

I actually don't agree. I think Qemu requires way too much configuration
from the user and doesn't try hard enough to provide best possible
defaults.


qemu has a huge wealth of options so the command line is incredibly 
complicated.  Changing defaults is hard due to backward compatibility 
constraints.



Dunno how much virt-manager changes all that, though.


virt-manager is a GUI interface, no command line at all.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Avi Kivity

On 07/25/2011 03:45 PM, Pekka Enberg wrote:

On Mon, Jul 25, 2011 at 12:06 PM, Alexander Grafag...@suse.de  wrote:
  And don't get this the wrong way either, I'm not hostile against other
  operating systems, but I simply am not interested enough in them to
  spend my time improving them.

  Then kvm-tool is about as useful as Mac-on-Linux. Why don't we have MoL user 
land in the kernel? I even added support for KVM to it about a year ago. So all I 
need to do is change it to the kernel coding style, add some dependencies on 
kernel headers and I'm good for a pull request?

Oh, I dunno - have you tried that? If you're interested in sending a
patch against tools/kvm that adds KVM-based support to Mac, I'm happy
to review it and consider for inclusion. ;-)


The patch will be a lot bigger than tools/kvm itself.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix svm unit test

2011-07-25 Thread Conny Seidel
Since the -enable-nesting option is no longer available remove the
option from the config file.

Signed-off-by: Conny Seidel conny.sei...@amd.com
---
 x86/unittests.cfg |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index 228ac1d..065020a 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -57,7 +57,7 @@ file = rmap_chain.flat
 [svm]
 file = svm.flat
 smp = 2
-extra_params = -enable-nesting -cpu qemu64,+svm
+extra_params = -cpu qemu64,+svm
 
 [svm-disabled]
 file = svm.flat
-- 
1.7.2.3


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 14:47, Avi Kivity wrote:

 On 07/25/2011 03:45 PM, Pekka Enberg wrote:
 On Mon, Jul 25, 2011 at 12:06 PM, Alexander Grafag...@suse.de  wrote:
   And don't get this the wrong way either, I'm not hostile against other
   operating systems, but I simply am not interested enough in them to
   spend my time improving them.
 
   Then kvm-tool is about as useful as Mac-on-Linux. Why don't we have MoL 
  user land in the kernel? I even added support for KVM to it about a year 
  ago. So all I need to do is change it to the kernel coding style, add some 
  dependencies on kernel headers and I'm good for a pull request?
 
 Oh, I dunno - have you tried that? If you're interested in sending a
 patch against tools/kvm that adds KVM-based support to Mac, I'm happy
 to review it and consider for inclusion. ;-)
 
 The patch will be a lot bigger than tools/kvm itself.

And it will be full of ugly hacks and code that needs to be compiled on a Mac. 
I was actually trying to pick an example of a project that you really don't 
want to have in the kernel tree.

One point stays true however with MoL. I added support for KVM a year ago and 
haven't had to touch it since. It just works. Because we have a pretty stable 
kernel/user space interface. So the whole thing about having user space and 
kernel code live together in the same tree is moot there.

And if you need to share code with perf, then that just proves the point that 
perf doesn't belong inside the kernel tree either.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Sasha Levin
On Mon, 2011-07-25 at 15:47 +0300, Avi Kivity wrote:
 On 07/25/2011 03:45 PM, Pekka Enberg wrote:
  On Mon, Jul 25, 2011 at 12:06 PM, Alexander Grafag...@suse.de  wrote:
And don't get this the wrong way either, I'm not hostile against other
operating systems, but I simply am not interested enough in them to
spend my time improving them.
  
Then kvm-tool is about as useful as Mac-on-Linux. Why don't we have MoL 
   user land in the kernel? I even added support for KVM to it about a year 
   ago. So all I need to do is change it to the kernel coding style, add 
   some dependencies on kernel headers and I'm good for a pull request?
 
  Oh, I dunno - have you tried that? If you're interested in sending a
  patch against tools/kvm that adds KVM-based support to Mac, I'm happy
  to review it and consider for inclusion. ;-)
 
 The patch will be a lot bigger than tools/kvm itself.
 

But it would be as a single patch series which includes kvm and
tools/kvm :)

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Alexander Graf

On 25.07.2011, at 14:51, Sasha Levin wrote:

 On Mon, 2011-07-25 at 15:47 +0300, Avi Kivity wrote:
 On 07/25/2011 03:45 PM, Pekka Enberg wrote:
 On Mon, Jul 25, 2011 at 12:06 PM, Alexander Grafag...@suse.de  wrote:
 And don't get this the wrong way either, I'm not hostile against other
 operating systems, but I simply am not interested enough in them to
 spend my time improving them.
 
 Then kvm-tool is about as useful as Mac-on-Linux. Why don't we have MoL 
 user land in the kernel? I even added support for KVM to it about a year 
 ago. So all I need to do is change it to the kernel coding style, add some 
 dependencies on kernel headers and I'm good for a pull request?
 
 Oh, I dunno - have you tried that? If you're interested in sending a
 patch against tools/kvm that adds KVM-based support to Mac, I'm happy
 to review it and consider for inclusion. ;-)
 
 The patch will be a lot bigger than tools/kvm itself.
 
 
 But it would be as a single patch series which includes kvm and
 tools/kvm :)

Not sure I get what you mean here. We already have KVM support for Macs ;). And 
pSeries. And the PS3. It even works on the Wii.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Native Linux KVM tool for 3.1

2011-07-25 Thread Paolo Bonzini

On 07/25/2011 09:50 AM, Sasha Levin wrote:

Anthony had a talk on last years KVM forum regarding the QEMU threading
model (slide:
http://www.linux-kvm.org/wiki/images/7/70/2010-forum-threading-qemu.pdf) .

It was suggested that the KVM part of QEMU is having a hard time
achieving the ideal threading model due to its need to support TCG -
something which has nothing to do with KVM itself.


No, it is not having a hard time.  The foot in the door that Anthony 
mentions has been part of QEMU and qemu-kvm for a long time 
(multi-threading is necessary to support SMP!) and works quite well.



Historically, there were three main loops:

1) QEMU single-threaded;

2) QEMU multi-threaded; clean, but buggy, untested and bitrotting;

3) qemu-kvm multi-threaded, forked from (1), ugly but robust and widely 
deployed.


In 0.15 the two multi-threaded versions have been unified by Jan Kiszka. 
 I have even ported (2) to Windows with little or no pain; porting to 
Mac OS X interestingly is harder than Windows, because non-portable 
Linux assumptions about signal handling have crept in the code (to 
preempt the objections: they weren't just non-portabilities, they were 
latent bugs).  Windows just does not have signals. :)


So, right now, the only difference is that QEMU is still defaulting to 
the single-threaded main loop, while qemu-kvm enables multi-threading by 
default.  In some time even QEMU will switch.


Yes, this is of course worse than getting it right in the first place; 
Nobody is saying the opposite.


Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] ioeventfd: Introduce KVM_IOEVENTFD_FLAG_SOCKET

2011-07-25 Thread Avi Kivity

On 07/25/2011 03:26 PM, Sasha Levin wrote:


  You can't block when a signal is pending.  You can block, however, after
  you've exited with -EINTR and re-entered.


What would happen with the MMIO then? I need to provide an answer before
I leave the read/write functions to exit with -EINTR, no?


If you exit, no result is needed until you reenter.  You just have to 
remember that on the next KVM_RUN, instead of running the vcpu, you have 
to talk to the socket (either write or read, depending on where the 
signal got you).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   >