Re: Flow Control and Port Mirroring Revisited

2011-01-22 Thread Simon Horman
On Sat, Jan 22, 2011 at 11:57:42PM +0200, Michael S. Tsirkin wrote:
> On Sat, Jan 22, 2011 at 10:11:52AM +1100, Simon Horman wrote:
> > On Fri, Jan 21, 2011 at 11:59:30AM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Jan 20, 2011 at 05:38:33PM +0900, Simon Horman wrote:
> > > > [ Trimmed Eric from CC list as vger was complaining that it is too long 
> > > > ]
> > > > 
> > > > On Tue, Jan 18, 2011 at 11:41:22AM -0800, Rick Jones wrote:
> > > > > >So it won't be all that simple to implement well, and before we try,
> > > > > >I'd like to know whether there are applications that are helped
> > > > > >by it. For example, we could try to measure latency at various
> > > > > >pps and see whether the backpressure helps. netperf has -b, -w
> > > > > >flags which might help these measurements.
> > > > > 
> > > > > Those options are enabled when one adds --enable-burst to the
> > > > > pre-compilation ./configure  of netperf (one doesn't have to
> > > > > recompile netserver).  However, if one is also looking at latency
> > > > > statistics via the -j option in the top-of-trunk, or simply at the
> > > > > histogram with --enable-histogram on the ./configure and a verbosity
> > > > > level of 2 (global -v 2) then one wants the very top of trunk
> > > > > netperf from:
> > > > 
> > > > Hi,
> > > > 
> > > > I have constructed a test where I run an un-paced  UDP_STREAM test in
> > > > one guest and a paced omni rr test in another guest at the same time.
> > > 
> > > Hmm, what is this supposed to measure?  Basically each time you run an
> > > un-paced UDP_STREAM you get some random load on the network.
> > > You can't tell what it was exactly, only that it was between
> > > the send and receive throughput.
> > 
> > Rick mentioned in another email that I messed up my test parameters a bit,
> > so I will re-run the tests, incorporating his suggestions.
> > 
> > What I was attempting to measure was the effect of an unpaced UDP_STREAM
> > on the latency of more moderated traffic. Because I am interested in
> > what effect an abusive guest has on other guests and how that my be
> > mitigated.
> > 
> > Could you suggest some tests that you feel are more appropriate?
> 
> Yes. To refraze my concern in these terms, besides the malicious guest
> you have another software in host (netperf) that interferes with
> the traffic, and it cooperates with the malicious guest.
> Right?

Yes, that is the scenario in this test.

> IMO for a malicious guest you would send
> UDP packets that then get dropped by the host.
> 
> For example block netperf in host so that
> it does not consume packets from the socket.

I'm more interested in rate-limiting netperf than blocking it.
But in any case, do you mean use iptables or tc based on
classification made by net_cls?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MIPS, io-thread, icount and wfi

2011-01-22 Thread Edgar E. Iglesias
On Wed, Jan 19, 2011 at 08:02:28PM +0100, Edgar E. Iglesias wrote:
> On Wed, Jan 19, 2011 at 03:02:26PM -0200, Marcelo Tosatti wrote:
> > On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote:
> > > On 2011-01-18 01:19, Edgar E. Iglesias wrote:
> > > > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
> > > >> Hi,
> > > >>
> > > >> I'm running an io-thread enabled qemu-system-mipsel with icount.
> > > >> When the guest (linux) goes to sleep through the wait insn (waiting
> > > >> to be woken up by future timer interrupts), the thing deadlocks.
> > > >>
> > > >> IIUC, this is because vm timers are driven by icount, but the CPU is
> > > >> halted so icount makes no progress and time stands still.
> > > >>
> > > >> I've locally disabled vcpu halting when icount is enabled, that
> > > >> works around my problem but of course makes qemu consume 100% host cpu.
> > > >>
> > > >> I don't know why I only see this problem with io-thread builds?
> > > >> Could be related timing and luck.
> > > >>
> > > >> Would be interesting to know if someone has any info on how this was
> > > >> intended to work (if it was)? And if there are ideas for better
> > > >> workarounds or fixes that don't disable vcpu halting entirely.
> > > > 
> > > > Hi,
> > > > 
> > > > I've found the problem. For some reason io-thread builds use a
> > > > static timeout for wait loops. The entire chunk of code that
> > > > makes sure qemu_icount makes forward progress when the CPU's
> > > > are idle has been ifdef'ed away...
> > > > 
> > > > This fixes the problem for me, hopefully without affecting
> > > > io-thread runs without icount.
> > > > 
> > > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b
> > > > Author: Edgar E. Iglesias 
> > > > Date:   Tue Jan 18 01:01:57 2011 +0100
> > > > 
> > > > qemu-timer: Fix timeout calc for io-thread with icount
> > > > 
> > > > Make sure we always make forward progress with qemu_icount to
> > > > avoid deadlocks. For io-thread, use the static 1000 timeout
> > > > only if icount is disabled.
> > > > 
> > > > Signed-off-by: Edgar E. Iglesias 
> > > > 
> > > > diff --git a/qemu-timer.c b/qemu-timer.c
> > > > index 95814af..db1ec49 100644
> > > > --- a/qemu-timer.c
> > > > +++ b/qemu-timer.c
> > > > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
> > > >  }
> > > >  }
> > > >  
> > > > -#ifndef CONFIG_IOTHREAD
> > > >  static int64_t qemu_icount_delta(void)
> > > >  {
> > > >  if (!use_icount) {
> > > > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
> > > >  return cpu_get_icount() - cpu_get_clock();
> > > >  }
> > > >  }
> > > > -#endif
> > > >  
> > > >  /* enable cpu_get_ticks() */
> > > >  void cpu_enable_ticks(void)
> > > > @@ -1077,9 +1075,17 @@ void quit_timers(void)
> > > >  
> > > >  int qemu_calculate_timeout(void)
> > > >  {
> > > > -#ifndef CONFIG_IOTHREAD
> > > >  int timeout;
> > > >  
> > > > +#ifdef CONFIG_IOTHREAD
> > > > +/* When using icount, making forward progress with qemu_icount 
> > > > when the
> > > > +   guest CPU is idle is critical. We only use the static io-thread 
> > > > timeout
> > > > +   for non icount runs.  */
> > > > +if (!use_icount) {
> > > > +return 1000;
> > > > +}
> > > > +#endif
> > > > +
> > > >  if (!vm_running)
> > > >  timeout = 5000;
> > > >  else {
> > > > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
> > > >  }
> > > >  
> > > >  return timeout;
> > > > -#else /* CONFIG_IOTHREAD */
> > > > -return 1000;
> > > > -#endif
> > > >  }
> > > >  
> > > > 
> > > > 
> > > 
> > > This logic and timeout values were imported on iothread merge. And I bet
> > > at least the timeout value of 1s (vs. 5s) can still be found in
> > > qemu-kvm. Maybe someone over there can remember the rationales behind
> > > choosing this value.
> > > 
> > > Jan
> > 
> > This timeout is for the main select() call. So there is not a lot
> > of reasoning, how long to wait when there's no activity on the file
> > descriptors.
> 
> OK, I suspected something like that. Thanks both of you for the info.
> I'll give people a couple of days to complain at the patch, if noone
> does I'll apply it.

Silence - so I've applied this one, thanks.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/2] Expose available KVM free memory slot count to help avoid aborts

2011-01-22 Thread Michael S. Tsirkin
On Fri, Jan 21, 2011 at 04:48:02PM -0700, Alex Williamson wrote:
> When doing device assignment, we use cpu_register_physical_memory() to
> directly map the qemu mmap of the device resource into the address
> space of the guest.  The unadvertised feature of the register physical
> memory code path on kvm, at least for this type of mapping, is that it
> needs to allocate an index from a small, fixed array of memory slots.
> Even better, if it can't get an index, the code aborts deep in the
> kvm specific bits, preventing the caller from having a chance to
> recover.
> 
> It's really easy to hit this by hot adding too many assigned devices
> to a guest (pretty easy to hit with too many devices at instantiation
> time too, but the abort is slightly more bearable there).
> 
> I'm assuming it's pretty difficult to make the memory slot array
> dynamically sized.  If that's not the case, please let me know as
> that would be a much better solution.
> 
> I'm not terribly happy with the solution in this series, it doesn't
> provide any guarantees whether a cpu_register_physical_memory() will
> succeed, only slightly better educated guesses.
> 
> Are there better ideas how we could solve this?  Thanks,
> 
> Alex

Put the table in qemu memory, make kvm access it with copy from/to user?
It can then be any size ...

> ---
> 
> Alex Williamson (2):
>   device-assignment: Count required kvm memory slots
>   kvm: Allow querying free slots
> 
> 
>  hw/device-assignment.c |   59 
> +++-
>  hw/device-assignment.h |3 ++
>  kvm-all.c  |   16 +
>  kvm.h  |2 ++
>  4 files changed, 79 insertions(+), 1 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Flow Control and Port Mirroring Revisited

2011-01-22 Thread Michael S. Tsirkin
On Sat, Jan 22, 2011 at 10:11:52AM +1100, Simon Horman wrote:
> On Fri, Jan 21, 2011 at 11:59:30AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Jan 20, 2011 at 05:38:33PM +0900, Simon Horman wrote:
> > > [ Trimmed Eric from CC list as vger was complaining that it is too long ]
> > > 
> > > On Tue, Jan 18, 2011 at 11:41:22AM -0800, Rick Jones wrote:
> > > > >So it won't be all that simple to implement well, and before we try,
> > > > >I'd like to know whether there are applications that are helped
> > > > >by it. For example, we could try to measure latency at various
> > > > >pps and see whether the backpressure helps. netperf has -b, -w
> > > > >flags which might help these measurements.
> > > > 
> > > > Those options are enabled when one adds --enable-burst to the
> > > > pre-compilation ./configure  of netperf (one doesn't have to
> > > > recompile netserver).  However, if one is also looking at latency
> > > > statistics via the -j option in the top-of-trunk, or simply at the
> > > > histogram with --enable-histogram on the ./configure and a verbosity
> > > > level of 2 (global -v 2) then one wants the very top of trunk
> > > > netperf from:
> > > 
> > > Hi,
> > > 
> > > I have constructed a test where I run an un-paced  UDP_STREAM test in
> > > one guest and a paced omni rr test in another guest at the same time.
> > 
> > Hmm, what is this supposed to measure?  Basically each time you run an
> > un-paced UDP_STREAM you get some random load on the network.
> > You can't tell what it was exactly, only that it was between
> > the send and receive throughput.
> 
> Rick mentioned in another email that I messed up my test parameters a bit,
> so I will re-run the tests, incorporating his suggestions.
> 
> What I was attempting to measure was the effect of an unpaced UDP_STREAM
> on the latency of more moderated traffic. Because I am interested in
> what effect an abusive guest has on other guests and how that my be
> mitigated.
> 
> Could you suggest some tests that you feel are more appropriate?

Yes. To refraze my concern in these terms, besides the malicious guest
you have another software in host (netperf) that interferes with
the traffic, and it cooperates with the malicious guest.
Right?

IMO for a malicious guest you would send
UDP packets that then get dropped by the host.

For example block netperf in host so that
it does not consume packets from the socket.



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-22 Thread Rik van Riel

On 01/22/2011 01:14 AM, Srivatsa Vaddagiri wrote:


Also it may be possible for the pv-ticketlocks to track owning vcpu and make use
of a yield-to interface as further optimization to avoid the
"others-get-more-time" problem, but Peterz rightly pointed that PI would be a
better solution there than yield-to. So overall IMO kvm_vcpu_on_spin+yield_to
could be the best solution for unmodified guests, while paravirtualized
ticketlocks + some sort of PI would be a better solution where we have the
luxury of modifying guest sources!


Agreed, for unmodified guests (which is what people will mostly be
running for the next couple of years), we have little choice but
to use PLE + kvm_vcpu_on_spin + yield_to.

The main question that remains is whether the PV ticketlocks are
a large enough improvement to also merge those.  I expect they
will be, and we'll see so in the benchmark numbers.

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 01/27] kvm: stop including asm-generic/bitops/le.h directly

2011-01-22 Thread Akinobu Mita
asm-generic/bitops/le.h is only intended to be included directly from
asm-generic/bitops/ext2-non-atomic.h or asm-generic/bitops/minix-le.h
which implements generic ext2 or minix bit operations.

This stops including asm-generic/bitops/le.h directly and use ext2
non-atomic bit operations instead.

It seems odd to use ext2_set_bit() on kvm, but it will replaced with
__set_bit_le() after introducing little endian bit operations
for all architectures.  This indirect step is necessary to maintain
bisectability for some architectures which have their own little-endian
bit operations.

Signed-off-by: Akinobu Mita 
Cc: Avi Kivity 
Cc: Marcelo Tosatti 
Cc: kvm@vger.kernel.org
---

Change from v4:
 - splitted into two patches to fix a bisection hole

The whole series is available in the git branch at:
 git://git.kernel.org/pub/scm/linux/kernel/git/mita/linux-2.6.git le-bitops-v5

 virt/kvm/kvm_main.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f29abeb..3461001 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -52,7 +52,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "coalesced_mmio.h"
 #include "async_pf.h"
@@ -1421,7 +1420,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
if (memslot && memslot->dirty_bitmap) {
unsigned long rel_gfn = gfn - memslot->base_gfn;
 
-   generic___set_le_bit(rel_gfn, memslot->dirty_bitmap);
+   ext2_set_bit(rel_gfn, memslot->dirty_bitmap);
}
 }
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 14/27] kvm: use little-endian bitops

2011-01-22 Thread Akinobu Mita
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h.  This converts ext2 non-atomic bit operations to
little-endian bit operations.

Signed-off-by: Akinobu Mita 
Cc: Avi Kivity 
Cc: Marcelo Tosatti 
Cc: kvm@vger.kernel.org
---

Change from v4:
 - splitted into two patches to fix a bisection hole

The whole series is available in the git branch at:
 git://git.kernel.org/pub/scm/linux/kernel/git/mita/linux-2.6.git le-bitops-v5

 virt/kvm/kvm_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3461001..508fdb1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1420,7 +1420,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
if (memslot && memslot->dirty_bitmap) {
unsigned long rel_gfn = gfn - memslot->base_gfn;
 
-   ext2_set_bit(rel_gfn, memslot->dirty_bitmap);
+   __set_bit_le(rel_gfn, memslot->dirty_bitmap);
}
 }
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html