date:20110915

Re: [PATCH 2/2] kvm tools: Use host's resolv.conf within the guest

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 08:44 +0300, Pekka Enberg wrote:
 On 9/15/11 8:36 AM, Sasha Levin wrote:
  On Thu, 2011-09-15 at 08:29 +0300, Pekka Enberg wrote:
  On Wed, Sep 14, 2011 at 7:28 PM, Sasha Levinlevinsasha...@gmail.com  
  wrote:
  Since kernel IP autoconfiguration doesn't set up /etc/resolv.conf, we'll
  use the one located within the host, since this was anyway what we 
  simulated
  within the DHCP offer packets.
 
  Signed-off-by: Sasha Levinlevinsasha...@gmail.com
 
  Wouldn't a symlink to /host/etc/resolv.conf be more appropriate?
  Remember, we're supposed to only need to setup the shared rootfs once.
 
  It would mean the guest can screw up with the host's networking.
 
 How? You're not supposed to run the tool.

Hm? If you it to the host's resolv.conf, a guest can edit host's file,
no?

Might even be not on purpose... For example, simply running dhcpcd on
the guest.

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] kvm tools: Use host's resolv.conf within the guest

2011-09-15 Thread Pekka Enberg

On Thu, Sep 15, 2011 at 9:00 AM, Sasha Levin levinsasha...@gmail.com wrote:
 Hm? If you it to the host's resolv.conf, a guest can edit host's file,
 no?

 Might even be not on purpose... For example, simply running dhcpcd on
 the guest.

How is that going to happen if you're not running kvmtool as root?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] kvm tools: Use host's resolv.conf within the guest

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 09:04 +0300, Pekka Enberg wrote:
 On Thu, Sep 15, 2011 at 9:00 AM, Sasha Levin levinsasha...@gmail.com wrote:
  Hm? If you it to the host's resolv.conf, a guest can edit host's file,
  no?
 
  Might even be not on purpose... For example, simply running dhcpcd on
  the guest.
 
 How is that going to happen if you're not running kvmtool as root?

In that case, dhcpcd in the guest will simply break because it can't
modify resolv.conf, no?

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: More work on Livebackup for qemu/qemu-kvm

2011-09-15 Thread shu ming


Jagane,
  we are testing and reviewing the livebackup workspace from
git://github.com/jagane/qemu-livebackup.git

 Several questions are coming from us.
 1)  It seems that the workspace has not been updated for a while.  Is 
there any new update for this project?
  2)  It looks like that the support is hightly bounded with qcow2 
image format.  Is there any plan to support

  other formats? Like raw, qed streaming?
  3) Can we add some checksum method to check if the backup image is 
correct in the process of image
   transfering?  For example, a checksum is made before the 
snapshot is transfered and then is compared

   with the checksum of the backup image after the backup is done.

Jagane Sundar:

Hello All,

I have made more progress on the proposed Livebackup feature
for qemu and qemu-kvm.

Based on Jes' feedback, I have switched over to using command
line parameters instead of specific named files. So, a typical
command line looks like this:

# ./x86_64-softmmu/qemu-system-x86_64 -drive \
file=/dev/kvm_vol_group/kvm_root_part,boot=on,if=virtio,livebackup=on \
-drive file=/dev/kvm_vol_group/kvm_disk1,if=virtio,livebackup=on \
-m 512 -net nic,model=virtio,macaddr=52:54:00:00:00:01 \
-net tap,ifname=tap0,script=no,downscript=no \
-vnc 0.0.0.0:1000 -usb -usbdevice tablet \
-livebackup_dir /root/kvm/livebackup -livebackup_port 7900

Note the new option livebackup=on in the drive parameters, and
the two new parameters -livebackup_dir and -livebackup_port

Here's my strategy for rigorous testing of this new code:
I have created two virtual disks in LVM logical volumes, and
added code in qemu livebackup to create a LVM snapshot as
soon as livebackup_client connects to qemu and creates a
livebackup snapshot. Then I binary compare the livebackup
backed up version of the virtual disk image with the
LVM snapshot that was created by using 'cmp'.
The backup images are a bit for bit match!

As always, all information is available at:

http://wiki.qemu.org/Features/Livebackup

I have also sent in my application to make a presentation at
the qemu forum 2011.

In the meantime, I invite feedback on livebackup.
Specifically, I am interested in scrutiny of my testing
methodology.

Also, I plan to add encryption (probably SSL) to the
livebackup TCP connection, and some form of authentication.

Any thoughts, feedback?

Thanks,
Jagane
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] kvm tools: Use host's resolv.conf within the guest

2011-09-15 Thread Pekka Enberg


On 9/15/11 9:04 AM, Sasha Levin wrote:

On Thu, 2011-09-15 at 09:04 +0300, Pekka Enberg wrote:

On Thu, Sep 15, 2011 at 9:00 AM, Sasha Levinlevinsasha...@gmail.com  wrote:

Hm? If you it to the host's resolv.conf, a guest can edit host's file,
no?

Might even be not on purpose... For example, simply running dhcpcd on
the guest.


How is that going to happen if you're not running kvmtool as root?


In that case, dhcpcd in the guest will simply break because it can't
modify resolv.conf, no?


Yes. Why is that a problem? You're not supposed to launch a dhcp client
when using shared rootfs because kvmtool takes care of that for you.

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 1/2] KVM: emulate lapic tsc deadline timer for guest

2011-09-15 Thread Liu, Jinsong

Marcelo Tosatti wrote:
 diff --git a/arch/x86/include/asm/apicdef.h
 b/arch/x86/include/asm/apicdef.h 
 index 34595d5..3925d80 100644
 --- a/arch/x86/include/asm/apicdef.h
 +++ b/arch/x86/include/asm/apicdef.h
 @@ -100,7 +100,9 @@
  #define APIC_TIMER_BASE_CLKIN   0x0
  #define APIC_TIMER_BASE_TMBASE  0x1
  #define APIC_TIMER_BASE_DIV 0x2
 +#define APIC_LVT_TIMER_ONESHOT  (0  17)
  #define APIC_LVT_TIMER_PERIODIC (1  17)
 +#define APIC_LVT_TIMER_TSCDEADLINE  (2  17)
  #define APIC_LVT_MASKED (1  16)
  #define APIC_LVT_LEVEL_TRIGGER  (1  15)
  #define APIC_LVT_REMOTE_IRR (1  14)
 
 Please have a separate, introductory patch for definitions that are
 not KVM specific.
 

OK, will present a separate patch. BTW, will the separate patch still be send 
to kvm@vger.kernel.org?

 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -671,6 +671,8 @@ u8 kvm_get_guest_memory_type(struct kvm_vcpu
 *vcpu, gfn_t gfn); 
 
  extern bool tdp_enabled;
 
 +extern u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
 +
 
 No need for extern.
 

Any special concern, or, for coding style? a little curious :)

 +} else if (apic_lvtt_tscdeadline(apic)) {
 +/* lapic timer in tsc deadline mode */
 +u64 guest_tsc, guest_tsc_delta, ns = 0;
 +struct kvm_vcpu *vcpu = apic-vcpu;
 +unsigned long this_tsc_khz = vcpu_tsc_khz(vcpu); +  
 unsigned long
 flags; +
 +if (unlikely(!apic-lapic_timer.tscdeadline || !this_tsc_khz))
 +return; +
 +local_irq_save(flags);
 +
 +now = apic-lapic_timer.timer.base-get_time();
 +kvm_get_msr(vcpu, MSR_IA32_TSC, guest_tsc);
 
 Use kvm_x86_ops-read_l1_tsc(vcpu) instead of direct MSR read
 (to avoid reading L2 guest TSC in case of nested virt).
 

Fine. I use some old version kvm (Jul 22), and didn't notice Nadav's patch 
checked in Aug 2 with read_l1_tsc hook.
Thanks for tell me.

 +guest_tsc_delta = apic-lapic_timer.tscdeadline - guest_tsc;
 
 if (guest_tsc = tscdeadline), the timer should start immediately.
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 6cb353c..a73c059 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -610,6 +610,16 @@ static void update_cpuid(struct kvm_vcpu *vcpu)
  if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE))
  best-ecx |= bit(X86_FEATURE_OSXSAVE);
  }
 +
 +/*
 + * When cpu has tsc deadline timer capacibility, use bit 17/18
 + * as timer mode mask. Otherwise only use bit 17. +  */
 +if (cpu_has_tsc_deadline_timer  best-function == 0x1) {
 +best-ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER);
 +vcpu-arch.apic-lapic_timer.timer_mode_mask = (3  17); + 
 } else
 +vcpu-arch.apic-lapic_timer.timer_mode_mask = (1  17);
  }
 
 The deadline timer is entirely emulated, whether the host CPU supports
 it or not is irrelevant.
 
 Why was this changed from previous submissions?

Hmm, will explain in next email.

Thanks,
Jinsong

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kgdb hooks and kvm-tool

2011-09-15 Thread David Evensky


Thanks!
\dae

On Thu, Sep 15, 2011 at 08:39:03AM +0300, Sasha Levin wrote:
 On Thu, 2011-09-15 at 08:32 +0300, Pekka Enberg wrote:
  On Thu, Sep 15, 2011 at 2:17 AM, David Evensky
  even...@dancer.ca.sandia.gov wrote:
   Hi. Is it possible to use kvm-tool with a kernel compiled with kgdb?
   I've tried adding 'kgdbwait kgdboc=ttyS0' to -p, but that doesn't seem
   to work.
  
  I've never tried kgdb myself but I'm rather surprised it doesn't just
  work. Sasha, Cyrill, Asias, have you guys ever tried kvmtool with
  kgdb?
 
 You can either use 'kgdboc=kbd' to use it over the keyboard. I also have
 a patch which uses forktty() to spawn serial consoles and redirect guest
 tty's into them, but it's somewhat ugly.
 
 Give me a day or two to make it nicer and I'll send it over.
 
 -- 
 
 Sasha.
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] kvm tools: Use host's resolv.conf within the guest

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 09:22 +0300, Pekka Enberg wrote:
 On 9/15/11 9:04 AM, Sasha Levin wrote:
  On Thu, 2011-09-15 at 09:04 +0300, Pekka Enberg wrote:
  On Thu, Sep 15, 2011 at 9:00 AM, Sasha Levinlevinsasha...@gmail.com  
  wrote:
  Hm? If you it to the host's resolv.conf, a guest can edit host's file,
  no?
 
  Might even be not on purpose... For example, simply running dhcpcd on
  the guest.
 
  How is that going to happen if you're not running kvmtool as root?
 
  In that case, dhcpcd in the guest will simply break because it can't
  modify resolv.conf, no?
 
 Yes. Why is that a problem? You're not supposed to launch a dhcp client
 when using shared rootfs because kvmtool takes care of that for you.

Why? Testing a brand new dhcp client for example :)

We can't block the user from editing guest configuration files...

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory API code review

2011-09-15 Thread Juan Quintela

Avi Kivity a...@redhat.com wrote:
 I would like to carry out an online code review of the memory API so
 that more people are familiar with the internals, and perhaps even to
 catch some bugs or deficiency.  I'd like to use the next kvm
 conference call slot for this (Tuesday 1400 UTC) since many people
 already have it reserved in the schedule.

 It would be great if people from the wider qemu community be present,
 rather than the usual x86 is everything crowd (+Jan) that usually
 participates in the kvm weekly call.

 Juan, Chris, can we dedicate next week's call to this?

I think so.

Later, Juan.

 We'll also need a way to disseminate a few slides and an editor
 session for showing the code.  We have an elluminate account that can
 be used for this, but usually this has a 50% failure rate on Linux.
 Anthony, perhaps we can set up a view-only vnc reflector on qemu.org?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 1/2] KVM: emulate lapic tsc deadline timer for guest

2011-09-15 Thread Liu, Jinsong

Marcelo Tosatti wrote:
 +} else if (apic_lvtt_tscdeadline(apic)) {
 +/* lapic timer in tsc deadline mode */
 +u64 guest_tsc, guest_tsc_delta, ns = 0;
 +struct kvm_vcpu *vcpu = apic-vcpu;
 +unsigned long this_tsc_khz = vcpu_tsc_khz(vcpu); +  
 unsigned long
 flags; +
 +if (unlikely(!apic-lapic_timer.tscdeadline || !this_tsc_khz))
 +return; +
 +local_irq_save(flags);
 +
 +now = apic-lapic_timer.timer.base-get_time();
 +kvm_get_msr(vcpu, MSR_IA32_TSC, guest_tsc);
 
 Use kvm_x86_ops-read_l1_tsc(vcpu) instead of direct MSR read
 (to avoid reading L2 guest TSC in case of nested virt).
 
 +guest_tsc_delta = apic-lapic_timer.tscdeadline - guest_tsc;
 
 if (guest_tsc = tscdeadline), the timer should start immediately.
 

Yes, under such case the timer does start immediately, with ns = 0

Thanks,
Jinsong--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] kvm tools: Allow remapping guest TTY into host PTS

2011-09-15 Thread Sasha Levin

This patch adds the '-tty' option to 'kvm run' which allows the user to
remap a guest TTY into a PTS on the host.

Usage:
'kvm run --tty [id] [other options]'

The tty will be mapped to a pts and will be printed on the screen:
'  Info: Assigned terminal 1 to pty /dev/pts/X'

At this point, it is possible to communicate with the guest using that pty.

This is useful for debugging guest kernel using KGDB:

1. Run the guest:
'kvm run -k [vmlinuz] -p kdbgoc=ttyS1 kdbgwait --tty 1'

And see which PTY got assigned to ttyS1.

2. Run GDB on the host:
'gdb [vmlinuz]'

3. Connect to the guest (from within GDB):
'target remote /dev/pty/X'

4. Start debugging! (enter 'continue' to continue boot).

Cc: David Evensky even...@dancer.ca.sandia.gov
Signed-off-by: Sasha Levin levinsasha...@gmail.com
---
 tools/kvm/Makefile   |1 +
 tools/kvm/builtin-run.c  |   12 
 tools/kvm/hw/serial.c|   46 ++--
 tools/kvm/include/kvm/term.h |   11 ---
 tools/kvm/term.c |   60 +
 tools/kvm/virtio/console.c   |6 ++--
 6 files changed, 96 insertions(+), 40 deletions(-)

diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
index efa032d..fef624d 100644
--- a/tools/kvm/Makefile
+++ b/tools/kvm/Makefile
@@ -115,6 +115,7 @@ OBJS+= bios/bios-rom.o
 
 LIBS   += -lrt
 LIBS   += -lpthread
+LIBS   += -lutil
 
 # Additional ARCH settings for x86
 ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/i386/ -e s/sun4u/sparc64/ \
diff --git a/tools/kvm/builtin-run.c b/tools/kvm/builtin-run.c
index 5dafb15..b5c63ca 100644
--- a/tools/kvm/builtin-run.c
+++ b/tools/kvm/builtin-run.c
@@ -172,6 +172,15 @@ static int virtio_9p_rootdir_parser(const struct option 
*opt, const char *arg, i
return 0;
 }
 
+static int tty_parser(const struct option *opt, const char *arg, int unset)
+{
+   int tty = atoi(arg);
+
+   term_set_tty(tty);
+
+   return 0;
+}
+
 static int shmem_parser(const struct option *opt, const char *arg, int unset)
 {
const u64 default_size = SHMEM_DEFAULT_SIZE;
@@ -316,6 +325,9 @@ static const struct option options[] = {
OPT_STRING('\0', console, console, serial or virtio,
Console to use),
OPT_STRING('\0', dev, dev, device_file, KVM device file),
+   OPT_CALLBACK('\0', tty, NULL, tty id,
+Remap guest TTY into a pty on the host,
+tty_parser),
 
OPT_GROUP(Kernel options:),
OPT_STRING('k', kernel, kernel_filename, kernel,
diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
index b3b233f..11fa5d4 100644
--- a/tools/kvm/hw/serial.c
+++ b/tools/kvm/hw/serial.c
@@ -14,6 +14,7 @@
 
 struct serial8250_device {
pthread_mutex_t mutex;
+   u8  id;
 
u16 iobase;
u8  irq;
@@ -42,6 +43,7 @@ static struct serial8250_device devices[] = {
[0] = {
.mutex  = PTHREAD_MUTEX_INITIALIZER,
 
+   .id = 0,
.iobase = 0x3f8,
.irq= 4,
 
@@ -51,6 +53,7 @@ static struct serial8250_device devices[] = {
[1] = {
.mutex  = PTHREAD_MUTEX_INITIALIZER,
 
+   .id = 1,
.iobase = 0x2f8,
.irq= 3,
 
@@ -60,6 +63,7 @@ static struct serial8250_device devices[] = {
[2] = {
.mutex  = PTHREAD_MUTEX_INITIALIZER,
 
+   .id = 2,
.iobase = 0x3e8,
.irq= 4,
 
@@ -69,6 +73,7 @@ static struct serial8250_device devices[] = {
[3] = {
.mutex  = PTHREAD_MUTEX_INITIALIZER,
 
+   .id = 3,
.iobase = 0x2e8,
.irq= 3,
 
@@ -111,10 +116,10 @@ static void serial8250__receive(struct kvm *kvm, struct 
serial8250_device *dev)
return;
}
 
-   if (!term_readable(CONSOLE_8250))
+   if (!term_readable(CONSOLE_8250, dev-id))
return;
 
-   c   = term_getc(CONSOLE_8250);
+   c = term_getc(CONSOLE_8250, dev-id);
 
if (c  0)
return;
@@ -123,30 +128,31 @@ static void serial8250__receive(struct kvm *kvm, struct 
serial8250_device *dev)
dev-lsr|= UART_LSR_DR;
 }
 
-/*
- * Interrupts are injected for ttyS0 only.
- */
 void serial8250__inject_interrupt(struct kvm *kvm)
 {
-   struct serial8250_device *dev = devices[0];
+   int i;
 
-   mutex_lock(dev-mutex);
+   for (i = 0; i  4; i++) {
+   struct serial8250_device *dev = devices[i];
 
-

-cpu core2duo still has no SSE4 support?

2011-09-15 Thread Jun Koi

hi,

i run kvm with -cpu core2duo option, but /proc/cpuinfo only shows
SSE and SSE2.
my host is Core i7, so i suppose that i should have SSE4 with this
option, but it seems not?

is there any way to get SSE4?

(i am on kvm-0.12.3 on Ubuntu 10.04)

thanks,
Jun
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: -cpu core2duo still has no SSE4 support?

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 17:04 +0800, Jun Koi wrote:
 hi,
 
 i run kvm with -cpu core2duo option, but /proc/cpuinfo only shows
 SSE and SSE2.
 my host is Core i7, so i suppose that i should have SSE4 with this
 option, but it seems not?
 
 is there any way to get SSE4?

How about just running it with '-cpu host'?

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: -cpu core2duo still has no SSE4 support?

2011-09-15 Thread Jun Koi

On Thu, Sep 15, 2011 at 5:05 PM, Sasha Levin levinsasha...@gmail.com wrote:
 On Thu, 2011-09-15 at 17:04 +0800, Jun Koi wrote:
 hi,

 i run kvm with -cpu core2duo option, but /proc/cpuinfo only shows
 SSE and SSE2.
 my host is Core i7, so i suppose that i should have SSE4 with this
 option, but it seems not?

 is there any way to get SSE4?

 How about just running it with '-cpu host'?

hah, that works, thanks!

but then there are 2 problems:

- -cpu host should be exposed in the doc. kvm -cpu ? reports no
such option, so i missed it.
- -cpu core2duo should enable SSE4, but it doesnt. a bug?

thanks,
Jun
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: -cpu core2duo still has no SSE4 support?

2011-09-15 Thread Dor Laor


On 09/15/2011 12:14 PM, Jun Koi wrote:

On Thu, Sep 15, 2011 at 5:05 PM, Sasha Levinlevinsasha...@gmail.com  wrote:

On Thu, 2011-09-15 at 17:04 +0800, Jun Koi wrote:

hi,

i run kvm with -cpu core2duo option, but /proc/cpuinfo only shows
SSE and SSE2.
my host is Core i7, so i suppose that i should have SSE4 with this
option, but it seems not?

is there any way to get SSE4?


How about just running it with '-cpu host'?


hah, that works, thanks!

but then there are 2 problems:

- -cpu host should be exposed in the doc. kvm -cpu ? reports no
such option, so i missed it.
- -cpu core2duo should enable SSE4, but it doesnt. a bug?


If this model should have contain it, yes.
According to sysconfigs/target/target-x86_64.conf  it shouldn't be in 
core2duo but does appear in newer ones.




thanks,
Jun
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: -cpu core2duo still has no SSE4 support?

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 17:14 +0800, Jun Koi wrote:
 On Thu, Sep 15, 2011 at 5:05 PM, Sasha Levin levinsasha...@gmail.com wrote:
  On Thu, 2011-09-15 at 17:04 +0800, Jun Koi wrote:
  hi,
 
  i run kvm with -cpu core2duo option, but /proc/cpuinfo only shows
  SSE and SSE2.
  my host is Core i7, so i suppose that i should have SSE4 with this
  option, but it seems not?
 
  is there any way to get SSE4?
 
  How about just running it with '-cpu host'?
 
 hah, that works, thanks!
 
 but then there are 2 problems:
 
 - -cpu host should be exposed in the doc. kvm -cpu ? reports no
 such option, so i missed it.
 - -cpu core2duo should enable SSE4, but it doesnt. a bug?

SSE4 doesn't come built in with all core2duos. See
http://download.intel.com/design/mobile/datashts/31674505.pdf for
example.

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm tools: Allow remapping guest TTY into host PTS

2011-09-15 Thread Pekka Enberg

On Thu, Sep 15, 2011 at 11:53 AM, Sasha Levin levinsasha...@gmail.com wrote:
 This patch adds the '-tty' option to 'kvm run' which allows the user to
 remap a guest TTY into a PTS on the host.

 Usage:
        'kvm run --tty [id] [other options]'

 The tty will be mapped to a pts and will be printed on the screen:
        '  Info: Assigned terminal 1 to pty /dev/pts/X'

 At this point, it is possible to communicate with the guest using that pty.

 This is useful for debugging guest kernel using KGDB:

 1. Run the guest:
        'kvm run -k [vmlinuz] -p kdbgoc=ttyS1 kdbgwait --tty 1'

 And see which PTY got assigned to ttyS1.

 2. Run GDB on the host:
        'gdb [vmlinuz]'

 3. Connect to the guest (from within GDB):
        'target remote /dev/pty/X'

 4. Start debugging! (enter 'continue' to continue boot).

 Cc: David Evensky even...@dancer.ca.sandia.gov
 Signed-off-by: Sasha Levin levinsasha...@gmail.com

Neat! Would a tools/kvm/Documentation/debugging.txt be helpful for
people who want to do kernel debugging with kvmtool?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Nokia

2011-09-15 Thread winners

Congratulations! Your email address has won £250,000.00 Pounds in this Year 
Nokia UK Mobile Promo.To claim E-mail your name, tel and add. Regards Susan 
Oxford 15/09/2011


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: emulate lapic tsc deadline timer for guest

2011-09-15 Thread Marcelo Tosatti

On Thu, Sep 15, 2011 at 04:17:20PM +0800, Liu, Jinsong wrote:
 Marcelo Tosatti wrote:
  +  } else if (apic_lvtt_tscdeadline(apic)) {
  +  /* lapic timer in tsc deadline mode */
  +  u64 guest_tsc, guest_tsc_delta, ns = 0;
  +  struct kvm_vcpu *vcpu = apic-vcpu;
  +  unsigned long this_tsc_khz = vcpu_tsc_khz(vcpu); +  
  unsigned long
  flags; +
  +  if (unlikely(!apic-lapic_timer.tscdeadline || !this_tsc_khz))
  +  return; +
  +  local_irq_save(flags);
  +
  +  now = apic-lapic_timer.timer.base-get_time();
  +  kvm_get_msr(vcpu, MSR_IA32_TSC, guest_tsc);
  
  Use kvm_x86_ops-read_l1_tsc(vcpu) instead of direct MSR read
  (to avoid reading L2 guest TSC in case of nested virt).
  
  +  guest_tsc_delta = apic-lapic_timer.tscdeadline - guest_tsc;
  
  if (guest_tsc = tscdeadline), the timer should start immediately.
  
 
 Yes, under such case the timer does start immediately, with ns = 0

No, guest_tsc_delta is unsigned, so the  0 comparation fails.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm tools: Allow remapping guest TTY into host PTS

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 12:32 +0300, Pekka Enberg wrote:
 On Thu, Sep 15, 2011 at 11:53 AM, Sasha Levin levinsasha...@gmail.com wrote:
  This patch adds the '-tty' option to 'kvm run' which allows the user to
  remap a guest TTY into a PTS on the host.
 
  Usage:
 'kvm run --tty [id] [other options]'
 
  The tty will be mapped to a pts and will be printed on the screen:
 '  Info: Assigned terminal 1 to pty /dev/pts/X'
 
  At this point, it is possible to communicate with the guest using that pty.
 
  This is useful for debugging guest kernel using KGDB:
 
  1. Run the guest:
 'kvm run -k [vmlinuz] -p kdbgoc=ttyS1 kdbgwait --tty 1'
 
  And see which PTY got assigned to ttyS1.
 
  2. Run GDB on the host:
 'gdb [vmlinuz]'
 
  3. Connect to the guest (from within GDB):
 'target remote /dev/pty/X'
 
  4. Start debugging! (enter 'continue' to continue boot).
 
  Cc: David Evensky even...@dancer.ca.sandia.gov
  Signed-off-by: Sasha Levin levinsasha...@gmail.com
 
 Neat! Would a tools/kvm/Documentation/debugging.txt be helpful for
 people who want to do kernel debugging with kvmtool?

I'll write a basic doc with the details provided above.

David, does this patch allows you to properly debug guest kernels? If
so, could you mail back any issues or hacks you had to do to set it up
so I could add it to the doc and move it into 'Documentation/'?

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 02/10] Driver core: Add iommu_ops to bus_type

2011-09-15 Thread Sethi Varun-B16395

 -Original Message-
 From: Roedel, Joerg [mailto:joerg.roe...@amd.com]
 Sent: Monday, September 12, 2011 6:06 PM
 To: Sethi Varun-B16395
 Cc: Joerg Roedel; Greg KH; io...@lists.linux-foundation.org; Alex
 Williamson; Ohad Ben-Cohen; David Woodhouse; David Brown;
 kvm@vger.kernel.org; linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 02/10] Driver core: Add iommu_ops to bus_type

 On Mon, Sep 12, 2011 at 08:08:41AM -0400, Sethi Varun-B16395 wrote:
   The IOMMUs are usually devices on the bus itself, so they are
   initialized after the bus is set up and the devices on it are
   populated.  So the function can not be called on bus initialization
   because the IOMMU is not ready at this point.
  Well, at what point would the add_device_group (referring to patch set
 posted by Alex) call back be invoked?

 The details are up to Alex Williamson. One option is to register a
 notifier for the bus in the iommu_bus_init() function and react to its
 notifications.
 I think in the end we will have a number of additional call-backs in the
 iommu_ops which are called by the notifier (or from the driver-core
 directly) to handle actions like added or removed devices. All the
 infrastructure for that which is implemented in the iommu-drivers today
 will then be in the iommu-core code.
I am not sure If I understand this, but as per your earlier statement iommu is 
a device on the bus
and its initialization would happen when bus is set up and devices are 
populated. So, when would device
notifier call an iommu call back?

-Varun

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: emulate lapic tsc deadline timer for guest

2011-09-15 Thread Marcelo Tosatti

On Thu, Sep 15, 2011 at 02:22:58PM +0800, Liu, Jinsong wrote:
 Marcelo Tosatti wrote:
  diff --git a/arch/x86/include/asm/apicdef.h
  b/arch/x86/include/asm/apicdef.h 
  index 34595d5..3925d80 100644
  --- a/arch/x86/include/asm/apicdef.h
  +++ b/arch/x86/include/asm/apicdef.h
  @@ -100,7 +100,9 @@
   #define   APIC_TIMER_BASE_CLKIN   0x0
   #define   APIC_TIMER_BASE_TMBASE  0x1
   #define   APIC_TIMER_BASE_DIV 0x2
  +#define   APIC_LVT_TIMER_ONESHOT  (0  17)
   #define   APIC_LVT_TIMER_PERIODIC (1  17)
  +#define   APIC_LVT_TIMER_TSCDEADLINE  (2  17)
   #define   APIC_LVT_MASKED (1  16)
   #define   APIC_LVT_LEVEL_TRIGGER  (1  15)
   #define   APIC_LVT_REMOTE_IRR (1  14)
  
  Please have a separate, introductory patch for definitions that are
  not KVM specific.
  
 
 OK, will present a separate patch. BTW, will the separate patch still be send 
 to kvm@vger.kernel.org?

Yes.

 
  +++ b/arch/x86/include/asm/kvm_host.h
  @@ -671,6 +671,8 @@ u8 kvm_get_guest_memory_type(struct kvm_vcpu
  *vcpu, gfn_t gfn); 
  
   extern bool tdp_enabled;
  
  +extern u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
  +
  
  No need for extern.
  
 
 Any special concern, or, for coding style? a little curious :)

It is not necessary.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] [dhcp] Use random transaction ID to associate messages

2011-09-15 Thread Amos Kong

RFC2131.txt:
xid 4  Transaction ID, a random number chosen by the
   client, used by the client and server to associate
   messages and responses between a client and a
   server.
The 'xid' field is used by the client to match incoming DHCP messages
with pending requests.  A DHCP client MUST choose 'xid's in such a
way as to minimize the chance of using an 'xid' identical to one used
by another client. For example, a client may choose a different,
random initial 'xid' each time the client is rebooted, and
subsequently use sequential 'xid's until the next reboot.  Selecting
a new 'xid' for each retransmission is an implementation decision.  A
client may choose to reuse the same 'xid' or select a new 'xid' for
each retransmitted message.

This patch generates random id when start dhcp, and record it to
netdev struct.

Signed-off-by: Amos Kong ak...@redhat.com
CC: Eduardo Habkost ehabk...@redhat.com
CC: Marty Connor m...@etherboot.org
---
 src/include/gpxe/netdevice.h |3 +++
 src/net/udp/dhcp.c   |   23 ---
 2 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/src/include/gpxe/netdevice.h b/src/include/gpxe/netdevice.h
index 97bf168..7272cf8 100644
--- a/src/include/gpxe/netdevice.h
+++ b/src/include/gpxe/netdevice.h
@@ -294,6 +294,9 @@ struct net_device {
/** Link-layer broadcast address */
const uint8_t *ll_broadcast;
 
+   /* DHCP Transaction ID */
+   uint32_t xid;
+
/** Current device state
 *
 * This is the bitwise-OR of zero or more NETDEV_XXX constants.
diff --git a/src/net/udp/dhcp.c b/src/net/udp/dhcp.c
index 4bfcb80..51b7150 100644
--- a/src/net/udp/dhcp.c
+++ b/src/net/udp/dhcp.c
@@ -136,23 +136,6 @@ static inline const char * dhcp_msgtype_name ( unsigned 
int msgtype ) {
}
 }
 
-/**
- * Calculate DHCP transaction ID for a network device
- *
- * @v netdev   Network device
- * @ret xidDHCP XID
- *
- * Extract the least significant bits of the hardware address for use
- * as the transaction ID.
- */
-static uint32_t dhcp_xid ( struct net_device *netdev ) {
-   uint32_t xid;
-
-   memcpy ( xid, ( netdev-ll_addr + netdev-ll_protocol-ll_addr_len
-- sizeof ( xid ) ), sizeof ( xid ) );
-   return xid;
-}
-
 /
  *
  * DHCP session
@@ -1070,7 +1053,7 @@ int dhcp_create_packet ( struct dhcp_packet *dhcppkt,
 
/* Initialise DHCP packet content */
memset ( dhcphdr, 0, max_len );
-   dhcphdr-xid = dhcp_xid ( netdev );
+   dhcphdr-xid = netdev-xid;
dhcphdr-magic = htonl ( DHCP_MAGIC_COOKIE );
dhcphdr-htype = ntohs ( netdev-ll_protocol-ll_proto );
dhcphdr-op = dhcp_op[msgtype];
@@ -1313,7 +1296,8 @@ static int dhcp_deliver_iob ( struct xfer_interface *xfer,
server_id, sizeof ( server_id ) );
 
/* Check for matching transaction ID */
-   if ( dhcphdr-xid != dhcp_xid ( dhcp-netdev ) ) {
+   if ( dhcphdr-xid != dhcp-netdev-xid ) {
+
DBGC ( dhcp, DHCP %p %s from %s:%d has bad transaction 
   ID\n, dhcp, dhcp_msgtype_name ( msgtype ),
   inet_ntoa ( peer-sin_addr ),
@@ -1442,6 +1426,7 @@ int start_dhcp ( struct job_interface *job, struct 
net_device *netdev ) {
dhcp = zalloc ( sizeof ( *dhcp ) );
if ( ! dhcp )
return -ENOMEM;
+   netdev-xid = random();
ref_init ( dhcp-refcnt, dhcp_free );
job_init ( dhcp-job, dhcp_job_operations, dhcp-refcnt );
xfer_init ( dhcp-xfer, dhcp_xfer_operations, dhcp-refcnt );

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 05/11] KVM: MMU: do not mark accessed bit on pte write path

2011-09-15 Thread Marcelo Tosatti

On Wed, Sep 14, 2011 at 12:55:09PM +0300, Avi Kivity wrote:
 On 09/13/2011 09:29 PM, Xiao Guangrong wrote:
 On 09/13/2011 06:53 PM, Avi Kivity wrote:
   On 08/30/2011 05:35 AM, Xiao Guangrong wrote:
   In current code, the accessed bit is always set when page fault occurred,
   do not need to set it on pte write path
 
   What about speculative sptes that are then only accessed via emulation?
 
 
 The gfn is read and written only via emulation? I think this case is very
 very rare?
 
 Probably...

The access information will be transferred via the host pte, via
get_user_pages, to MM layer, in that case.

 Marcelo? Can you think of another case where spte.accessed is needed?

No, an spte updated via emulation will either be accessed directly, or
if via emulation, access to the gfn it points transferred via host pte.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/10] Driver core: Add iommu_ops to bus_type

2011-09-15 Thread Roedel, Joerg

On Thu, Sep 15, 2011 at 08:45:35AM -0400, Sethi Varun-B16395 wrote:
  From: Roedel, Joerg [mailto:joerg.roe...@amd.com]
  The details are up to Alex Williamson. One option is to register a
  notifier for the bus in the iommu_bus_init() function and react to its
  notifications.
  I think in the end we will have a number of additional call-backs in the
  iommu_ops which are called by the notifier (or from the driver-core
  directly) to handle actions like added or removed devices. All the
  infrastructure for that which is implemented in the iommu-drivers today
  will then be in the iommu-core code.

 I am not sure If I understand this, but as per your earlier statement
 iommu is a device on the bus and its initialization would happen when
 bus is set up and devices are populated. So, when would device
 notifier call an iommu call back?

This is done in the iommu_bus_init() function. It will iterate over all
devices that are already on the bus and do the iommu specific
initialization on them. The devices added or removed later the notifier
will do the job.

Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/4] Avoid soft lockup message when KVM is stopped by host

2011-09-15 Thread Marcelo Tosatti

On Tue, Sep 13, 2011 at 04:49:55PM -0400, Eric B Munson wrote:
 On Fri, 09 Sep 2011, Marcelo Tosatti wrote:
 
  On Thu, Sep 01, 2011 at 02:27:49PM -0600, emun...@mgebm.net wrote:
   On Thu, 01 Sep 2011 14:24:12 -0500, Anthony Liguori wrote:
   On 08/30/2011 07:26 AM, Marcelo Tosatti wrote:
   On Mon, Aug 29, 2011 at 05:27:11PM -0600, Eric B Munson wrote:
   Currently, when qemu stops a guest kernel that guest will
   issue a soft lockup
   message when it resumes.  This set provides the ability for
   qemu to comminucate
   to the guest that it has been stopped.  When the guest hits
   the watchdog on
   resume it will check if it was suspended before issuing the
   warning.
   
   Eric B Munson (4):
  Add flag to indicate that a vm was stopped by the host
  Add functions to check if the host has stopped the vm
  Add generic stubs for kvm stop check functions
  Add check for suspended vm in softlockup detector
   
 arch/x86/include/asm/pvclock-abi.h |1 +
 arch/x86/include/asm/pvclock.h |2 ++
 arch/x86/kernel/kvmclock.c |   14 ++
 include/asm-generic/pvclock.h  |   14 ++
 kernel/watchdog.c  |   12 
 5 files changed, 43 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-generic/pvclock.h
   
   --
   1.7.4.1
   
   How is the host supposed to set this flag?
   
   As mentioned previously, if you save save/restore the offset
   added to
   kvmclock on stop/cont (and the TSC MSR, forgot to mention that), no
   paravirt infrastructure is required. Which means the issue is
   also fixed
   for older guests.
   
 
 Marcelo,
 
 I think that stopping the TSC is the wrong approach because it will break time
 between the two systems so timething that expects the monotonic clock to move
 consistently will be wrong.

In case the VM stops for whatever reason, the host system is not
supposed to adjust time related hardware state to compensate, in an
attempt to present apparent continuous time.

If you save a VM and then restore it later, it is the guest
responsability to adjust its time representation.

QEMU exposing continuous TSC and kvmclock state between stop and
cont should not be a reason to introduce new paravirt infrastructure.

  IMO, messing with the TSC at run time to avoid a watchdog message
 is the wrong solution, better to teach the watchdog to ignore this
 special case.

OK then, it is not a harmful addition, can you post the QEMU patches to
set the ignore watchdog bit.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [dhcp] Use random transaction ID to associate messages

2011-09-15 Thread Hagen Paul Pfeifer


On Thu, 15 Sep 2011 21:00:38 +0800, Amos Kong ak...@redhat.com wrote:



 + netdev-xid = random();



This will not work for reboots. The decision that the hardware address is

choosen was not accidental. Not sure if some DHCP server will count on the

ID. (RFC 2131 Retain DHCP client configuration across server reboots, and,

whenever possible, a DHCP client should be assigned the same configuration

parameters despite restarts of the DHCP mechanism). If not so I am fine

with the patch.



Hagen


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-15 Thread Roopa Prabhu



The netlink patch is still in the works. I will post the patches after I
clean it up a bit and also accommodate or find answers to most questions
discussed for non-passthru case. Thought I will post the netlink interface
here to see if anyone has any early comments. I have a
rtnl_link_ops-set_rx_filter defined.

[IFLA_RX_FILTER] = {
[IFLA_ADDRESS_FILTER] = {
[IFLA_ADDRESS_FILTER_FLAGS]
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_VLAN_FILTER] = {
[IFLA_VLAN_LIST] = {
[IFLA_VLAN]
}
}
}

Some open questions:
- The VLAN filter above shows a VLAN list. It could also be a bitmap or
the interface could provide both a bitmap and VLAN list for more flexibility
. Like the below  

[IFLA_RX_FILTER] = {
[IFLA_ADDRESS_FILTER] = {
[IFLA_ADDRESS_FILTER_FLAGS]
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_VLAN_FILTER] = {
[IFLA_VLAN_BITMAP]
[IFLA_VLAN_LIST] = {
[IFLA_VLAN]
}
}
}

- Do you see any advantage in keeping Unicast and multicast address list
separate ? Something like the below :
[IFLA_RX_FILTER] = {
[IFLA_ADDRESS_FILTER_FLAGS]
[IFLA_UC_ADDRESS_FILTER] = {
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_MC_ADDRESS_FILTER] = {
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_VLAN_FILTER] = {
[IFLA_VLAN_LIST] = {
[IFLA_VLAN]
}
}
} 

- Is there any need to keep address and vlan filters separate. And have
two rtnl_link_ops, set_rx_address_filter, set_rx_vlan_filter ?. I don't see
one .

[IFLA_RX_ADDRESS_FILTER] = {
[IFLA_ADDRESS_FILTER_FLAGS]
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_RX_VLAN_FILTER] = {
[IFLA_VLAN_LIST] = {
[IFLA_VLAN]
}
} 


Thanks,
Roopa



On 9/12/11 10:02 AM, Roopa Prabhu ropra...@cisco.com wrote:

 
 
 
 On 9/11/11 12:03 PM, Michael S. Tsirkin m...@redhat.com wrote:
 
 On Sun, Sep 11, 2011 at 06:18:01AM -0700, Roopa Prabhu wrote:
 
 
 
 On 9/11/11 2:44 AM, Michael S. Tsirkin m...@redhat.com wrote:
 
 
 Yes, but what I mean is, if the size of the single filter table
 is limited, we need to decide how many addresses is
 each guest allowed. If we let one guest ask for
 as many as it wants, it can lock others out.
 
 Yes true. In these cases ie when the number of unicast addresses being
 registered is more than it can handle, The VF driver will put the VF  in
 promiscuous mode (Or at least its supposed to do. I think all drivers do
 that).
 
 
 Thanks,
 Roopa
 
 Right, so that works at least but likely performs worse
 than a hardware filter. So we better allocate it in
 some fair way, as a minimum. Maybe a way for
 the admin to control that allocation is useful.
 
 Yes I think we will have to do something like that. There is a maximum that hw
 can support. Might need to consider that too. But there is no interface to get
 that today. I think the virtualization case gets a little trickier. Virtio-net
 allows upto 64 unicast addresses. But the lowerdev may allow only upto say 10
 unicast addresses (I think intel supports 10 unicast addresses on the VF). Am
 not sure if there is a good way to notify the guest of blocked addresses.
 Maybe putting the lower dev in promiscuous mode could be a policy decision too
 in this case. 
 
 One other thing, I had indicated that I will look up details on opening my
 patch for non-passthru to enable hw filtering (without adding filtering
 support in macvlan right away. Ie phase1). Turns out in current code in
 macvlan_handle_frame, for non-passthru case, it does not fwd unicast pkts
 destined to macs other than the ones in macvlan hash. So a filter or hash
 lookup there for additional unicast addresses needs to be definitely added for
 non-passthru.
 
 Thanks,
 Roopa
 
 
  

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [dhcp] Use random transaction ID to associate messages

2011-09-15 Thread Yaniv Kaul

- Original Message -
 On Thu, 15 Sep 2011 21:00:38 +0800, Amos Kong ak...@redhat.com
 wrote:
 
  + netdev-xid = random();
 
 This will not work for reboots. The decision that the hardware address
 is
 choosen was not accidental. Not sure if some DHCP server will count on
 the
 ID. (RFC 2131 Retain DHCP client configuration across server reboots,
 and,
 whenever possible, a DHCP client should be assigned the same
 configuration
 parameters despite restarts of the DHCP mechanism). If not so I am
 fine
 with the patch.
 
 Hagen

But a DHCP client should be identified by its MAC, not the xid.
Y.

 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [dhcp] Use random transaction ID to associate messages

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 10:13 -0400, Yaniv Kaul wrote:
 - Original Message -
  On Thu, 15 Sep 2011 21:00:38 +0800, Amos Kong ak...@redhat.com
  wrote:
  
   + netdev-xid = random();
  
  This will not work for reboots. The decision that the hardware address
  is
  choosen was not accidental. Not sure if some DHCP server will count on
  the
  ID. (RFC 2131 Retain DHCP client configuration across server reboots,
  and,
  whenever possible, a DHCP client should be assigned the same
  configuration
  parameters despite restarts of the DHCP mechanism). If not so I am
  fine
  with the patch.
  
  Hagen
 
 But a DHCP client should be identified by its MAC, not the xid.
 Y.

DHCP server may not be aware of MAC address.

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH kvm-unit-tests] apic: test simultaneous NMIs

2011-09-15 Thread Avi Kivity

If multiple NMIs occur simultaneously, the first is handled while
the others are collapsed and queued.  But the current implementation
may collapse all NMIs into the first if timing is bad.

Signed-off-by: Avi Kivity a...@redhat.com
---
 x86/apic.c |   75 
 1 files changed, 75 insertions(+), 0 deletions(-)

diff --git a/x86/apic.c b/x86/apic.c
index 1366185..c51e6a5 100644
--- a/x86/apic.c
+++ b/x86/apic.c
@@ -198,6 +198,80 @@ static void test_sti_nmi(void)
 report(nmi-after-sti, nmi_hlt_counter == 0);
 }
 
+static volatile bool nmi_done, nmi_flushed;
+static volatile int nmi_received;
+static volatile int cpu0_nmi_ctr1, cpu1_nmi_ctr1;
+static volatile int cpu0_nmi_ctr2, cpu1_nmi_ctr2;
+
+static void multiple_nmi_handler(isr_regs_t *regs)
+{
+++nmi_received;
+}
+
+static void kick_me_nmi(void *blah)
+{
+while (!nmi_done) {
+   ++cpu1_nmi_ctr1;
+   while (cpu1_nmi_ctr1 != cpu0_nmi_ctr1  !nmi_done) {
+   pause();
+   }
+   if (nmi_done) {
+   return;
+   }
+   apic_icr_write(APIC_DEST_PHYSICAL | APIC_DM_NMI | APIC_INT_ASSERT, 0);
+   /* make sure the NMI has arrived by sending an IPI after it */
+   apic_icr_write(APIC_DEST_PHYSICAL | APIC_DM_FIXED | APIC_INT_ASSERT
+  | 0x44, 0);
+   ++cpu1_nmi_ctr2;
+   while (cpu1_nmi_ctr2 != cpu0_nmi_ctr2  !nmi_done) {
+   pause();
+   }
+}
+}
+
+static void flush_nmi(isr_regs_t *regs)
+{
+nmi_flushed = true;
+apic_write(APIC_EOI, 0);
+}
+
+static void test_multiple_nmi(void)
+{
+int i;
+bool ok = true;
+
+if (cpu_count()  2) {
+   return;
+}
+
+sti();
+handle_irq(2, multiple_nmi_handler);
+handle_irq(0x44, flush_nmi);
+on_cpu_async(1, kick_me_nmi, 0);
+for (i = 0; i  100; ++i) {
+   nmi_flushed = false;
+   nmi_received = 0;
+   ++cpu0_nmi_ctr1;
+   while (cpu1_nmi_ctr1 != cpu0_nmi_ctr1) {
+   pause();
+   }
+   apic_icr_write(APIC_DEST_PHYSICAL | APIC_DM_NMI | APIC_INT_ASSERT, 0);
+   while (!nmi_flushed) {
+   pause();
+   }
+   if (nmi_received != 2) {
+   ok = false;
+   break;
+   }
+   ++cpu0_nmi_ctr2;
+   while (cpu1_nmi_ctr2 != cpu0_nmi_ctr2) {
+   pause();
+   }
+}
+nmi_done = true;
+report(multiple nmi, ok);
+}
+
 int main()
 {
 setup_vm();
@@ -215,6 +289,7 @@ int main()
 test_ioapic_intr();
 test_ioapic_simultaneous();
 test_sti_nmi();
+test_multiple_nmi();
 
 printf(\nsummary: %d tests, %d failures\n, g_tests, g_fail);
 
-- 
1.7.6.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [dhcp] Use random transaction ID to associate messages

2011-09-15 Thread Amos Kong

- Original Message -
 - Original Message -
  On Thu, 15 Sep 2011 21:00:38 +0800, Amos Kong ak...@redhat.com
  wrote:
 
   + netdev-xid = random();
 
  This will not work for reboots. The decision that the hardware
  address
  is
  choosen was not accidental. Not sure if some DHCP server will count
  on
  the
  ID. (RFC 2131 Retain DHCP client configuration across server
  reboots,
  and,
  whenever possible, a DHCP client should be assigned the same
  configuration
  parameters despite restarts of the DHCP mechanism). If not so I am
  fine
  with the patch.

Hi Hagen,

rfc2131 clearly describes that we need a random xid,
I don't think xid is a port of DHCP client configuration,
it only be used to associate messages and responses between client and server.

I would post a patch to ipxe maillist later if it's ok.

Thanks,
Amos

 But a DHCP client should be identified by its MAC, not the xid.
 Y.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] KVM: Fix simultaneous NMIs

2011-09-15 Thread Avi Kivity

If simultaneous NMIs happen, we're supposed to queue the second
and next (collapsing them), but currently we sometimes collapse
the second into the first.

Fix by using a counter for pending NMIs instead of a bool; collapsing
happens when the NMI window reopens.

Signed-off-by: Avi Kivity a...@redhat.com
---

Not sure whether this interacts correctly with NMI-masked-by-STI or with
save/restore.

 arch/x86/include/asm/kvm_host.h |2 +-
 arch/x86/kvm/svm.c  |1 +
 arch/x86/kvm/vmx.c  |3 ++-
 arch/x86/kvm/x86.c  |   33 +++--
 arch/x86/kvm/x86.h  |7 +++
 5 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6ab4241..3a95885 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -413,7 +413,7 @@ struct kvm_vcpu_arch {
u32  tsc_catchup_mult;
s8   tsc_catchup_shift;
 
-   bool nmi_pending;
+   atomic_t nmi_pending;
bool nmi_injected;
 
struct mtrr_state_type mtrr_state;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index e7ed4b1..d4c792f 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3609,6 +3609,7 @@ static void svm_complete_interrupts(struct vcpu_svm *svm)
if ((svm-vcpu.arch.hflags  HF_IRET_MASK)
 kvm_rip_read(svm-vcpu) != svm-nmi_iret_rip) {
svm-vcpu.arch.hflags = ~(HF_NMI_MASK | HF_IRET_MASK);
+   kvm_collapse_pending_nmis(svm-vcpu);
kvm_make_request(KVM_REQ_EVENT, svm-vcpu);
}
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a0d6bd9..745dadb 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4761,6 +4761,7 @@ static int handle_nmi_window(struct kvm_vcpu *vcpu)
cpu_based_vm_exec_control = ~CPU_BASED_VIRTUAL_NMI_PENDING;
vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control);
++vcpu-stat.nmi_window_exits;
+   kvm_collapse_pending_nmis(vcpu);
kvm_make_request(KVM_REQ_EVENT, vcpu);
 
return 1;
@@ -5790,7 +5791,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
if (vmx_interrupt_allowed(vcpu)) {
vmx-soft_vnmi_blocked = 0;
} else if (vmx-vnmi_blocked_time  10LL 
-  vcpu-arch.nmi_pending) {
+  atomic_read(vcpu-arch.nmi_pending)) {
/*
 * This CPU don't support us in finding the end of an
 * NMI-blocked window if the guest runs with IRQs
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6b37f18..d4f45e0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -359,8 +359,8 @@ void kvm_propagate_fault(struct kvm_vcpu *vcpu, struct 
x86_exception *fault)
 
 void kvm_inject_nmi(struct kvm_vcpu *vcpu)
 {
+   atomic_inc(vcpu-arch.nmi_pending);
kvm_make_request(KVM_REQ_EVENT, vcpu);
-   vcpu-arch.nmi_pending = 1;
 }
 EXPORT_SYMBOL_GPL(kvm_inject_nmi);
 
@@ -2844,7 +2844,7 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct 
kvm_vcpu *vcpu,
KVM_X86_SHADOW_INT_MOV_SS | KVM_X86_SHADOW_INT_STI);
 
events-nmi.injected = vcpu-arch.nmi_injected;
-   events-nmi.pending = vcpu-arch.nmi_pending;
+   events-nmi.pending = atomic_read(vcpu-arch.nmi_pending) != 0;
events-nmi.masked = kvm_x86_ops-get_nmi_mask(vcpu);
events-nmi.pad = 0;
 
@@ -2878,7 +2878,7 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct 
kvm_vcpu *vcpu,
 
vcpu-arch.nmi_injected = events-nmi.injected;
if (events-flags  KVM_VCPUEVENT_VALID_NMI_PENDING)
-   vcpu-arch.nmi_pending = events-nmi.pending;
+   atomic_set(vcpu-arch.nmi_pending, events-nmi.pending);
kvm_x86_ops-set_nmi_mask(vcpu, events-nmi.masked);
 
if (events-flags  KVM_VCPUEVENT_VALID_SIPI_VECTOR)
@@ -4763,7 +4763,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, 
int irq, int inc_eip)
kvm_set_rflags(vcpu, ctxt-eflags);
 
if (irq == NMI_VECTOR)
-   vcpu-arch.nmi_pending = false;
+   atomic_set(vcpu-arch.nmi_pending, 0);
else
vcpu-arch.interrupt.pending = false;
 
@@ -5570,9 +5570,9 @@ static void inject_pending_event(struct kvm_vcpu *vcpu)
}
 
/* try to inject new event if pending */
-   if (vcpu-arch.nmi_pending) {
+   if (atomic_read(vcpu-arch.nmi_pending)) {
if (kvm_x86_ops-nmi_allowed(vcpu)) {
-   vcpu-arch.nmi_pending = false;
+   atomic_dec(vcpu-arch.nmi_pending);
vcpu-arch.nmi_injected = true;
kvm_x86_ops-set_nmi(vcpu);
}
@@ -5604,10 +5604,14 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu)
}
 }
 
+static bool

Re: [PATCH] [dhcp] Use random transaction ID to associate messages

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 10:43 -0400, Amos Kong wrote:
 - Original Message -
  - Original Message -
   On Thu, 15 Sep 2011 21:00:38 +0800, Amos Kong ak...@redhat.com
   wrote:
  
+ netdev-xid = random();
  
   This will not work for reboots. The decision that the hardware
   address
   is
   choosen was not accidental. Not sure if some DHCP server will count
   on
   the
   ID. (RFC 2131 Retain DHCP client configuration across server
   reboots,
   and,
   whenever possible, a DHCP client should be assigned the same
   configuration
   parameters despite restarts of the DHCP mechanism). If not so I am
   fine
   with the patch.
 
 Hi Hagen,
 
 rfc2131 clearly describes that we need a random xid,
 I don't think xid is a port of DHCP client configuration,
 it only be used to associate messages and responses between client and server.
 
 I would post a patch to ipxe maillist later if it's ok.

rfc2131 only required that A DHCP client MUST choose 'xid's in such a
way as to minimize the chance of using an 'xid' identical to one used
by another client..

The 'random xid' suggestion is listed merely as an example.

The way I see it using a xid based on MAC instead of a random number is
safer since the odds for same MAC on the same network are pretty slim
since it would cause problems on other layers in the network.

Whats the reason behind this patch? Whats wrong with current selection
of xid?

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [dhcp] Use random transaction ID to associate messages

2011-09-15 Thread Amos Kong

Whole Archive: http://marc.info/?l=kvmm=131609166918121w=2

- Original Message -
 On Thu, 2011-09-15 at 10:43 -0400, Amos Kong wrote:
  - Original Message -
   - Original Message -
On Thu, 15 Sep 2011 21:00:38 +0800, Amos Kong ak...@redhat.com
wrote:
   
 + netdev-xid = random();
   
This will not work for reboots. The decision that the hardware
address
is
choosen was not accidental. Not sure if some DHCP server will
count
on
the
ID. (RFC 2131 Retain DHCP client configuration across server
reboots,
and,
whenever possible, a DHCP client should be assigned the same
configuration
parameters despite restarts of the DHCP mechanism). If not so I
am
fine
with the patch.
 
  Hi Hagen,
 
  rfc2131 clearly describes that we need a random xid,
  I don't think xid is a port of DHCP client configuration,
  it only be used to associate messages and responses between client
  and server.
 
  I would post a patch to ipxe maillist later if it's ok.
 
 rfc2131 only required that A DHCP client MUST choose 'xid's in such a
 way as to minimize the chance of using an 'xid' identical to one used
 by another client..
 
 The 'random xid' suggestion is listed merely as an example.
 
 The way I see it using a xid based on MAC instead of a random number
 is
 safer since the odds for same MAC on the same network are pretty slim
 since it would cause problems on other layers in the network.

Users may boot up a QEMU guest without default mac address, it's easy to repeat.

Yaniv, what real problem do you touched? only not in accordance to the RFC?

Try to re-start host network, I can capture random dhcp idx, it's not fixed.

Amos

 Whats the reason behind this patch? Whats wrong with current selection
 of xid?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] KVM: Fix simultaneous NMIs

2011-09-15 Thread Jan Kiszka

On 2011-09-15 16:45, Avi Kivity wrote:
 If simultaneous NMIs happen, we're supposed to queue the second
 and next (collapsing them), but currently we sometimes collapse
 the second into the first.

Can you describe the race in a few more details here (sometimes sounds
like I don't know when :) )?

 
 Fix by using a counter for pending NMIs instead of a bool; collapsing
 happens when the NMI window reopens.
 
 Signed-off-by: Avi Kivity a...@redhat.com
 ---
 
 Not sure whether this interacts correctly with NMI-masked-by-STI or with
 save/restore.
 
  arch/x86/include/asm/kvm_host.h |2 +-
  arch/x86/kvm/svm.c  |1 +
  arch/x86/kvm/vmx.c  |3 ++-
  arch/x86/kvm/x86.c  |   33 +++--
  arch/x86/kvm/x86.h  |7 +++
  5 files changed, 26 insertions(+), 20 deletions(-)
 
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index 6ab4241..3a95885 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -413,7 +413,7 @@ struct kvm_vcpu_arch {
   u32  tsc_catchup_mult;
   s8   tsc_catchup_shift;
  
 - bool nmi_pending;
 + atomic_t nmi_pending;
   bool nmi_injected;
  
   struct mtrr_state_type mtrr_state;
 diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
 index e7ed4b1..d4c792f 100644
 --- a/arch/x86/kvm/svm.c
 +++ b/arch/x86/kvm/svm.c
 @@ -3609,6 +3609,7 @@ static void svm_complete_interrupts(struct vcpu_svm 
 *svm)
   if ((svm-vcpu.arch.hflags  HF_IRET_MASK)
kvm_rip_read(svm-vcpu) != svm-nmi_iret_rip) {
   svm-vcpu.arch.hflags = ~(HF_NMI_MASK | HF_IRET_MASK);
 + kvm_collapse_pending_nmis(svm-vcpu);
   kvm_make_request(KVM_REQ_EVENT, svm-vcpu);
   }
  
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index a0d6bd9..745dadb 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -4761,6 +4761,7 @@ static int handle_nmi_window(struct kvm_vcpu *vcpu)
   cpu_based_vm_exec_control = ~CPU_BASED_VIRTUAL_NMI_PENDING;
   vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control);
   ++vcpu-stat.nmi_window_exits;
 + kvm_collapse_pending_nmis(vcpu);
   kvm_make_request(KVM_REQ_EVENT, vcpu);
  
   return 1;
 @@ -5790,7 +5791,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
   if (vmx_interrupt_allowed(vcpu)) {
   vmx-soft_vnmi_blocked = 0;
   } else if (vmx-vnmi_blocked_time  10LL 
 -vcpu-arch.nmi_pending) {
 +atomic_read(vcpu-arch.nmi_pending)) {
   /*
* This CPU don't support us in finding the end of an
* NMI-blocked window if the guest runs with IRQs
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 6b37f18..d4f45e0 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -359,8 +359,8 @@ void kvm_propagate_fault(struct kvm_vcpu *vcpu, struct 
 x86_exception *fault)
  
  void kvm_inject_nmi(struct kvm_vcpu *vcpu)
  {
 + atomic_inc(vcpu-arch.nmi_pending);
   kvm_make_request(KVM_REQ_EVENT, vcpu);
 - vcpu-arch.nmi_pending = 1;

Does the reordering matter? Do we need barriers?

  }
  EXPORT_SYMBOL_GPL(kvm_inject_nmi);
  
 @@ -2844,7 +2844,7 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct 
 kvm_vcpu *vcpu,
   KVM_X86_SHADOW_INT_MOV_SS | KVM_X86_SHADOW_INT_STI);
  
   events-nmi.injected = vcpu-arch.nmi_injected;
 - events-nmi.pending = vcpu-arch.nmi_pending;
 + events-nmi.pending = atomic_read(vcpu-arch.nmi_pending) != 0;
   events-nmi.masked = kvm_x86_ops-get_nmi_mask(vcpu);
   events-nmi.pad = 0;
  
 @@ -2878,7 +2878,7 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct 
 kvm_vcpu *vcpu,
  
   vcpu-arch.nmi_injected = events-nmi.injected;
   if (events-flags  KVM_VCPUEVENT_VALID_NMI_PENDING)
 - vcpu-arch.nmi_pending = events-nmi.pending;
 + atomic_set(vcpu-arch.nmi_pending, events-nmi.pending);
   kvm_x86_ops-set_nmi_mask(vcpu, events-nmi.masked);
  
   if (events-flags  KVM_VCPUEVENT_VALID_SIPI_VECTOR)
 @@ -4763,7 +4763,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu 
 *vcpu, int irq, int inc_eip)
   kvm_set_rflags(vcpu, ctxt-eflags);
  
   if (irq == NMI_VECTOR)
 - vcpu-arch.nmi_pending = false;
 + atomic_set(vcpu-arch.nmi_pending, 0);
   else
   vcpu-arch.interrupt.pending = false;
  
 @@ -5570,9 +5570,9 @@ static void inject_pending_event(struct kvm_vcpu *vcpu)
   }
  
   /* try to inject new event if pending */
 - if (vcpu-arch.nmi_pending) {
 + if (atomic_read(vcpu-arch.nmi_pending)) {
   if (kvm_x86_ops-nmi_allowed(vcpu)) {
 - vcpu-arch.nmi_pending = false;
 + atomic_dec(vcpu-arch.nmi_pending);

Here we lost NMIs in the past by

Re: [PATCH] kvm tools: Allow remapping guest TTY into host PTS

2011-09-15 Thread David Evensky


Sasha,

So far so good! I applied your patch to an older version of kvm-tool
that I had hacked on and it works for a simple test. So I think that I
can do some kernel hacking with kvm tool! Very cool.

I'm tested with the older version of kvm-tool because I am seeing a
bug with an old kernel (2.6.28.10) and the latest version of kvm-tool.
It is an old kernel, and now that I can debug more easily; hopefully I
won't require it.

In case this is worthwhile the error I'm seeing is below. While I do
have 9p compiled into my kernel, I'm not actually using it. I haven't
tried without the 9p compiled in.


$ sudo ~/.../unpatched/linux-kvm/tools/kvm/kvm run -c 1 -m 2048 -k 
./bzImage-2.6.28.10 \
 --console serial -p 'console=ttyS0 ip=192.168.122.2 ' -i ./initramfs-guest.img 
\
  -n tap --host-ip 192.168.122.1 --guest-ip 192.168.122.2 --shmem 
pci:0xc800:16m:create

...
[1.245232] Installing 9P2000 support
[1.246826] 9p: virtio: Maximum channels exceeded
[1.248674] [ cut here ]
[1.250291] kernel BUG at net/9p/trans_virtio.c:240!
[1.252018] invalid opcode:  [#1] SMP 
[1.252491] last sysfs file: 
[1.252491] Dumping ftrace buffer:
[1.252491](ftrace buffer empty)
[1.252491] CPU 0 
[1.252491] Modules linked in:
[1.252491] Pid: 1, comm: swapper Not tainted 2.6.28.10big_64 #6
[1.252491] RIP: 0010:[8057ee2f]  [8057ee2f] 
p9_virtio_probe+0xcf/0x120
[1.252491] RSP: 0018:88007ec2bc90  EFLAGS: 00010286
[1.252491] RAX: 0038 RBX: 0001 RCX: 
[1.252491] RDX: 807d6978 RSI: 0086 RDI: 0246
[1.252491] RBP: 88007d6ae800 R08:  R09: 
[1.252491] R10:  R11:  R12: 88007d6ae808
[1.252491] R13:  R14: 00013200 R15: 
[1.252491] FS:  () GS:809ee000() 
knlGS:
[1.252491] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[1.252491] CR2: 7f77866892c0 CR3: 00201000 CR4: 06e0
[1.252491] DR0:  DR1:  DR2: 
[1.252491] DR3:  DR6: 0ff0 DR7: 0400
[1.252491] Process swapper (pid: 1, threadinfo 88007ec2a000, task 
88007ec29390)
[1.252491] Stack:
[1.252491]  88007d6ae808 804fe696 808106e0 
88007d6ae800
[1.252491]  88007d6ae808 804fe8db 808106e0 

[1.252491]  88007d6ae808 8047ee3a 808106e0 
88007d6ae808
[1.252491] Call Trace:
[1.252491]  [804fe696] ? add_status+0x26/0x50
[1.252491]  [804fe8db] ? virtio_dev_probe+0xab/0xf0
[1.252491]  [8047ee3a] ? driver_probe_device+0x9a/0x1b0
[1.252491]  [8047eff3] ? __driver_attach+0xa3/0xb0
[1.252491]  [8047ef50] ? __driver_attach+0x0/0xb0
[1.252491]  [8047e3a8] ? bus_for_each_dev+0x58/0x80
[1.252491]  [8047e632] ? bus_add_driver+0xb2/0x230
[1.252491]  [8047f1d7] ? driver_register+0x67/0x130
[1.252491]  [8059a5cf] ? _spin_lock+0xf/0x20
[1.252491]  [80578ce2] ? v9fs_register_trans+0x42/0x50
[1.252491]  [809897ea] ? p9_virtio_init+0x0/0x24
[1.252491]  [80209042] ? _stext+0x42/0x1c0
[1.252491]  [803e40db] ? ida_get_new_above+0x14b/0x220
[1.252491]  [802ebd32] ? kmem_cache_alloc+0x102/0x110
[1.252491]  [803e431b] ? idr_pre_get+0x4b/0x90
[1.252491]  [8059a5cf] ? _spin_lock+0xf/0x20
[1.252491]  [80342d72] ? proc_register+0x142/0x240
[1.252491]  [8095ad23] ? kernel_init+0x115/0x15d
[1.252491]  [8095ad1c] ? kernel_init+0x10e/0x15d
[1.252491]  [8022eaf9] ? child_rip+0xa/0x11
[1.252491]  [8095ac0e] ? kernel_init+0x0/0x15d
[1.252491]  [8022eaef] ? child_rip+0x0/0x11
[1.252491] Code: 68 fa e6 ff c6 83 e1 85 b1 80 00 c6 83 e0 85 b1 80 01 31 
c0 48 83 c4 10 5b 5d 41 5c c3 48 c7 c7 e0 42 6e 80 3
 
1 c0 e8 0b 8b 01 00 0f 0b eb fe 48 8b 95 78 02 00 00 48 89 c7 48 89 44 24 08 
ff 52 
[1.252491] RIP  [8057ee2f] p9_virtio_probe+0xcf/0x120
[1.252491]  RSP 88007ec2bc90
[1.360254] ---[ end trace 695d68cac3254cff ]---
[1.361863] Kernel panic - not syncing: Attempted to kill init!
[1.364043] Rebooting in 1 seconds..

*** Compatability Warning ***

virtio-9p device was not detected

While you have requested a virtio-9p device, the guest kernel didn't seem to 
detect it.
Please make sure that the kernel was compiled with CONFIG_NET_9P_VIRTIO.

  # KVM session ended normally.


For this kernel, CONFIG_NET_9P_VIRTIO is defined, but the kernel is old, so
there may be issues.

\dae

On Thu, Sep 15, 2011 at 03:28:46PM +0300,

Re: [RFC] KVM: Fix simultaneous NMIs

2011-09-15 Thread Avi Kivity


On 09/15/2011 07:01 PM, Jan Kiszka wrote:

On 2011-09-15 16:45, Avi Kivity wrote:
  If simultaneous NMIs happen, we're supposed to queue the second
  and next (collapsing them), but currently we sometimes collapse
  the second into the first.

Can you describe the race in a few more details here (sometimes sounds
like I don't know when :) )?


In this case it was I'm in a hurry.



   void kvm_inject_nmi(struct kvm_vcpu *vcpu)
   {
  + atomic_inc(vcpu-arch.nmi_pending);
kvm_make_request(KVM_REQ_EVENT, vcpu);
  - vcpu-arch.nmi_pending = 1;

Does the reordering matter?


I think so.  Suppose the vcpu enters just after kvm_make_request(); it 
sees KVM_REQ_EVENT and clears it, but doesn't see nmi_pending because it 
wasn't set set.  Then comes a kick, the guest is reentered with 
nmi_pending set but KVM_REQ_EVENT clear and sails through the check and 
enters the guest.  The NMI is delayed until the next KVM_REQ_EVENT.



Do we need barriers?


Yes.



  @@ -5570,9 +5570,9 @@ static void inject_pending_event(struct kvm_vcpu *vcpu)
}

/* try to inject new event if pending */
  - if (vcpu-arch.nmi_pending) {
  + if (atomic_read(vcpu-arch.nmi_pending)) {
if (kvm_x86_ops-nmi_allowed(vcpu)) {
  - vcpu-arch.nmi_pending = false;
  + atomic_dec(vcpu-arch.nmi_pending);

Here we lost NMIs in the past by overwriting nmi_pending while another
one was already queued, right?


One place, yes.  The other is kvm_inject_nmi() - if the first nmi didn't 
get picked up by the vcpu by the time the second nmi arrives, we lose 
the second nmi.



if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
inject_pending_event(vcpu);

/* enable NMI/IRQ window open exits if needed */
  - if (nmi_pending)
  + if (atomic_read(vcpu-arch.nmi_pending)
  +   nmi_in_progress(vcpu))

Is nmi_pending  !nmi_in_progress possible at all?


Yes, due to NMI-blocked-by-STI.  A really touchy area.


Is it rather a BUG
condition?


No.


If not, what will happen next?


The NMI window will open and we'll inject the NMI.  But I think we have 
a bug here - we should only kvm_collapse_nmis() if an NMI handler was 
indeed running, yet we do it unconditionally.




  +static inline void kvm_collapse_pending_nmis(struct kvm_vcpu *vcpu)
  +{
  + /* Collapse all NMIs queued while an NMI handler was running to one */
  + if (atomic_read(vcpu-arch.nmi_pending))
  + atomic_set(vcpu-arch.nmi_pending, 1);

Is it OK that NMIs injected after the collapse will increment this to
1 again? Or is that impossible?



It's possible and okay.  We're now completing execution of IRET.  Doing 
atomic_set() after atomic_inc() means the NMI happened before IRET 
completed, and vice versa.  Since these events are asynchronous, we're 
free to choose one or the other (a self-IPI-NMI just before the IRET 
must be swallowed, and a self-IPI-NMI just after the IRET would only be 
executed after the next time around the handler).



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm tools: Allow remapping guest TTY into host PTS

2011-09-15 Thread Sasha Levin

On Thu, 2011-09-15 at 09:52 -0700, David Evensky wrote:
 Sasha,
 
 So far so good! I applied your patch to an older version of kvm-tool
 that I had hacked on and it works for a simple test. So I think that I
 can do some kernel hacking with kvm tool! Very cool.
 

Awesome!

 I'm tested with the older version of kvm-tool because I am seeing a
 bug with an old kernel (2.6.28.10) and the latest version of kvm-tool.
 It is an old kernel, and now that I can debug more easily; hopefully I
 won't require it.
 
 In case this is worthwhile the error I'm seeing is below. While I do
 have 9p compiled into my kernel, I'm not actually using it. I haven't
 tried without the 9p compiled in.
 

[snip]

 For this kernel, CONFIG_NET_9P_VIRTIO is defined, but the kernel is old, so
 there may be issues.
 
 \dae
 

I've noticed that 9p/virtio-9p is a bit unstable in older versions, for
example: you can't use 9p rootfs with kernels older than 2.6.38 (which
isn't that old really).

-- 

Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] KVM: Fix simultaneous NMIs

2011-09-15 Thread Jan Kiszka

On 2011-09-15 19:02, Avi Kivity wrote:
 On 09/15/2011 07:01 PM, Jan Kiszka wrote:
 On 2011-09-15 16:45, Avi Kivity wrote:
  If simultaneous NMIs happen, we're supposed to queue the second
  and next (collapsing them), but currently we sometimes collapse
  the second into the first.

 Can you describe the race in a few more details here (sometimes sounds
 like I don't know when :) )?
 
 In this case it was I'm in a hurry.
 

   void kvm_inject_nmi(struct kvm_vcpu *vcpu)
   {
  +  atomic_inc(vcpu-arch.nmi_pending);
 kvm_make_request(KVM_REQ_EVENT, vcpu);
  -  vcpu-arch.nmi_pending = 1;

 Does the reordering matter?
 
 I think so.  Suppose the vcpu enters just after kvm_make_request(); it 
 sees KVM_REQ_EVENT and clears it, but doesn't see nmi_pending because it 
 wasn't set set.  Then comes a kick, the guest is reentered with 
 nmi_pending set but KVM_REQ_EVENT clear and sails through the check and 
 enters the guest.  The NMI is delayed until the next KVM_REQ_EVENT.

That makes sense - and the old code looks more strange now.

 
 Do we need barriers?
 
 Yes.
 

  @@ -5570,9 +5570,9 @@ static void inject_pending_event(struct kvm_vcpu 
 *vcpu)
 }

 /* try to inject new event if pending */
  -  if (vcpu-arch.nmi_pending) {
  +  if (atomic_read(vcpu-arch.nmi_pending)) {
 if (kvm_x86_ops-nmi_allowed(vcpu)) {
  -  vcpu-arch.nmi_pending = false;
  +  atomic_dec(vcpu-arch.nmi_pending);

 Here we lost NMIs in the past by overwriting nmi_pending while another
 one was already queued, right?
 
 One place, yes.  The other is kvm_inject_nmi() - if the first nmi didn't 
 get picked up by the vcpu by the time the second nmi arrives, we lose 
 the second nmi.

Thinking this through again, it's actually not yet clear to me what we
are modeling here: If two NMI events arrive almost perfectly in
parallel, does the real hardware guarantee that they will always cause
two NMI events in the CPU? Then this is required.

Otherwise I just lost understanding again why we were loosing NMIs here
and in kvm_inject_nmi (maybe elsewhere then?).

 
 if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
 inject_pending_event(vcpu);

 /* enable NMI/IRQ window open exits if needed */
  -  if (nmi_pending)
  +  if (atomic_read(vcpu-arch.nmi_pending)
  +nmi_in_progress(vcpu))

 Is nmi_pending  !nmi_in_progress possible at all?
 
 Yes, due to NMI-blocked-by-STI.  A really touchy area.

And we don't need the window exit notification then? I don't understand
what nmi_in_progress is supposed to do here.

 
 Is it rather a BUG
 condition?
 
 No.
 
 If not, what will happen next?
 
 The NMI window will open and we'll inject the NMI.

How will we know this? We do not request the exit, that's my worry.

  But I think we have 
 a bug here - we should only kvm_collapse_nmis() if an NMI handler was 
 indeed running, yet we do it unconditionally.
 

  +static inline void kvm_collapse_pending_nmis(struct kvm_vcpu *vcpu)
  +{
  +  /* Collapse all NMIs queued while an NMI handler was running to one */
  +  if (atomic_read(vcpu-arch.nmi_pending))
  +  atomic_set(vcpu-arch.nmi_pending, 1);

 Is it OK that NMIs injected after the collapse will increment this to
 1 again? Or is that impossible?

 
 It's possible and okay.  We're now completing execution of IRET.  Doing 
 atomic_set() after atomic_inc() means the NMI happened before IRET 
 completed, and vice versa.  Since these events are asynchronous, we're 
 free to choose one or the other (a self-IPI-NMI just before the IRET 
 must be swallowed, and a self-IPI-NMI just after the IRET would only be 
 executed after the next time around the handler).

Need to think through this separately.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] KVM: Fix simultaneous NMIs

2011-09-15 Thread Avi Kivity


On 09/15/2011 08:25 PM, Jan Kiszka wrote:


  I think so.  Suppose the vcpu enters just after kvm_make_request(); it
  sees KVM_REQ_EVENT and clears it, but doesn't see nmi_pending because it
  wasn't set set.  Then comes a kick, the guest is reentered with
  nmi_pending set but KVM_REQ_EVENT clear and sails through the check and
  enters the guest.  The NMI is delayed until the next KVM_REQ_EVENT.

That makes sense - and the old code looks more strange now.


I think it dates to the time all NMIs were synchronous.



/* try to inject new event if pending */
   -if (vcpu-arch.nmi_pending) {
   +if (atomic_read(vcpu-arch.nmi_pending)) {
if (kvm_x86_ops-nmi_allowed(vcpu)) {
   -vcpu-arch.nmi_pending = false;
   +atomic_dec(vcpu-arch.nmi_pending);

  Here we lost NMIs in the past by overwriting nmi_pending while another
  one was already queued, right?

  One place, yes.  The other is kvm_inject_nmi() - if the first nmi didn't
  get picked up by the vcpu by the time the second nmi arrives, we lose
  the second nmi.

Thinking this through again, it's actually not yet clear to me what we
are modeling here: If two NMI events arrive almost perfectly in
parallel, does the real hardware guarantee that they will always cause
two NMI events in the CPU? Then this is required.


It's not 100% clear from the SDM, but this is what I understood from 
it.  And it's needed - the NMI handlers are now being reworked to handle 
just one NMI source (hopefully the cheapest) in the handler, and if we 
detect a back-to-back NMI, handle all possible NMI sources.  This 
optimization is needed in turn so we can use Jeremy's paravirt spinlock 
framework, which requires a sleep primitive and a 
wake-up-even-if-the-sleeper-has-interrupts-disabled primitive.  i 
thought of using HLT and NMIs respectively, but that means we need a 
cheap handler (i.e. don't go reading PMU MSRs).



Otherwise I just lost understanding again why we were loosing NMIs here
and in kvm_inject_nmi (maybe elsewhere then?).


Because of that.



if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
inject_pending_event(vcpu);

/* enable NMI/IRQ window open exits if needed */
   -if (nmi_pending)
   +if (atomic_read(vcpu-arch.nmi_pending)
   +   nmi_in_progress(vcpu))

  Is nmi_pending   !nmi_in_progress possible at all?

  Yes, due to NMI-blocked-by-STI.  A really touchy area.

And we don't need the window exit notification then? I don't understand
what nmi_in_progress is supposed to do here.


We need the window notification in both cases.  If we're recovering from 
STI, then we don't need to collapse NMIs.  If we're completing an NMI 
handler, then we do need to collapse NMIs (since the queue length is 
two, and we just completed one).




  If not, what will happen next?

  The NMI window will open and we'll inject the NMI.

How will we know this? We do not request the exit, that's my worry.


I think we do? Oh, but this patch breaks it.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [dhcp] Use random transaction ID to associate messages

2011-09-15 Thread Eduardo Habkost

On Thu, Sep 15, 2011 at 05:53:02PM +0300, Sasha Levin wrote:
[...]
 The 'random xid' suggestion is listed merely as an example.
 
 The way I see it using a xid based on MAC instead of a random number is
 safer since the odds for same MAC on the same network are pretty slim
 since it would cause problems on other layers in the network.

I would agree with you if the current code didn't use just the last 4
bytes of the MAC address. So clients could have completely different MAC
addresses (as expected), have no problems communicating in the network,
but share the same final 4 bytes in the MAC address and end up
generating the same xid.

Probably a hash function that used all bytes of the MAC address as input
would work too, but using a random number seems to be good enough (and
simpler, IMO).


 
 Whats the reason behind this patch? Whats wrong with current selection
 of xid?

I'm not sure what issue made Amos investigate the xid generation code,
but the current selection of xid is wrong as it uses just the last 4
bytes of the MAC address.

-- 
Eduardo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pci: clean all funcs when hot-removing multifunc device

2011-09-15 Thread Bjorn Helgaas

On Tue, Sep 13, 2011 at 10:55 PM, Amos Kong ak...@redhat.com wrote:
 'slot-funcs' is initialized in acpiphp_glue.c:register_slot() before
 hotpluging device, and only one entry(func 0) is added to it,
 no new entry will be added to the list when hotpluging devices to the slot.
 When we release the whole device, there is only one entry in the list,
 this causes func1~7 could not be released.
 I try to add entries for all hotpluged device in enable_device(), but
 it doesn't work, because 'slot-funcs' is used in many place which we only
 need to process func 0. This patch just try to clean all funcs in
 disable_device().
...
 Hotpluging multifunc of WinXp is fine.

I'm going to ignore this patch for now.  Please consider these
questions, then repost it if you still want it:

I assume you mean that Linux and WinXP are both running on top of the
same SeaBIOS, and hot-remove of a multifunction device works in WinXP
and fails in Linux.  That sounds like Linux is broken, and we should
fix it.  We might want to make a SeaBIOS change for other reasons, but
it'd still be good to fix Linux in case there are other similar
BIOSes.

Why do we need pci_scan_single_device()?  The device should have been
scanned already when it was added, and I would think that should have
set pdev-multifunction.

Your patch needs spaces around the operators in the for loop.

In the changelog, it would be nice to have the URL of a bugzilla where
the dmesg and DSDT are attached.

Bjorn

 Signed-off-by: Amos Kong ak...@redhat.com
 ---
  drivers/pci/hotplug/acpiphp_glue.c |   27 ++-
  1 files changed, 18 insertions(+), 9 deletions(-)

 diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
 b/drivers/pci/hotplug/acpiphp_glue.c
 index a70fa89..3b86d1a 100644
 --- a/drivers/pci/hotplug/acpiphp_glue.c
 +++ b/drivers/pci/hotplug/acpiphp_glue.c
 @@ -880,6 +880,8 @@ static int disable_device(struct acpiphp_slot *slot)
  {
        struct acpiphp_func *func;
        struct pci_dev *pdev;
 +       struct pci_bus *bus = slot-bridge-pci_bus;
 +       int i, num = 1;

        /* is this slot already disabled? */
        if (!(slot-flags  SLOT_ENABLED))
 @@ -893,16 +895,23 @@ static int disable_device(struct acpiphp_slot *slot)
                        func-bridge = NULL;
                }

 -               pdev = pci_get_slot(slot-bridge-pci_bus,
 -                                   PCI_DEVFN(slot-device, func-function));
 -               if (pdev) {
 -                       pci_stop_bus_device(pdev);
 -                       if (pdev-subordinate) {
 -                               disable_bridges(pdev-subordinate);
 -                               pci_disable_device(pdev);
 +               pdev = pci_scan_single_device(bus,
 +                                       PCI_DEVFN(slot-device, 0));
 +               if (!pdev)
 +                       goto err_exit;
 +               if (pdev-multifunction == 1)
 +                       num = 8;
 +                for (i=0; inum; i++) {
 +                       pdev = pci_get_slot(bus, PCI_DEVFN(slot-device, i));
 +                       if (pdev) {
 +                               pci_stop_bus_device(pdev);
 +                               if (pdev-subordinate) {
 +                                       disable_bridges(pdev-subordinate);
 +                                       pci_disable_device(pdev);
 +                               }
 +                               pci_remove_bus_device(pdev);
 +                               pci_dev_put(pdev);
                        }
 -                       pci_remove_bus_device(pdev);
 -                       pci_dev_put(pdev);
                }
        }

 --
 1.7.6.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pci: clean all funcs when hot-removing multifunc device

2011-09-15 Thread Kenji Kaneshige


(2011/09/16 4:03), Bjorn Helgaas wrote:

On Tue, Sep 13, 2011 at 10:55 PM, Amos Kongak...@redhat.com  wrote:

'slot-funcs' is initialized in acpiphp_glue.c:register_slot() before
hotpluging device, and only one entry(func 0) is added to it,
no new entry will be added to the list when hotpluging devices to the slot.
When we release the whole device, there is only one entry in the list,
this causes func1~7 could not be released.
I try to add entries for all hotpluged device in enable_device(), but
it doesn't work, because 'slot-funcs' is used in many place which we only
need to process func 0. This patch just try to clean all funcs in
disable_device().

...

Hotpluging multifunc of WinXp is fine.


I'm going to ignore this patch for now.  Please consider these
questions, then repost it if you still want it:

I assume you mean that Linux and WinXP are both running on top of the
same SeaBIOS, and hot-remove of a multifunction device works in WinXP
and fails in Linux.  That sounds like Linux is broken, and we should
fix it.  We might want to make a SeaBIOS change for other reasons, but
it'd still be good to fix Linux in case there are other similar
BIOSes.


No objection about fixing Linux.



Why do we need pci_scan_single_device()?  The device should have been
scanned already when it was added, and I would think that should have
set pdev-multifunction.


It should be pci_get_slot() instead. Note that it needs
corresponding pci_dev_put().

Regards,
Kenji Kaneshige





Your patch needs spaces around the operators in the for loop.

In the changelog, it would be nice to have the URL of a bugzilla where
the dmesg and DSDT are attached.

Bjorn


Signed-off-by: Amos Kongak...@redhat.com
---
  drivers/pci/hotplug/acpiphp_glue.c |   27 ++-
  1 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
b/drivers/pci/hotplug/acpiphp_glue.c
index a70fa89..3b86d1a 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -880,6 +880,8 @@ static int disable_device(struct acpiphp_slot *slot)
  {
struct acpiphp_func *func;
struct pci_dev *pdev;
+   struct pci_bus *bus = slot-bridge-pci_bus;
+   int i, num = 1;

/* is this slot already disabled? */
if (!(slot-flags  SLOT_ENABLED))
@@ -893,16 +895,23 @@ static int disable_device(struct acpiphp_slot *slot)
func-bridge = NULL;
}

-   pdev = pci_get_slot(slot-bridge-pci_bus,
-   PCI_DEVFN(slot-device, func-function));
-   if (pdev) {
-   pci_stop_bus_device(pdev);
-   if (pdev-subordinate) {
-   disable_bridges(pdev-subordinate);
-   pci_disable_device(pdev);
+   pdev = pci_scan_single_device(bus,
+   PCI_DEVFN(slot-device, 0));
+   if (!pdev)
+   goto err_exit;
+   if (pdev-multifunction == 1)
+   num = 8;
+for (i=0; inum; i++) {
+   pdev = pci_get_slot(bus, PCI_DEVFN(slot-device, i));
+   if (pdev) {
+   pci_stop_bus_device(pdev);
+   if (pdev-subordinate) {
+   disable_bridges(pdev-subordinate);
+   pci_disable_device(pdev);
+   }
+   pci_remove_bus_device(pdev);
+   pci_dev_put(pdev);
}
-   pci_remove_bus_device(pdev);
-   pci_dev_put(pdev);
}
}

--
1.7.6.1

--
To unsubscribe from this list: send the line unsubscribe linux-pci in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html







--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm tools: Allow remapping guest TTY into host PTS

2011-09-15 Thread Asias He

On 09/15/2011 04:53 PM, Sasha Levin wrote:
 This patch adds the '-tty' option to 'kvm run' which allows the user to
 remap a guest TTY into a PTS on the host.
 
 Usage:
   'kvm run --tty [id] [other options]'
 
 The tty will be mapped to a pts and will be printed on the screen:
   '  Info: Assigned terminal 1 to pty /dev/pts/X'
 
 At this point, it is possible to communicate with the guest using that pty.
 
 This is useful for debugging guest kernel using KGDB:
 
 1. Run the guest:
   'kvm run -k [vmlinuz] -p kdbgoc=ttyS1 kdbgwait --tty 1'
 
 And see which PTY got assigned to ttyS1.
 
 2. Run GDB on the host:
   'gdb [vmlinuz]'
 
 3. Connect to the guest (from within GDB):
   'target remote /dev/pty/X'
 
 4. Start debugging! (enter 'continue' to continue boot).
 
 Cc: David Evensky even...@dancer.ca.sandia.gov
 Signed-off-by: Sasha Levin levinsasha...@gmail.com
 ---
  tools/kvm/Makefile   |1 +
  tools/kvm/builtin-run.c  |   12 
  tools/kvm/hw/serial.c|   46 ++--
  tools/kvm/include/kvm/term.h |   11 ---
  tools/kvm/term.c |   60 +
  tools/kvm/virtio/console.c   |6 ++--
  6 files changed, 96 insertions(+), 40 deletions(-)
 
 diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
 index efa032d..fef624d 100644
 --- a/tools/kvm/Makefile
 +++ b/tools/kvm/Makefile
 @@ -115,6 +115,7 @@ OBJS  += bios/bios-rom.o
  
  LIBS += -lrt
  LIBS += -lpthread
 +LIBS += -lutil
  
  # Additional ARCH settings for x86
  ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/i386/ -e s/sun4u/sparc64/ \
 diff --git a/tools/kvm/builtin-run.c b/tools/kvm/builtin-run.c
 index 5dafb15..b5c63ca 100644
 --- a/tools/kvm/builtin-run.c
 +++ b/tools/kvm/builtin-run.c
 @@ -172,6 +172,15 @@ static int virtio_9p_rootdir_parser(const struct option 
 *opt, const char *arg, i
   return 0;
  }
  
 +static int tty_parser(const struct option *opt, const char *arg, int unset)
 +{
 + int tty = atoi(arg);
 +
 + term_set_tty(tty);
 +
 + return 0;
 +}
 +
  static int shmem_parser(const struct option *opt, const char *arg, int unset)
  {
   const u64 default_size = SHMEM_DEFAULT_SIZE;
 @@ -316,6 +325,9 @@ static const struct option options[] = {
   OPT_STRING('\0', console, console, serial or virtio,
   Console to use),
   OPT_STRING('\0', dev, dev, device_file, KVM device file),
 + OPT_CALLBACK('\0', tty, NULL, tty id,
 +  Remap guest TTY into a pty on the host,
 +  tty_parser),
  
   OPT_GROUP(Kernel options:),
   OPT_STRING('k', kernel, kernel_filename, kernel,
 diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
 index b3b233f..11fa5d4 100644
 --- a/tools/kvm/hw/serial.c
 +++ b/tools/kvm/hw/serial.c
 @@ -14,6 +14,7 @@
  
  struct serial8250_device {
   pthread_mutex_t mutex;
 + u8  id;
  
   u16 iobase;
   u8  irq;
 @@ -42,6 +43,7 @@ static struct serial8250_device devices[] = {
   [0] = {
   .mutex  = PTHREAD_MUTEX_INITIALIZER,
  
 + .id = 0,
   .iobase = 0x3f8,
   .irq= 4,
  
 @@ -51,6 +53,7 @@ static struct serial8250_device devices[] = {
   [1] = {
   .mutex  = PTHREAD_MUTEX_INITIALIZER,
  
 + .id = 1,
   .iobase = 0x2f8,
   .irq= 3,
  
 @@ -60,6 +63,7 @@ static struct serial8250_device devices[] = {
   [2] = {
   .mutex  = PTHREAD_MUTEX_INITIALIZER,
  
 + .id = 2,
   .iobase = 0x3e8,
   .irq= 4,
  
 @@ -69,6 +73,7 @@ static struct serial8250_device devices[] = {
   [3] = {
   .mutex  = PTHREAD_MUTEX_INITIALIZER,
  
 + .id = 3,
   .iobase = 0x2e8,
   .irq= 3,
  
 @@ -111,10 +116,10 @@ static void serial8250__receive(struct kvm *kvm, struct 
 serial8250_device *dev)
   return;
   }
  
 - if (!term_readable(CONSOLE_8250))
 + if (!term_readable(CONSOLE_8250, dev-id))
   return;
  
 - c   = term_getc(CONSOLE_8250);
 + c = term_getc(CONSOLE_8250, dev-id);
  
   if (c  0)
   return;
 @@ -123,30 +128,31 @@ static void serial8250__receive(struct kvm *kvm, struct 
 serial8250_device *dev)
   dev-lsr|= UART_LSR_DR;
  }
  
 -/*
 - * Interrupts are injected for ttyS0 only.
 - */
  void serial8250__inject_interrupt(struct kvm *kvm)
  {
 - struct serial8250_device *dev = devices[0];
 + int i;
  
 - mutex_lock(dev-mutex);
 + for (i = 0; i  4; i++)

Re: [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle

2011-09-15 Thread Scott Wood

On 09/05/2011 05:30 PM, Alexander Graf wrote:
 
 On 27.08.2011, at 01:31, Scott Wood wrote:
 
 +#ifdef CONFIG_E500
 +/*
 + * Skip the overhead of HID0 accesses that KVM ignores --
 + * just write MSR[WE].
 + *
 + * We don't need _TLF_NAPPING, because under KVM we know
 + * it will take effect right away.
 + */
 +if (ppc_md.power_save == e500_idle)
 +ppc_md.power_save = kvm_msrwe_idle;
 
 Why the if() here?

To avoid replacing some other power_save() implementation.
kvm_msrwe_idle() is a paravirt-optimized version of e500_idle().

However, now that e500_idle has an ifdef for e500mc, we'll need that
ifdef here as well.  e500mc doesn't use MSR[WE] (and if it did, we
couldn't trap on it).  For e500mc we'll want to make an hcall for idle
(ePAPR EV_IDLE).

-Scott

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

46 matches

Mail list logo