Re: [PATCH 2/2] Add serial number support for virtio_blk, V4

2009-06-01 Thread Rusty Russell
On Fri, 29 May 2009 01:45:27 pm john cooper wrote:
 virtio_blk-serial-4.patch

Hate to ask dumb questions, but is there a scsi equivalent of this?  It'd be 
nice if we could avoid being ATA-specific in the long run...

Also, why u16?

Thanks,
Rusty.

 +/* return ATA identify data
 + */
 +static int virtblk_identify(struct gendisk *disk, void *argp)
 +{
 + struct virtio_blk *vblk = disk-private_data;
 + u16 *id;
 + int err = -ENOMEM;
 +
 + id = kmalloc(VIRTIO_BLK_ID_BYTES, GFP_KERNEL);
 + if (!id)
 + goto out;
 +
 + err = virtio_config_buf(vblk-vdev, VIRTIO_BLK_F_IDENTIFY,
 + offsetof(struct virtio_blk_config, identify), id,
 + VIRTIO_BLK_ID_BYTES);

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 0/2] Intel-IOMMU: source-id checking for interrupt remapping

2009-06-01 Thread Han, Weidong
Any comments on the patchset? Thanks. 

Regards,
Weidong

Han, Weidong wrote:
 Support source-id checking for interrupt remapping, and then
 isolates interrupts for guests/VMs with assigned devices.
 
 Eric raised pci rebalance issue with VT-d. Yes, it's an issue now.
 Linux needs to handle pci rebalance changes to DRHD scopes. It's
 tricky to support it. This patch just supports source-id for
 interrupt remapping, won't touch that.
 
 The patchset can be applied on linux-2.6-tip tree.
 
 v2 - v3 changelog:
   As Ingo suggested, restructured some code and fixed some code
 style issues.
 
 v1 - v2 changelog:
 Access PCI directly (read_pci_config_byte) to parse IOAPIC,
 instead of PCI related discovery, because PCI subsystem is not
 initialized at that time.
 
 
 Weidong Han (2):
   Intel-IOMMU, intr-remap: set the whole 128bits of irte when
 modify/free it
   Intel-IOMMU, intr-remap: source-id checking
 
  arch/x86/kernel/apic/io_apic.c |6 ++
  drivers/pci/intr_remapping.c   |  160
  +++ drivers/pci/intr_remapping.h
  |2 + include/linux/dmar.h   |   11 +++
  4 files changed, 162 insertions(+), 17 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] RFC: virtual device as irq injection interface

2009-06-01 Thread Avi Kivity

Michael S. Tsirkin wrote:

On Sun, May 31, 2009 at 11:30:48PM +0300, Avi Kivity wrote:
  

Michael S. Tsirkin wrote:

Version N of irqfd actually had the kernel create the fd, due to   
concerns about eventfd's flexibility (thread wakeup vs function 
call).   As it turned out these concerns were misplaced (well, we 
still want the  call to happen in process context when available).



I'm afraid there are deep lifetime issues there, and the recent patch
calling eventfd_fget seems to be just papering over the worst of them.
  
  

You'll have to be more specific.



My concern is that we do fget on eventfd and keep this reference until
fput is done on vm fd. This works as long as no one else does
similar tricks. Imagine for example eventfd or another fs/ change that makes
eventfd do fget on descriptor X and keep it until fput is done on eventfd.
We'll get resource leak if kvm fd is substituted for X.

What do you think?

  


I think it's unlikely that eventfd will start hanging on to fds.  If it 
does, it will have to deal with recursion anyway (eventfd holding on to 
itself), so irqfd will be just a part of the problem.


It's better to have one big problem rather than many small problems.

I'd really like to stick with eventfd if we can solve all the 
problems  there, rather than creating yet another interface.

Especially if we want uio to communicate directly with kvm.



Actually, current irqfd might not be able to handle assigned pci devices
because of the trick it does with set_irq(1)/set_irq(0) trick.
Guest drivers for pci devices likely assume the interrupt
is level.
  
  
Right.  I'm willing to have some userspace mediation for level-triggered  
interrupts.



In other words, you want to keep using KVM_IRQ_LINE for this, as well?
  


We'll need something more than level-triggered interrupts since we need 
to pass the acknowledge from the guest to the host somehow.


It's a corner case anyway as we don't support shared  
interrupts on the host, and PCI level-triggered interrupts are very  
likely to be shared.



If you think about virtio-net-host, there's no host interrupt there.
  


I was talking about uio, sorry.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu-kvm: Flush icache after dma operations for ia64

2009-06-01 Thread Avi Kivity

Zhang, Xiantao wrote:

Avi Kivity wrote:
  

Jes Sorensen wrote:


Ok,

Trying once more. After spending a couple of hours trying to follow
the QEMU dma codeflow, I have convinced myself Avi is right and those
two functions don't need to do the flushing as they all end up
calling dma_bdrv_cb() which calls dma_brdv_unmap(). I have added a
couple comments to the code, which will hopefully save the next
person the 'pleasure' of trying to figure out this too.

  

It looks right to me.  Xiantao?



Fine to me.  But seems the change in qemu_iovec_from_buffer is lost in this 
patch or that change is also not unnecessary ?
  



I think the fixed unmap handles that case.  Can you test to make sure?

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Qemu (host) - host userspace signaling?

2009-06-01 Thread Avi Kivity

pav wrote:

Hello,

I am looking for a simple way to get a bidirectional event notification 
interface between qemu/kvm and host userspace processes. Just a kick, 
messages/data not required.



What I basically need is a way to have an interested host process 
informed by a custom qemu device that something happened (i.e. after a 
MMIO write) and the other way around - to allow similar notifications 
from the process to the qemu device. Of course I do not want qemu to 
sleep.

Instant reaction to such events is not required.


I understand I could use a unix socket and qemu_chr_open() and friends 
for this, but isn't a full-blown socket a bit of an overkill for a simple 
kick interface?
  


Not at all.  Send a byte to have the other side wake up.

From what I understand qemu would then act as a server and sleep just 
after starting (or later?), waiting for connections? Or maybe there is a 
way to reverse it, have qemu be the client, although that could still 
make qemu sleep?.
I guess it could use some kind of poll/select, but I am not sure where in 
qemu should such code be put in though...
  


You can have have qemu act as a client or as server, and wait for 
connections or not.



Or maybe there is something else for this in qemu already? I had thought 
iosignalfd or eventfd were made for that, but if I understand correctly, 
they communicate with the guest and are for something different?
  


iosiginalfd and irqfd are slightly more efficient, but you need to make 
them available to another process by passing them over a unix domain 
socket with SCM_RIGHTS.  So they are more cumbersome to set up.


irqfd also requires MSI support in the guest.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][KVM-AUTOTEST] Check exit status of custom install script and fail if script failed.

2009-06-01 Thread Avi Kivity

Lucas Meneghel Rodrigues wrote:

On Sun, 2009-05-24 at 17:48 +0300, Avi Kivity wrote:
  

Mike Burns wrote:


Signed-off-by: Mike Burns mbu...@redhat.com
---
 client/tests/kvm_runtest_2/kvm_install.py |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm_runtest_2/kvm_install.py 
b/client/tests/kvm_runtest_2/kvm_install.py
index ebd8b7d..392ef0c 100755
--- a/client/tests/kvm_runtest_2/kvm_install.py
+++ b/client/tests/kvm_runtest_2/kvm_install.py
@@ -90,7 +90,12 @@ def run_kvm_install(test, params, env):
  kvm_log.info(Adding KVM_INSTALL_%s to Environment % (k))
   os.putenv(KVM_INSTALL_%s % (k), str(params[k]))
kvm_log.info(Running  + script +  to install kvm)
-os.system(cd %s; %s % (test.bindir, script))
+install_result = os.system(cd %s; %s % (test.bindir, script))
+   if os.WEXITSTATUS(install_result) != 0:
+  message = Custom Script encountered an error
+  kvm_log.error(message)
+  raise error.TestError, message
+
  
  
How about a helper that does os.system()  (or rather, 
commands.getstatusoutput()) and throws an exception on failure?  I 
imagine it could be used in many places.



utils.system() does that. If we have exit code != 0, it throws an
error.CmdError exception.
  


Well let's use it then.  Every time I see 'raise' used I'm going to 
complain, so it will be a lot more efficient as well as smaller code.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] KVM-Autotest: basic parallel test execution

2009-06-01 Thread Avi Kivity

Mike Burns wrote:

Avi Kivity wrote:
  

Michael Goldish wrote:


Drawbacks:
- requires some initial work to be done by the user -- the user has
to define exactly where each test should run
  
  

For me, this is a major drawback.  I'd really like a fire-and-forget
solution.  If I have to spend my own time getting this to work, vs.
waiting longer for the tests to run on their own, I'll just be lazy.


It seems like this is adding another layer of user interaction without
really getting that much additional functionality.  An administrator of
kvm-autotest is going to know enough to be able to create control and
kvm_tests.cfg files that can mimic this already and then just create 2
jobs in the server to run on different hosts. 
  


I'm an administrator of kvm-autotest and I have no idea how to create 
control and kvm_tests.cfg files.


Also, I'm a lot more interested in spreading the load on my one host 
rather than on fictional other hosts.  And for that we need a scheduler, 
since different tests need differing amounts of memory and cores.



- test sets need to be modified when tests or hosts are
added/removed, to include/exclude them
  
  

This is also annoying -- and likely to stop me from updating.


Host definitions already happen on the server and it seems like it would
be confusing to have to make sure that the host you choose when setting
up a test is setup correctly in your config files. 
  


Host setups should be separated from test setup, so that tests can be 
continuously updated while the host setup is kept static.



I'd really like this to be automated, just specify a set of machines
and have the jobs distributed.  Furthermore, it is very important to
utilize the existing hosts better.  A 4-core 4GB server can easily run
a 2x smp 1GB guest and 2 other uniprocessor 1GB guests.  It's wasteful
to add more servers when the existing servers are underutilized.



Overall, the way I envisioned parallel test execution was having
multiple tests running on a single machine.  It seems that the server
already provides most (if not all) the functionality needed to spread
tests across multiple machines.  I really think this is putting too much
on the user for very small gains in functionality. 
  


As a user, I disagree.  I can't calculate what resources are needed by 
each test and make sure they all fit on my host.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Read MADT entries from memory in processors _MAT method.

2009-06-01 Thread Gleb Natapov
Also use enable/disable bit from ACPI MADT entries to track CPU hot
plug. This removes assumptions about APIC ids from AML code and simplify
cup hotplug handling.

Signed-off-by: Gleb Natapov g...@redhat.com

diff --git a/kvm/bios/acpi-dsdt.dsl b/kvm/bios/acpi-dsdt.dsl
index c756fed..69e1426 100755
--- a/kvm/bios/acpi-dsdt.dsl
+++ b/kvm/bios/acpi-dsdt.dsl
@@ -27,36 +27,32 @@ DefinitionBlock (
 {
Scope (\_PR)
{
-   OperationRegion(PRST, SystemIO, 0xaf00, 32)
-   Field (PRST, ByteAcc, NoLock, Preserve)
+   /* pointer to fist element of MADT APIC structures */
+   OperationRegion(ATPR, SystemMemory, 0x0514, 4)
+   Field (ATPR, DwordAcc, NoLock, Preserve)
{
-   PRS, 256
+   ATP, 32 
}
 
-   Name(PRSS, Buffer(32){}) /* shadow CPU status bitmask */
-   Name(SSVL, 0)
-
-   Method(CRST, 1) {
-   If (LEqual(SSVL, 0)) {
-   Store(PRS, PRSS) /* read CPUs status bitmaks from HW */
-   Store(1, SSVL)
-}
-   ShiftRight(Arg0, 3, Local1)
-   Store(DerefOf(Index(PRSS, Local1)), Local2)
-   Return(And(Local2, ShiftLeft(1, And(Arg0, 0x7
-   }
+#define madt_addr(nr)  Add (ATP, Multiply(nr, 8))
 
 #define gen_processor(nr, name)\
Processor (CPU##name, nr, 0xb010, 0x06) {   \
-Name (PREN, Buffer(0x8) {0x0, 0x8, nr, nr, 0x1, 0x0, 0x0, 0x0}) \
-Name (PRDS, Buffer(0x8) {0x0, 0x8, nr, nr, 0x0, 0x0, 0x0, 0x0}) \
+   OperationRegion (MATR, SystemMemory, madt_addr(nr), 8)  \
+   Field (MATR, ByteAcc, NoLock, Preserve) \
+   {   \
+   MAT, 64 \
+   }   \
+   Field (MATR, ByteAcc, NoLock, Preserve) \
+   {   \
+   Offset(4),  \
+   FLG, 1  \
+   }   \
 Method(_MAT, 0) {   \
-If (CRST(nr)) { Return(PREN) }  \
-Else { Return(PRDS) }   \
+   Return(MAT) \
 }   \
 Method (_STA) { \
-If (CRST(nr)) { Return(0xF) }   \
-Else { Return(0x9) }\
+If (FLG) { Return(0xF) } Else { Return(0x9) }   \
 }   \
 }   \
 
@@ -78,9 +74,16 @@ DefinitionBlock (
gen_processor(14, E)
 
Method (NTFY, 2) {
-#define gen_ntfy(nr)  \
-   If (LEqual(Arg0, 0x##nr)) {   \
-   Notify(CPU##nr, Arg1) \
+#define gen_ntfy(nr)\
+   If (LEqual(Arg0, 0x##nr)) { \
+   If (LNotEqual(Arg1, \_PR.CPU##nr.FLG)) {\
+   Store (Arg1, \_PR.CPU##nr.FLG)  \
+   If (LEqual(Arg1, 1)) {  \
+   Notify(CPU##nr, 1)  \
+   } Else {\
+   Notify(CPU##nr, 3)  \
+   }   \
+   }   \
}
gen_ntfy(0)
gen_ntfy(1)
@@ -100,41 +103,26 @@ DefinitionBlock (
Return(One)
}
 
-   /* Works on 8 bit quentity.
- * Arg1 - Shadow status bits
- * Arg2 - Current status bits
-*/
-Method(PR1, 3) {
-   Xor(Arg1, Arg2, Local0) /* figure out what chaged */
-   ShiftLeft(Arg0, 3, Local1)
-While (LNotEqual(Local0, Zero)) {
-   If (And(Local0, 1)) {  /* if staus have changed */
-if(And(Arg2, 1)) { /* check previous status */
-   Store(3, Local3)
-   } Else {
-   Store(1, Local3)
-   }
-   NTFY(Local1, Local3)
-}
-   ShiftRight(Local0, 1, Local0)
-   ShiftRight(Arg2, 1, Arg2)
-

Re: [PATCH 0/3] RFC: virtual device as irq injection interface

2009-06-01 Thread Gregory Haskins
Michael S. Tsirkin wrote:
 On Sun, May 31, 2009 at 11:30:48PM +0300, Avi Kivity wrote:
   
 Michael S. Tsirkin wrote:
 
 Version N of irqfd actually had the kernel create the fd, due to   
 concerns about eventfd's flexibility (thread wakeup vs function 
 call).   As it turned out these concerns were misplaced (well, we 
 still want the  call to happen in process context when available).
 
 
 I'm afraid there are deep lifetime issues there, and the recent patch
 calling eventfd_fget seems to be just papering over the worst of them.
   
   
 You'll have to be more specific.
 

 My concern is that we do fget on eventfd and keep this reference until
 fput is done on vm fd.

Hi Michael,
  This is not really the full picture, and I think it might be where all
the confusion starts.  You are only covering the case where kvm is the
first to close (and if you think about it, you need to handle that case
as well just like me or the tables are turned).

We both agree that a irqfd or irqfd-like concept and kvm have a
relationship with one another, and that we have to manage that
relationship, right?  The relationship starts with an IRQFD_ASSIGN, and
it stops when either the irqfd is closed, or if the kvm is closed
(whichever comes first).  The lifetimes are actually identical with your
proposal if you think about it.  Only the mechanics of how to get there
are (slightly) different.

i.e. If the IRQFD wants to close first, you do an ioctl(kvmfd,
IRQFD_DEASSIGN)+close(irqfd).   If kvm wants to close first, you do a
close(kvmfd).  I do not think there is really any issue with lifetimes
there.

I suppose you could argue: well what if they do the close(irqfd) but
not the ioctl() (or vice versa)?, and to that I would say that its no
different than if userspace forgot to do X in any other resource.  The
fact is that userspace holds a number of kernel resources, and they can
either be explicitly freed (such as with a close()), or they will be
implicitly freed when the task exits.  I think all of these requirements
are met here, so I do not see a problem.

Yes, I agree that having to do two system calls to completely close it
are not as attractive as one, but the tradeoff is to potentially not use
eventfd as the underlying basis for the construct.  There are distinct
advantages to using eventfd here, so we would like to continue to do so
unless someone can display a compelling reason not to.  So far I am not
seeing such a reason.

A potential compromise is to investigate the POLLHUP technique that
Davide mentioned so that kvmfd can get notified of the closure without
needing an additional explicit ioctl to do it.  Note that we already
have irqfd in the tree so I assume we would need to do this in a ABI
friendly way, but its possible.


  This works as long as no one else does
 similar tricks. Imagine for example eventfd or another fs/ change that makes
 eventfd do fget on descriptor X and keep it until fput is done on eventfd.
 We'll get resource leak if kvm fd is substituted for X.
   

I don't think thats a realistic concern to assume eventfd would ever be
grabbing other fd's, but I think Avi answered this succinctly in his
reply to this mail so I won't rehash it.

 What do you think?

   
   
   
 I'd really like to stick with eventfd if we can solve all the 
 problems  there, rather than creating yet another interface.
 Especially if we want uio to communicate directly with kvm.
 
 
 Actually, current irqfd might not be able to handle assigned pci devices
 because of the trick it does with set_irq(1)/set_irq(0) trick.
 Guest drivers for pci devices likely assume the interrupt
 is level.
   
   
 Right.  I'm willing to have some userspace mediation for level-triggered  
 interrupts.
 

 In other words, you want to keep using KVM_IRQ_LINE for this, as well?
   

Or more specifically, if you need something more than a basic edge
interrupt, you should use the existing interfaces.  We set the stake in
the ground during review that irqfd would only support interfaces that
can do MSI/edge like injections.

   
 It's a corner case anyway as we don't support shared  
 interrupts on the host, and PCI level-triggered interrupts are very  
 likely to be shared.
 

 If you think about virtio-net-host, there's no host interrupt there.

   
 With virt devices, what we'd do is create a virt device that attaches to
 uio driver.  This would handle interrupts and everything else that needs
 to live in kernel
   
 With irqfd, what we do is attach an eventfd to the MSI we're interested  
 in.  Given that eventfds are usable from userspace, we're adding a  
 non-virt-specific interface to uio that serves kvm well.  Both uio and  
 kvm win.
 




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 0/3] RFC: virtual device as irq injection interface

2009-06-01 Thread Avi Kivity

Gregory Haskins wrote:

A potential compromise is to investigate the POLLHUP technique that
Davide mentioned so that kvmfd can get notified of the closure without
needing an additional explicit ioctl to do it.  Note that we already
have irqfd in the tree so I assume we would need to do this in a ABI
friendly way, but its possible.
  


We don't have irqfd in any released tree.  I'm only submitting it for 
2.6.32 (exactly so we can iron these things out), so we can change it 
any way we like or even pull it out completely.


The POLLHUP stuff is something I'd like to see in.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] RFC: virtual device as irq injection interface

2009-06-01 Thread Gregory Haskins
Avi Kivity wrote:
 Gregory Haskins wrote:
 A potential compromise is to investigate the POLLHUP technique that
 Davide mentioned so that kvmfd can get notified of the closure without
 needing an additional explicit ioctl to do it.  Note that we already
 have irqfd in the tree so I assume we would need to do this in a ABI
 friendly way, but its possible.
   

 We don't have irqfd in any released tree.  I'm only submitting it for
 2.6.32 (exactly so we can iron these things out), so we can change it
 any way we like or even pull it out completely.

 The POLLHUP stuff is something I'd like to see in.

Ah, perfect.  I will submit a patch to implement this, then.

Thanks,
-Greg



signature.asc
Description: OpenPGP digital signature


Re: [KVM PATCH v4 3/3] kvm: add iosignalfd support

2009-06-01 Thread Gregory Haskins
Avi Kivity wrote:
 Gregory Haskins wrote:
 This is closer to how the original series worked, but Avi asked for a
 data-match token and thus the cookie was born.  I think the rationale is
 that we can't predict whether the same eventfd will be registered more
 than once, and thus we need a way to further qualify it.  However, to
 your point, I cannot think of a valid use case for having the same fd
 registered to the same address more than once, so perhaps your fd/addr
 tuple is sufficient and we can drop the cookie (or, really, rename it to
 trigger ;)

 Avi?
   

 This is just how virtio works.  To kick ring N of device X, it writes
 N to a port specific to X.

 If we lose N, then we don't know which ring was kicked and have to
 check them all.

 May we can rename cookie to data_match to make it explicit.  If the
 data doesn't match, the eventfd isn't kicked.

 (Mark, same as we have arbitrary ring-MSI mappings (allowing one MSI
 to notify multiple rings), perhaps we should have the same capability
 for the other direction?  So the guest could kick mulitple rings with
 one write, or just one ring, according to personal preference.

I think we are all on the same page here, actually (more or less).  I
think the confusion was my own when you initially asked for a data-match
token, and I gave you cookie.  cookie as I implemented it was really
only used to match up the eventfd during de-assign, not the actual
signal event.  I think what you are talking about here is the same as
what Mark and I have been calling trigger.  I agree that we need this
interface to be able to properly sort something like virtio, and I will
be including this in the next release.

OTOH, what I was proposing above is that my misguided attempt at
cookie for deassign is redundant with simply looking at the
eventfd/addr tuple.  We can simply key off of those items to match up
the iosignalfd to close, and therefore lets get rid of the cookie field.

Sorry for the confusion.  Let me know if you still think this isn't right.

-Greg



signature.asc
Description: OpenPGP digital signature


[PATCH 0/3] Cache PDPTRs under ept/npt

2009-06-01 Thread Avi Kivity
Currently the EPT code re-loads the PDPTRs after every exit, even though
they usually are not needed.  Moreover, the PDPTRs are reloaded from memory,
even though they are supposed to be kept on processor registers independent
from memory.  The NPT case is similar.

This patchset makes the PDPTRs cacheable registers (like RSP and RIP on vmx)
so they are only stored in the VMCS if they are dirtied and copied from the
VMCS to memory on demand.  As SVM doesn't virtualize the PDPTRs, we simple
load them from memory on demand (instead of on every exit).

It reduces ept vmexit costs (inb on some pic port) from 3231 cycles to 2750
cycles, a 15% reduction.

Please review.

Avi Kivity (3):
  KVM: VMX: Avoid duplicate ept tlb flush when setting cr3
  KVM: VMX: Simplify pdptr and cr3 management
  KVM: Cache pdptrs

 arch/x86/include/asm/kvm_host.h |4 +++
 arch/x86/kvm/kvm_cache_regs.h   |   10 +
 arch/x86/kvm/mmu.c  |7 -
 arch/x86/kvm/paging_tmpl.h  |2 +-
 arch/x86/kvm/svm.c  |   24 -
 arch/x86/kvm/vmx.c  |   42 +-
 arch/x86/kvm/x86.c  |8 +++
 7 files changed, 78 insertions(+), 19 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] KVM: Cache pdptrs

2009-06-01 Thread Avi Kivity
Instead of reloading the pdptrs on every entry and exit (vmcs writes on vmx,
guest memory access on svm) extract them on demand.

Signed-off-by: Avi Kivity a...@redhat.com
---
 arch/x86/include/asm/kvm_host.h |4 
 arch/x86/kvm/kvm_cache_regs.h   |   10 ++
 arch/x86/kvm/mmu.c  |7 +--
 arch/x86/kvm/paging_tmpl.h  |2 +-
 arch/x86/kvm/svm.c  |   24 ++--
 arch/x86/kvm/vmx.c  |   22 ++
 arch/x86/kvm/x86.c  |8 
 7 files changed, 64 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d2b082d..1951d39 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -120,6 +120,10 @@ enum kvm_reg {
NR_VCPU_REGS
 };
 
+enum kvm_reg_ex {
+   VCPU_EXREG_PDPTR = NR_VCPU_REGS,
+};
+
 enum {
VCPU_SREG_ES,
VCPU_SREG_CS,
diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index 1ff819d..24fc742 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -29,4 +29,14 @@ static inline void kvm_rip_write(struct kvm_vcpu *vcpu, 
unsigned long val)
kvm_register_write(vcpu, VCPU_REGS_RIP, val);
 }
 
+
+static inline u64 kvm_pdptr_read(struct kvm_vcpu *vcpu, int index)
+{
+   if (!test_bit(VCPU_EXREG_PDPTR,
+ (unsigned long *)vcpu-arch.regs_avail))
+   kvm_x86_ops-cache_reg(vcpu, VCPU_EXREG_PDPTR);
+
+   return vcpu-arch.pdptrs[index];
+}
+
 #endif
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7030b5f..809cce0 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -18,6 +18,7 @@
  */
 
 #include mmu.h
+#include kvm_cache_regs.h
 
 #include linux/kvm_host.h
 #include linux/types.h
@@ -1930,6 +1931,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
gfn_t root_gfn;
struct kvm_mmu_page *sp;
int direct = 0;
+   u64 pdptr;
 
root_gfn = vcpu-arch.cr3  PAGE_SHIFT;
 
@@ -1957,11 +1959,12 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
 
ASSERT(!VALID_PAGE(root));
if (vcpu-arch.mmu.root_level == PT32E_ROOT_LEVEL) {
-   if (!is_present_pte(vcpu-arch.pdptrs[i])) {
+   pdptr = kvm_pdptr_read(vcpu, i);
+   if (!is_present_pte(pdptr)) {
vcpu-arch.mmu.pae_root[i] = 0;
continue;
}
-   root_gfn = vcpu-arch.pdptrs[i]  PAGE_SHIFT;
+   root_gfn = pdptr  PAGE_SHIFT;
} else if (vcpu-arch.mmu.root_level == 0)
root_gfn = 0;
if (mmu_check_root(vcpu, root_gfn))
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 67785f6..4cb1dbf 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -131,7 +131,7 @@ walk:
pte = vcpu-arch.cr3;
 #if PTTYPE == 64
if (!is_long_mode(vcpu)) {
-   pte = vcpu-arch.pdptrs[(addr  30)  3];
+   pte = kvm_pdptr_read(vcpu, (addr  30)  3);
if (!is_present_pte(pte))
goto not_present;
--walker-level;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 3ac45e3..37397f6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -779,6 +779,18 @@ static void svm_set_rflags(struct kvm_vcpu *vcpu, unsigned 
long rflags)
to_svm(vcpu)-vmcb-save.rflags = rflags;
 }
 
+static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
+{
+   switch (reg) {
+   case VCPU_EXREG_PDPTR:
+   BUG_ON(!npt_enabled);
+   load_pdptrs(vcpu, vcpu-arch.cr3);
+   break;
+   default:
+   BUG();
+   }
+}
+
 static void svm_set_vintr(struct vcpu_svm *svm)
 {
svm-vmcb-control.intercept |= 1ULL  INTERCEPT_VINTR;
@@ -2286,12 +2298,6 @@ static int handle_exit(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
}
vcpu-arch.cr0 = svm-vmcb-save.cr0;
vcpu-arch.cr3 = svm-vmcb-save.cr3;
-   if (is_paging(vcpu)  is_pae(vcpu)  !is_long_mode(vcpu)) {
-   if (!load_pdptrs(vcpu, vcpu-arch.cr3)) {
-   kvm_inject_gp(vcpu, 0);
-   return 1;
-   }
-   }
if (mmu_reload) {
kvm_mmu_reset_context(vcpu);
kvm_mmu_load(vcpu);
@@ -2642,6 +2648,11 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
 
svm-next_rip = 0;
 
+   if (npt_enabled) {
+   vcpu-arch.regs_avail = ~(1  VCPU_EXREG_PDPTR);
+   vcpu-arch.regs_dirty = ~(1  VCPU_EXREG_PDPTR);
+   }
+
svm_complete_interrupts(svm);
 }
 
@@ -2750,6 +2761,7 @@ 

[PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management

2009-06-01 Thread Avi Kivity
Instead of reading the PDPTRs from memory after every exit (which is slow
and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from
memory to the VMCS before entry, and from the VMCS to memory after exit.
Do the same for cr3.

Signed-off-by: Avi Kivity a...@redhat.com
---
 arch/x86/kvm/vmx.c |   21 +++--
 1 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5607de8..1783606 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1538,10 +1538,6 @@ static void vmx_decache_cr4_guest_bits(struct kvm_vcpu 
*vcpu)
 static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
 {
if (is_paging(vcpu)  is_pae(vcpu)  !is_long_mode(vcpu)) {
-   if (!load_pdptrs(vcpu, vcpu-arch.cr3)) {
-   printk(KERN_ERR EPT: Fail to load pdptrs!\n);
-   return;
-   }
vmcs_write64(GUEST_PDPTR0, vcpu-arch.pdptrs[0]);
vmcs_write64(GUEST_PDPTR1, vcpu-arch.pdptrs[1]);
vmcs_write64(GUEST_PDPTR2, vcpu-arch.pdptrs[2]);
@@ -1549,6 +1545,16 @@ static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
}
 }
 
+static void ept_save_pdptrs(struct kvm_vcpu *vcpu)
+{
+   if (is_paging(vcpu)  is_pae(vcpu)  !is_long_mode(vcpu)) {
+   vcpu-arch.pdptrs[0] = vmcs_read64(GUEST_PDPTR0);
+   vcpu-arch.pdptrs[1] = vmcs_read64(GUEST_PDPTR1);
+   vcpu-arch.pdptrs[2] = vmcs_read64(GUEST_PDPTR2);
+   vcpu-arch.pdptrs[3] = vmcs_read64(GUEST_PDPTR3);
+   }
+}
+
 static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
 
 static void ept_update_paging_mode_cr0(unsigned long *hw_cr0,
@@ -1642,7 +1648,6 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned 
long cr3)
if (enable_ept) {
eptp = construct_eptp(cr3);
vmcs_write64(EPT_POINTER, eptp);
-   ept_load_pdptrs(vcpu);
guest_cr3 = is_paging(vcpu) ? vcpu-arch.cr3 :
VMX_EPT_IDENTITY_PAGETABLE_ADDR;
}
@@ -3199,7 +3204,7 @@ static int vmx_handle_exit(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
 * to sync with guest real CR3. */
if (enable_ept  is_paging(vcpu)) {
vcpu-arch.cr3 = vmcs_readl(GUEST_CR3);
-   ept_load_pdptrs(vcpu);
+   ept_save_pdptrs(vcpu);
}
 
if (unlikely(vmx-fail)) {
@@ -3376,6 +3381,10 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+   if (enable_ept  is_paging(vcpu)) {
+   vmcs_writel(GUEST_CR3, vcpu-arch.cr3);
+   ept_load_pdptrs(vcpu);
+   }
/* Record the guest's net vcpu time for enforced NMI injections. */
if (unlikely(!cpu_has_virtual_nmis()  vmx-soft_vnmi_blocked))
vmx-entry_time = ktime_get();
-- 
1.6.0.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] KVM: VMX: Avoid duplicate ept tlb flush when setting cr3

2009-06-01 Thread Avi Kivity
vmx_set_cr3() will call vmx_tlb_flush(), which will flush the ept context.
So there is no need to call ept_sync_context() explicitly.

Signed-off-by: Avi Kivity a...@redhat.com
---
 arch/x86/kvm/vmx.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 25f1239..5607de8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1642,7 +1642,6 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned 
long cr3)
if (enable_ept) {
eptp = construct_eptp(cr3);
vmcs_write64(EPT_POINTER, eptp);
-   ept_sync_context(eptp);
ept_load_pdptrs(vcpu);
guest_cr3 = is_paging(vcpu) ? vcpu-arch.cr3 :
VMX_EPT_IDENTITY_PAGETABLE_ADDR;
-- 
1.6.0.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH] Adding kvm test (kvm autotest upstream merge proposal take 2)

2009-06-01 Thread Uri Lublin

On 05/31/2009 10:24 PM, Lucas Meneghel Rodrigues wrote:

The test was just commited. Stage 1 of the upstream merge complete.

Please let me know if you have any doubts. Happy hacking!


Please credit Dror Russo for his contribution (see below).
Dror has been contributing since kvm-autotest early days.

Thanks,
Uri.



On Fri, 2009-05-29 at 14:58 -0300, Lucas Meneghel Rodrigues wrote:

After some conversations, we decided to rename kvm_runtest_2 to kvm.
Also, corrected some small mistakes I've done on the first patches.

Hopefully, things are good to go. From the feedback I've got on the IRC
channel, people are happy with the current state of the test. I am going
to commit it if nobody pronounces against it :)

From: Uri Lublin (u...@redhat.com)


  Dror Russo (dru...@redhat.com)


   Michael Goldish (mgold...@redhat.com)
   David Huff (dh...@redhat.com)
   Alexey Eromenko (aerom...@redhat.com)
   Mike Burns (mbu...@redhat.com)

Signed-off-by: Lucas Meneghel Rodriguesl...@redhat.com

Index: trunk/client/tests/kvm/control
===
--- trunk/client/tests/kvm/control  (revision 0)
+++ trunk/client/tests/kvm/control  (revision 0)
@@ -0,0 +1,155 @@
+AUTHOR = 
+u...@redhat.com (Uri Lublin)


   +dru...@redhat.com (Dror Russo)


+mgold...@redhat.com (Michael Goldish)
+dh...@redhat.com (David Huff)
+aerom...@redhat.com (Alexey Eromenko)
+mbu...@redhat.com (Mike Burns)
+





Index: trunk/client/tests/kvm/make_html_report.py
===
--- trunk/client/tests/kvm/make_html_report.py  (revision 0)
+++ trunk/client/tests/kvm/make_html_report.py  (revision 0)
@@ -0,0 +1,1735 @@
+#!/usr/bin/python
+
+Script used to parse the test results and generate an HTML report.
+
+...@copyright: (c)2005-2007 Matt Kruse (javascripttoolbox.com)
+...@copyright: Red Hat 2008-2009


   +...@author: dru...@redhat.com (Dror Russo)



Index: trunk/client/tests/kvm/kvm.py
===
--- trunk/client/tests/kvm/kvm.py   (revision 0)
+++ trunk/client/tests/kvm/kvm.py   (revision 0)
@@ -0,0 +1,109 @@
+import sys, os, time, shelve, random, resource, logging
+from autotest_lib.client.bin import test
+from autotest_lib.client.common_lib import error
+
+
+class test_routine:
+def __init__(self, module_name, routine_name):
+self.module_name = module_name
+self.routine_name = routine_name
+self.routine = None
+
+
+class kvm(test.test):
+
+Suite of KVM virtualization functional tests.
+Contains tests for testing both KVM kernel code and userspace code.
+
+@copyright: Red Hat 2008-2009
+@author: Uri Lublin (u...@redhat.com)


   +@author: Dror Russo (dru...@redhat.com)


+@author: Michael Goldish (mgold...@redhat.com)
+@author: David Huff (dh...@redhat.com)
+@author: Alexey Eromenko (aerom...@redhat.com)
+@author: Mike Burns (mbu...@redhat.com)
+


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: configure script bug..

2009-06-01 Thread john cooper
Avi Kivity wrote:
 john cooper wrote:
 Hit this yesterday when configure hung attempting
 to pull the version from a kernel's .config.

   
 
 Looks right, but missing a signoff.


Signed-off-by: john cooper john.coo...@redhat.com

diff --git a/configure b/configure
index 493c178..1fd133c 100755
--- a/configure
+++ b/configure
@@ -126,7 +126,7 @@ if [ -n $no_uname ]; then
 elif [ -e $kerneldir/include/config/kernel.release ]; then
 depmod_version=`cat $kerneldir/include/config/kernel.release`
 elif [ -e $kerneldir/.config ]; then
-   depmod_version=$(awk '/Linux kernel version:/ { print $NF }'
+   depmod_version=$(awk '/Linux kernel version:/ { print $NF }' \
 $kerneldir/.config)
 else
 echo

-- 
john.coo...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


how to manage KVM guests with libvirt ?

2009-06-01 Thread Riccardo Veraldi

Hello,
I have always created my guests by hand with qemu-kvm syntax.
Is there a way to control and manage KVM guests with libvirt without being
forced to create the guest with virtmanager or with virtsh ?

what if I post a guest from XEN to KVM, how to manage it with libvirt ?

thanks


Rick

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: configure script bug..

2009-06-01 Thread Avi Kivity

john cooper wrote:

Avi Kivity wrote:
  

john cooper wrote:


Hit this yesterday when configure hung attempting
to pull the version from a kernel's .config.

  
  

Looks right, but missing a signoff.




Signed-off-by: john cooper john.coo...@redhat.com

diff --git a/configure b/configure
index 493c178..1fd133c 100755
--- a/configure
+++ b/configure
@@ -126,7 +126,7 @@ if [ -n $no_uname ]; then
 elif [ -e $kerneldir/include/config/kernel.release ]; then
 depmod_version=`cat $kerneldir/include/config/kernel.release`
 elif [ -e $kerneldir/.config ]; then
-   depmod_version=$(awk '/Linux kernel version:/ { print $NF }'
+   depmod_version=$(awk '/Linux kernel version:/ { print $NF }' \
 $kerneldir/.config)
 else
 echo
  


What repo and branch are you looking at exactly?  my ./configure has 
this backslash.



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to manage KVM guests with libvirt ?

2009-06-01 Thread Javier Guerra
On Mon, Jun 1, 2009 at 9:41 AM, Riccardo Veraldi
riccardo.vera...@cnaf.infn.it wrote:
 Hello,
 I have always created my guests by hand with qemu-kvm syntax.
 Is there a way to control and manage KVM guests with libvirt without being
 forced to create the guest with virtmanager or with virtsh ?

i'm doing this, after seeing the managment tools available with
libvirt.  the easiest way is to write an XML that describes what you
already know how to do with command line.  there are still a couple of
missing options (most notably cache=none, and getting to the command
console); but you should be able to get it to work.

virt-install is a nice hack for installing well-behaved linux distros;
but for windows (where you have to pick exactly which features to
expose at different steps of the install), it's easier to do on the
command line and write the needed XML after that.

one tip, if you choose to allow libvirt to manage an LVM storage pool,
it's better not to create/destroy LVs manually.  use virsh for that or
just don't tell libvirt about your LVM.  i had a couple of host
crashes when the libvirt view of the storage got inconsistent with
reality.


-- 
Javier
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: kvm-autotest: False PASS results

2009-06-01 Thread Uri Lublin

On 05/10/2009 08:15 PM, sudhir kumar wrote:

Hi Uri,
Any comments?


-- Forwarded message --
From: sudhir kumarsmalik...@gmail.com

The kvm-autotest shows the following PASS results for migration,
while the VM was crashed and test should have failed.

Here is the sequence of test commands and results grepped from
kvm-autotest output.

/root/sudhir/regression/test/kvm-autotest-phx/client/tests/kvm_runtest_2/qemu
-name 'vm1' -monitor
unix:/tmp/monitor-20090508-055624-QSuS,server,nowait -drive
file=/root/sudhir/regression/test/kvm-autotest-phx/client/tests/kvm_runtest_2/images/rhel5-32.raw,if=ide,boot=on
-net nic,vlan=0 -net user,vlan=0 -m 8192
-smp 4 -redir tcp:5000::22 -vnc :1


/root/sudhir/regression/test/kvm-autotest-phx/client/tests/kvm_runtest_2/qemu
-name 'dst' -monitor
unix:/tmp/monitor-20090508-055625-iamW,server,nowait -drive
file=/root/sudhir/regression/test/kvm-autotest-phx/client/tests/kvm_runtest_2/images/rhel5-32.raw,if=ide,boot=on
-net nic,vlan=0 -net user,vlan=0 -m 8192
-smp 4 -redir tcp:5001::22 -vnc :2 -incoming tcp:0:5200



2009-05-08 05:58:43,471 Configuring logger for client level
GOOD
kvm_runtest_2.raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.1
END GOOD
kvm_runtest_2.raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.1

GOOD
kvm_runtest_2.raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2
kvm_runtest_2.raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2
timestamp=1241762371
localtime=May 08 05:59:31   completed successfully
Persistent state variable __group_level now set to 1
END GOOD
kvm_runtest_2.raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2
kvm_runtest_2.raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2
timestamp=1241762371
localtime=May 08 05:59:31

 From the test output it looks that the test was succesful to
log into the guest after migration:

20090508-055926 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: Migration
finished successfully
20090508-055926 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
send_monitor_cmd: Sending monitor command: screendump
/root/sudhir/regression/test/kvm-autotest-phx/client/results/default/kvm_runtest_2.raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2/debug/migration_post.ppm
20090508-055926 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
send_monitor_cmd: Sending monitor command: screendump
/root/sudhir/regression/test/kvm-autotest-phx/client/results/default/kvm_runtest_2.raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2/debug/migration_pre.ppm
20090508-055926 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
send_monitor_cmd: Sending monitor command: quit
20090508-055926 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
is_sshd_running: Timeout
20090508-055926 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: Logging into
guest after migration...
20090508-055926 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
remote_login: Trying to login...
20090508-055927 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
remote_login: Got 'Are you sure...'; sending 'yes'
20090508-055927 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
remote_login: Got password prompt; sending '123456'
20090508-055928 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
remote_login: Got shell prompt -- logged in
20090508-055928 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: Logged in
after migration
20090508-055928 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
get_command_status_output: Sending command: help
20090508-055930 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
postprocess_vm: Postprocessing VM 'vm1'...
20090508-055930 raw.8gb_mem.smp4.RHEL.5.3.i386.migrate.2: DEBUG:
postprocess_vm: VM object found in environment

When I did vnc to the final migrated VM was crashed with a call trace
as shown in the attachment.
Quite less possible that the call trace appeared after the test
finished as migration with memory
more than 4GB is already broken [BUG 52527]. This looks a false PASS
to me. Any idea how can we handle
such falso positive results? Shall we wait for sometime after
migration, log into the vm, do some work or run some good test,
get output and report that the vm is alive?




I don't think it's a False PASS.
It seems the test was able to ssh into the guest, and run a command on the 
guest.

Currently we only run migration once (round-trip). I think we should run 
migration more than once (using iterations). If the guest crashes due to 
migration, it would fail following rounds of migration.


Sorry for the late reply,
Uri.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][KVM-AUTOTEST] Add custom install option for kvm_install

2009-06-01 Thread Uri Lublin

On 05/12/2009 06:34 PM, Mike Burns wrote:

From: Michael Burnsmbu...@redhat.com


Signed-off-by: Michael Burnsmbu...@redhat.com
---
  client/tests/kvm_runtest_2/control|   18 +-
  client/tests/kvm_runtest_2/kvm_install.py |   15 +++
  2 files changed, 32 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm_runtest_2/control 
b/client/tests/kvm_runtest_2/control
index fd68e94..d6e26bc 100644
--- a/client/tests/kvm_runtest_2/control
+++ b/client/tests/kvm_runtest_2/control
@@ -41,6 +41,19 @@ link_if_not_exist(pwd, qemu_img, 'qemu-img')

  # -
  # Build and install kvm
+#
+# Details of Install options
+#   Mode: custom
+#   Description:  install from custom install script
+#   Parameters needed:
+# install_script:
+#   location of script relative to the kvm-runtest_2 directory.
+#   Script will be executed from test.bindir (generally kvm_runtest_2)
+#   parameters for the script can be passed either as environment variables
+#   in the params array below or in the definition of install_script.
+#   If they are passed as part of params, then they will be accessible as
+#   KVM_INSTALL_s  in the OS Environment when your script runs.
+#


I think this, being the only explanation about kvm_install, can be confusing to 
the user. We can add a link to the wiki instead:

http://kvm.et.redhat.com/page/KVM-Autotest/ControlFile


  # -
  params = {
  name: kvm_install,
@@ -57,7 +70,10 @@ params = {

  ## Install from git
  git_repo: 'git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git',
-user_git_repo: 
'git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm-userspace.git'
+user_git_repo: 
'git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm-userspace.git',
+
+## Custom install
+install_script: 'custom_kvm_install.sh param1'
  }

  # Comment the job.run_test line if you do not want to install kvm on the host.
diff --git a/client/tests/kvm_runtest_2/kvm_install.py 
b/client/tests/kvm_runtest_2/kvm_install.py
index 8be5a93..234c77a 100755
--- a/client/tests/kvm_runtest_2/kvm_install.py
+++ b/client/tests/kvm_runtest_2/kvm_install.py
@@ -77,6 +77,21 @@ def run_kvm_install(test, params, env):
  elif install_mode == localsrc:
  __install_kvm(test, srcdir)

+# install from custom script
+elif install_mode == custom:
+install_script = params.get(install_script)
+script = os.path.join(test.bindir,install_script)


This line (script = ..) should be located below the following if statement.
if install_script is not in params (and install_script is None), os.path.join 
fails.



+if not install_script:
+message = Custom script filename not specified
+kvm_log.error(message)
+raise error.TestError, message
+for k in params.keys():


Fix white-space.


+ kvm_log.info(Adding KVM_INSTALL_%s to Environment % (k))


kvm_log.debug


+  os.putenv(KVM_INSTALL_%s % (k), str(params[k]))
+   kvm_log.info(Running  + script +  to install kvm)
+os.system(cd %s; %s % (test.bindir, script))


if the script fails, quit (raise).


+   kvm_log.info(Completed %s % (script))
+
  # invalid installation mode
  else:
  message = Invalid installation mode: '%s' % install_mode




Sorry for the late reply,
Uri.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fixing kvm test authorship issue

2009-06-01 Thread Lucas Meneghel Rodrigues
Ok, patch applied. Thanks for pointing this out, Uri.

On Mon, 2009-06-01 at 12:41 -0300, Lucas Meneghel Rodrigues wrote:
 When preparing the kvm test for inclusion on autotest, I did
 forget to add the name of Dror Russo (dru...@redhat.com) as
 one of the rightful authors of the test. This patch includes
 his name on the docstrings of the test.
 ---
  client/tests/kvm/control |1 +
  client/tests/kvm/kvm.py  |1 +
  client/tests/kvm/make_html_report.py |1 +
  3 files changed, 3 insertions(+), 0 deletions(-)
 
 diff --git a/client/tests/kvm/control b/client/tests/kvm/control
 index f4b2113..eb38686 100644
 --- a/client/tests/kvm/control
 +++ b/client/tests/kvm/control
 @@ -1,5 +1,6 @@
  AUTHOR = 
  u...@redhat.com (Uri Lublin)
 +dru...@redhat.com (Dror Russo)
  mgold...@redhat.com (Michael Goldish)
  dh...@redhat.com (David Huff)
  aerom...@redhat.com (Alexey Eromenko)
 diff --git a/client/tests/kvm/kvm.py b/client/tests/kvm/kvm.py
 index 91ec89a..1b9013c 100644
 --- a/client/tests/kvm/kvm.py
 +++ b/client/tests/kvm/kvm.py
 @@ -17,6 +17,7 @@ class kvm(test.test):
  
  @copyright: Red Hat 2008-2009
  @author: Uri Lublin (u...@redhat.com)
 +@author: Dror Russo (dru...@redhat.com)
  @author: Michael Goldish (mgold...@redhat.com)
  @author: David Huff (dh...@redhat.com)
  @author: Alexey Eromenko (aerom...@redhat.com)
 diff --git a/client/tests/kvm/make_html_report.py 
 b/client/tests/kvm/make_html_report.py
 index 988b2f3..6aed39e 100755
 --- a/client/tests/kvm/make_html_report.py
 +++ b/client/tests/kvm/make_html_report.py
 @@ -4,6 +4,7 @@ Script used to parse the test results and generate an HTML 
 report.
  
  @copyright: (c)2005-2007 Matt Kruse (javascripttoolbox.com)
  @copyright: Red Hat 2008-2009
 +...@author: Dror Russo (dru...@redhat.com)
  
  
  import os, sys, re, getopt, time, datetime, commands
-- 
Lucas Meneghel Rodrigues
Software Engineer (QE)
Red Hat - Emerging Technologies

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to manage KVM guests with libvirt ?

2009-06-01 Thread Javier Guerra
On Mon, Jun 1, 2009 at 11:02 AM, Riccardo Veraldi
riccardo.vera...@cnaf.infn.it wrote:
 thank you very much.

 How do I know all the XML tag options ??

 how to convert from comand line quemu options into XML tags ?

 and here to put XML file ?

you'll have to play around a little with a test machine before you get
the hang of it.  the xml options are documented on the libvirt site.
put them in /etc/libvirt/qemu/blahblahblah.xml, and the libvirtd
daemon will pick them at startup.

 about the console I used to start qemu-kvm under SCREEN program.
 is ther another better way to have serial console ?

for that i don't have any good answer.  it seems libvirt ties the qemu
monitor with a pipe to its own process, so you don't have manual
control anymore.

i'd much prefer if it used a unix socket, so you could open it with
socat or similar tools.

-- 
Javier
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-kmod.git

2009-06-01 Thread Ryan Harper
* Avi Kivity a...@redhat.com [2009-04-28 05:53]:
 Michael S. Tsirkin wrote:
 Or maybe, add a s/CONFIG_KVM_TRACE/CONFIG_KMOD_KVM_TRACE/ to make the  
 two options independent.
 
 
 You decide. 
 
 Well, I think it's less confusing.
 
 I also wonder what happens if one tries to build on
 a machine with kvm built into kernel. Ideally one would get
 a clear error message.
   
 
 kvm-kmod is really designed for those running on pre-kvm distro kernels, 
 and for those testing newer kvm versions on distro kernels.  If you can 
 compile your own kernel, download kvm.git and run that.

So no way of having kvm-kmod sync in kvm.git bits and still build
against the current running kernel?

I wanted to test out 2.6.29-maint kernel modules and in
kvm-userspace/kernel, I could checkout maint/2.6.29 in kvm.git and then
make LINUX= sync.  I couldn't quite figure out how to do that with
kvm-kmod since when I do a ./configure --kerneldir -- it's building
modules *for* whatever kernel is at kerneldir rather than just syncing
in the kvm bits from that kerneldir and then building modules against
the running kernel.

The reason I'm digging is that I'm seeing 64-bit migration failing on
maint/2.6.29 kvm modules, but not on upstream kvm kernel bits and I
wanted to bisect to find where 64-bit migration is fixed so I can
suggest what to pull into maint/2.6.29.


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH v3 0/3] mmio/pio cleanup

2009-06-01 Thread Gregory Haskins
(This is v3 of the series, applies to kvm.git/master:7ff90748)

This is in prep for some more substantial mmio/pio work for iosignalfd,
coming later.

[ Changelog:

  v3:
  *) Addressed feedback from Chris Wright w.r.t. patch 2/3
  *) Rebased to kvm.git/master:7ff90748

]

---

Gregory Haskins (3):
  kvm: do not register i8254 PIO regions until we are initialized
  kvm: cleanup io_device code
  kvm: fix potential coalesced_mmio leak on shutdown


 arch/x86/kvm/i8254.c  |   53 +
 arch/x86/kvm/i8259.c  |   20 -
 arch/x86/kvm/lapic.c  |   22 +--
 arch/x86/kvm/x86.c|2 +-
 virt/kvm/coalesced_mmio.c |   26 ++
 virt/kvm/ioapic.c |   22 +--
 virt/kvm/iodev.h  |   29 +
 virt/kvm/kvm_main.c   |2 +-
 8 files changed, 117 insertions(+), 59 deletions(-)

-- 
Signature
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH v3 1/3] kvm: fix potential coalesced_mmio leak on shutdown

2009-06-01 Thread Gregory Haskins
It would appear that we are invoking kfree() on the wrong pointer in the
destructor for the coalesced_mmio device.  This could result in a potential
leak during shutdown.  This works today because the kvm_io_device is
the first element of the private structure, but this could change in
the future, so lets clean this up.

Signed-off-by: Gregory Haskins ghask...@novell.com
---

 virt/kvm/coalesced_mmio.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index 5ae620d..03ea280 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -80,7 +80,10 @@ static void coalesced_mmio_write(struct kvm_io_device *this,
 
 static void coalesced_mmio_destructor(struct kvm_io_device *this)
 {
-   kfree(this);
+   struct kvm_coalesced_mmio_dev *dev =
+   (struct kvm_coalesced_mmio_dev *)this-private;
+
+   kfree(dev);
 }
 
 int kvm_coalesced_mmio_init(struct kvm *kvm)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH v3 2/3] kvm: cleanup io_device code

2009-06-01 Thread Gregory Haskins
We modernize the io_device code so that we use container_of() instead of
dev-private, and move the vtable to a separate ops structure
(theoretically allows better caching for multiple instances of the same
ops structure)

Signed-off-by: Gregory Haskins ghask...@novell.com
---

 arch/x86/kvm/i8254.c  |   40 
 arch/x86/kvm/i8259.c  |   20 ++--
 arch/x86/kvm/lapic.c  |   22 +++---
 arch/x86/kvm/x86.c|2 +-
 virt/kvm/coalesced_mmio.c |   25 +++--
 virt/kvm/ioapic.c |   22 +++---
 virt/kvm/iodev.h  |   29 -
 virt/kvm/kvm_main.c   |2 +-
 8 files changed, 109 insertions(+), 53 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 584e3d3..21301a2 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -347,10 +347,20 @@ void kvm_pit_load_count(struct kvm *kvm, int channel, u32 
val)
mutex_unlock(kvm-arch.vpit-pit_state.lock);
 }
 
+static inline struct kvm_pit *dev_to_pit(struct kvm_io_device *dev)
+{
+   return container_of(dev, struct kvm_pit, dev);
+}
+
+static inline struct kvm_pit *speaker_to_pit(struct kvm_io_device *dev)
+{
+   return container_of(dev, struct kvm_pit, speaker_dev);
+}
+
 static void pit_ioport_write(struct kvm_io_device *this,
 gpa_t addr, int len, const void *data)
 {
-   struct kvm_pit *pit = (struct kvm_pit *)this-private;
+   struct kvm_pit *pit = dev_to_pit(this);
struct kvm_kpit_state *pit_state = pit-pit_state;
struct kvm *kvm = pit-kvm;
int channel, access;
@@ -423,7 +433,7 @@ static void pit_ioport_write(struct kvm_io_device *this,
 static void pit_ioport_read(struct kvm_io_device *this,
gpa_t addr, int len, void *data)
 {
-   struct kvm_pit *pit = (struct kvm_pit *)this-private;
+   struct kvm_pit *pit = dev_to_pit(this);
struct kvm_kpit_state *pit_state = pit-pit_state;
struct kvm *kvm = pit-kvm;
int ret, count;
@@ -494,7 +504,7 @@ static int pit_in_range(struct kvm_io_device *this, gpa_t 
addr,
 static void speaker_ioport_write(struct kvm_io_device *this,
 gpa_t addr, int len, const void *data)
 {
-   struct kvm_pit *pit = (struct kvm_pit *)this-private;
+   struct kvm_pit *pit = speaker_to_pit(this);
struct kvm_kpit_state *pit_state = pit-pit_state;
struct kvm *kvm = pit-kvm;
u32 val = *(u32 *) data;
@@ -508,7 +518,7 @@ static void speaker_ioport_write(struct kvm_io_device *this,
 static void speaker_ioport_read(struct kvm_io_device *this,
gpa_t addr, int len, void *data)
 {
-   struct kvm_pit *pit = (struct kvm_pit *)this-private;
+   struct kvm_pit *pit = speaker_to_pit(this);
struct kvm_kpit_state *pit_state = pit-pit_state;
struct kvm *kvm = pit-kvm;
unsigned int refresh_clock;
@@ -560,6 +570,18 @@ static void pit_mask_notifer(struct kvm_irq_mask_notifier 
*kimn, bool mask)
}
 }
 
+static const struct kvm_io_device_ops pit_dev_ops = {
+   .read = pit_ioport_read,
+   .write= pit_ioport_write,
+   .in_range = pit_in_range,
+};
+
+static const struct kvm_io_device_ops speaker_dev_ops = {
+   .read = speaker_ioport_read,
+   .write= speaker_ioport_write,
+   .in_range = speaker_in_range,
+};
+
 struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
 {
struct kvm_pit *pit;
@@ -580,17 +602,11 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
spin_lock_init(pit-pit_state.inject_lock);
 
/* Initialize PIO device */
-   pit-dev.read = pit_ioport_read;
-   pit-dev.write = pit_ioport_write;
-   pit-dev.in_range = pit_in_range;
-   pit-dev.private = pit;
+   kvm_iodevice_init(pit-dev, pit_dev_ops);
kvm_io_bus_register_dev(kvm-pio_bus, pit-dev);
 
if (flags  KVM_PIT_SPEAKER_DUMMY) {
-   pit-speaker_dev.read = speaker_ioport_read;
-   pit-speaker_dev.write = speaker_ioport_write;
-   pit-speaker_dev.in_range = speaker_in_range;
-   pit-speaker_dev.private = pit;
+   kvm_iodevice_init(pit-speaker_dev, speaker_dev_ops);
kvm_io_bus_register_dev(kvm-pio_bus, pit-speaker_dev);
}
 
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 1ccb50c..2520922 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -444,10 +444,15 @@ static int picdev_in_range(struct kvm_io_device *this, 
gpa_t addr,
}
 }
 
+static inline struct kvm_pic *to_pic(struct kvm_io_device *dev)
+{
+   return container_of(dev, struct kvm_pic, dev);
+}
+
 static void picdev_write(struct kvm_io_device *this,
 gpa_t addr, int len, const void *val)
 {
-   struct kvm_pic *s = this-private;
+   

[KVM PATCH v3 3/3] kvm: do not register i8254 PIO regions until we are initialized

2009-06-01 Thread Gregory Haskins
We currently publish the i8254 resources to the pio_bus before the devices
are fully initialized.  Since we hold the pit_lock, its probably not
a real issue.  But lets clean this up anyway.

Found-by: Avi Kivity a...@redhat.com
Signed-off-by: Gregory Haskins ghask...@novell.com
---

 arch/x86/kvm/i8254.c |   17 -
 1 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 21301a2..f91b0e3 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -601,15 +601,6 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
mutex_lock(pit-pit_state.lock);
spin_lock_init(pit-pit_state.inject_lock);
 
-   /* Initialize PIO device */
-   kvm_iodevice_init(pit-dev, pit_dev_ops);
-   kvm_io_bus_register_dev(kvm-pio_bus, pit-dev);
-
-   if (flags  KVM_PIT_SPEAKER_DUMMY) {
-   kvm_iodevice_init(pit-speaker_dev, speaker_dev_ops);
-   kvm_io_bus_register_dev(kvm-pio_bus, pit-speaker_dev);
-   }
-
kvm-arch.vpit = pit;
pit-kvm = kvm;
 
@@ -628,6 +619,14 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
pit-mask_notifier.func = pit_mask_notifer;
kvm_register_irq_mask_notifier(kvm, 0, pit-mask_notifier);
 
+   kvm_iodevice_init(pit-dev, pit_dev_ops);
+   kvm_io_bus_register_dev(kvm-pio_bus, pit-dev);
+
+   if (flags  KVM_PIT_SPEAKER_DUMMY) {
+   kvm_iodevice_init(pit-speaker_dev, speaker_dev_ops);
+   kvm_io_bus_register_dev(kvm-pio_bus, pit-speaker_dev);
+   }
+
return pit;
 }
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-kmod.git

2009-06-01 Thread Avi Kivity

Ryan Harper wrote:

I also wonder what happens if one tries to build on
a machine with kvm built into kernel. Ideally one would get
a clear error message.
 
  
kvm-kmod is really designed for those running on pre-kvm distro kernels, 
and for those testing newer kvm versions on distro kernels.  If you can 
compile your own kernel, download kvm.git and run that.



So no way of having kvm-kmod sync in kvm.git bits and still build
against the current running kernel?
  


We must be miscommunicating, what you describe is kvm-kmod.git's sole 
purpose in life.



I wanted to test out 2.6.29-maint kernel modules and in
kvm-userspace/kernel, I could checkout maint/2.6.29 in kvm.git and then
make LINUX= sync.  I couldn't quite figure out how to do that with
kvm-kmod since when I do a ./configure --kerneldir -- it's building
modules *for* whatever kernel is at kerneldir rather than just syncing
in the kvm bits from that kerneldir and then building modules against
the running kernel.
  


'make sync' will copy the kvm bits from the linux-2.6 directory, or 
$(LINUX) if you specify that to make.


So:
 LINUX= is for the kvm sources (further controlled by whatever branch 
is checked out)

 --kerneldir= is for the host kernel


The reason I'm digging is that I'm seeing 64-bit migration failing on
maint/2.6.29 kvm modules, but not on upstream kvm kernel bits and I
wanted to bisect to find where 64-bit migration is fixed so I can
suggest what to pull into maint/2.6.29.
  


Thanks, that's helpful.  How does it fail?  maybe I can supply an 
educated guess.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to manage KVM guests with libvirt ?

2009-06-01 Thread Jim Paris
Javier Guerra wrote:
 On Mon, Jun 1, 2009 at 11:02 AM, Riccardo Veraldi
 riccardo.vera...@cnaf.infn.it wrote:
  thank you very much.
 
  How do I know all the XML tag options ??
 
  how to convert from comand line quemu options into XML tags ?
 
  and here to put XML file ?
 
 you'll have to play around a little with a test machine before you get
 the hang of it.  the xml options are documented on the libvirt site.
 put them in /etc/libvirt/qemu/blahblahblah.xml, and the libvirtd
 daemon will pick them at startup.

With libvirt 0.6.4, you can use the virsh domxml-from-native tool to
convert from QEMU arguments to XML automatically:
  http://libvirt.org/drvqemu.html#xmlimport

  about the console I used to start qemu-kvm under SCREEN program.
  is ther another better way to have serial console ?
 
 for that i don't have any good answer.  it seems libvirt ties the qemu
 monitor with a pipe to its own process, so you don't have manual
 control anymore.
 
 i'd much prefer if it used a unix socket, so you could open it with
 socat or similar tools.

If by serial console you mean an actual emulated serial port, use
the serial tag in the XML for this.  You can point it to a socket,
actual device, TCP server, etc.
  http://libvirt.org/formatdomain.html#elementsConsole

If you want access to the Qemu monitor console, Libvirt does not
provide that directly (by design), but it does use Unix sockets to
interact with the monitor, so you could in theory kill Libvirt and
talk to Qemu directly through the monitor socket in /var/run/libvirt/qemu.

-jim
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v3 1/3] kvm: fix potential coalesced_mmio leak on shutdown

2009-06-01 Thread Chris Wright
* Gregory Haskins (ghask...@novell.com) wrote:
 It would appear that we are invoking kfree() on the wrong pointer in the
 destructor for the coalesced_mmio device.  This could result in a potential
 leak during shutdown.  This works today because the kvm_io_device is
 the first element of the private structure, but this could change in
 the future, so lets clean this up.
 
 Signed-off-by: Gregory Haskins ghask...@novell.com

Acked-by: Chris Wright chr...@sous-sol.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v3 2/3] kvm: cleanup io_device code

2009-06-01 Thread Chris Wright
* Gregory Haskins (ghask...@novell.com) wrote:
 We modernize the io_device code so that we use container_of() instead of
 dev-private, and move the vtable to a separate ops structure
 (theoretically allows better caching for multiple instances of the same
 ops structure)
 
 Signed-off-by: Gregory Haskins ghask...@novell.com

Acked-by: Chris Wright chr...@sous-sol.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v3 3/3] kvm: do not register i8254 PIO regions until we are initialized

2009-06-01 Thread Chris Wright
* Gregory Haskins (ghask...@novell.com) wrote:
 We currently publish the i8254 resources to the pio_bus before the devices
 are fully initialized.  Since we hold the pit_lock, its probably not
 a real issue.  But lets clean this up anyway.
 
 Found-by: Avi Kivity a...@redhat.com
 Signed-off-by: Gregory Haskins ghask...@novell.com

Acked-by: Chris Wright chr...@sous-sol.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Qemu (host) - host userspace signaling?

2009-06-01 Thread Gregory Haskins
Avi Kivity wrote:
 pav wrote:


 I understand I could use a unix socket and qemu_chr_open() and
 friends for this, but isn't a full-blown socket a bit of an overkill
 for a simple kick interface?
   

 Not at all.  Send a byte to have the other side wake up.
FWIW: you could also use an eventfd here.

-Greg



signature.asc
Description: OpenPGP digital signature


Re: Qemu (host) - host userspace signaling?

2009-06-01 Thread Avi Kivity

Gregory Haskins wrote:

Avi Kivity wrote:
  

pav wrote:


I understand I could use a unix socket and qemu_chr_open() and
friends for this, but isn't a full-blown socket a bit of an overkill
for a simple kick interface?
  
  

Not at all.  Send a byte to have the other side wake up.


FWIW: you could also use an eventfd here.
  


To share the eventfd with someone else you need a unix domain socket 
anyway.  Once you do that setup, however, it works out quite nicely 
(esp. if it's connected to iosignalfd).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][KVM-AUTOTEST][REPOST] Add ability to install custom kernel modules

2009-06-01 Thread Uri Lublin

On 05/21/2009 03:29 AM, Mike Burns wrote:

See comment in control file for details of implementation

Signed-off-by: Mike Burnsmbu...@redhat.com
---
  client/tests/kvm_runtest_2/control|6 ++
  client/tests/kvm_runtest_2/kvm_install.py |   11 +--
  2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/client/tests/kvm_runtest_2/control 
b/client/tests/kvm_runtest_2/control
index d6e26bc..437de4c 100644
--- a/client/tests/kvm_runtest_2/control
+++ b/client/tests/kvm_runtest_2/control
@@ -74,6 +74,12 @@ params = {

  ## Custom install
  install_script: 'custom_kvm_install.sh param1'
+
+## Additional kernel modules to install
+## Must be a space separated list of values
+## Installed in the order they are listed.
+## to install mod1.ko, mod2.ko, mod3.ko, you would set like this:
+#additional_modules: 'mod1 mod2 mod3'
  }

  # Comment the job.run_test line if you do not want to install kvm on the host.
diff --git a/client/tests/kvm_runtest_2/kvm_install.py 
b/client/tests/kvm_runtest_2/kvm_install.py
index 392ef0c..80354f5 100755
--- a/client/tests/kvm_runtest_2/kvm_install.py
+++ b/client/tests/kvm_runtest_2/kvm_install.py
@@ -106,7 +106,7 @@ def run_kvm_install(test, params, env):

  # load kvm modules (unless requested not to)
  if params.get('load_modules', yes) == yes:
-__load_kvm_modules()
+__load_kvm_modules(params)
  else:
  kvm_log.info(user requested not to load kvm modules)

@@ -209,7 +209,7 @@ def __install_kvm_from_local_tarball(test, srcdir, tarball):
  __install_kvm(test, srcdir)


-def __load_kvm_modules():
+def __load_kvm_modules(params):
  kvm_log.info(Detecting CPU vendor...)
  vendor = intel
  if os.system(grep vmx /proc/cpuinfo 1/dev/null) != 0:
@@ -237,6 +237,13 @@ def __load_kvm_modules():
  os.chdir(x86)
  utils.system(/sbin/insmod ./kvm.ko  sleep 1  /sbin/insmod 
./kvm-%s.ko % vendor)

+#Add additional modules specified in params by additional_modules
+#Modules must be namedvalue.ko and be located in the
+#same location as kvm and kvm-vendor modules
+for module in params.get(additional_modules,).split():
+  kvm_log.info(Installing module \%s\ % module)
+  utils.system(/sbin/insmod ./%s.ko % module )
+
  #elif self.config.load_modules == no:
  #kvm_log.info(user requested not to load kvm modules)



Hi Mike,

Can you load those kernel modules before running kvm-autotest (something like a 
setup script) ? And clean up when the run completes (cleanup script) ?


How would those modules get into the directory where kvm.ko is built.

Who is in charge to unload those modules ? What happen if we have 2 consecutive 
runs of kvm-autotest (the second insmod would fail, wouldn't it) ?


This may belong to a separate patch, but I think it's a good idea to support 
module-parameters for each of those modules (specifically I think of ept/npt and 
other params of kvm*.ko)


Sorry for the late reply,
Uri.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-kmod.git

2009-06-01 Thread Ryan Harper
* Avi Kivity a...@redhat.com [2009-06-01 12:04]:
 Ryan Harper wrote:
 I also wonder what happens if one tries to build on
 a machine with kvm built into kernel. Ideally one would get
 a clear error message.
  
   
 kvm-kmod is really designed for those running on pre-kvm distro kernels, 
 and for those testing newer kvm versions on distro kernels.  If you can 
 compile your own kernel, download kvm.git and run that.
 
 
 So no way of having kvm-kmod sync in kvm.git bits and still build
 against the current running kernel?
   
 
 We must be miscommunicating, what you describe is kvm-kmod.git's sole 
 purpose in life.
 
 I wanted to test out 2.6.29-maint kernel modules and in
 kvm-userspace/kernel, I could checkout maint/2.6.29 in kvm.git and then
 make LINUX= sync.  I couldn't quite figure out how to do that with
 kvm-kmod since when I do a ./configure --kerneldir -- it's building
 modules *for* whatever kernel is at kerneldir rather than just syncing
 in the kvm bits from that kerneldir and then building modules against
 the running kernel.
   
 
 'make sync' will copy the kvm bits from the linux-2.6 directory, or 
 $(LINUX) if you specify that to make.
 
 So:
  LINUX= is for the kvm sources (further controlled by whatever branch 
 is checked out)
  --kerneldir= is for the host kernel

% cd kvm
% git checkout -f 2.6.29-stable origins/maint/2.6.29
% cd ../kvm-kmod.git
% ./configure
% make LINUX=../kvm sync
./sync -v kvm-devel -l ../kvm
Traceback (most recent call last):
  File ./sync, line 207, in module
source_sync(arch)
  File ./sync, line 200, in source_sync
hack(T, arch, i)
  File ./sync, line 123, in hack
_hack(T + '/' + file, arch)
  File ./sync, line 114, in _hack
data = file(fname).read()
IOError: [Errno 2] No such file or directory: 'source/timer.c'
make: *** [sync] Error 1

kvm.git on branch kvm-85:
% make LINUX=/home/rharper/work/git/kvm sync
./sync -v kvm-devel -l /home/rharper/work/git/kvm
Traceback (most recent call last):
  File ./sync, line 207, in module
source_sync(arch)
  File ./sync, line 200, in source_sync
hack(T, arch, i)
  File ./sync, line 123, in hack
_hack(T + '/' + file, arch)
  File ./sync, line 114, in _hack
data = file(fname).read()
IOError: [Errno 2] No such file or directory: 'source/eventfd.c'
make: *** [sync] Error 1

kvm.git on branch kvm-86:
% make LINUX=/home/rharper/work/git/kvm sync
./sync -v kvm-devel -l /home/rharper/work/git/kvm
Traceback (most recent call last):
  File ./sync, line 207, in module
source_sync(arch)
  File ./sync, line 200, in source_sync
hack(T, arch, i)
  File ./sync, line 123, in hack
_hack(T + '/' + file, arch)
  File ./sync, line 114, in _hack
data = file(fname).read()
IOError: [Errno 2] No such file or directory: 'source/eventfd.c'
make: *** [sync] Error 1


only branch master seems to work with kvm-kmod

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-kmod.git

2009-06-01 Thread Avi Kivity

Ryan Harper wrote:

So:
 LINUX= is for the kvm sources (further controlled by whatever branch 
is checked out)

 --kerneldir= is for the host kernel



Ah, excellent, I didn't see it documented on the Code page of the wiki
and I blindly assumed that it went away.  Thanks for correction.
  


It actually did go away, but Jan brought it back.


The reason I'm digging is that I'm seeing 64-bit migration failing on
maint/2.6.29 kvm modules, but not on upstream kvm kernel bits and I
wanted to bisect to find where 64-bit migration is fixed so I can
suggest what to pull into maint/2.6.29.
 
  
Thanks, that's helpful.  How does it fail?  maybe I can supply an 
educated guess.



Migration succeeds, but source and target are frozen (though monitor of
guest is still interactive. 


Nothing interesting in host dmesg, and of course no output from either
guest (they seemed locked).  Reproduced this sort of hang for any number
of guests (RHEL4u7/8, SLES10 SP2, SLES11, Win2k3-r2, win2k8, etc.) as
long as they are 64-bit
  


SMP?  Uniprocessor?

Try 'info registers' on both source and target and compare.  Maybe we 
lose some bit in EFER.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-kmod.git

2009-06-01 Thread Avi Kivity

Ryan Harper wrote:

% cd kvm
% git checkout -f 2.6.29-stable origins/maint/2.6.29
% cd ../kvm-kmod.git
% ./configure
% make LINUX=../kvm sync
./sync -v kvm-devel -l ../kvm
Traceback (most recent call last):
  File ./sync, line 207, in module
source_sync(arch)
  File ./sync, line 200, in source_sync
hack(T, arch, i)
  File ./sync, line 123, in hack
_hack(T + '/' + file, arch)
  File ./sync, line 114, in _hack
data = file(fname).read()
IOError: [Errno 2] No such file or directory: 'source/timer.c'
make: *** [sync] Error 1

kvm.git on branch kvm-85:
% make LINUX=/home/rharper/work/git/kvm sync
./sync -v kvm-devel -l /home/rharper/work/git/kvm
Traceback (most recent call last):
  File ./sync, line 207, in module
source_sync(arch)
  File ./sync, line 200, in source_sync
hack(T, arch, i)
  File ./sync, line 123, in hack
_hack(T + '/' + file, arch)
  File ./sync, line 114, in _hack
data = file(fname).read()
IOError: [Errno 2] No such file or directory: 'source/eventfd.c'
make: *** [sync] Error 1

kvm.git on branch kvm-86:
% make LINUX=/home/rharper/work/git/kvm sync
./sync -v kvm-devel -l /home/rharper/work/git/kvm
Traceback (most recent call last):
  File ./sync, line 207, in module
source_sync(arch)
  File ./sync, line 200, in source_sync
hack(T, arch, i)
  File ./sync, line 123, in hack
_hack(T + '/' + file, arch)
  File ./sync, line 114, in _hack
data = file(fname).read()
IOError: [Errno 2] No such file or directory: 'source/eventfd.c'
make: *** [sync] Error 1


only branch master seems to work with kvm-kmod

  


Well there are differences in the source trees, so they need different 
hacking.  There is a branch for 2.6.30 in kvm-kmod.git, but 2.6.29 is 
slightly different and doesn't have a branch.  You can just remove the 
offending files from ./sync, there's a good chance it will work.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][KVM-AUTOTEST] Check exit status of custom install script and fail if script failed.

2009-06-01 Thread Mike Burns
Avi Kivity wrote:
 Lucas Meneghel Rodrigues wrote:
 On Sun, 2009-05-24 at 17:48 +0300, Avi Kivity wrote:
  
 Mike Burns wrote:

 Signed-off-by: Mike Burns mbu...@redhat.com
 ---
  client/tests/kvm_runtest_2/kvm_install.py |7 ++-
  1 files changed, 6 insertions(+), 1 deletions(-)

 diff --git a/client/tests/kvm_runtest_2/kvm_install.py
 b/client/tests/kvm_runtest_2/kvm_install.py
 index ebd8b7d..392ef0c 100755
 --- a/client/tests/kvm_runtest_2/kvm_install.py
 +++ b/client/tests/kvm_runtest_2/kvm_install.py
 @@ -90,7 +90,12 @@ def run_kvm_install(test, params, env):
kvm_log.info(Adding KVM_INSTALL_%s to Environment % (k))
os.putenv(KVM_INSTALL_%s % (k), str(params[k]))
  kvm_log.info(Running  + script +  to install kvm)
 -os.system(cd %s; %s % (test.bindir, script))
 +install_result = os.system(cd %s; %s % (test.bindir,
 script))
 +if os.WEXITSTATUS(install_result) != 0:
 +  message = Custom Script encountered an error
 +  kvm_log.error(message)
 +  raise error.TestError, message
 +
 
 How about a helper that does os.system()  (or rather,
 commands.getstatusoutput()) and throws an exception on failure?  I
 imagine it could be used in many places.
 

 utils.system() does that. If we have exit code != 0, it throws an
 error.CmdError exception.
   

 Well let's use it then.  Every time I see 'raise' used I'm going to
 complain, so it will be a lot more efficient as well as smaller code.

Agreed.  I'll rework this and my other patches and get them re-posted in
the next day or two.

Mike
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][KVM-AUTOTEST] Add custom install option for kvm_install

2009-06-01 Thread Mike Burns
Uri Lublin wrote:
 On 05/12/2009 06:34 PM, Mike Burns wrote:
 From: Michael Burnsmbu...@redhat.com


 Signed-off-by: Michael Burnsmbu...@redhat.com
 ---
   client/tests/kvm_runtest_2/control|   18 +-
   client/tests/kvm_runtest_2/kvm_install.py |   15 +++
   2 files changed, 32 insertions(+), 1 deletions(-)

 diff --git a/client/tests/kvm_runtest_2/control
 b/client/tests/kvm_runtest_2/control
 index fd68e94..d6e26bc 100644
 --- a/client/tests/kvm_runtest_2/control
 +++ b/client/tests/kvm_runtest_2/control
 @@ -41,6 +41,19 @@ link_if_not_exist(pwd, qemu_img, 'qemu-img')

   # -
   # Build and install kvm
 +#
 +# Details of Install options
 +#   Mode: custom
 +#   Description:  install from custom install script
 +#   Parameters needed:
 +# install_script:
 +#   location of script relative to the kvm-runtest_2 directory.
 +#   Script will be executed from test.bindir (generally
 kvm_runtest_2)
 +#   parameters for the script can be passed either as
 environment variables
 +#   in the params array below or in the definition of
 install_script.
 +#   If they are passed as part of params, then they will be
 accessible as
 +#   KVM_INSTALL_s  in the OS Environment when your script runs.
 +#

 I think this, being the only explanation about kvm_install, can be
 confusing to the user. We can add a link to the wiki instead:
 http://kvm.et.redhat.com/page/KVM-Autotest/ControlFile
That is a good point.  I was debating whether to put them all there or
not, but didn't want to clutter up my patch too much.  I decided
afterwards to add the page to the wiki, so I'll link there instead when
I repost.

   # -
   params = {
   name: kvm_install,
 @@ -57,7 +70,10 @@ params = {

   ## Install from git
   git_repo:
 'git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git',
 -user_git_repo:
 'git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm-userspace.git'
 +user_git_repo:
 'git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm-userspace.git',
 +
 +## Custom install
 +install_script: 'custom_kvm_install.sh param1'
   }

   # Comment the job.run_test line if you do not want to install kvm
 on the host.
 diff --git a/client/tests/kvm_runtest_2/kvm_install.py
 b/client/tests/kvm_runtest_2/kvm_install.py
 index 8be5a93..234c77a 100755
 --- a/client/tests/kvm_runtest_2/kvm_install.py
 +++ b/client/tests/kvm_runtest_2/kvm_install.py
 @@ -77,6 +77,21 @@ def run_kvm_install(test, params, env):
   elif install_mode == localsrc:
   __install_kvm(test, srcdir)

 +# install from custom script
 +elif install_mode == custom:
 +install_script = params.get(install_script)
 +script = os.path.join(test.bindir,install_script)

 This line (script = ..) should be located below the following if
 statement.
 if install_script is not in params (and install_script is None),
 os.path.join fails.

Ok, will fix
 +if not install_script:
 +message = Custom script filename not specified
 +kvm_log.error(message)
 +raise error.TestError, message
 +for k in params.keys():

 Fix white-space.

 +  kvm_log.info(Adding KVM_INSTALL_%s to Environment % (k))

 kvm_log.debug

 +  os.putenv(KVM_INSTALL_%s % (k), str(params[k]))
 +kvm_log.info(Running  + script +  to install kvm)
 +os.system(cd %s; %s % (test.bindir, script))

 if the script fails, quit (raise).

we're going to change this to utils.system instead to use the built-in
error handling.
 +kvm_log.info(Completed %s % (script))
 +
   # invalid installation mode
   else:
   message = Invalid installation mode: '%s' % install_mode



 Sorry for the late reply,
 Uri.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] VMX Unrestricted mode support

2009-06-01 Thread Nitin A Kamble
On Mon, 2009-06-01 at 11:06 -0700, Nitin A Kamble wrote:
 On Sun, 2009-05-31 at 01:39 -0700, Avi Kivity wrote:

  Instead of changing all the checks like this, you can make rmode.active 
  be false when unrestricted guest is enabled.  We can interpret 
  rmode.active as emulating real mode via vm86, not as guest is in real 
  mode.

Avi,
 How about renaming rmode.active to rmode.vm86_active ?

Thanks  Regards,
Nitin

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] VMX Unrestricted mode support

2009-06-01 Thread Avi Kivity

Nitin A Kamble wrote:

Avi,
 How about renaming rmode.active to rmode.vm86_active ?
  


Sure.  But if you do that, then do it in a separate patch please.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-kmod.git

2009-06-01 Thread Ryan Harper
* Avi Kivity a...@redhat.com [2009-06-01 13:20]:
 Ryan Harper wrote:
 % cd kvm
 % git checkout -f 2.6.29-stable origins/maint/2.6.29
 % cd ../kvm-kmod.git
 % ./configure
 % make LINUX=../kvm sync
 ./sync -v kvm-devel -l ../kvm
 Traceback (most recent call last):
   File ./sync, line 207, in module
 source_sync(arch)
   File ./sync, line 200, in source_sync
 hack(T, arch, i)
   File ./sync, line 123, in hack
 _hack(T + '/' + file, arch)
   File ./sync, line 114, in _hack
 data = file(fname).read()
 IOError: [Errno 2] No such file or directory: 'source/timer.c'
 make: *** [sync] Error 1
 

 
 only branch master seems to work with kvm-kmod
 
   
 
 Well there are differences in the source trees, so they need different 
 hacking.  There is a branch for 2.6.30 in kvm-kmod.git, but 2.6.29 is 
 slightly different and doesn't have a branch.  You can just remove the 
 offending files from ./sync, there's a good chance it will work.

Yeah, updating sync and x86/Kbuild to not include source files/objects
works -- though it would be nice to have something to indicate the
version we synced, right now all modules that are built this way, dmesg
reports:

loaded kvm module (kvm-devel)


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][KVM-AUTOTEST][REPOST] Add ability to install custom kernel modules

2009-06-01 Thread David Huff
Uri Lublin wrote:
 Hi Mike,
 
 Can you load those kernel modules before running kvm-autotest (something
 like a setup script) ? And clean up when the run completes (cleanup
 script) ?

This can problay be done with the pre_command and post_command
parameters that I added in my previous patch.

Come to think of it, the whole custom kvm_install method is just a large
pre_command.  I even used alot of the same code mike used when adding
this functionality.

Does it make sense to treat the custom install of kvm as one large
pre_command?


-D
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix import of modules from inside the kvm module.

2009-06-01 Thread Lucas Meneghel Rodrigues
Import of modules present inside the kvm test is broken due to
a path that wasn't converted when the test got renamed from
kvm_runtest_2 to kvm. Fixing the problem.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/control |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/client/tests/kvm/control b/client/tests/kvm/control
index eb38686..154d513 100644
--- a/client/tests/kvm/control
+++ b/client/tests/kvm/control
@@ -45,8 +45,8 @@ Each test is appropriately documented on each test docstrings.
 
 import sys, os
 
-# enable modules import from current directory (tests/kvm_runtest_2)
-pwd = os.path.join(os.environ['AUTODIR'],'tests/kvm_runtest_2')
+# enable modules import from current directory (tests/kvm)
+pwd = os.path.join(os.environ['AUTODIR'],'tests/kvm')
 sys.path.append(pwd)
 
 # 
-- 
1.6.2.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: add localversion to avoid confusion and conflicts

2009-06-01 Thread Steven Rostedt
On Fri, May 29, 2009 at 02:13:46PM +0530, Jaswinder Singh Rajput wrote:
 On Fri, 2009-05-29 at 09:48 +0200, Christian Bornträger wrote:
  Am Freitag 29 Mai 2009 09:18:14 schrieb Jaswinder Singh Rajput:
   Adding localversion avoids confusion in kernel images :
  
   like Linux version 2.6.30-rc7 does not tell whether it is linus or kvm
   kernel.
  
   By adding localversion it tells :
  
   Linux version 2.6.30-rc7-kvm , any doubt ;-)
   I am inspired by Ingo's -tip, I am sure Ingo will tell more advantages,
   if these are not enough :-)
  [...]
   diff --git a/localversion-kvm b/localversion-kvm
   new file mode 100644
   index 000..d969ff0
   --- /dev/null
   +++ b/localversion-kvm
   @@ -0,0 +1 @@
   +-kvm
  
  NAK from my side. If you need a distinction, there is always 
  CONFIG_LOCALVERSION_AUTO. If you need this kind of prefix, there is always  
  CONFIG_LOCALVERSION.
  
 
 Here is NAK for your NAK from my side.

And here is a NACK for your NACK of a NACK!

 
 This patch is only for KVM tree and not for linus tree.
 
 Lets assume 100 developers are working on kvm tree and they use kvm tree
 on 2 PCs. So count becomes 200.
 
 Like in my case I have dozen of kernel trees so I keep on swapping
 config between kernels. And I also need to test config from various
 users. So this count is countless.
 I think this is the biggest point for adding localversion in -tip.
 It seems Ingo is busy in perfcounter stuff otherwise he will explain you
 more advantages.
 
 In the least case, Can you differentiate between 1 and 200 ?
 
 So by adding this patch we can save lot of developer's time.

No this patch wastes a lot of developers time. If we accept it, than any
patch that is added after it will need to be rebased before going to
Linus. Unless KVM is made up of a bunch of branches like tip is, this will 
become
more of a hassle than a benefit.

Ingo's tip/master is based off of a bunch of branches. It is not recommended to
develope against tip/master. I develop against tip/tracing/core, because that is
what will get pulled by Linus. tip/master is created automatically from a bunch 
of
branches and gets rebased all the time. One of those branches is tip, which 
supplies
the localversion. Thus, if you use tip/master, you get the localversion file. 
But
if you develop against one of the main branches that will eventually go to 
Linus,
then you will not have that localversion file.

-- Steve

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-kmod.git

2009-06-01 Thread Avi Kivity

Ryan Harper wrote:

though it would be nice to have something to indicate the
version we synced, right now all modules that are built this way, dmesg
reports:

loaded kvm module (kvm-devel


I committed something to use 'git describe' when appropriate.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: add localversion to avoid confusion and conflicts

2009-06-01 Thread Avi Kivity

Steven Rostedt wrote:



This patch is only for KVM tree and not for linus tree.

Lets assume 100 developers are working on kvm tree and they use kvm tree
on 2 PCs. So count becomes 200.

Like in my case I have dozen of kernel trees so I keep on swapping
config between kernels. And I also need to test config from various
users. So this count is countless.
I think this is the biggest point for adding localversion in -tip.
It seems Ingo is busy in perfcounter stuff otherwise he will explain you
more advantages.

In the least case, Can you differentiate between 1 and 200 ?

So by adding this patch we can save lot of developer's time.



No this patch wastes a lot of developers time. If we accept it, than any
patch that is added after it will need to be rebased before going to
Linus. Unless KVM is made up of a bunch of branches like tip is, this will 
become
more of a hassle than a benefit.
  


kvm.git for-linus branches are rebased anyway, since I fold patches that 
fix or revert other patches.  I also (rarely) delay some patches in my 
tree but submit others that came later.


localversion would show up in linux-next, not sure if that's a problem.  
On the other hand, I'm not sure what it's worth.



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: add localversion to avoid confusion and conflicts

2009-06-01 Thread Steven Rostedt

On Mon, 1 Jun 2009, Avi Kivity wrote:
 
 localversion would show up in linux-next, not sure if that's a problem.  On
 the other hand, I'm not sure what it's worth.

If a localversion file with -kvm showed up in linux-next, I would 
consider that a problem.

-- Steve

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/4] KVM: move coalesced_mmio locking to its own device

2009-06-01 Thread Avi Kivity

Marcelo Tosatti wrote:

On Sun, May 31, 2009 at 03:14:36PM +0300, Avi Kivity wrote:
  

Marcelo Tosatti wrote:


Move coalesced_mmio locking to its own device, instead of relying on
kvm-lock.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: kvm-irqlock/virt/kvm/coalesced_mmio.c
===
--- kvm-irqlock.orig/virt/kvm/coalesced_mmio.c
+++ kvm-irqlock/virt/kvm/coalesced_mmio.c
@@ -26,9 +26,7 @@ static int coalesced_mmio_in_range(struc
if (!is_write)
return 0;
 -  /* kvm-lock is taken by the caller and must be not released before
- * dev.read/write
- */
+   spin_lock(dev-lock);
  
  
This unbalanced locking is still very displeasing.  At a minimum you  
need a sparse annotation to indicate it.


But I think it really indicates a problem with the io_device API.

Potential solutions:
- fold in_range() into -write and -read.  Make those functions  
responsible for both determining whether they can handle the range and  
performing the I/O.

- have a separate rwlock for the device list.



IMO the problem is the coalesced_mmio device. The unbalanced locking is
a result of the abuse of the in_range() and read/write() methods.

  


Okay, the penny has dropped.  I understand now.


Normally you'd expect parallel accesses to in_range() to be allowed,
since its just checking whether (aha) the access is in range, returning
a pointer to the device if positive. Now read/write() are the ones who
need serialization, since they touch the device internal state.

coalesced_mmio abuses in_range() to do more things than it should.

Ideally we should fix coalesced_mmio, but i'm not going to do that now
(sorry, not confident in changing it without seeing go through intense
torture testing).
  


It's not trivial since it's userspace that clears the ring, and we can't 
wait on userspace.



That said, is sparse annotation enough the convince you?
  


Let me have a look at fixing coalesced_mmio first.  We might allow 
-write to fail, causing a fallback to userspace.  Or we could fail if 
n_avail  MAX_VCPUS, so even the worst-case race leaves us one entry.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -tip v8 5/7] x86: add pt_regs register and stack access APIs

2009-06-01 Thread Ingo Molnar

* Christoph Hellwig h...@infradead.org wrote:

  +   if (n  NR_REGPARMS) {
  +   switch (n) {
  +   case 0: return regs-ax;
  +   case 1: return regs-dx;
  +   case 2: return regs-cx;
  +   }
 
 Normal kernel style would be
 
   switch (n) {
   case 0:
   return regs-ax;
   case 1:
   return regs-dx;
   case 2:
   return regs-cx;
   }

the original single-line shortcut is acceptable too.

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH -tip v9 1/7] x86: instruction decoder API

2009-06-01 Thread Masami Hiramatsu
Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.

This version introduces instruction attributes for decoding instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).

Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.

1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;

 Table: table-name
 Referrer: escaped-name
 opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 
2nd-mnemonic ...]
  (or)
 opcode: escape # escaped-name
 EndTable

Group opcodes, which has 8 elements, are written as below;

 GrpTable: GrpXXX
 reg:  mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 
2nd-mnemonic ...]
 EndTable

These opcode maps do NOT include most of SSE and FP opcodes, because
those opcodes are not used in the kernel.

Changes from v6.1:
- fix patch title.

Signed-off-by: Masami Hiramatsu mhira...@redhat.com
Signed-off-by: Jim Keniston jkeni...@us.ibm.com
Cc: H. Peter Anvin h...@zytor.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Ingo Molnar mi...@elte.hu
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Andi Kleen a...@linux.intel.com
Cc: Vegard Nossum vegard.nos...@gmail.com
Cc: Avi Kivity a...@redhat.com
Cc: Przemysław Pawełczyk przemys...@pawelczyk.it
---

 arch/x86/include/asm/inat.h|  125 ++
 arch/x86/include/asm/insn.h|  134 ++
 arch/x86/lib/Makefile  |   13 +
 arch/x86/lib/inat.c|   80 
 arch/x86/lib/insn.c|  471 +
 arch/x86/lib/x86-opcode-map.txt|  711 
 arch/x86/scripts/gen-insn-attr-x86.awk |  314 ++
 7 files changed, 1848 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/inat.h
 create mode 100644 arch/x86/include/asm/insn.h
 create mode 100644 arch/x86/lib/inat.c
 create mode 100644 arch/x86/lib/insn.c
 create mode 100644 arch/x86/lib/x86-opcode-map.txt
 create mode 100644 arch/x86/scripts/gen-insn-attr-x86.awk

diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
new file mode 100644
index 000..01e079a
--- /dev/null
+++ b/arch/x86/include/asm/inat.h
@@ -0,0 +1,125 @@
+#ifndef _ASM_INAT_INAT_H
+#define _ASM_INAT_INAT_H
+/*
+ * x86 instruction attributes
+ *
+ * Written by Masami Hiramatsu mhira...@redhat.com
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+#include linux/types.h
+
+/* Instruction attributes */
+typedef u32 insn_attr_t;
+
+/*
+ * Internal bits. Don't use bitmasks directly, because these bits are
+ * unstable. You should add checking macros and use that macro in
+ * your code.
+ */
+
+#define INAT_OPCODE_TABLE_SIZE 256
+#define INAT_GROUP_TABLE_SIZE 8
+
+/* Legacy instruction prefixes */
+#define INAT_PFX_OPNDSZ1   /* 0x66 */ /* LPFX1 */
+#define INAT_PFX_REPNE 2   /* 0xF2 */ /* LPFX2 */
+#define INAT_PFX_REPE  3   /* 0xF3 */ /* LPFX3 */
+#define INAT_PFX_LOCK  4   /* 0xF0 */
+#define INAT_PFX_CS5   /* 0x2E */
+#define INAT_PFX_DS6   /* 0x3E */
+#define INAT_PFX_ES7   /* 0x26 */
+#define INAT_PFX_FS8   /* 0x64 */
+#define INAT_PFX_GS9   /* 0x65 */
+#define INAT_PFX_SS10  /* 0x36 */
+#define INAT_PFX_ADDRSZ11  /* 0x67 */
+
+#define INAT_LPREFIX_MAX   3
+
+/* Immediate size */
+#define INAT_IMM_BYTE  1
+#define INAT_IMM_WORD  2
+#define INAT_IMM_DWORD 3
+#define INAT_IMM_QWORD 4
+#define INAT_IMM_PTR   5
+#define INAT_IMM_VWORD32   6
+#define INAT_IMM_VWORD 7
+
+/* Legacy prefix */
+#define INAT_PFX_OFFS  0
+#define INAT_PFX_BITS  4
+#define INAT_PFX_MAX((1  INAT_PFX_BITS) - 1)
+#define INAT_PFX_MASK  (INAT_PFX_MAX  INAT_PFX_OFFS)
+/* Escape opcodes */
+#define INAT_ESC_OFFS  (INAT_PFX_OFFS + INAT_PFX_BITS)
+#define INAT_ESC_BITS  2
+#define INAT_ESC_MAX   ((1  

[PATCH -tip v9 0/7] tracing: kprobe-based event tracer and x86 instruction decoder

2009-06-01 Thread Masami Hiramatsu
Hi,

Here are the patches of kprobe-based event tracer for x86, version 9,
which allows you to probe various kernel events through ftrace interface.
The tracer supports per-probe filtering which allows you to set filters
on each probe and shows formats of each probe. I think this is more
generic integration with ftrace, especially event-tracer.

This patchset also includes x86(-64) instruction decoder which
supports non-SSE/FP opcodes and includes x86 opcode map. The decoder
is used for finding the instruction boundaries when inserting new
kprobes. I think it will be possible to share this opcode map
with KVM's decoder.
The decoder is tested when building kernel, the test compares the 
results of objdump and the decoder right after building vmlinux.
You can enable that test by CONFIG_X86_DECODER_SELFTEST=y.

This series can be applied on the latest linux-2.6-tip tree.

This supports only x86(-32/-64) (but porting it on other arch
just needs kprobes/kretprobes and register and stack access APIs).

This patchset includes following changes:
- Add x86 instruction decoder [1/7]
- Add x86 instruction decoder selftest [2/7]
- Check insertion point safety in kprobe [3/7]
- Cleanup fix_riprel() with insn decoder [4/7]
- Add arch-dep register and stack fetching functions [5/7] (FIXED)
- Add dynamic event_call support to ftrace [6/7] (FIXED)
- Add kprobe-based event tracer [7/7] (FIXED)


Enhancement ideas will be added after merging:
- .init function tracing support.
- Support primitive types(long, ulong, int, uint, etc) for args.


Kprobe-based Event Tracer
=

Overview

This tracer is similar to the events tracer which is based on Tracepoint
infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
and kretprobe). It probes anywhere where kprobes can probe(this means, all
functions body except for __kprobes functions).

Unlike the function tracer, this tracer can probe instructions inside of
kernel functions. It allows you to check which instruction has been executed.

Unlike the Tracepoint based events tracer, this tracer can add new probe points
on the fly.

Similar to the events tracer, this tracer doesn't need to be activated via
current_tracer, instead of that, just set probe points via
/sys/kernel/debug/tracing/kprobe_events. And you can set filters on each
probe events via /sys/kernel/debug/tracing/events/kprobes/EVENT/filter.


Synopsis of kprobe_events
-
  p[:EVENT] SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS] : set a probe
  r[:EVENT] SYMBOL[+0] [FETCHARGS]  : set a return probe

 EVENT  : Event name
 SYMBOL[+offs|-offs]: Symbol+offset where the probe is inserted
 MEMADDR: Address where the probe is inserted

 FETCHARGS  : Arguments
  %REG  : Fetch register REG
  sN: Fetch Nth entry of stack (N = 0)
  @ADDR : Fetch memory at ADDR (ADDR should be in kernel)
  @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol)
  aN: Fetch function argument. (N = 0)(*)
  rv: Fetch return value.(**)
  ra: Fetch return address.(**)
  +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.(***)

  (*) aN may not correct on asmlinkaged functions and at the middle of
  function body.
  (**) only for return probe.
  (***) this is useful for fetching a field of data structures.


Per-Probe Event Filtering
-
 Per-probe event filtering feature allows you to set different filter on each
probe and gives you what arguments will be shown in trace buffer. If an event
name is specified right after 'p:' or 'r:' in kprobe_events, the tracer adds
an event under tracing/events/kprobes/EVENT, at the directory you can see
'id', 'enabled', 'format' and 'filter'.

enabled:
  You can enable/disable the probe by writing 1 or 0 on it.

format:
  It shows the format of this probe event. It also shows aliases of arguments
 which you specified to kprobe_events.

filter:
  You can write filtering rules of this event. And you can use both of aliase
 names and field names for describing filters.


Usage examples
--
To add a probe as a new event, write a new definition to kprobe_events
as below.

  echo p:myprobe do_sys_open a0 a1 a2 a3  
/sys/kernel/debug/tracing/kprobe_events

 This sets a kprobe on the top of do_sys_open() function with recording
1st to 4th arguments as myprobe event.

  echo r:myretprobe do_sys_open rv ra  /sys/kernel/debug/tracing/kprobe_events

 This sets a kretprobe on the return point of do_sys_open() function with
recording return value and return address as myretprobe event.
 You can see the format of these events via
/sys/kernel/debug/tracing/events/kprobes/EVENT/format.

  cat /sys/kernel/debug/tracing/events/kprobes/myprobe/format
name: myprobe
ID: 23
format:
field:unsigned short common_type;   offset:0;   size:2;
field:unsigned char common_flags;   offset:2;   size:1;
 

[PATCH -tip v9 6/7] tracing: ftrace dynamic ftrace_event_call support

2009-06-01 Thread Masami Hiramatsu
Add dynamic ftrace_event_call support to ftrace. Trace engines can adds new
ftrace_event_call to ftrace on the fly. Each operator functions of the call
takes a ftrace_event_call data structure as an argument, because these
functions may be shared among several ftrace_event_calls.

Changes from v8:
 - Lock event_mutex in trace_add/remove_event_call().
 - Add __trace_add/remove_event_call() for internal use.
 - Rename dummy variables to unused.

Signed-off-by: Masami Hiramatsu mhira...@redhat.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Ingo Molnar mi...@elte.hu
Cc: Tom Zanussi tzanu...@gmail.com
Cc: Frederic Weisbecker fweis...@gmail.com
---

 include/linux/ftrace_event.h |   13 +---
 include/trace/ftrace.h   |   22 +++--
 kernel/trace/trace_events.c  |   70 --
 kernel/trace/trace_export.c  |   27 
 4 files changed, 85 insertions(+), 47 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index bbf40f6..e25f3a4 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -108,12 +108,13 @@ struct ftrace_event_call {
struct dentry   *dir;
struct trace_event  *event;
int enabled;
-   int (*regfunc)(void);
-   void(*unregfunc)(void);
+   int (*regfunc)(struct ftrace_event_call *);
+   void(*unregfunc)(struct ftrace_event_call *);
int id;
-   int (*raw_init)(void);
-   int (*show_format)(struct trace_seq *s);
-   int (*define_fields)(void);
+   int (*raw_init)(struct ftrace_event_call *);
+   int (*show_format)(struct ftrace_event_call *,
+  struct trace_seq *);
+   int (*define_fields)(struct ftrace_event_call *);
struct list_headfields;
int filter_active;
void*filter;
@@ -138,6 +139,8 @@ extern int filter_current_check_discard(struct 
ftrace_event_call *call,
 
 extern int trace_define_field(struct ftrace_event_call *call, char *type,
  char *name, int offset, int size, int is_signed);
+extern int trace_add_event_call(struct ftrace_event_call *call);
+extern void trace_remove_event_call(struct ftrace_event_call *call);
 
 #define is_signed_type(type)   (((type)(-1))  0)
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index b4ec83a..e163e4b 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -229,7 +229,8 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int 
flags)\
 #undef TRACE_EVENT
 #define TRACE_EVENT(call, proto, args, tstruct, func, print)   \
 static int \
-ftrace_format_##call(struct trace_seq *s)  \
+ftrace_format_##call(struct ftrace_event_call *event_call, \
+struct trace_seq *s)   \
 {  \
struct ftrace_raw_##call field __attribute__((unused)); \
int ret = 0;\
@@ -269,10 +270,9 @@ ftrace_format_##call(struct trace_seq *s)  
\
 #undef TRACE_EVENT
 #define TRACE_EVENT(call, proto, args, tstruct, func, print)   \
 int\
-ftrace_define_fields_##call(void)  \
+ftrace_define_fields_##call(struct ftrace_event_call *event_call)  \
 {  \
struct ftrace_raw_##call field; \
-   struct ftrace_event_call *event_call = event_##call;   \
int ret;\
\
__common_field(int, type, 1);   \
@@ -298,7 +298,7 @@ ftrace_define_fields_##call(void)   
\
  * event_trace_printk(_RET_IP_, call:  fmt);
  * }
  *
- * static int ftrace_reg_event_call(void)
+ * static int ftrace_reg_event_call(struct ftrace_event_call *unused)
  * {
  * int ret;
  *
@@ -309,7 +309,7 @@ ftrace_define_fields_##call(void)   
\
  * return ret;
  * }
  *
- * static void ftrace_unreg_event_call(void)
+ * static void ftrace_unreg_event_call(struct ftrace_event_call *unused)
  * {
  * unregister_trace_call(ftrace_event_call);
  * }
@@ -342,7 +342,7 @@ ftrace_define_fields_##call(void)   
\
  * 

[PATCH -tip v9 4/7] kprobes: cleanup fix_riprel() using insn decoder on x86

2009-06-01 Thread Masami Hiramatsu
Cleanup fix_riprel() in arch/x86/kernel/kprobes.c by using x86 instruction
decoder.

Signed-off-by: Masami Hiramatsu mhira...@redhat.com
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Jim Keniston jkeni...@us.ibm.com
Cc: Ingo Molnar mi...@elte.hu
---

 arch/x86/kernel/kprobes.c |  128 -
 1 files changed, 23 insertions(+), 105 deletions(-)

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 41d524f..ebac470 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -108,50 +108,6 @@ static const u32 twobyte_is_boostable[256 / 32] = {
/*  --- */
/*  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  */
 };
-static const u32 onebyte_has_modrm[256 / 32] = {
-   /*  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  */
-   /*  --- */
-   W(0x00, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) | /* 00 */
-   W(0x10, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) , /* 10 */
-   W(0x20, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) | /* 20 */
-   W(0x30, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) , /* 30 */
-   W(0x40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* 40 */
-   W(0x50, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 50 */
-   W(0x60, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0) | /* 60 */
-   W(0x70, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 70 */
-   W(0x80, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 80 */
-   W(0x90, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 90 */
-   W(0xa0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* a0 */
-   W(0xb0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* b0 */
-   W(0xc0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0) | /* c0 */
-   W(0xd0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1) , /* d0 */
-   W(0xe0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* e0 */
-   W(0xf0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1)   /* f0 */
-   /*  --- */
-   /*  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  */
-};
-static const u32 twobyte_has_modrm[256 / 32] = {
-   /*  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  */
-   /*  --- */
-   W(0x00, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1) | /* 0f */
-   W(0x10, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0) , /* 1f */
-   W(0x20, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1) | /* 2f */
-   W(0x30, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 3f */
-   W(0x40, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 4f */
-   W(0x50, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 5f */
-   W(0x60, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 6f */
-   W(0x70, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1) , /* 7f */
-   W(0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* 8f */
-   W(0x90, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 9f */
-   W(0xa0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1) | /* af */
-   W(0xb0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1) , /* bf */
-   W(0xc0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0) | /* cf */
-   W(0xd0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* df */
-   W(0xe0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* ef */
-   W(0xf0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0)   /* ff */
-   /*  --- */
-   /*  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  */
-};
 #undef W
 
 struct kretprobe_blackpoint kretprobe_blacklist[] = {
@@ -344,68 +300,30 @@ static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
 static void __kprobes fix_riprel(struct kprobe *p)
 {
 #ifdef CONFIG_X86_64
-   u8 *insn = p-ainsn.insn;
-   s64 disp;
-   int need_modrm;
-
-   /* Skip legacy instruction prefixes.  */
-   while (1) {
-   switch (*insn) {
-   case 0x66:
-   case 0x67:
-   case 0x2e:
-   case 0x3e:
-   case 0x26:
-   case 0x64:
-   case 0x65:
-   case 0x36:
-   case 0xf0:
-   case 0xf3:
-   case 0xf2:
-   ++insn;
-   continue;
-   }
-   break;
-   }
+   struct insn insn;
+   kernel_insn_init(insn, p-ainsn.insn);
 
-   /* Skip REX instruction prefix.  */
-   if (is_REX_prefix(insn))
-   ++insn;
-
-   if (*insn == 0x0f) {
-   /* Two-byte opcode.  */
-   ++insn;
-   

[PATCH -tip v9 2/7] x86: x86 instruction decoder build-time selftest

2009-06-01 Thread Masami Hiramatsu
Add a user-space selftest of x86 instruction decoder at kernel build time.
When CONFIG_X86_DECODER_SELFTEST=y, Kbuild builds a test harness of x86
instruction decoder and performs it after building vmlinux.
The test compares the results of objdump and x86 instruction decoder
code and check there are no differences.

Changes from v7:
- Add data, addr, rep, lock prefixes to skip instructions list.
- Add license comments.

Signed-off-by: Masami Hiramatsu mhira...@redhat.com
Signed-off-by: Jim Keniston jkeni...@us.ibm.com
Cc: H. Peter Anvin h...@zytor.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Ingo Molnar mi...@elte.hu
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Andi Kleen a...@linux.intel.com
Cc: Vegard Nossum vegard.nos...@gmail.com
Cc: Avi Kivity a...@redhat.com
Cc: Przemysław Pawełczyk przemys...@pawelczyk.it
Cc: Sam Ravnborg s...@ravnborg.org
---

 arch/x86/Kconfig.debug  |9 
 arch/x86/Makefile   |3 +
 arch/x86/include/asm/inat.h |2 +
 arch/x86/include/asm/insn.h |2 +
 arch/x86/lib/inat.c |2 +
 arch/x86/lib/insn.c |2 +
 arch/x86/scripts/Makefile   |   19 +++
 arch/x86/scripts/distill.awk|   42 +
 arch/x86/scripts/test_get_len.c |   99 +++
 arch/x86/scripts/user_include.h |   49 +++
 10 files changed, 229 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/scripts/Makefile
 create mode 100644 arch/x86/scripts/distill.awk
 create mode 100644 arch/x86/scripts/test_get_len.c
 create mode 100644 arch/x86/scripts/user_include.h

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 9a88937..430aab4 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -179,6 +179,15 @@ config X86_DS_SELFTEST
 config HAVE_MMIOTRACE_SUPPORT
def_bool y
 
+config X86_DECODER_SELFTEST
+ bool x86 instruction decoder selftest
+ depends on DEBUG_KERNEL
+   ---help---
+Perform x86 instruction decoder selftests at build time.
+This option is useful for checking the sanity of x86 instruction
+decoder code.
+If unsure, say N.
+
 #
 # IO delay types:
 #
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 1b68659..7046556 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -154,6 +154,9 @@ all: bzImage
 KBUILD_IMAGE := $(boot)/bzImage
 
 bzImage: vmlinux
+ifeq ($(CONFIG_X86_DECODER_SELFTEST),y)
+   $(Q)$(MAKE) $(build)=arch/x86/scripts posttest
+endif
$(Q)$(MAKE) $(build)=$(boot) $(KBUILD_IMAGE)
$(Q)mkdir -p $(objtree)/arch/$(UTS_MACHINE)/boot
$(Q)ln -fsn ../../x86/boot/bzImage 
$(objtree)/arch/$(UTS_MACHINE)/boot/$@
diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
index 01e079a..9090665 100644
--- a/arch/x86/include/asm/inat.h
+++ b/arch/x86/include/asm/inat.h
@@ -20,7 +20,9 @@
  * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
  *
  */
+#ifdef __KERNEL__
 #include linux/types.h
+#endif
 
 /* Instruction attributes */
 typedef u32 insn_attr_t;
diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index 5b50fa3..5736404 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -20,7 +20,9 @@
  * Copyright (C) IBM Corporation, 2009
  */
 
+#ifdef __KERNEL__
 #include linux/types.h
+#endif
 /* insn_attr_t is defined in inat.h */
 #include asm/inat.h
 
diff --git a/arch/x86/lib/inat.c b/arch/x86/lib/inat.c
index d6a34be..564ecbd 100644
--- a/arch/x86/lib/inat.c
+++ b/arch/x86/lib/inat.c
@@ -18,7 +18,9 @@
  * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
  *
  */
+#ifdef __KERNEL__
 #include linux/module.h
+#endif
 #include asm/insn.h
 
 /* Attribute tables are generated from opcode map */
diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 254c848..3b9451a 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -18,8 +18,10 @@
  * Copyright (C) IBM Corporation, 2002, 2004, 2009
  */
 
+#ifdef __KERNEL__
 #include linux/string.h
 #include linux/module.h
+#endif
 #include asm/inat.h
 #include asm/insn.h
 
diff --git a/arch/x86/scripts/Makefile b/arch/x86/scripts/Makefile
new file mode 100644
index 000..f08859e
--- /dev/null
+++ b/arch/x86/scripts/Makefile
@@ -0,0 +1,19 @@
+PHONY += posttest
+quiet_cmd_posttest = TEST$@
+  cmd_posttest = objdump -d $(objtree)/vmlinux | awk -f 
$(srctree)/arch/x86/scripts/distill.awk | $(obj)/test_get_len
+
+posttest: $(obj)/test_get_len vmlinux
+   $(call cmd,posttest)
+
+test_get_len_SRC = $(srctree)/arch/x86/scripts/test_get_len.c 
$(srctree)/arch/x86/lib/insn.c $(srctree)/arch/x86/lib/inat.c
+test_get_len_INC = $(srctree)/arch/x86/include/asm/inat.h 
$(srctree)/arch/x86/include/asm/insn.h $(objtree)/arch/x86/lib/inat-tables.c
+
+quiet_cmd_test_get_len = CC  $@
+  cmd_test_get_len = $(CC) -Wall $(test_get_len_SRC) 

[PATCH -tip v9 5/7] x86: add pt_regs register and stack access APIs

2009-06-01 Thread Masami Hiramatsu
Add following APIs for accessing registers and stack entries from pt_regs.

- regs_query_register_offset(const char *name)
   Query the offset of name register.

- regs_query_register_name(unsigned offset)
   Query the name of register by its offset.

- regs_get_register(struct pt_regs *regs, unsigned offset)
   Get the value of a register by its offset.

- regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
   Check the address is in the kernel stack.

- regs_get_kernel_stack_nth(struct pt_regs *reg, unsigned nth)
   Get Nth entry of the kernel stack. (N = 0)

- regs_get_argument_nth(struct pt_regs *reg, unsigned nth)
   Get Nth argument at function call. (N = 0)

Changes from v8:
 - Add regs_ prefix to the APIs
 - Add kerneldoc comments.
 - Cleanup regs_get_argument_nth() code.
 - Remove REG_OFFSET macro.

Signed-off-by: Masami Hiramatsu mhira...@redhat.com
Cc: Christoph Hellwig h...@infradead.org
Cc: Steven Rostedt rost...@goodmis.org
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Ingo Molnar mi...@elte.hu
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Roland McGrath rol...@redhat.com
Cc: linux-a...@vger.kernel.org
---

 arch/x86/include/asm/ptrace.h |  122 +
 arch/x86/kernel/ptrace.c  |   73 +
 2 files changed, 195 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 0f0d908..2fd3ea3 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -7,6 +7,7 @@
 
 #ifdef __KERNEL__
 #include asm/segment.h
+#include asm/page_types.h
 #endif
 
 #ifndef __ASSEMBLY__
@@ -216,6 +217,127 @@ static inline unsigned long user_stack_pointer(struct 
pt_regs *regs)
return regs-sp;
 }
 
+/* Query offset/name of register from its name/offset */
+extern int regs_query_register_offset(const char *name);
+extern const char *regs_query_register_name(unsigned offset);
+#define MAX_REG_OFFSET (offsetof(struct pt_regs, ss))
+
+/**
+ * regs_get_regsiter() - get register value from its offset
+ * @regs:  pt_regs from which register value is gotten.
+ * @offset:offset number of the register.
+ *
+ * regs_get_register returns the value of a register whose offset from @regs
+ * is @offset. The @offset is the offset of the register in struct pt_regs.
+ * If @offset is bigger than MAX_REG_OFFSET, this returns 0.
+ */
+static inline unsigned long regs_get_register(struct pt_regs *regs,
+ unsigned offset)
+{
+   if (unlikely(offset  MAX_REG_OFFSET))
+   return 0;
+   return *(unsigned long *)((unsigned long)regs + offset);
+}
+
+/**
+ * regs_within_kernel_stack() - check the address in the stack
+ * @regs:  pt_regs which contains kernel stack pointer.
+ * @addr:  address which is checked.
+ *
+ * regs_within_kenel_stack() checks @addr is within the kernel stack page(s).
+ * If @addr is within the kernel stack, it returns true. If not, returns false.
+ */
+static inline int regs_within_kernel_stack(struct pt_regs *regs,
+  unsigned long addr)
+{
+   return ((addr  ~(THREAD_SIZE - 1))  ==
+   (kernel_stack_pointer(regs)  ~(THREAD_SIZE - 1)));
+}
+
+/**
+ * regs_get_kernel_stack_nth() - get Nth entry of the stack
+ * @regs:  pt_regs which contains kernel stack pointer.
+ * @n: stack entry number.
+ *
+ * regs_get_kernel_stack_nth() returns @n th entry of the kernel stack which
+ * is specifined by @regs. If the @n th entry is NOT in the kernel stack,
+ * this returns 0.
+ */
+static inline unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
+ unsigned n)
+{
+   unsigned long *addr = (unsigned long *)kernel_stack_pointer(regs);
+   addr += n;
+   if (regs_within_kernel_stack(regs, (unsigned long)addr))
+   return *addr;
+   else
+   return 0;
+}
+
+/**
+ * regs_get_argument_nth() - get Nth argument at function call
+ * @regs:  pt_regs which contains registers at function entry.
+ * @n: argument number.
+ *
+ * regs_get_argument_nth() returns @n th argument of a function call.
+ * Since usually the kernel stack will be changed right after function entry,
+ * you must use this at function entry. If the @n th entry is NOT in the
+ * kernel stack or pt_regs, this returns 0.
+ */
+#ifdef CONFIG_X86_32
+#define NR_REGPARMS 3
+static inline unsigned long regs_get_argument_nth(struct pt_regs *regs,
+ unsigned n)
+{
+   if (n  NR_REGPARMS) {
+   switch (n) {
+   case 0:
+   return regs-ax;
+   case 1:
+   return regs-dx;
+   case 2:
+   return regs-cx;
+   }
+   return 0;
+   } else {
+   /*
+* The typical 

[PATCH -tip v9 7/7] tracing: add kprobe-based event tracer

2009-06-01 Thread Masami Hiramatsu
Add kprobes-based event tracer on ftrace.

This tracer is similar to the events tracer which is based on Tracepoint
infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
and kretprobe). It probes anywhere where kprobes can probe(this means, all
functions body except for __kprobes functions).

Similar to the events tracer, this tracer doesn't need to be activated via
current_tracer, instead of that, just set probe points via
/sys/kernel/debug/tracing/kprobe_events. And you can set filters on each
probe events via /sys/kernel/debug/tracing/events/kprobes/EVENT/filter.

This tracer supports following probe arguments for each probe.

  %REG  : Fetch register REG
  sN: Fetch Nth entry of stack (N = 0)
  @ADDR : Fetch memory at ADDR (ADDR should be in kernel)
  @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol)
  aN: Fetch function argument. (N = 0)
  rv: Fetch return value.
  ra: Fetch return address.
  +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.

See Documentation/trace/kprobes.txt for details.

Changes from v8:
 - Fix wrong argument offsets in format.
 - Remove EVENT_TRACING selection in Kconfig.
 - Fix debugfs file path.

Signed-off-by: Masami Hiramatsu mhira...@redhat.com
Cc: Christoph Hellwig h...@infradead.org
Cc: Steven Rostedt rost...@goodmis.org
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Ingo Molnar mi...@elte.hu
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Tom Zanussi tzanu...@gmail.com
---

 Documentation/trace/kprobes.txt  |  138 
 kernel/trace/Kconfig |   11 
 kernel/trace/Makefile|1 
 kernel/trace/trace.h |   22 +
 kernel/trace/trace_event_types.h |   20 +
 kernel/trace/trace_kprobe.c  | 1183 ++
 6 files changed, 1375 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/trace/kprobes.txt
 create mode 100644 kernel/trace/trace_kprobe.c

diff --git a/Documentation/trace/kprobes.txt b/Documentation/trace/kprobes.txt
new file mode 100644
index 000..3a90ebb
--- /dev/null
+++ b/Documentation/trace/kprobes.txt
@@ -0,0 +1,138 @@
+ Kprobe-based Event Tracer
+ =
+
+ Documentation is written by Masami Hiramatsu
+
+
+Overview
+
+This tracer is similar to the events tracer which is based on Tracepoint
+infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
+and kretprobe). It probes anywhere where kprobes can probe(this means, all
+functions body except for __kprobes functions).
+
+Unlike the function tracer, this tracer can probe instructions inside of
+kernel functions. It allows you to check which instruction has been executed.
+
+Unlike the Tracepoint based events tracer, this tracer can add and remove
+probe points on the fly.
+
+Similar to the events tracer, this tracer doesn't need to be activated via
+current_tracer, instead of that, just set probe points via
+/sys/kernel/debug/tracing/kprobe_events. And you can set filters on each
+probe events via /sys/kernel/debug/tracing/events/kprobes/EVENT/filter.
+
+
+Synopsis of kprobe_events
+-
+  p[:EVENT] SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS]: set a probe
+  r[:EVENT] SYMBOL[+0] [FETCHARGS] : set a return probe
+
+ EVENT : Event name
+ SYMBOL[+offs|-offs]   : Symbol+offset where the probe is inserted
+ MEMADDR   : Address where the probe is inserted
+
+ FETCHARGS : Arguments
+  %REG : Fetch register REG
+  sN   : Fetch Nth entry of stack (N = 0)
+  @ADDR: Fetch memory at ADDR (ADDR should be in kernel)
+  @SYM[+|-offs]: Fetch memory at SYM +|- offs (SYM should be a data 
symbol)
+  aN   : Fetch function argument. (N = 0)(*)
+  rv   : Fetch return value.(**)
+  ra   : Fetch return address.(**)
+  +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.(***)
+
+  (*) aN may not correct on asmlinkaged functions and at the middle of
+  function body.
+  (**) only for return probe.
+  (***) this is useful for fetching a field of data structures.
+
+
+Per-Probe Event Filtering
+-
+ Per-probe event filtering feature allows you to set different filter on each
+probe and gives you what arguments will be shown in trace buffer. If an event
+name is specified right after 'p:' or 'r:' in kprobe_events, the tracer adds
+an event under tracing/events/kprobes/EVENT, at the directory you can see
+'id', 'enabled', 'format' and 'filter'.
+
+enabled:
+  You can enable/disable the probe by writing 1 or 0 on it.
+
+format:
+  It shows the format of this probe event. It also shows aliases of arguments
+ which you specified to kprobe_events.
+
+filter:
+  You can write filtering rules of this event. And you can use both of aliase
+ names and field names for describing filters.
+
+
+Usage examples
+--
+To add a probe as a new 

[PATCH -tip v9 3/7] kprobes: checks probe address is instruction boudary on x86

2009-06-01 Thread Masami Hiramatsu
Ensure safeness of inserting kprobes by checking whether the specified
address is at the first byte of a instruction on x86.
This is done by decoding probed function from its head to the probe point.

Signed-off-by: Masami Hiramatsu mhira...@redhat.com
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Jim Keniston jkeni...@us.ibm.com
Cc: Ingo Molnar mi...@elte.hu
---

 arch/x86/kernel/kprobes.c |   69 +
 1 files changed, 69 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 7b5169d..41d524f 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -48,12 +48,14 @@
 #include linux/preempt.h
 #include linux/module.h
 #include linux/kdebug.h
+#include linux/kallsyms.h
 
 #include asm/cacheflush.h
 #include asm/desc.h
 #include asm/pgtable.h
 #include asm/uaccess.h
 #include asm/alternative.h
+#include asm/insn.h
 
 void jprobe_return_end(void);
 
@@ -244,6 +246,71 @@ retry:
}
 }
 
+/* Recover the probed instruction at addr for further analysis. */
+static int recover_probed_instruction(kprobe_opcode_t *buf, unsigned long addr)
+{
+   struct kprobe *kp;
+   kp = get_kprobe((void *)addr);
+   if (!kp)
+   return -EINVAL;
+
+   /*
+*  Basically, kp-ainsn.insn has an original instruction.
+*  However, RIP-relative instruction can not do single-stepping
+*  at different place, fix_riprel() tweaks the displacement of
+*  that instruction. In that case, we can't recover the instruction
+*  from the kp-ainsn.insn.
+*
+*  On the other hand, kp-opcode has a copy of the first byte of
+*  the probed instruction, which is overwritten by int3. And
+*  the instruction at kp-addr is not modified by kprobes except
+*  for the first byte, we can recover the original instruction
+*  from it and kp-opcode.
+*/
+   memcpy(buf, kp-addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
+   buf[0] = kp-opcode;
+   return 0;
+}
+
+/* Dummy buffers for kallsyms_lookup */
+static char __dummy_buf[KSYM_NAME_LEN];
+
+/* Check if paddr is at an instruction boundary */
+static int __kprobes can_probe(unsigned long paddr)
+{
+   int ret;
+   unsigned long addr, offset = 0;
+   struct insn insn;
+   kprobe_opcode_t buf[MAX_INSN_SIZE];
+
+   if (!kallsyms_lookup(paddr, NULL, offset, NULL, __dummy_buf))
+   return 0;
+
+   /* Decode instructions */
+   addr = paddr - offset;
+   while (addr  paddr) {
+   kernel_insn_init(insn, (void *)addr);
+   insn_get_opcode(insn);
+
+   /* Check if the instruction has been modified. */
+   if (OPCODE1(insn) == BREAKPOINT_INSTRUCTION) {
+   ret = recover_probed_instruction(buf, addr);
+   if (ret)
+   /*
+* Another debugging subsystem might insert
+* this breakpoint. In that case, we can't
+* recover it.
+*/
+   return 0;
+   kernel_insn_init(insn, buf);
+   }
+   insn_get_length(insn);
+   addr += insn.length;
+   }
+
+   return (addr == paddr);
+}
+
 /*
  * Returns non-zero if opcode modifies the interrupt flag.
  */
@@ -359,6 +426,8 @@ static void __kprobes arch_copy_kprobe(struct kprobe *p)
 
 int __kprobes arch_prepare_kprobe(struct kprobe *p)
 {
+   if (!can_probe((unsigned long)p-addr))
+   return -EILSEQ;
/* insn: must be on special executable page on x86. */
p-ainsn.insn = get_insn_slot();
if (!p-ainsn.insn)


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhira...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] powerpc/kvm: fix some init/exit annotations

2009-06-01 Thread Stephen Rothwell
Fixes a couple of warnings like this one:

WARNING: arch/powerpc/kvm/kvm-440.o(.text+0x1e8c): Section mismatch in 
reference from the function kvmppc_44x_exit() to the function 
.exit.text:kvmppc_booke_exit()
The function kvmppc_44x_exit() references a function in an exit section.
Often the function kvmppc_booke_exit() has valid usage outside the exit section
and the fix is to remove the __exit annotation of kvmppc_booke_exit.

Also add some __init annotations on obvious routines.

Signed-off-by: Stephen Rothwell s...@canb.auug.org.au
---
 arch/powerpc/kvm/44x.c   |4 ++--
 arch/powerpc/kvm/booke.c |2 +-
 arch/powerpc/kvm/e500.c  |4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index 0cef809..f4d1b55 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -138,7 +138,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
kmem_cache_free(kvm_vcpu_cache, vcpu_44x);
 }
 
-static int kvmppc_44x_init(void)
+static int __init kvmppc_44x_init(void)
 {
int r;
 
@@ -149,7 +149,7 @@ static int kvmppc_44x_init(void)
return kvm_init(NULL, sizeof(struct kvmppc_vcpu_44x), THIS_MODULE);
 }
 
-static void kvmppc_44x_exit(void)
+static void __exit kvmppc_44x_exit(void)
 {
kvmppc_booke_exit();
 }
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 642e420..e7bf4d0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -520,7 +520,7 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
return kvmppc_core_vcpu_translate(vcpu, tr);
 }
 
-int kvmppc_booke_init(void)
+int __init kvmppc_booke_init(void)
 {
unsigned long ivor[16];
unsigned long max_ivor = 0;
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index d8067fd..674e796 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -132,7 +132,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
kmem_cache_free(kvm_vcpu_cache, vcpu_e500);
 }
 
-static int kvmppc_e500_init(void)
+static int __init kvmppc_e500_init(void)
 {
int r, i;
unsigned long ivor[3];
@@ -160,7 +160,7 @@ static int kvmppc_e500_init(void)
return kvm_init(NULL, sizeof(struct kvmppc_vcpu_e500), THIS_MODULE);
 }
 
-static void kvmppc_e500_exit(void)
+static void __init kvmppc_e500_exit(void)
 {
kvmppc_booke_exit();
 }
-- 
1.6.3.1

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2789729 ] Destination guest will reboot when Migration

2009-06-01 Thread SourceForge.net
Bugs item #2789729, was opened at 2009-05-10 08:09
Message generated for change (Comment added) made by jiajun
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2789729group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: qemu
Group: None
Status: Open
Resolution: Fixed
Priority: 5
Private: No
Submitted By: Jiajun Xu (jiajun)
Assigned to: Nobody/Anonymous (nobody)
Summary: Destination guest will reboot when Migration

Initial Comment:
With kvm.git commit:66b0aed4a9e15a2ea3a00763f362b6ee0b28d538 and
qemu-kvm.git commit:6e57bb9a636cefdaba7decbd5ac10f1508ff64c0, when doing live 
migration, the destination guest will reboot.

Reproduce steps:

(1)qemu-img create -b /share/xvs/img/app/ia32e_SMP.img -f qcow2 
/share/xvs/var/tmp-img_CPL_LM_40_1228273473_1
(2)qemu  -m 256  -net nic,macaddr=00:16:3e:39:78:1c,model=rtl8139 -net 
tap,script=/etc/kvm/qemu-ifup
-hda /share/xvs/var/tmp-img_CPL_LM_40_1228273473_1 -incoming tcp:localhost:
(3) Press Ctrl+Alt+2 to switch to qemu monitor
(4) Run migrate tcp:localhost:


--

Comment By: Jiajun Xu (jiajun)
Date: 2009-06-01 20:47

Message:
Verified the bug with qemu-kvm.git
14cd1fdede96646992fc1f9731d3e367f0807ef0, the issue has been fixed in the
commit.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2789729group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] clean up cpu hotplug code

2009-06-01 Thread Glauber Costa
There's nothing kvm specific in get_cpu function. Remove it from
kvm ifdef. Buy us a cleaner code, and may help us with any attempt
of integrating this on the future.

Signed-off-by: Glauber Costa glom...@redhat.com
---
 hw/acpi.c |   12 +++-
 1 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/hw/acpi.c b/hw/acpi.c
index f4062ac..2b16437 100644
--- a/hw/acpi.c
+++ b/hw/acpi.c
@@ -770,9 +770,7 @@ static void disable_processor(struct gpe_regs *g, int cpu)
 g-cpus_sts[cpu/8] = ~(1  (cpu%8));
 }
 
-#if defined(TARGET_I386) || defined(TARGET_X86_64)
-#ifdef USE_KVM
-static CPUState *qemu_kvm_cpu_env(int index)
+static CPUState *qemu_get_cpu_env(int index)
 {
 CPUState *penv;
 
@@ -786,18 +784,14 @@ static CPUState *qemu_kvm_cpu_env(int index)
 
 return NULL;
 }
-#endif
 
+#if defined(TARGET_I386) || defined(TARGET_X86_64)
 
 void qemu_system_cpu_hot_add(int cpu, int state)
 {
 CPUState *env;
 
-if (state
-#ifdef USE_KVM
- (!qemu_kvm_cpu_env(cpu))
-#endif
-) {
+if (state i (!qemu_kvm_cpu_env(cpu))) {
 env = pc_new_cpu(cpu, model, 1);
 if (!env) {
 fprintf(stderr, cpu %d creation failed\n, cpu);
-- 
1.6.0.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] clean up cpu hotplug code

2009-06-01 Thread Glauber Costa
There's nothing kvm specific in get_cpu function. Remove it from
kvm ifdef. Buy us a cleaner code, and may help us with any attempt
of integrating this on the future.

Signed-off-by: Glauber Costa glom...@redhat.com
---
 hw/acpi.c |   12 +++-
 1 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/hw/acpi.c b/hw/acpi.c
index f4062ac..7f23e4e 100644
--- a/hw/acpi.c
+++ b/hw/acpi.c
@@ -770,9 +770,7 @@ static void disable_processor(struct gpe_regs *g, int cpu)
 g-cpus_sts[cpu/8] = ~(1  (cpu%8));
 }
 
-#if defined(TARGET_I386) || defined(TARGET_X86_64)
-#ifdef USE_KVM
-static CPUState *qemu_kvm_cpu_env(int index)
+static CPUState *qemu_get_cpu_env(int index)
 {
 CPUState *penv;
 
@@ -786,18 +784,14 @@ static CPUState *qemu_kvm_cpu_env(int index)
 
 return NULL;
 }
-#endif
 
+#if defined(TARGET_I386) || defined(TARGET_X86_64)
 
 void qemu_system_cpu_hot_add(int cpu, int state)
 {
 CPUState *env;
 
-if (state
-#ifdef USE_KVM
- (!qemu_kvm_cpu_env(cpu))
-#endif
-) {
+if (state  (!qemu_get_cpu_env(cpu))) {
 env = pc_new_cpu(cpu, model, 1);
 if (!env) {
 fprintf(stderr, cpu %d creation failed\n, cpu);
-- 
1.6.0.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] powerpc/kvm: fix some init/exit annotations

2009-06-01 Thread Stephen Rothwell
Fixes a couple of warnings like this one:

WARNING: arch/powerpc/kvm/kvm-440.o(.text+0x1e8c): Section mismatch in 
reference from the function kvmppc_44x_exit() to the function 
.exit.text:kvmppc_booke_exit()
The function kvmppc_44x_exit() references a function in an exit section.
Often the function kvmppc_booke_exit() has valid usage outside the exit section
and the fix is to remove the __exit annotation of kvmppc_booke_exit.

Also add some __init annotations on obvious routines.

Signed-off-by: Stephen Rothwell s...@canb.auug.org.au
---
 arch/powerpc/kvm/44x.c   |4 ++--
 arch/powerpc/kvm/booke.c |2 +-
 arch/powerpc/kvm/e500.c  |4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index 0cef809..f4d1b55 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -138,7 +138,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
kmem_cache_free(kvm_vcpu_cache, vcpu_44x);
 }
 
-static int kvmppc_44x_init(void)
+static int __init kvmppc_44x_init(void)
 {
int r;
 
@@ -149,7 +149,7 @@ static int kvmppc_44x_init(void)
return kvm_init(NULL, sizeof(struct kvmppc_vcpu_44x), THIS_MODULE);
 }
 
-static void kvmppc_44x_exit(void)
+static void __exit kvmppc_44x_exit(void)
 {
kvmppc_booke_exit();
 }
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 642e420..e7bf4d0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -520,7 +520,7 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
return kvmppc_core_vcpu_translate(vcpu, tr);
 }
 
-int kvmppc_booke_init(void)
+int __init kvmppc_booke_init(void)
 {
unsigned long ivor[16];
unsigned long max_ivor = 0;
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index d8067fd..674e796 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -132,7 +132,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
kmem_cache_free(kvm_vcpu_cache, vcpu_e500);
 }
 
-static int kvmppc_e500_init(void)
+static int __init kvmppc_e500_init(void)
 {
int r, i;
unsigned long ivor[3];
@@ -160,7 +160,7 @@ static int kvmppc_e500_init(void)
return kvm_init(NULL, sizeof(struct kvmppc_vcpu_e500), THIS_MODULE);
 }
 
-static void kvmppc_e500_exit(void)
+static void __init kvmppc_e500_exit(void)
 {
kvmppc_booke_exit();
 }
-- 
1.6.3.1

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html