Re: [PATCH] KVM test: Disable HPET on windows timedrift tests

2010-07-04 Thread Dor Laor

On 07/01/2010 07:05 PM, Lucas Meneghel Rodrigues wrote:

On Thu, 2010-07-01 at 17:42 +0300, Avi Kivity wrote:

On 06/30/2010 06:39 PM, Lucas Meneghel Rodrigues wrote:

By default, HPET is enabled on qemu and no time drift
mitigation is being made for it. So, add -no-hpet
if qemu supports it, during windows timedrift tests.




Hm, you're compensating for a qemu bug by not testing it.

Can we have an XFAIL for this test instead?


Certainly we can. In actuality, that's what's being done on our internal
autotest server - this particular test is linked to the upstream bug
https://bugs.launchpad.net/qemu/+bug/599958

We've discussed about this issue this morning, it boils down to the way
people are more comfortable with handling this issue. My first thought
was to disable HPET until someone come up with a time drift mitigation
strategy for it.

But your approach makes more sense, unless someone has something else to
say about it, I'll drop the patch from autotest shortly.


Actually we should do both - XFAIL when hpet is used and in addition 
(and even more importantly) test other clock sources by disabling hpet.




Lucas


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 02.07.2010, at 21:10, Scott Wood wrote:

 On Fri, 2 Jul 2010 20:47:44 +0200
 Alexander Graf ag...@suse.de wrote:
 
 
 On 02.07.2010, at 19:59, Hollis Blanchard wrote:
 
 [Resending...]
 
 Please reconcile this with
 http://www.linux-kvm.org/page/PowerPC_Hypercall_ABI, which has been
 discussed in the (admittedly closed) Power.org embedded hypervisor
 working group. Bear in mind that other hypervisors are already
 implementing the documented ABI, so if you have concerns, you should
 probably raise them with that audience...
 
 We can not use sc with LV=1 because that would break the KVM in
 something else case which is KVM's strong point on PPC.
 
 The current proposal involves the hypervisor specifying the hcall opcode
 sequence in the device tree -- to allow either sc 1 or sc 0 plus
 magic GPR depending on whether you've got the hardware hypervisor
 feature (hereafter HHV).

Ah right, so you can still trap a hypercall with HHV. Makes sense.

 
 With HHV, sc 0 plus magic GPR just doesn't work, since it won't trap
 to the hypervisor.  sc 1 plus magic GPR might be problematic on some
 non-HHV implementations, especially if you *do* have HHV but the
 non-HHV hypervisor is running as an HHV guest.

Yes, that's why I need sc 0 plus magic GPR in r0 and r3 - to accomodate for all 
the non-HHV cases. And it would be clever to have a way to expose the same 
functionality when we do use the HHV features.

So, is that draft available anywhere? The wiki page Hollis pointed to is very 
vague.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 04.07.2010, at 00:41, Benjamin Herrenschmidt wrote:

 On Fri, 2010-07-02 at 18:27 +0200, Segher Boessenkool wrote:
 +To find out if we're running on KVM or not, we overlay the PVR  
 register. Usually
 +the PVR register contains an id that identifies your CPU type. If,  
 however, you
 +pass KVM_PVR_PARA in the register that you want the PVR result in,  
 the register
 +still contains KVM_PVR_PARA after the mfpvr call.
 +
 +   LOAD_REG_IMM(r5, KVM_PVR_PARA)
 +   mfpvr   r5
 +   [r5 still contains KVM_PVR_PARA]
 
 I love this part :-)
 
 Me not :-)
 
 It should be in the device-tree instead, or something like that. Enough
 games with PVR...

My biggest concern about putting things in the device-tree is that I was trying 
to keep things as separate as possible. Why does the firmware have to know that 
it's running in KVM? Why do I have to patch 3 projects (Linux, OpenBIOS, Qemu) 
when I could go with patching a single one (Linux)?

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH repost] sched: export sched_set/getaffinity to modules

2010-07-04 Thread Michael S. Tsirkin
On Fri, Jul 02, 2010 at 11:06:37PM +0200, Oleg Nesterov wrote:
 On 07/02, Peter Zijlstra wrote:
 
  On Fri, 2010-07-02 at 11:01 -0700, Sridhar Samudrala wrote:
  
Does  it (Tejun's kthread_clone() patch) also  inherit the
   cgroup of the caller?
 
  Of course, its a simple do_fork() which inherits everything just as you
  would expect from a similar sys_clone()/sys_fork() call.
 
 Yes. And I'm afraid it can inherit more than we want. IIUC, this is called
 from ioctl(), right?
 
 Then the new thread becomes the natural child of the caller, and it shares
 -mm with the parent. And files, dup_fd() without CLONE_FS.
 
 Signals. Say, if you send SIGKILL to this new thread, it can't sleep in
 TASK_INTERRUPTIBLE or KILLABLE after that. And this SIGKILL can be sent
 just because the parent gets SIGQUIT or abother coredumpable signal.
 Or the new thread can recieve SIGSTOP via ^Z.
 
 Perhaps this is OK, I do not know. Just to remind that kernel_thread()
 is merely clone(CLONE_VM).
 
 Oleg.


Right. Doing this might break things like flush.  The signal and exit
behaviour needs to be examined carefully. I am also unsure whether
using such threads might be more expensive than inheriting kthreadd.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Avi Kivity

On 07/04/2010 12:04 PM, Alexander Graf wrote:


My biggest concern about putting things in the device-tree is that I was trying 
to keep things as separate as possible. Why does the firmware have to know that 
it's running in KVM?


It doesn't need to know about kvm, it needs to know that a particular 
hypercall protocol is available.



Why do I have to patch 3 projects (Linux, OpenBIOS, Qemu) when I could go with 
patching a single one (Linux)?
   


That's not a valid argument.  You patch as many projects as it takes to 
get it right (not that I have an opinion in this particular discussion).


At the very least you have to patch qemu for reasons described before 
(backwards compatible live migration).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 04.07.2010, at 11:10, Avi Kivity wrote:

 On 07/04/2010 12:04 PM, Alexander Graf wrote:
 
 My biggest concern about putting things in the device-tree is that I was 
 trying to keep things as separate as possible. Why does the firmware have to 
 know that it's running in KVM?
 
 It doesn't need to know about kvm, it needs to know that a particular 
 hypercall protocol is available.

Considering how the parts of the draft that I read about sound like, that's not 
the inventor's idea. PPC people love to see the BIOS be part of the 
virtualization solution. I don't. That's the biggest difference here and reason 
for us going different directions.

I think what they thought of is something like

if (in_kvm()) {
  device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
  device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
}

which then the OS reads out. But that's useless, as the hypercalls are 
hypervisor specific. So why make the detection on the Linux side generic?

 
 Why do I have to patch 3 projects (Linux, OpenBIOS, Qemu) when I could go 
 with patching a single one (Linux)?
   
 
 That's not a valid argument.  You patch as many projects as it takes to get 
 it right (not that I have an opinion in this particular discussion).

If you can put code in Linux that touches 3 submaintainer's directories or one 
submaintainer's directory with both ending up being functionally equivalent, 
which way would you go?

 
 At the very least you have to patch qemu for reasons described before 
 (backwards compatible live migration).

There is no live migration on PPC (yet). That point is completely moot atm.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 04.07.2010, at 11:17, Alexander Graf wrote:

 
 On 04.07.2010, at 11:10, Avi Kivity wrote:
 
 On 07/04/2010 12:04 PM, Alexander Graf wrote:
 
 My biggest concern about putting things in the device-tree is that I was 
 trying to keep things as separate as possible. Why does the firmware have 
 to know that it's running in KVM?
 
 It doesn't need to know about kvm, it needs to know that a particular 
 hypercall protocol is available.
 
 Considering how the parts of the draft that I read about sound like, that's 
 not the inventor's idea. PPC people love to see the BIOS be part of the 
 virtualization solution. I don't. That's the biggest difference here and 
 reason for us going different directions.
 
 I think what they thought of is something like
 
 if (in_kvm()) {
  device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
  device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
 }
 
 which then the OS reads out. But that's useless, as the hypercalls are 
 hypervisor specific. So why make the detection on the Linux side generic?

In fact, it's even worse. Right now with KVM for PPC we have 3 different ways 
of generating the device tree:

1) OpenBIOS (Mac emulation)
2) Qemu libfdt (BookE)
3) MOL OF implementation

So I'd have to touch even more projects. Just for the sake of splitting out 
something that belongs together anyway. And probably even create new interfaces 
just for that sake (qemu asking the kernel which type of hypercalls the vm 
should use) even though the guest could just query all that itself.

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Avi Kivity

On 07/04/2010 12:17 PM, Alexander Graf wrote:

On 04.07.2010, at 11:10, Avi Kivity wrote:

   

On 07/04/2010 12:04 PM, Alexander Graf wrote:
 

My biggest concern about putting things in the device-tree is that I was trying 
to keep things as separate as possible. Why does the firmware have to know that 
it's running in KVM?
   

It doesn't need to know about kvm, it needs to know that a particular hypercall 
protocol is available.
 

Considering how the parts of the draft that I read about sound like, that's not 
the inventor's idea. PPC people love to see the BIOS be part of the 
virtualization solution. I don't. That's the biggest difference here and reason 
for us going different directions.
   


Regardless of which direction is correct, you need to go in one direction.


I think what they thought of is something like

if (in_kvm()) {
   device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
   device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
}

which then the OS reads out. But that's useless, as the hypercalls are 
hypervisor specific. So why make the detection on the Linux side generic?
   


Looks like the benefit is less magic in the detection code.  x86 has 
(more or less) standardized feature detection.  Is this an attempt to 
bring something similar to ppc-land?



Why do I have to patch 3 projects (Linux, OpenBIOS, Qemu) when I could go with 
patching a single one (Linux)?

   

That's not a valid argument.  You patch as many projects as it takes to get it 
right (not that I have an opinion in this particular discussion).
 

If you can put code in Linux that touches 3 submaintainer's directories or one 
submaintainer's directory with both ending up being functionally equivalent, 
which way would you go?
   


We would do the right thing.  Trivial examples include adding defines to 
include/asm/processor.h or include/asm/msr-index.h, more complicated 
ones are the topic for my talk in kvm forum 2010.


Yes, coordinating the acks and trees and merge windows is not as fun as 
coding.  Yes, it's even more difficult with separate trees.  No, that's 
not an excuse if we[1] determine that the right thing to do is the most 
complicated.


[1] we in this case are the powerpc Linux arch maintainers and/or 
whoever defines the hardware specification



At the very least you have to patch qemu for reasons described before 
(backwards compatible live migration).
 

There is no live migration on PPC (yet). That point is completely moot atm.
   


You still need to make that feature disableable from userspace.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Avi Kivity

On 07/04/2010 12:30 PM, Alexander Graf wrote:



Considering how the parts of the draft that I read about sound like, that's not 
the inventor's idea. PPC people love to see the BIOS be part of the 
virtualization solution. I don't. That's the biggest difference here and reason 
for us going different directions.

I think what they thought of is something like

if (in_kvm()) {
  device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
  device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
}

which then the OS reads out. But that's useless, as the hypercalls are 
hypervisor specific. So why make the detection on the Linux side generic?
 

In fact, it's even worse. Right now with KVM for PPC we have 3 different ways 
of generating the device tree:

1) OpenBIOS (Mac emulation)
2) Qemu libfdt (BookE)
3) MOL OF implementation
   


I sympathize.  But, if the arch says that's how you do things, then 
that's how you do things.



So I'd have to touch even more projects. Just for the sake of splitting out 
something that belongs together anyway. And probably even create new interfaces 
just for that sake (qemu asking the kernel which type of hypercalls the vm 
should use) even though the guest could just query all that itself.
   


qemu needs to be involved, in case one day you support more than one 
type of hypercalls (like x86 does with hyper-v) or if you want to live 
migrate from a host that has hypercall support to another host that has 
this feature removed (as has already happened on x86 with the pvmmu).


Planning for the future means a lot of boring interfaces.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/27] KVM: PPC: Magic Page Book3s support

2010-07-04 Thread Avi Kivity

On 07/02/2010 06:37 PM, Alexander Graf wrote:

Alexander Graf wrote:
   

We need to override EA as well as PA lookups for the magic page. When the guest
tells us to project it, the magic page overrides any guest mappings.

In order to reflect that, we need to hook into all the MMU layers of KVM to
force map the magic page if necessary.

Signed-off-by: Alexander Grafag...@suse.de

v1 -  v2:

   - RMO -  PAM
---
  arch/powerpc/kvm/book3s.c |7 +++
  arch/powerpc/kvm/book3s_32_mmu.c  |   16 
  arch/powerpc/kvm/book3s_32_mmu_host.c |   12 
  arch/powerpc/kvm/book3s_64_mmu.c  |   30 +-
  arch/powerpc/kvm/book3s_64_mmu_host.c |   12 
  5 files changed, 76 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 14db032..b22e608 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -554,6 +554,13 @@ mmio:

  static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn)
  {
+   ulong mp_pa = vcpu-arch.magic_page_pa;
+
+   if (unlikely(mp_pa)
+   unlikely((mp_pa  KVM_RMO)  PAGE_SHIFT == gfn)) {

 

This should be KVM_PAM :(. Should I respin the whole thing or could
whoever commits this just make that trivial change?

   


A respin followed by a bisectability test (compile each revision as it 
is applied), please.  Of course we need to resolve the detection issue 
first.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH v4] [RFC] KVM test: rss.cpp: add file transfer support

2010-07-04 Thread Michael Goldish
Enable RSS to send/receive files and directory trees (recursively).

See protocol details in rss.cpp.

Changes from v3:
- Protocol change: instead of sending a file in one big packet, send it in
  multiple chunks.  The last chunk of a file is identified by being smaller.
  This solves a problem with transfers of files that happen to be growing or
  shrinking while being transferred.
- Change the way messages are displayed in the text box: instead of adding them
  immediately and redrawing the text box for each message, queue the messages
  in a buffer and flush it every 250ms or when it's full.  This makes it
  possible to log everything without using a significant amount of CPU time,
  even for transfers of thousands of small files.
- Change text box limit to 262144 chars.
- Allow resizing the main window.
- Reuse some code by putting it in an Accept() function.

Changes from v2:
- Use ports 10022 and 10023 by default instead of 22 and 23.

Changes from v1:
- Expand environment variables (e.g. %WinDir%) in all paths.
- Change text box limit to 16384 chars.
- When text box is full, clear 3/4 of old text instead of 1/2 (looks prettier).
- Use const char * instead of char * where appropriate.

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/deps/rss.cpp | 1449 -
 1 files changed, 990 insertions(+), 459 deletions(-)

diff --git a/client/tests/kvm/deps/rss.cpp b/client/tests/kvm/deps/rss.cpp
index 66d9a5b..26c5ed6 100644
--- a/client/tests/kvm/deps/rss.cpp
+++ b/client/tests/kvm/deps/rss.cpp
@@ -1,459 +1,990 @@
-// Simple remote shell server
-// Author: Michael Goldish mgold...@redhat.com
-// Much of the code here was adapted from Microsoft code samples.
-
-// Usage: rss.exe [port]
-// If no port is specified the default is 22.
-
-#define _WIN32_WINNT 0x0500
-
-#include windows.h
-#include winsock2.h
-#include stdio.h
-
-#pragma comment(lib, ws2_32.lib)
-
-int port = 22;
-
-HWND hMainWindow = NULL;
-HWND hTextBox = NULL;
-
-struct client_info {
-SOCKET socket;
-sockaddr_in addr;
-int pid;
-HWND hwnd;
-HANDLE hJob;
-HANDLE hChildOutputRead;
-HANDLE hThreadChildToSocket;
-};
-
-void ExitOnError(char *message, BOOL winsock = 0)
-{
-LPVOID system_message;
-char buffer[512];
-
-int error_code;
-if (winsock)
-error_code = WSAGetLastError();
-else
-error_code = GetLastError();
-
-WSACleanup();
-
-FormatMessage(
-FORMAT_MESSAGE_ALLOCATE_BUFFER|FORMAT_MESSAGE_FROM_SYSTEM,
-NULL, error_code, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
-(LPTSTR)system_message, 0, NULL);
-
-sprintf(buffer,
-%s!\n
-Error code = %d\n
-Error message = %s,
-message, error_code, (char *)system_message);
-
-MessageBox(hMainWindow, buffer, Error, MB_OK | MB_ICONERROR);
-
-LocalFree(system_message);
-ExitProcess(1);
-}
-
-void AppendMessage(char *message)
-{
-int length = GetWindowTextLength(hTextBox);
-SendMessage(hTextBox, EM_SETSEL, (WPARAM)length, (LPARAM)length);
-SendMessage(hTextBox, EM_REPLACESEL, (WPARAM)FALSE, (LPARAM)message);
-}
-
-void FormatStringForPrinting(char *dst, char *src, int size)
-{
-int j = 0;
-
-for (int i = 0; i  size  src[i]; i++) {
-if (src[i] == '\n') {
-dst[j++] = '\\';
-dst[j++] = 'n';
-} else if (src[i] == '\r') {
-dst[j++] = '\\';
-dst[j++] = 'r';
-} else if (src[i] == '\t') {
-dst[j++] = '\\';
-dst[j++] = 't';
-} else if (src[i] == '\\') {
-dst[j++] = '\\';
-dst[j++] = '\\';
-} else dst[j++] = src[i];
-}
-dst[j] = 0;
-}
-
-char* GetClientIPAddress(client_info *ci)
-{
-char *address = inet_ntoa(ci-addr.sin_addr);
-if (address)
-return address;
-else
-return unknown;
-}
-
-DWORD WINAPI ChildToSocket(LPVOID client_info_ptr)
-{
-char buffer[1024], message[1024];
-client_info ci;
-DWORD bytes_read;
-int bytes_sent;
-
-memcpy(ci, client_info_ptr, sizeof(ci));
-
-while (1) {
-// Read data from the child's STDOUT/STDERR pipes
-if (!ReadFile(ci.hChildOutputRead,
-  buffer, sizeof(buffer),
-  bytes_read, NULL) || !bytes_read) {
-if (GetLastError() == ERROR_BROKEN_PIPE)
-break; // Pipe done -- normal exit path
-else
-ExitOnError(ReadFile failed); // Something bad happened
-}
-// Send data to the client
-bytes_sent = send(ci.socket, buffer, bytes_read, 0);
-/*
-// Make sure all the data was sent
-if (bytes_sent != bytes_read) {
-sprintf(message,
-ChildToSocket: bytes read (%d) != bytes sent (%d),
-bytes_read, bytes_sent);
-ExitOnError(message, 1);
-}
-*/
- 

[KVM-AUTOTEST PATCH v4] [RFC] KVM test: add python client for rss file transfer services

2010-07-04 Thread Michael Goldish
See details in docstrings in rss_file_transfer.py.
See protocol details in deps/rss.cpp.

Changes from v3:
- Protocol change: instead of sending a file as one big packet, send it in
  multiple chunks.  See details in deps/rss.cpp.

Changes from v2:
- Raise FileTransferNotFoundError if no files/dirs are transferred (due to
  a bad path or wildcard pattern)
- Make all connection related errors in the base class raise
  FileTransferConnectError

Changes from v1:
- Use glob() instead of iglob() (Python 2.4 doesn't like iglob())
- Change a few comments

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/rss_file_transfer.py |  431 +
 1 files changed, 431 insertions(+), 0 deletions(-)
 create mode 100755 client/tests/kvm/rss_file_transfer.py

diff --git a/client/tests/kvm/rss_file_transfer.py 
b/client/tests/kvm/rss_file_transfer.py
new file mode 100755
index 000..c584397
--- /dev/null
+++ b/client/tests/kvm/rss_file_transfer.py
@@ -0,0 +1,431 @@
+#!/usr/bin/python
+
+Client for file transfer services offered by RSS (Remote Shell Server).
+
+...@author: Michael Goldish (mgold...@redhat.com)
+...@copyright: 2008-2010 Red Hat Inc.
+
+
+import socket, struct, time, sys, os, glob
+
+# Globals
+CHUNKSIZE = 65536
+
+# Protocol message constants
+RSS_MAGIC   = 0x525353
+RSS_OK  = 1
+RSS_ERROR   = 2
+RSS_UPLOAD  = 3
+RSS_DOWNLOAD= 4
+RSS_SET_PATH= 5
+RSS_CREATE_FILE = 6
+RSS_CREATE_DIR  = 7
+RSS_LEAVE_DIR   = 8
+RSS_DONE= 9
+
+# See rss.cpp for protocol details.
+
+
+class FileTransferError(Exception):
+pass
+
+
+class FileTransferConnectError(FileTransferError):
+pass
+
+
+class FileTransferTimeoutError(FileTransferError):
+pass
+
+
+class FileTransferProtocolError(FileTransferError):
+pass
+
+
+class FileTransferSendError(FileTransferError):
+pass
+
+
+class FileTransferServerError(FileTransferError):
+pass
+
+
+class FileTransferNotFoundError(FileTransferError):
+pass
+
+
+class FileTransferClient(object):
+
+Connect to a RSS (remote shell server) and transfer files.
+
+
+def __init__(self, address, port, timeout=10):
+
+Connect to a server.
+
+@param address: The server's address
+@param port: The server's port
+@param timeout: Time duration to wait for connection to succeed
+@raise FileTransferConnectError: Raised if the connection fails
+@raise FileTransferProtocolError: Raised if an incorrect magic number
+is received
+
+self._socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+self._socket.settimeout(timeout)
+try:
+self._socket.connect((address, port))
+except socket.error:
+raise FileTransferConnectError(Could not connect to server)
+try:
+if self._receive_msg(timeout) != RSS_MAGIC:
+raise FileTransferConnectError(Received wrong magic number)
+except FileTransferTimeoutError:
+raise FileTransferConnectError(Timeout expired while waiting to 
+   receive magic number)
+self._send(struct.pack(=i, CHUNKSIZE))
+
+
+def __del__(self):
+self.close()
+
+
+def close(self):
+
+Close the connection.
+
+self._socket.close()
+
+
+def _send(self, str):
+try:
+self._socket.sendall(str)
+except socket.error:
+raise FileTransferSendError(Could not send data to server)
+
+
+def _receive(self, size, timeout=10):
+strs = []
+end_time = time.time() + timeout
+while size  0:
+try:
+self._socket.settimeout(max(0.0001, end_time - time.time()))
+data = self._socket.recv(size)
+except socket.timeout:
+raise FileTransferTimeoutError(Timeout expired while 
+   receiving data from server)
+except socket.error:
+raise FileTransferProtocolError(Error receiving data from 
+server)
+if not data:
+raise FileTransferProtocolError(Connection closed 
+unexpectedly)
+strs.append(data)
+size -= len(data)
+return .join(strs)
+
+
+def _send_packet(self, str):
+self._send(struct.pack(=I, len(str)))
+self._send(str)
+
+
+def _receive_packet(self, timeout=10):
+size = struct.unpack(=I, self._receive(4))[0]
+return self._receive(size, timeout)
+
+
+def _send_file_chunks(self, filename, timeout=30):
+f = open(filename, rb)
+try:
+end_time = time.time() + timeout
+while time.time()  end_time:
+data = f.read(CHUNKSIZE)
+ 

[KVM-AUTOTEST PATCH 1/3] KVM test: kvm_utils.py: add utility function which locates program files

2010-07-04 Thread Michael Goldish
find_command() is similar to os_dep.command(), but instead of searching $PATH,
it searches a few hard coded locations.  The problem with searching
$PATH is that different users may have different $PATHs.  For example, autotest
can be executed by a non-root user, or using sudo, and in some OSes this will
result in /sbin not being included in $PATH.

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/kvm_utils.py |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
index 0372565..a57a334 100644
--- a/client/tests/kvm/kvm_utils.py
+++ b/client/tests/kvm/kvm_utils.py
@@ -322,6 +322,15 @@ def env_unregister_vm(env, name):
 
 # Utility functions for dealing with external processes
 
+def find_command(cmd):
+for dir in [/usr/local/sbin, /usr/local/bin,
+/usr/sbin, /usr/bin, /sbin, /bin]:
+file = os.path.join(dir, cmd)
+if os.path.exists(file):
+return file
+raise ValueError('Missing command: %s' % cmd)
+
+
 def pid_exists(pid):
 
 Return True if a given PID exists.
-- 
1.5.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH 2/3] KVM test: use kvm_utils.find_command() where appropriate

2010-07-04 Thread Michael Goldish
Instead of hardcoding binary paths, use kvm_utils.find_command().
This should make the KVM test a little more distro independent.

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/kvm_preprocessing.py |4 ++--
 client/tests/kvm/kvm_utils.py |7 ---
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py 
b/client/tests/kvm/kvm_preprocessing.py
index ee279bd..2f6994a 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -208,8 +208,8 @@ def preprocess(test, params, env):
 env[tcpdump].close()
 del env[tcpdump]
 if tcpdump not in env and params.get(run_tcpdump, yes) == yes:
-command = /usr/sbin/tcpdump -npvi any 'dst port 68'
-logging.debug(Starting tcpdump (%s)..., command)
+cmd = %s -npvi any 'dst port 68' % kvm_utils.find_command(tcpdump)
+logging.debug(Starting tcpdump (%s)..., cmd)
 env[tcpdump] = kvm_subprocess.kvm_tail(
 command=command,
 output_func=_update_address_cache,
diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
index a57a334..4183f1c 100644
--- a/client/tests/kvm/kvm_utils.py
+++ b/client/tests/kvm/kvm_utils.py
@@ -250,19 +250,20 @@ def verify_ip_address_ownership(ip, macs, timeout=10.0):
 regex = re.compile(r\b%s\b.*\b(%s)\b % (ip, mac_regex), re.IGNORECASE)
 
 # Check the ARP cache
-o = commands.getoutput(/sbin/arp -n)
+o = commands.getoutput(%s -n % find_command(arp))
 if regex.search(o):
 return True
 
 # Get the name of the bridge device for arping
-o = commands.getoutput(/sbin/ip route get %s % ip)
+o = commands.getoutput(%s route get %s % (find_command(ip), ip))
 dev = re.findall(dev\s+\S+, o, re.IGNORECASE)
 if not dev:
 return False
 dev = dev[0].split()[-1]
 
 # Send an ARP request
-o = commands.getoutput(/sbin/arping -f -c 3 -I %s %s % (dev, ip))
+o = commands.getoutput(%s -f -c 3 -I %s %s %
+   (find_command(arping), dev, ip))
 return bool(regex.search(o))
 
 
-- 
1.5.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH 3/3] KVM test: kvm_subprocess: allow garbage collection of kvm_tail instances

2010-07-04 Thread Michael Goldish
Tail threads refer to kvm_tail objects, preventing them from being garbage-
collected.

1) Before a tail thread exits, remove the reference to the thread from the
   kvm_tail object.
2) Add a function kill_tail_threads() which asks all tail threads to terminate
   and waits for them to do so.
3) Use kill_tail_threads() instead of VM.kill_tail_thread() (which was there
   for a different reason) in kvm_preprocessing.py.

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/kvm_preprocessing.py |5 +-
 client/tests/kvm/kvm_subprocess.py|  121 ++---
 client/tests/kvm/kvm_vm.py|8 --
 3 files changed, 69 insertions(+), 65 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py 
b/client/tests/kvm/kvm_preprocessing.py
index 2f6994a..123928e 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -344,9 +344,8 @@ def postprocess(test, params, env):
 else:
 vm.destroy(gracefully=False)
 
-# Kill the tailing threads of all VMs
-for vm in kvm_utils.env_get_all_vms(env):
-vm.kill_tail_thread()
+# Kill all kvm_subprocess tail threads
+kvm_subprocess.kill_tail_threads()
 
 # Terminate tcpdump if no VMs are alive
 living_vms = [vm for vm in kvm_utils.env_get_all_vms(env) if vm.is_alive()]
diff --git a/client/tests/kvm/kvm_subprocess.py 
b/client/tests/kvm/kvm_subprocess.py
index 93a8429..f815069 100755
--- a/client/tests/kvm/kvm_subprocess.py
+++ b/client/tests/kvm/kvm_subprocess.py
@@ -548,6 +548,21 @@ class kvm_spawn:
 self.send(str + self.linesep)
 
 
+_thread_kill_requested = False
+
+def kill_tail_threads():
+
+Kill all kvm_tail threads.
+
+After calling this function no new threads should be started.
+
+global _thread_kill_requested
+_thread_kill_requested = True
+for t in threading.enumerate():
+if hasattr(t, name) and t.name.startswith(tail_thread):
+t.join(10)
+
+
 class kvm_tail(kvm_spawn):
 
 This class runs a child process in the background and sends its output in
@@ -608,7 +623,6 @@ class kvm_tail(kvm_spawn):
 
 # Start the thread in the background
 self.tail_thread = None
-self.__thread_kill_requested = False
 if termination_func or output_func:
 self._start_thread()
 
@@ -675,15 +689,6 @@ class kvm_tail(kvm_spawn):
 self.output_prefix = output_prefix
 
 
-def kill_tail_thread(self):
-
-Stop the tailing thread which calls output_func() and
-termination_func().
-
-self.__thread_kill_requested = True
-self._join_thread()
-
-
 def _tail(self):
 def print_line(text):
 # Pre-pend prefix and remove trailing whitespace
@@ -695,60 +700,68 @@ class kvm_tail(kvm_spawn):
 except TypeError:
 pass
 
-fd = self._get_fd(tail)
-buffer = 
-while True:
-if self.__thread_kill_requested:
+try:
+fd = self._get_fd(tail)
+buffer = 
+while True:
+global _thread_kill_requested
+if _thread_kill_requested:
+return
+try:
+# See if there's any data to read from the pipe
+r, w, x = select.select([fd], [], [], 0.05)
+except:
+break
+if fd in r:
+# Some data is available; read it
+new_data = os.read(fd, 1024)
+if not new_data:
+break
+buffer += new_data
+# Send the output to output_func line by line
+# (except for the last line)
+if self.output_func:
+lines = buffer.split(\n)
+for line in lines[:-1]:
+print_line(line)
+# Leave only the last line
+last_newline_index = buffer.rfind(\n)
+buffer = buffer[last_newline_index+1:]
+else:
+# No output is available right now; flush the buffer
+if buffer:
+print_line(buffer)
+buffer = 
+# The process terminated; print any remaining output
+if buffer:
+print_line(buffer)
+# Get the exit status, print it and send it to termination_func
+status = self.get_status()
+if status is None:
 return
+print_line((Process terminated with status %s) % status)
 try:
-# See if there's any data to read from the pipe
-r, w, x = select.select([fd], [], [], 0.05)
-except:
-break
-if fd in r:
-   

Re: [PATCH v4 5/6] KVM: MMU: combine guest pte read between walk and pte prefetch

2010-07-04 Thread Avi Kivity

On 07/03/2010 04:03 PM, Xiao Guangrong wrote:


Avi Kivity wrote:

   


Well, in the description, it looked like everything was using small
pages (in kvm, level=1 means PTE level, we need to change this one
day).  Please describe again and say exactly when the guest or host
uses large pages.


   

Oh, I see what you mean.

Regarding the patch, is it possible just to move the check before,
instead of adding the 'check' variable?

 

Umm, if we move the check before the judgment, it'll check every level,
actually, only the opened mapping and the laster level need checked, so
for the performance reason, maybe it's better to keep two check-point.
   


What exactly are the conditions when you want to check?

Perhaps we do need to check every level.  A write to a PDE (or higher 
level) will clear the corresponding spte, but a fault immediately 
afterwards can re-establish the spte to point to the new page.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/6] KVM: MMU: combine guest pte read between walk and pte prefetch

2010-07-04 Thread Avi Kivity

On 07/03/2010 03:57 PM, Xiao Guangrong wrote:


Avi Kivity wrote:

   

And this check is not sufficient, since it's only checked if the
mapping is zapped or not exist, for other words only when broken this
judgment:
 is_shadow_present_pte(*sptep)   !is_large_pte(*sptep)

but if the middle level is present and it's not the large mapping,
this check is skipped.

   


Well, in the description, it looked like everything was using small
pages (in kvm, level=1 means PTE level, we need to change this one
day).  Please describe again and say exactly when the guest or host uses
large pages.

 

Avi, sorry for my poor English, i not mean everything was using small, i don't
know where cause you confused :-(
   


Well, LVL1 got me to assume those are small pages.

Can you explain more precisely?  Use the processor terms like PTE and 
PDE.  Don't worry about your English :)


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Hot resizing of partitions on HVM

2010-07-04 Thread Daniel Bareiro
Hi all!

Is it possible to resize the partition of a KVM virtual machine without
it is down?

With Xen PV virtual machines, I get this by umounting the filesystem on
the VM, detaching the device and then resizing the logical volume and
file system in the VMHost. Finally, I attach the device and mount the
filesystem on virtual machine. All this without the VM is down.

The procedure that I am using at the moment to resizing a partition in a
KVM virtual machine is the following one:

* Shutdown the VM.
* Resize the logical volume which is the partition to resize.
* Boot a KVM VM with PartedMagic [1] and to extend the partition until
  it has the wished size using GParted.
* Shutdown the KVM VM with PartedMagic.
* Boot the VM with the new partition size.


There is some more efficient way to do it?


Thanks in advance for your replies.

Regards,
Daniel

[1] http://partedmagic.com/
-- 
Fingerprint: BFB3 08D6 B4D1 31B2 72B9  29CE 6696 BF1B 14E6 1D37
Powered by Debian GNU/Linux Lenny - Linux user #188.598


signature.asc
Description: Digital signature


Re: [PATCH] KVM: VMX: fix tlb flush with invalid root

2010-07-04 Thread Sheng Yang
On Saturday 03 July 2010 16:02:42 Xiao Guangrong wrote:
 Commit 341d9b535b6c simplify reload logic while entry guest mode, it
 can avoid unnecessary sync-root if KVM_REQ_MMU_RELOAD and
 KVM_REQ_MMU_SYNC both set.

Which commit you are talking about? Can't find 341d9b535b6c...

--
regards
Yang, Sheng

 
 But, it cause a issue that when we handle 'KVM_REQ_TLB_FLUSH', the
 root is invalid, it is triggered during my test:
 
 Kernel BUG at a00212b8 [verbose debug info unavailable]
 ..
 
  [810f5caf] ? fget_light+0x111/0x28e
  [81103963] sys_ioctl+0x47/0x6a
  [81002c1b] system_call_fastpath+0x16/0x1b
 Code: f0 eb 21 f7 c2 00 00 00 04 74 22 48 8d 45 f0 48 c7 45 f0 00 00 00 00
 48 c7 45 f8 00 00 00 00 b9 02 00 00 00 66 0f 38 80 08 77 02 0f 0b c9 c3
 55 48 89 e5 0f 1f 44 00 00 ba 00 68 00 00 48 8b 8f RIP 
 [a00212b8] vmx_flush_tlb+0xdf/0xe3 [kvm_intel]
  RSP 8800be269ca8
 
 Fixed by directly return if the root is not ready.
 
 Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com
 ---
  arch/x86/include/asm/kvm_host.h |2 ++
  arch/x86/kvm/mmu.c  |2 --
  arch/x86/kvm/vmx.c  |5 -
  3 files changed, 6 insertions(+), 3 deletions(-)
 
 diff --git a/arch/x86/include/asm/kvm_host.h
 b/arch/x86/include/asm/kvm_host.h index 2bda624..8f522ec 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -40,6 +40,8 @@
 0xFF00ULL)
 
  #define INVALID_PAGE (~(hpa_t)0)
 +#define VALID_PAGE(x) ((x) != INVALID_PAGE)
 +
  #define UNMAPPED_GVA (~(gpa_t)0)
 
  /* KVM Hugepage definitions for x86 */
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index a0c5c31..399ddb0 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -92,8 +92,6 @@ module_param(oos_shadow, bool, 0644);
  #define PT_FIRST_AVAIL_BITS_SHIFT 9
  #define PT64_SECOND_AVAIL_BITS_SHIFT 52
 
 -#define VALID_PAGE(x) ((x) != INVALID_PAGE)
 -
  #define PT64_LEVEL_BITS 9
 
  #define PT64_LEVEL_SHIFT(level) \
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index 806ab12..ebaaeaf 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -1831,8 +1831,11 @@ static void exit_lmode(struct kvm_vcpu *vcpu)
  static void vmx_flush_tlb(struct kvm_vcpu *vcpu)
  {
   vpid_sync_context(to_vmx(vcpu));
 - if (enable_ept)
 + if (enable_ept) {
 + if (!VALID_PAGE(vcpu-arch.mmu.root_hpa))
 + return;
   ept_sync_context(construct_eptp(vcpu-arch.mmu.root_hpa));
 + }
  }
 
  static void vmx_decache_cr0_guest_bits(struct kvm_vcpu *vcpu)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: VMX: fix tlb flush with invalid root

2010-07-04 Thread Marcelo Tosatti
On Sat, Jul 03, 2010 at 04:02:42PM +0800, Xiao Guangrong wrote:
 Commit 341d9b535b6c simplify reload logic while entry guest mode, it
 can avoid unnecessary sync-root if KVM_REQ_MMU_RELOAD and
 KVM_REQ_MMU_SYNC both set.
 
 But, it cause a issue that when we handle 'KVM_REQ_TLB_FLUSH', the
 root is invalid, it is triggered during my test:
 
 Kernel BUG at a00212b8 [verbose debug info unavailable]
 ..
 
  [810f5caf] ? fget_light+0x111/0x28e
  [81103963] sys_ioctl+0x47/0x6a
  [81002c1b] system_call_fastpath+0x16/0x1b
 Code: f0 eb 21 f7 c2 00 00 00 04 74 22 48 8d 45 f0 48 c7 45 f0 00 00 00 00 48 
 c7 45 f8 00 00 00 00 b9 02 00 00 00 66 0f 38 80 08 77 02 0f 0b c9 c3 55 48 
 89 e5 0f 1f 44 00 00 ba 00 68 00 00 48 8b 8f 
 RIP  [a00212b8] vmx_flush_tlb+0xdf/0xe3 [kvm_intel]
  RSP 8800be269ca8
 
 Fixed by directly return if the root is not ready.
 
 Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com

Applied, thanks.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/6] KVM: MMU: combine guest pte read between walk and pte prefetch

2010-07-04 Thread Xiao Guangrong


Avi Kivity wrote:

 Umm, if we move the check before the judgment, it'll check every level,
 actually, only the opened mapping and the laster level need checked, so
 for the performance reason, maybe it's better to keep two check-point.

 
 What exactly are the conditions when you want to check?
 
 Perhaps we do need to check every level.  A write to a PDE (or higher
 level) will clear the corresponding spte, but a fault immediately
 afterwards can re-establish the spte to point to the new page.
 

Looks into the code more carefully, maybe this code is wrong:


 if (!direct) {
 r = kvm_read_guest_atomic(vcpu-kvm,
-  gw-pte_gpa[level - 2],
+  gw-pte_gpa[level - 1],
  curr_pte, sizeof(curr_pte));
-if (r || curr_pte != gw-ptes[level - 2]) {
+if (r || curr_pte != gw-ptes[level - 1]) {
kvm_mmu_put_page(sp, sptep);
kvm_release_pfn_clean(pfn);
sptep = NULL;

It should check the 'level' mapping not 'level - 1', in the later description
i'll explain it.

Since the 'walk_addr' functions is out of mmu_lock protection, we should
handle the guest modify its mapping between 'walk_addr' and 'fetch'.
Now, let's consider every case may be happened while handle guest #PF.

One case is host handle guest written after 'fetch', just like this order:

VCPU 0  VCPU 1
walk_addr
guest modify its mapping
fetch
host handle this written(pte_write or invlpg)

This case is not broken anything, even if 'fetch' setup the wrong mapping, the
later written handler will fix it.

Another case is host handle guest written before 'fetch', like this:

CPU 0   VCPU 1
walk_addr
guest modify its mapping
host handle this written(pte_write or invlpg)
fetch

We should notice this case, since the later 'fetch' will setup the wrong 
mapping.

For example, the guest mapping which 'walk_addr' got is:

GPML4E - GPDPE - GPDE - GPTE - GFNA
(Just take small page for example, other mapping way is also applied)

And, for good to describe, we abstract 'fetch''s work:

for_each_shadow_entry(vcpu, addr, iterator) {
if (iterator.level == hlevel)
Mapping the later level

if (is_shadow_present_pte(*sptep) 
!is_large_pte(*sptep)) -- Point A
continue;

/* handle the non-present/wrong  middle level */

find/create the corresponding sp- Point B

if (the guest mapping is modified)  - Point C
break;
setup this level's mapping  - Point D

}

[
 Note: the later level means PTE, PDE if 2M page size, PDPE if 1G page size
 the middle level means PDE, PDPE if not using large page size / PML4E
]

There are two cases:

1: Guest modify the middle level, for example, guest modify the GPDE.
   a: the GPDE has corresponding sp entry, after VCPU1 handle this written,
  the corresponding host mapping is like this:
  HPML4E - HPDPE - HPDE
  [ HPDE.P = 0 since VCPU1 written handler zapped it in pte_wirte ]

  Under this case, it can broke Point A's judgment and Point C can detect
  the guest mapping is modified, so it exits this function without setup the
  mapping.

  And, we should check the guest mapping at GPDE not GPTE since it's GPDE
  modified not GPTE, it's the explanation for the front fix.

  b: the GPDE not has corresponding sp entry(the area is firstly accessed),
 corresponding host mapping is like this:
 HPML4E - HPDPE
 [ HPDPE.P = 0]

 under this case, VCPU1 happily write GPDE without #PF since the GPDE not 
has shadow
 page, it's not write-protected.

 while we handle HPDPE, we will create the shadow page for GPDE
 at Point B, then the host mapping is like this:
 HPML4E - HPDPE - HPDE
 [ HPDE.P = 0 ]
 it's the same as 1.a, so do the same work as 1.a

 Note: there is a trick: we should put 'Point C' behind of 'Point B', since 
after we
   create sp for GPDE, the later modify GPDE will cause #PF, it can 
ensure later
   check is valid

2: Guest modify the later level.
   Form 'fetch''s abstract, we can see the late level is not checked by 'Point 
C', if
   guest modified the GPTE's mapping, the wrong-mapping will be setup by 
'fetch', this
   is just this path does




 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: VMX: fix tlb flush with invalid root

2010-07-04 Thread Xiao Guangrong


Sheng Yang wrote:
 On Saturday 03 July 2010 16:02:42 Xiao Guangrong wrote:
 Commit 341d9b535b6c simplify reload logic while entry guest mode, it
 can avoid unnecessary sync-root if KVM_REQ_MMU_RELOAD and
 KVM_REQ_MMU_SYNC both set.
 
 Which commit you are talking about? Can't find 341d9b535b6c...
 

Please checkout to the 'next' branch.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: IOAPIC: only access APIC registers one dword at a time

2010-07-04 Thread Xiao Guangrong


Avi Kivity wrote:
 On 07/02/2010 11:00 AM, Xiao Guangrong wrote:
 The IOAPIC spec says:

 When accessing these registers, accesses must be done one dword at a
 time.
 For example, software should never access byte 2 from the Data
 register before
 accessing bytes 0 and 1. The hardware will not attempt to recover from
 a bad
 programming model in this case.

 So, this patch removes other width access


 
 The ioapic code also implements the ia64 iosapic.  I'm guessing that
 does support 64-bit accesses.  Please check the iosapic documentation.
 

The iosapic also using 32-bit to access registers:

All registers are accessed using 32-bit uncacheable loads and stores to a 
reserved memory location
in system memory. This implies that to modify a field (e.g., a bit or a byte) 
in any register, the
whole 32-bit register must be read, the field modified, and the 32 bits written 
back. Partial register
access, or non-aligned register access, are implementation-defined by the I/O 
xAPIC and will not
be compatible across different implementations. Also, registers that are 
described as 64 bits wide
are accessed as multiple independent 32-bit registers.

[ From  IntelĀ® ItaniumĀ® Processor Family Interrupt Architecture Guide , 
P2-6 ]

 There might be guests that use incorrect access despite the
 documentation; if real hardware supports it, it should work.  So we need
 to start with just a warning, and allow the access.  Later we can drop
 the invalid access.

If the OS contravene the spec, i thinks it's the OS's bug, also, i have tested 
some versions
windows/linux guests, it's no broken, can we directly drop the other wide 
access?

 
 @@ -288,6 +288,11 @@ static int ioapic_mmio_read(struct kvm_io_device
 *this, gpa_t addr, int len,
   ioapic_debug(addr %lx\n, (unsigned long)addr);
   ASSERT(!(addr  0xf));/* check alignment */

 +if (len != 4) {
 +printk(KERN_WARNING ioapic: wrong length %d\n, len);
 +return 0;
 +}
 +

 
 Guest triggered, so needs to be rate limited.

Yeah, will using printk_ratelimit cooperate with it.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 02.07.2010, at 21:10, Scott Wood wrote:

 On Fri, 2 Jul 2010 20:47:44 +0200
 Alexander Graf ag...@suse.de wrote:
 
 
 On 02.07.2010, at 19:59, Hollis Blanchard wrote:
 
 [Resending...]
 
 Please reconcile this with
 http://www.linux-kvm.org/page/PowerPC_Hypercall_ABI, which has been
 discussed in the (admittedly closed) Power.org embedded hypervisor
 working group. Bear in mind that other hypervisors are already
 implementing the documented ABI, so if you have concerns, you should
 probably raise them with that audience...
 
 We can not use sc with LV=1 because that would break the KVM in
 something else case which is KVM's strong point on PPC.
 
 The current proposal involves the hypervisor specifying the hcall opcode
 sequence in the device tree -- to allow either sc 1 or sc 0 plus
 magic GPR depending on whether you've got the hardware hypervisor
 feature (hereafter HHV).

Ah right, so you can still trap a hypercall with HHV. Makes sense.

 
 With HHV, sc 0 plus magic GPR just doesn't work, since it won't trap
 to the hypervisor.  sc 1 plus magic GPR might be problematic on some
 non-HHV implementations, especially if you *do* have HHV but the
 non-HHV hypervisor is running as an HHV guest.

Yes, that's why I need sc 0 plus magic GPR in r0 and r3 - to accomodate for all 
the non-HHV cases. And it would be clever to have a way to expose the same 
functionality when we do use the HHV features.

So, is that draft available anywhere? The wiki page Hollis pointed to is very 
vague.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 04.07.2010, at 00:41, Benjamin Herrenschmidt wrote:

 On Fri, 2010-07-02 at 18:27 +0200, Segher Boessenkool wrote:
 +To find out if we're running on KVM or not, we overlay the PVR  
 register. Usually
 +the PVR register contains an id that identifies your CPU type. If,  
 however, you
 +pass KVM_PVR_PARA in the register that you want the PVR result in,  
 the register
 +still contains KVM_PVR_PARA after the mfpvr call.
 +
 +   LOAD_REG_IMM(r5, KVM_PVR_PARA)
 +   mfpvr   r5
 +   [r5 still contains KVM_PVR_PARA]
 
 I love this part :-)
 
 Me not :-)
 
 It should be in the device-tree instead, or something like that. Enough
 games with PVR...

My biggest concern about putting things in the device-tree is that I was trying 
to keep things as separate as possible. Why does the firmware have to know that 
it's running in KVM? Why do I have to patch 3 projects (Linux, OpenBIOS, Qemu) 
when I could go with patching a single one (Linux)?

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 04.07.2010, at 11:17, Alexander Graf wrote:

 
 On 04.07.2010, at 11:10, Avi Kivity wrote:
 
 On 07/04/2010 12:04 PM, Alexander Graf wrote:
 
 My biggest concern about putting things in the device-tree is that I was 
 trying to keep things as separate as possible. Why does the firmware have 
 to know that it's running in KVM?
 
 It doesn't need to know about kvm, it needs to know that a particular 
 hypercall protocol is available.
 
 Considering how the parts of the draft that I read about sound like, that's 
 not the inventor's idea. PPC people love to see the BIOS be part of the 
 virtualization solution. I don't. That's the biggest difference here and 
 reason for us going different directions.
 
 I think what they thought of is something like
 
 if (in_kvm()) {
  device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
  device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
 }
 
 which then the OS reads out. But that's useless, as the hypercalls are 
 hypervisor specific. So why make the detection on the Linux side generic?

In fact, it's even worse. Right now with KVM for PPC we have 3 different ways 
of generating the device tree:

1) OpenBIOS (Mac emulation)
2) Qemu libfdt (BookE)
3) MOL OF implementation

So I'd have to touch even more projects. Just for the sake of splitting out 
something that belongs together anyway. And probably even create new interfaces 
just for that sake (qemu asking the kernel which type of hypercalls the vm 
should use) even though the guest could just query all that itself.

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Avi Kivity

On 07/04/2010 12:17 PM, Alexander Graf wrote:

On 04.07.2010, at 11:10, Avi Kivity wrote:

   

On 07/04/2010 12:04 PM, Alexander Graf wrote:
 

My biggest concern about putting things in the device-tree is that I was trying 
to keep things as separate as possible. Why does the firmware have to know that 
it's running in KVM?
   

It doesn't need to know about kvm, it needs to know that a particular hypercall 
protocol is available.
 

Considering how the parts of the draft that I read about sound like, that's not 
the inventor's idea. PPC people love to see the BIOS be part of the 
virtualization solution. I don't. That's the biggest difference here and reason 
for us going different directions.
   


Regardless of which direction is correct, you need to go in one direction.


I think what they thought of is something like

if (in_kvm()) {
   device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
   device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
}

which then the OS reads out. But that's useless, as the hypercalls are 
hypervisor specific. So why make the detection on the Linux side generic?
   


Looks like the benefit is less magic in the detection code.  x86 has 
(more or less) standardized feature detection.  Is this an attempt to 
bring something similar to ppc-land?



Why do I have to patch 3 projects (Linux, OpenBIOS, Qemu) when I could go with 
patching a single one (Linux)?

   

That's not a valid argument.  You patch as many projects as it takes to get it 
right (not that I have an opinion in this particular discussion).
 

If you can put code in Linux that touches 3 submaintainer's directories or one 
submaintainer's directory with both ending up being functionally equivalent, 
which way would you go?
   


We would do the right thing.  Trivial examples include adding defines to 
include/asm/processor.h or include/asm/msr-index.h, more complicated 
ones are the topic for my talk in kvm forum 2010.


Yes, coordinating the acks and trees and merge windows is not as fun as 
coding.  Yes, it's even more difficult with separate trees.  No, that's 
not an excuse if we[1] determine that the right thing to do is the most 
complicated.


[1] we in this case are the powerpc Linux arch maintainers and/or 
whoever defines the hardware specification



At the very least you have to patch qemu for reasons described before 
(backwards compatible live migration).
 

There is no live migration on PPC (yet). That point is completely moot atm.
   


You still need to make that feature disableable from userspace.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Avi Kivity

On 07/04/2010 12:30 PM, Alexander Graf wrote:



Considering how the parts of the draft that I read about sound like, that's not 
the inventor's idea. PPC people love to see the BIOS be part of the 
virtualization solution. I don't. That's the biggest difference here and reason 
for us going different directions.

I think what they thought of is something like

if (in_kvm()) {
  device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
  device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
}

which then the OS reads out. But that's useless, as the hypercalls are 
hypervisor specific. So why make the detection on the Linux side generic?
 

In fact, it's even worse. Right now with KVM for PPC we have 3 different ways 
of generating the device tree:

1) OpenBIOS (Mac emulation)
2) Qemu libfdt (BookE)
3) MOL OF implementation
   


I sympathize.  But, if the arch says that's how you do things, then 
that's how you do things.



So I'd have to touch even more projects. Just for the sake of splitting out 
something that belongs together anyway. And probably even create new interfaces 
just for that sake (qemu asking the kernel which type of hypercalls the vm 
should use) even though the guest could just query all that itself.
   


qemu needs to be involved, in case one day you support more than one 
type of hypercalls (like x86 does with hyper-v) or if you want to live 
migrate from a host that has hypercall support to another host that has 
this feature removed (as has already happened on x86 with the pvmmu).


Planning for the future means a lot of boring interfaces.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/27] KVM: PPC: Magic Page Book3s support

2010-07-04 Thread Avi Kivity

On 07/02/2010 06:37 PM, Alexander Graf wrote:

Alexander Graf wrote:
   

We need to override EA as well as PA lookups for the magic page. When the guest
tells us to project it, the magic page overrides any guest mappings.

In order to reflect that, we need to hook into all the MMU layers of KVM to
force map the magic page if necessary.

Signed-off-by: Alexander Grafag...@suse.de

v1 -  v2:

   - RMO -  PAM
---
  arch/powerpc/kvm/book3s.c |7 +++
  arch/powerpc/kvm/book3s_32_mmu.c  |   16 
  arch/powerpc/kvm/book3s_32_mmu_host.c |   12 
  arch/powerpc/kvm/book3s_64_mmu.c  |   30 +-
  arch/powerpc/kvm/book3s_64_mmu_host.c |   12 
  5 files changed, 76 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 14db032..b22e608 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -554,6 +554,13 @@ mmio:

  static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn)
  {
+   ulong mp_pa = vcpu-arch.magic_page_pa;
+
+   if (unlikely(mp_pa)
+   unlikely((mp_pa  KVM_RMO)  PAGE_SHIFT == gfn)) {

 

This should be KVM_PAM :(. Should I respin the whole thing or could
whoever commits this just make that trivial change?

   


A respin followed by a bisectability test (compile each revision as it 
is applied), please.  Of course we need to resolve the detection issue 
first.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html