[libvirt] [RFC] driver's persistent connection closed event

2015-06-06 Thread Nikolay Shirokovskiy
Hello.  



I have a question concerning hypervisor driver "persistent connection   

closed" event mechanics.



Currently the only driver that have such a connection (or at least  

notify of such event) is the remote driver. Accordingly we don't

have interface for this event at driver level.  

Parallels driver actually have a persistent connection. Imagine 

we have a libvirt connection to parallels driver through remote driver, 

if parallels internal connection is closed we can catch it in daemon

but the only option to notify remote driver of this event would 

be closing of remote driver to daemon connection which will 

in turn trigger connection closed event deliver to client. Thus 

event will be delivered but at cost of forcedly closed connection   

to daemon.  

I suspect the described solution have architectural flaws so

i suggest move interface of "persistent connection closed" event

to driver level which will give us a chance to deliver this 

event without side effects. We don't need to implement this 

interface for most drivers and clients could handle this

well as in this case "unimplemented" would be a synonym 

of "have not persistent connection". Alternatively 
we could silently do nothing if driver does not support
this interface so clients would not break.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [CFT][PATCH 00/10] Making new mounts of proc and sysfs as safe as bind mounts (take 2)

2015-06-06 Thread Eric W. Biederman
Richard Weinberger  writes:

> [CC'ing libvirt-lxc folks]
>
> Am 28.05.2015 um 23:32 schrieb Eric W. Biederman:
>> Richard Weinberger  writes:
>> 
>>> Am 28.05.2015 um 21:57 schrieb Eric W. Biederman:
> FWIW, it breaks also libvirt-lxc:
> Error: internal error: guest failed to start: Failed to re-mount 
> /proc/sys on /proc/sys flags=1021: Operation not permitted

 Interesting.  I had not anticipated a failure there?  And it is failing
 in remount?  Oh that is interesting.

 That implies that there is some flag of the original mount of /proc that
 the remount of /proc/sys is clearing, and that previously 

 The flags specified are current rdonly,remount,bind so I expect there
 are some other flags on proc that libvirt-lxc is clearing by accident
 and we did not fail before because the kernel was not enforcing things.
>>>
>>> Please see:
>>> http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/lxc/lxc_container.c;h=9a9ae5c2aaf0f90ff472f24fda43c077b44998c7;hb=HEAD#l933
>>> lxcContainerMountBasicFS()
>>>
>>> and:
>>> http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/lxc/lxc_container.c;h=9a9ae5c2aaf0f90ff472f24fda43c077b44998c7;hb=HEAD#l850
>>> lxcBasicMounts
>>>
 What are the mount flags in a working libvirt-lxc?
>>>
>>> See:
>>> test1:~ # cat /proc/self/mountinfo
>>> 149 147 0:56 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
>>> 150 149 0:56 /sys /proc/sys ro,nodev,relatime - proc proc rw
>> 
>>> If you need more info, please let me know. :-)
>> 
>> Oh interesting I had not realized libvirt-lxc had grown an unprivileged
>> mode using user namespaces.
>> 
>> This does appear to be a classic remount bug, where you are not
>> preserving the permissions.  It appears the fact that the code
>> failed to enforce locked permissions on the fresh mount of proc
>> was hiding this bug until now.
>> 
>> I expect what you actually want is the code below:
>> 
>> diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
>> index 9a9ae5c2aaf0..f008a7484bfe 100644
>> --- a/src/lxc/lxc_container.c
>> +++ b/src/lxc/lxc_container.c
>> @@ -850,7 +850,7 @@ typedef struct {
>>  
>>  static const virLXCBasicMountInfo lxcBasicMounts[] = {
>>  { "proc", "/proc", "proc", MS_NOSUID|MS_NOEXEC|MS_NODEV, false, false, 
>> false },
>> -{ "/proc/sys", "/proc/sys", NULL, MS_BIND|MS_RDONLY, false, false, 
>> false },
>> +{ "/proc/sys", "/proc/sys", NULL, 
>> MS_BIND|MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_RDONLY, false, false, false },
>>  { "/.oldroot/proc/sys/net/ipv4", "/proc/sys/net/ipv4", NULL, MS_BIND, 
>> false, false, true },
>>  { "/.oldroot/proc/sys/net/ipv6", "/proc/sys/net/ipv6", NULL, MS_BIND, 
>> false, false, true },
>>  { "sysfs", "/sys", "sysfs", MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_RDONLY, 
>> false, false, false },
>> 
>> Or possibly just:
>> 
>> diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
>> index 9a9ae5c2aaf0..a60ccbd12bfc 100644
>> --- a/src/lxc/lxc_container.c
>> +++ b/src/lxc/lxc_container.c
>> @@ -850,7 +850,7 @@ typedef struct {
>>  
>>  static const virLXCBasicMountInfo lxcBasicMounts[] = {
>>  { "proc", "/proc", "proc", MS_NOSUID|MS_NOEXEC|MS_NODEV, false, false, 
>> false },
>> -{ "/proc/sys", "/proc/sys", NULL, MS_BIND|MS_RDONLY, false, false, 
>> false },
>> +{ "/proc/sys", "/proc/sys", NULL, MS_BIND|MS_RDONLY, true, false, false 
>> },
>>  { "/.oldroot/proc/sys/net/ipv4", "/proc/sys/net/ipv4", NULL, MS_BIND, 
>> false, false, true },
>>  { "/.oldroot/proc/sys/net/ipv6", "/proc/sys/net/ipv6", NULL, MS_BIND, 
>> false, false, true },
>>  { "sysfs", "/sys", "sysfs", MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_RDONLY, 
>> false, false, false },
>> 
>> As the there is little point in making /proc/sys read-only in a
>> user-namespace, as the permission checks are uid based and no-one should
>> have the global uid 0 in your container.  Making mounting /proc/sys
>> read-only rather pointless.
>
> Eric, using the patch below I was able to spawn a user-namespace enabled 
> container
> using libvirt-lxc. :-)
>
> I had to:
> 1. Disable the read-only mount of /proc/sys which is anyway useless in the 
> user-namespace case.
> 2. Disable the /proc/sys/net/ipv{4,6} bind mounts, this ugly hack is only 
> needed for the non user-namespace case.
> 3. Remove MS_RDONLY from the sysfs mount (For the non user-namespace case 
> we'd have to keep this, though).
>
> Daniel, I'd take this as a chance to disable all the MS_RDONLY games if 
> user-namespace are configured.
> With Eric's fixes they hurt us. And as I wrote many times before if root 
> within the user-namespace
> is able to do nasty things in /sys and /proc that's a plain kernel bug which 
> needs fixing. There is no
> point in mounting these read-only. Except for the case then no user-namespace 
> is used.
>

For clarity the patch below appears to be the minimal change needed to
fix this security issue.

AKA add mnt_mflags in when remounting something read-only.

/p

[libvirt] ANNOUNCE: virt-manager 1.2.1 released

2015-06-06 Thread Cole Robinson
I'm happy to announce the release of virt-manager 1.2.1!

virt-manager is a desktop application for managing KVM, Xen, and LXC
virtualization via libvirt.

The release can be downloaded from:

http://virt-manager.org/download/

The direct download links are:

http://virt-manager.org/download/sources/virt-manager/virt-manager-1.2.1.tar.gz

This release is mostly for bug scooping up some bugfixes. Including:

- Fix connecting to older libvirt versions (Michał Kępień)
- Fix connecting to VM console with non-IP hostname (Giuseppe Scrivano)
- Fix addhardware/create wizard errors when a nodedev disappears
- Fix adding a second cdrom via customize dialog

Thanks to everyone who has contributed to this release through testing,
bug reporting, submitting patches, and otherwise sending in feedback!

Thanks,
Cole

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 3/3] qemu: caps: Advertise arm 32-on-64 KVM option

2015-06-06 Thread Cole Robinson
On 05/28/2015 07:31 AM, Daniel P. Berrange wrote:
> On Thu, May 21, 2015 at 07:03:28PM -0400, Cole Robinson wrote:
>> We need to use qemu-system-aarch64 to run armv7l KVM VMs on an aarch64
>> host.
>> ---
>>  src/qemu/qemu_capabilities.c | 42 ++
>>  1 file changed, 22 insertions(+), 20 deletions(-)
>>
>> diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
>> index 1e7bddb..7181865 100644
>> --- a/src/qemu/qemu_capabilities.c
>> +++ b/src/qemu/qemu_capabilities.c
>> @@ -723,19 +723,6 @@ virQEMUCapsFindBinaryForArch(virArch hostarch,
>>  return ret;
>>  }
>>  
>> -
>> -static bool
>> -virQEMUCapsIsValidForKVM(virArch hostarch,
>> - virArch guestarch)
>> -{
>> -if (hostarch == guestarch)
>> -return true;
>> -if (hostarch == VIR_ARCH_X86_64 &&
>> -guestarch == VIR_ARCH_I686)
>> -return true;
>> -return false;
>> -}
>> -
>>  static int
>>  virQEMUCapsInitGuest(virCapsPtr caps,
>>   virQEMUCapsCachePtr cache,
>> @@ -747,6 +734,7 @@ virQEMUCapsInitGuest(virCapsPtr caps,
>>  char *binary = NULL;
>>  virQEMUCapsPtr qemubinCaps = NULL;
>>  virQEMUCapsPtr kvmbinCaps = NULL;
>> +bool native_kvm, x86_32on64_kvm, arm_32on64_kvm;
>>  int ret = -1;
>>  
>>  /* Check for existence of base emulator, or alternate base
>> @@ -764,16 +752,30 @@ virQEMUCapsInitGuest(virCapsPtr caps,
>>  
>>  /* qemu-kvm/kvm binaries can only be used if
>>   *  - host & guest arches match
>> - * Or
>> - *  - hostarch is x86_64 and guest arch is i686
>> - * The latter simply needs "-cpu qemu32"
>> + *  - hostarch is x86_64 and guest arch is i686 (needs -cpu qemu32)
>> + *  - hostarch is aarch64 and guest arch is armv7l (needs -cpu 
>> aarch64=off)
>>   */
>> -if (virQEMUCapsIsValidForKVM(hostarch, guestarch)) {
>> -const char *const kvmbins[] = { "/usr/libexec/qemu-kvm", /* RHEL */
>> -"qemu-kvm", /* Fedora */
>> -"kvm" }; /* Upstream .spec */
>> +native_kvm = (hostarch == guestarch);
>> +x86_32on64_kvm = (hostarch == VIR_ARCH_X86_64 &&
>> +guestarch == VIR_ARCH_I686);
>> +arm_32on64_kvm = (hostarch == VIR_ARCH_AARCH64 &&
>> +guestarch == VIR_ARCH_ARMV7L);
>> +
>> +if (native_kvm || x86_32on64_kvm || arm_32on64_kvm) {
>> +const char *kvmbins[] = {
>> +"/usr/libexec/qemu-kvm", /* RHEL */
>> +"qemu-kvm", /* Fedora */
>> +"kvm", /* Debian/Ubuntu */
>> +NULL,
>> +};
>> +
>> +if (arm_32on64_kvm)
>> +kvmbins[3] = "qemu-system-aarch64";
> 
> I'm unclear why you need to be adding this. We don't need it for
> the equivalent i686 with qemu-system-x86_64, as the earlier call
> to virQEMUCapsFindBinaryForArch() will already return the binary
> qemu-system-x86_64. IIUC, it should have returned the binary
> qemu-system-aarch64 too, so this just seems to duplicate the
> check for that binary.

We need this in the case you are running on an aarch64 host and have both
qemu-system-arm and qemu-system-aarch64 installed. In this case, when you want
to use KVM for arm32, you _have_ to use qemu-system-aarch64, qemu-system-arm
does not work. What you suggest would mean that qemu-system-arm is grabbed
from the caps cache.

x86 doesn't have this problem because qemu-system-i386, qemu-system-x86_64 and
by extension qemu-kvm can all be used to do 32-on-64 KVM.

Thanks,
Cole

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 0/3] qemu: Allow arm 32-on-64 KVM

2015-06-06 Thread Cole Robinson
On 05/28/2015 07:26 AM, John Ferlan wrote:
> 
> 
> On 05/21/2015 07:03 PM, Cole Robinson wrote:
>> qemu 2.3.0 added support for enabling armv7l VMs to run on aarch64 hosts
>> with KVM. First patch handles the special command line bit, last two
>> patches are about advertising things in virsh capabilities.
>>
>> After these patches, 'virt-install --arch armv7l ...' on an aarch64
>> host with new enough qemu will automatically use KVM and generate a
>> working config.
>>
>> Cole Robinson (3):
>>   qemu: command: Support arm 32-on-64 KVM with -cpu aarch64=off
>>   qemu: caps: qemu-system-aarch64 supports armv7l
>>   qemu: caps: Advertise arm 32-on-64 KVM option
>>
>>  src/qemu/qemu_capabilities.c   | 57 
>> ++
>>  src/qemu/qemu_capabilities.h   |  1 +
>>  src/qemu/qemu_command.c| 13 +
>>  .../qemuxml2argv-aarch64-kvm-32-on-64.args | 10 
>>  .../qemuxml2argv-aarch64-kvm-32-on-64.xml  | 35 +
>>  tests/qemuxml2argvtest.c   |  9 
>>  6 files changed, 105 insertions(+), 20 deletions(-)
>>  create mode 100644 
>> tests/qemuxml2argvdata/qemuxml2argv-aarch64-kvm-32-on-64.args
>>  create mode 100644 
>> tests/qemuxml2argvdata/qemuxml2argv-aarch64-kvm-32-on-64.xml
>>
> 
> Perhaps lost by having the tweak dir perms handling patch sneak in...
> 
> Seems reasonable to me (ACK)...

Thanks, I'll wait to see if Dan has any more responses before pushing.

 Should there be some way to document how
> this would be added/usable in the passthrough description in
> formatdomain? Or somewhere else? Not sure how anyone would know the
> magic incantation to define/use this... Even though this was posted
> before freeze, not 100% clear whether it's push-able for this release cycle.
> 

The way it's meant to be consumed is just parsing the capabilities output to
find supported virt-type + arch + emulator + os-type tuples, which apps
already have to do. So this doesn't require any specific steps. Maybe how to
_use_ the capabilities XML isn't documented anywhere but if so it's outside
the scope of this patch set.

Thanks,
Cole

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] rpc: RH1026137: Fix slow volume download (virsh vol-download)

2015-06-06 Thread Ossi Herrala
Use I/O vector (iovec) instead of one huge memory buffer as suggested
in https://bugzilla.redhat.com/show_bug.cgi?id=1026137#c7. This avoids
doing memmove() to big buffers and performance doesn't degrade if
source (virNetClientStreamQueuePacket()) is faster than sink
(virNetClientStreamRecvPacket()).
---
 src/rpc/virnetclientstream.c |  134 +
 1 files changed, 82 insertions(+), 52 deletions(-)

diff --git a/src/rpc/virnetclientstream.c b/src/rpc/virnetclientstream.c
index b428f4b..18c6e8b 100644
--- a/src/rpc/virnetclientstream.c
+++ b/src/rpc/virnetclientstream.c
@@ -49,9 +49,9 @@ struct _virNetClientStream {
  * time by stopping consuming any incoming data
  * off the socket
  */
-char *incoming;
-size_t incomingOffset;
-size_t incomingLength;
+struct iovec *incomingVec; /* I/O Vector to hold data */
+size_t writeVec;   /* Vectors produced */
+size_t readVec;/* Vectors consumed */
 bool incomingEOF;
 
 virNetClientStreamEventCallback cb;
@@ -86,9 +86,9 @@ virNetClientStreamEventTimerUpdate(virNetClientStreamPtr st)
 if (!st->cb)
 return;
 
-VIR_DEBUG("Check timer offset=%zu %d", st->incomingOffset, st->cbEvents);
+VIR_DEBUG("Check timer readVec %zu writeVec %zu %d", st->readVec, 
st->writeVec, st->cbEvents);
 
-if (((st->incomingOffset || st->incomingEOF) &&
+if st->readVec < st->writeVec) || st->incomingEOF) &&
  (st->cbEvents & VIR_STREAM_EVENT_READABLE)) ||
 (st->cbEvents & VIR_STREAM_EVENT_WRITABLE)) {
 VIR_DEBUG("Enabling event timer");
@@ -110,13 +110,14 @@ virNetClientStreamEventTimer(int timer ATTRIBUTE_UNUSED, 
void *opaque)
 
 if (st->cb &&
 (st->cbEvents & VIR_STREAM_EVENT_READABLE) &&
-(st->incomingOffset || st->incomingEOF))
+((st->readVec < st->writeVec) || st->incomingEOF))
 events |= VIR_STREAM_EVENT_READABLE;
 if (st->cb &&
 (st->cbEvents & VIR_STREAM_EVENT_WRITABLE))
 events |= VIR_STREAM_EVENT_WRITABLE;
 
-VIR_DEBUG("Got Timer dispatch %d %d offset=%zu", events, st->cbEvents, 
st->incomingOffset);
+VIR_DEBUG("Got Timer dispatch %d %d readVec %zu writeVec %zu", events, 
st->cbEvents,
+  st->readVec, st->writeVec);
 if (events) {
 virNetClientStreamEventCallback cb = st->cb;
 void *cbOpaque = st->cbOpaque;
@@ -161,7 +162,7 @@ void virNetClientStreamDispose(void *obj)
 virNetClientStreamPtr st = obj;
 
 virResetError(&st->err);
-VIR_FREE(st->incoming);
+VIR_FREE(st->incomingVec);
 virObjectUnref(st->prog);
 }
 
@@ -265,38 +266,49 @@ int virNetClientStreamQueuePacket(virNetClientStreamPtr 
st,
   virNetMessagePtr msg)
 {
 int ret = -1;
-size_t need;
+struct iovec iov;
+char *base;
+size_t piece, pieces, length, offset = 0, size = 1024*1024;
 
 virObjectLock(st);
-need = msg->bufferLength - msg->bufferOffset;
-if (need) {
-size_t avail = st->incomingLength - st->incomingOffset;
-if (need > avail) {
-size_t extra = need - avail;
-if (VIR_REALLOC_N(st->incoming,
-  st->incomingLength + extra) < 0) {
-VIR_DEBUG("Out of memory handling stream data");
-goto cleanup;
-}
-st->incomingLength += extra;
-}
 
-memcpy(st->incoming + st->incomingOffset,
-   msg->buffer + msg->bufferOffset,
-   msg->bufferLength - msg->bufferOffset);
-st->incomingOffset += (msg->bufferLength - msg->bufferOffset);
-} else {
+length = msg->bufferLength - msg->bufferOffset;
+
+if (length == 0) {
 st->incomingEOF = true;
+goto end;
 }
 
-VIR_DEBUG("Stream incoming data offset %zu length %zu EOF %d",
-  st->incomingOffset, st->incomingLength,
-  st->incomingEOF);
+pieces = (length + size - 1) / size;
+for (piece = 0; piece < pieces; piece++) {
+if (size > length - offset)
+size = length - offset;
+
+if (VIR_ALLOC_N(base, size)) {
+VIR_DEBUG("Allocation failed");
+goto cleanup;
+}
+
+memcpy(base, msg->buffer + msg->bufferOffset + offset, size);
+iov.iov_base = base;
+iov.iov_len = size;
+offset += size;
+
+if (VIR_APPEND_ELEMENT(st->incomingVec, st->writeVec, iov) < 0) {
+VIR_DEBUG("Append failed");
+VIR_FREE(base);
+goto cleanup;
+}
+VIR_DEBUG("Wrote piece of vector. readVec %zu, writeVec %zu size %zu", 
st->readVec, st->writeVec, size);
+}
+
+ end:
 virNetClientStreamEventTimerUpdate(st);
-
 ret = 0;
 
  cleanup:
+VIR_DEBUG("Stream incoming data readVec %zu writeVec %zu EOF %d",
+  st->readVec, st->writeVec, st->incomingEOF);
 virObjectUnlock(st);
 return ret;
 }
@@ -