Re: [libvirt] [RFC] require for suggestions on support for ivshmem device

2014-07-20 Thread Wang Rui
On 2014/7/17 17:37, Martin Kletzander wrote:
> On Tue, May 20, 2014 at 11:17:32AM +0200, Martin Kletzander wrote:
>> On Wed, May 14, 2014 at 08:23:21AM +, Wangrui (K) wrote:
>>> Hi,
>>>
>>> Libvirt does not support ivshmem(Inter-VM Shared Memory) device recently,
>>> thus, I would like to know if there's any plan to support it in the future?
>>> If not, I would like to contribute a serial of patches to do so.
>>>
> 
> I came back to this mail right now because I need to have this
> implemented.  Is there any progress on your side with this or should I
> try hitting this?
> 

There's some experimental progress, not good enough to send patches.
Sure, you can hav a try. I would keep attention on your patches.

You mentioned shm unlink below. If I Understand Correctly, QEMU does have code 
to cleanup shm.
Libvirt should do the cleanup job.
> [...]
>>> There are two ways to use ivshmem with qemu
>>> (please refer to 
>>> http://qemu.weilnetz.de/qemu-doc.html#pcsys_005fother_005fdevs ):
>>> 1.Guests map a POSIX shared memory region into the guest as a PCI device
>>> that enables zero-copy communication to the application level of the 
>>> guests, The basic syntax is:
>>>
>>>  qemu-system-i386-device ivshmem, size =  [, 
>>> shm = ]
>>>
>>> 2.If desired, interrupts can be sent between guest VMs accessing the same 
>>> shared memory region.
>>> Interrupt support requires using a shared memory server and using a chardev 
>>> socket to connect to it.
>>> An example syntax when using the shared memory server is:
>>>
>>>  qemu-system-i386-device ivshmem, size =  [, 
>>> chardev = ] [, msi = on]
>>>   [, ioeventfd = on] [, vectors = n] [, role = 
>>> peer | master]
>>>  qemu-system-i386-chardev socket, path = , id = 
>>>
>>> The respective xml configuration for the above 2 qemu command lines are 
>>> shown as below:
>>>
>>> Example1: automatically attach device with KVM
>>>
>>>  
>>>
>>>  
>>>
>>>  
>>>
>>> NOTE: "size" means ivshmem size in unit MB, "name" means shm name
>>>  "role" is optional, it may be set to "master" or "peer", the default 
>>> one is "master"
>>>
>>
>> What do these roles mean, I mean what's the difference between master
>> and peer and why is it only used with the chardev?  Does it mean
>> master can only send interrupts or...?  Just curious.
>>
> 
> @Cam (Cc'd) I was wondering about the role= options, so I looked into
> the code.  It looks like role=peer just effectively disables
> migration.  Did I miss any other difference?
> 
> From the libvirt's POV I'd have a few more questions if I may.  How
> does the migration work (if there's role=master) WRT other guests
> using the same shm?  I found no shm_unlink call in QEMU sources (but
> again, I'm not experienced in QEMU's internals), does that mean that
> cleanup should be done by libvirt?
> 
> Thank you for any info provided.
> 
> Martin


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] LXC: create a bind mount for sysfs when enable userns but disable netns

2014-07-20 Thread Gao feng
On 07/14/2014 06:01 PM, Chen Hanxiao wrote:
> kernel commit 7dc5dbc879bd0779924b5132a48b731a0bc04a1e
> forbid us doing a fresh mount for sysfs
> when enable userns but disable netns.
> This patch will create a bind mount in this senario.
> 
> Signed-off-by: Chen Hanxiao 
> ---

Looks good to me, ACK

>  src/lxc/lxc_container.c | 44 +---
>  1 file changed, 33 insertions(+), 11 deletions(-)
> 
> diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
> index 4d89677..8a27215 100644
> --- a/src/lxc/lxc_container.c
> +++ b/src/lxc/lxc_container.c
> @@ -815,10 +815,13 @@ static int lxcContainerSetReadOnly(void)
>  }
>  
>  
> -static int lxcContainerMountBasicFS(bool userns_enabled)
> +static int lxcContainerMountBasicFS(bool userns_enabled,
> +bool netns_disabled)
>  {
>  size_t i;
>  int rc = -1;
> +char* mnt_src = NULL;
> +int mnt_mflags;
>  
>  VIR_DEBUG("Mounting basic filesystems");
>  
> @@ -826,8 +829,25 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
>  bool bindOverReadonly;
>  virLXCBasicMountInfo const *mnt = &lxcBasicMounts[i];
>  
> +/* When enable userns but disable netns, kernel will
> + * forbid us doing a new fresh mount for sysfs.
> + * So we had to do a bind mount for sysfs instead.
> + */
> +if (userns_enabled && netns_disabled &&
> +STREQ(mnt->src, "sysfs")) {
> +if (VIR_STRDUP(mnt_src, "/sys") < 0) {
> +goto cleanup;
> +}
> +mnt_mflags = MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_RDONLY|MS_BIND;
> +} else {
> +if (VIR_STRDUP(mnt_src, mnt->src) < 0) {
> +goto cleanup;
> +}
> +mnt_mflags = mnt->mflags;
> +}
> +
>  VIR_DEBUG("Processing %s -> %s",
> -  mnt->src, mnt->dst);
> +  mnt_src, mnt->dst);
>  
>  if (mnt->skipUnmounted) {
>  char *hostdir;
> @@ -856,7 +876,7 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
>  if (virFileMakePath(mnt->dst) < 0) {
>  virReportSystemError(errno,
>   _("Failed to mkdir %s"),
> - mnt->src);
> + mnt_src);
>  goto cleanup;
>  }
>  
> @@ -867,24 +887,24 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
>   * we mount the filesystem in read-write mode initially, and then do 
> a
>   * separate read-only bind mount on top of that.
>   */
> -bindOverReadonly = !!(mnt->mflags & MS_RDONLY);
> +bindOverReadonly = !!(mnt_mflags & MS_RDONLY);
>  
>  VIR_DEBUG("Mount %s on %s type=%s flags=%x",
> -  mnt->src, mnt->dst, mnt->type, mnt->mflags & ~MS_RDONLY);
> -if (mount(mnt->src, mnt->dst, mnt->type, mnt->mflags & ~MS_RDONLY, 
> NULL) < 0) {
> +  mnt_src, mnt->dst, mnt->type, mnt_mflags & ~MS_RDONLY);
> +if (mount(mnt_src, mnt->dst, mnt->type, mnt_mflags & ~MS_RDONLY, 
> NULL) < 0) {
>  virReportSystemError(errno,
>   _("Failed to mount %s on %s type %s 
> flags=%x"),
> - mnt->src, mnt->dst, NULLSTR(mnt->type),
> - mnt->mflags & ~MS_RDONLY);
> + mnt_src, mnt->dst, NULLSTR(mnt->type),
> + mnt_mflags & ~MS_RDONLY);
>  goto cleanup;
>  }
>  
>  if (bindOverReadonly &&
> -mount(mnt->src, mnt->dst, NULL,
> +mount(mnt_src, mnt->dst, NULL,
>MS_BIND|MS_REMOUNT|MS_RDONLY, NULL) < 0) {
>  virReportSystemError(errno,
>   _("Failed to re-mount %s on %s flags=%x"),
> - mnt->src, mnt->dst,
> + mnt_src, mnt->dst,
>   MS_BIND|MS_REMOUNT|MS_RDONLY);
>  goto cleanup;
>  }
> @@ -893,6 +913,7 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
>  rc = 0;
>  
>   cleanup:
> +VIR_FREE(mnt_src);
>  VIR_DEBUG("rc=%d", rc);
>  return rc;
>  }
> @@ -1643,7 +1664,8 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr 
> vmDef,
>  goto cleanup;
>  
>  /* Mounts the core /proc, /sys, etc filesystems */
> -if (lxcContainerMountBasicFS(vmDef->idmap.nuidmap) < 0)
> +if (lxcContainerMountBasicFS(vmDef->idmap.nuidmap,
> + !vmDef->nnets) < 0)
>  goto cleanup;
>  
>  /* Ensure entire root filesystem (except /.oldroot) is readonly */
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list