Re: qemu:///embed and isolation from global components

2020-03-19 Thread Daniel P . Berrangé
On Thu, Mar 19, 2020 at 10:21:39AM +0100, Andrea Bolognani wrote:
> On Wed, 2020-03-18 at 18:01 +0100, Michal Prívozník wrote:
> > On 18. 3. 2020 16:47, Andrea Bolognani wrote:
> > > if I use either one of
> > > 
> > >   
> > > 
> > >   
> > > 
> > >   
> > > 
> > >   
> > > 
> > > both qemu:///embed instances try to use the same paths:
> > > 
> > >   /dev/hugepages/libvirt/qemu/$domid-$domname/
> > > 
> > >   /dev/shm/libvirt/qemu/$domid-$domname/
> > 
> > This is because qemu driver uses virDomainDefGetShortName() to construct
> > the path which is not embed driver aware. In fact, we use it on a few
> > other places:
> > 
> > libvirt.git $ git grep -C0 virDomainDefGetShortName  -- src/qemu/ \
> >  | wc -l
> > 15
> > 
> > so we probably have more broken plaes. I will investigate and post a
> > patch. Probably something among those lines that fixed machined name
> > generation.
> 
> I'm not sure changing virDomainDefGetShortName() to include the embed
> root hash all the time, which I assume is what you had in mind, is a
> good enough solution.
> 
> That would work very well for /dev/shm and friends, but if the path
> you're building is *inside* the embed root, then, you just added 15
> characters to the path of any file. This wouldn't be too bad if not
> for the fact that the path to a UNIX socket can only be up to 108
> characters long, which is limiting enough already...
> 
> Can we make the whole thing smart enough that the embed root hash
> will appear in paths that are outside of the embed root, but not in
> those that are inside it?

This is probably a matter of having 2 methods for building a unique
name. One for use cases interacting with global resources, and one
for use cases interacting with isolated resources. Then we'll need
to make sure the driver calls the right one in each context.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: qemu:///embed and isolation from global components

2020-03-19 Thread Andrea Bolognani
On Wed, 2020-03-18 at 18:01 +0100, Michal Prívozník wrote:
> On 18. 3. 2020 16:47, Andrea Bolognani wrote:
> > if I use either one of
> > 
> >   
> > 
> >   
> > 
> >   
> > 
> >   
> > 
> > both qemu:///embed instances try to use the same paths:
> > 
> >   /dev/hugepages/libvirt/qemu/$domid-$domname/
> > 
> >   /dev/shm/libvirt/qemu/$domid-$domname/
> 
> This is because qemu driver uses virDomainDefGetShortName() to construct
> the path which is not embed driver aware. In fact, we use it on a few
> other places:
> 
> libvirt.git $ git grep -C0 virDomainDefGetShortName  -- src/qemu/ \
>  | wc -l
> 15
> 
> so we probably have more broken plaes. I will investigate and post a
> patch. Probably something among those lines that fixed machined name
> generation.

I'm not sure changing virDomainDefGetShortName() to include the embed
root hash all the time, which I assume is what you had in mind, is a
good enough solution.

That would work very well for /dev/shm and friends, but if the path
you're building is *inside* the embed root, then, you just added 15
characters to the path of any file. This wouldn't be too bad if not
for the fact that the path to a UNIX socket can only be up to 108
characters long, which is limiting enough already...

Can we make the whole thing smart enough that the embed root hash
will appear in paths that are outside of the embed root, but not in
those that are inside it?

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: qemu:///embed and isolation from global components

2020-03-18 Thread Michal Prívozník
On 18. 3. 2020 16:47, Andrea Bolognani wrote:
> On Mon, 2020-03-09 at 18:04 +, Daniel P. Berrangé wrote:
>> At a high level the embedded QEMU driver
>>
>>  - Isolated from any other instance of the QEMU driver
> 
> Replying here because it looks as good a place as any.
> 
> Now that Michal has made it so that identically-name domains defined
> under different qemu:///embed roots no longer clash at the machined
> level (thanks!), I have immediately bumped into the next issue: if
> I use either one of
> 
>   
> 
>   
> 
>   
> 
>   
> 
> both qemu:///embed instances try to use the same paths:
> 
>   /dev/hugepages/libvirt/qemu/$domid-$domname/
> 
>   /dev/shm/libvirt/qemu/$domid-$domname/

This is because qemu driver uses virDomainDefGetShortName() to construct
the path which is not embed driver aware. In fact, we use it on a few
other places:

libvirt.git $ git grep -C0 virDomainDefGetShortName  -- src/qemu/ \
 | wc -l
15

so we probably have more broken plaes. I will investigate and post a
patch. Probably something among those lines that fixed machined name
generation.

> 
> which obviously breaks everything.
> 
> In this case, is the application expected to set
> 
>   hugetlbfs_mount = "/dev/hugepages/$random"
> 
>   memory_backing_dir = "/dev/shm/$random"
> 
> in qemu.conf beforehand (I'm not sure the former setting would even
> work), or should libvirt take care of it automatically in a similar
> way it now does for machine names?
> 

The latter is true.

Michal



Re: qemu:///embed and isolation from global components

2020-03-18 Thread Andrea Bolognani
On Mon, 2020-03-09 at 18:04 +, Daniel P. Berrangé wrote:
> At a high level the embedded QEMU driver
> 
>  - Isolated from any other instance of the QEMU driver

Replying here because it looks as good a place as any.

Now that Michal has made it so that identically-name domains defined
under different qemu:///embed roots no longer clash at the machined
level (thanks!), I have immediately bumped into the next issue: if
I use either one of

  

  

  

  

both qemu:///embed instances try to use the same paths:

  /dev/hugepages/libvirt/qemu/$domid-$domname/

  /dev/shm/libvirt/qemu/$domid-$domname/

which obviously breaks everything.

In this case, is the application expected to set

  hugetlbfs_mount = "/dev/hugepages/$random"

  memory_backing_dir = "/dev/shm/$random"

in qemu.conf beforehand (I'm not sure the former setting would even
work), or should libvirt take care of it automatically in a similar
way it now does for machine names?

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: qemu:///embed and isolation from global components

2020-03-12 Thread Andrea Bolognani
On Wed, 2020-03-11 at 09:53 +, Daniel P. Berrangé wrote:
> On Tue, Mar 10, 2020 at 07:25:46PM +0100, Andrea Bolognani wrote:
> > In your scenario, when you don't specify a scope you get the same
> > one as the primary driver is using (this matches the current
> > behavior): so if you are using qemu:///session, every 
> > will use network:///session unless you explicitly specify
> > scope='system', at which point network:///system will be used; in
> > the same fashion, if you're connected to qemu:///embed, the default
> > for s should be network:///embed, with the possibility
> > to use either network:///session (with scope='session') or, most
> > likely, network:///system (with scope='system').
> 
> No, I'm not talking about using the same URI for the secondary
> drivers, I'm talking about using the same *privilege* level.
> ie, qemu:///system and a qemu:///embed running as root should
> both use the privileged network/storage driver. The qemu:///session
> and qemu:///embed running as non-root should both default to
> the  unprivileged network/storage drivers.

What's the advantage of conflating URI and privilege level? It seems
to me like it only makes things complicated when they could be
absolutely straightforward instead.

In fact, I believe libguestfs developers have long lamented the fact
that it's not really possible to have qemu:///session for the root
user, which is caused by the same kind of logic... It would be
preferable, I think, not to introduce even more instances of it.

> > > With such functionality present, it logically then also extends to
> > > cover connections to daemons running in different namespaces.
> > 
> > I'm still unclear on how this scenario, which would apparently have
> > multiple eg. privileged virtnetworkd, running at the same time but in
> > different network namespaces, would work.
> 
> There would need to be some selection method, most likely a way
> to explicitly specify the socket path, but this is a longish way
> into the future.

Got it!

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: qemu:///embed and isolation from global components

2020-03-12 Thread Daniel P . Berrangé
On Thu, Mar 12, 2020 at 01:50:49PM +0100, Andrea Bolognani wrote:
> On Thu, 2020-03-12 at 12:09 +, Daniel P. Berrangé wrote:
> > On Thu, Mar 12, 2020 at 12:57:36PM +0100, Andrea Bolognani wrote:
> > > Honestly, so far I haven't been able to figure out the use case for
> > > registering libvirt VMs with machined either :)
> > > 
> > > Most of the operations are either not supported (login, shell, start)
> > > or do not work as expected (list --all, reboot), so all you can
> > > really do is list the subset of libvirt VMs that happen to be running
> > > and power them off. I can't really imagine that being very useful to
> > > anyone... Am I missing something?
> > 
> > Yeah, pretty much all you get is a way to report & terminate VMs via
> > systemd commands. A few others things could be wired up, but no one
> > ever made an effort todo so and I don't think it is worth it.
> > 
> > So I'm getting inclined to consider machined a failed experiment from
> > POV of VMs - still makes sense for containers. That said I'd still
> > keep using it, because we need systemd to deal with cgroups creation
> > no matter what, and its no worse to talk to systemd via machined
> > than directly.
> 
> Would it make sense to default to not registering with machined? It
> would probably be more honest of us, considering how severely limited
> the functionality is... Set expectations right and all that. The fact
> that not even reboot, one of the only three operations available
> through machinectl, works correctly (it will shut down the VM instead
> of restarting it) leads me to believe that nobody is actually using
> this anyway.
> 
> > > Of course it's not about secrecy, but for the same reasons
> > > qemu:///embed VMs don't show up in the output of 'virsh list' it
> > > also makes sense for them to be omitted from that of 'machinectl
> > > list', I think.
> > 
> > Yes, I agree with this.
> > 
> > The only even slightly plausible use case for machinectl to list
> > a full set of guest OS running on the host. This just about makes
> > sense for traditional data center / cloud virt use case. I don't
> > think it makes sense when KVM is merely used as an infrastructure
> > building block embedded in applications. As such, I think we should
> > NOT register with machined or systemd at all, for embedded VMs, without
> > an explicit opt-in. We should flip to just inheriting the calling
> > processes cgroups context, to align with the goal that embedded driver
> > should generally aim to inherit all process context.
> 
> Cool.
> 
> Let's just make sure there is a way for qemu:///embed users to
> explicitly opt-in into qemu:///system-style CGroup handling,
> hopefully without machined getting in the way, before we flip the
> switch. Are you going to revive the old patch you linked to a few
> day ago?

Dunno how well it will still apply, but yeah, something approximately
the same as that would suit us


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: qemu:///embed and isolation from global components

2020-03-12 Thread Andrea Bolognani
On Thu, 2020-03-12 at 12:09 +, Daniel P. Berrangé wrote:
> On Thu, Mar 12, 2020 at 12:57:36PM +0100, Andrea Bolognani wrote:
> > Honestly, so far I haven't been able to figure out the use case for
> > registering libvirt VMs with machined either :)
> > 
> > Most of the operations are either not supported (login, shell, start)
> > or do not work as expected (list --all, reboot), so all you can
> > really do is list the subset of libvirt VMs that happen to be running
> > and power them off. I can't really imagine that being very useful to
> > anyone... Am I missing something?
> 
> Yeah, pretty much all you get is a way to report & terminate VMs via
> systemd commands. A few others things could be wired up, but no one
> ever made an effort todo so and I don't think it is worth it.
> 
> So I'm getting inclined to consider machined a failed experiment from
> POV of VMs - still makes sense for containers. That said I'd still
> keep using it, because we need systemd to deal with cgroups creation
> no matter what, and its no worse to talk to systemd via machined
> than directly.

Would it make sense to default to not registering with machined? It
would probably be more honest of us, considering how severely limited
the functionality is... Set expectations right and all that. The fact
that not even reboot, one of the only three operations available
through machinectl, works correctly (it will shut down the VM instead
of restarting it) leads me to believe that nobody is actually using
this anyway.

> > Of course it's not about secrecy, but for the same reasons
> > qemu:///embed VMs don't show up in the output of 'virsh list' it
> > also makes sense for them to be omitted from that of 'machinectl
> > list', I think.
> 
> Yes, I agree with this.
> 
> The only even slightly plausible use case for machinectl to list
> a full set of guest OS running on the host. This just about makes
> sense for traditional data center / cloud virt use case. I don't
> think it makes sense when KVM is merely used as an infrastructure
> building block embedded in applications. As such, I think we should
> NOT register with machined or systemd at all, for embedded VMs, without
> an explicit opt-in. We should flip to just inheriting the calling
> processes cgroups context, to align with the goal that embedded driver
> should generally aim to inherit all process context.

Cool.

Let's just make sure there is a way for qemu:///embed users to
explicitly opt-in into qemu:///system-style CGroup handling,
hopefully without machined getting in the way, before we flip the
switch. Are you going to revive the old patch you linked to a few
day ago?

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: qemu:///embed and isolation from global components

2020-03-12 Thread Daniel P . Berrangé
On Thu, Mar 12, 2020 at 12:57:36PM +0100, Andrea Bolognani wrote:
> On Wed, 2020-03-11 at 17:32 +0100, Michal Privoznik wrote:
> > I still don't quite see the value in machinectl (maybe because I'm not 
> > using systemd :-D)
> 
> Honestly, so far I haven't been able to figure out the use case for
> registering libvirt VMs with machined either :)
> 
> Most of the operations are either not supported (login, shell, start)
> or do not work as expected (list --all, reboot), so all you can
> really do is list the subset of libvirt VMs that happen to be running
> and power them off. I can't really imagine that being very useful to
> anyone... Am I missing something?

Yeah, pretty much all you get is a way to report & terminate VMs via
systemd commands. A few others things could be wired up, but no one
ever made an effort todo so and I don't think it is worth it.

So I'm getting inclined to consider machined a failed experiment from
POV of VMs - still makes sense for containers. That said I'd still
keep using it, because we need systemd to deal with cgroups creation
no matter what, and its no worse to talk to systemd via machined
than directly.

> > but anyway - it's a system-wide monitor of virtual 
> > machines. Therefore it makes sense to register a domain started under 
> > qemu:///embed there. I don't view embed mode as a way of starting VMs 
> > secretly. It's a way of starting VMs privately and that's a different 
> > thing. Other users might learn that my app is running a VM (plain 'ps' 
> > would give it away), but they can not mangle with it in any way, e.g. 
> > change its XML.
> 
> Of course it's not about secrecy, but for the same reasons
> qemu:///embed VMs don't show up in the output of 'virsh list' it
> also makes sense for them to be omitted from that of 'machinectl
> list', I think.

Yes, I agree with this.

The only even slightly plausible use case for machinectl to list
a full set of guest OS running on the host. This just about makes
sense for traditional data center / cloud virt use case. I don't
think it makes sense when KVM is merely used as an infrastructure
building block embedded in applications. As such, I think we should
NOT register with machined or systemd at all, for embedded VMs, without
an explicit opt-in. We should flip to just inheriting the calling
processes cgroups context, to align with the goal that embedded driver
should generally aim to inherit all process context.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: qemu:///embed and isolation from global components

2020-03-12 Thread Andrea Bolognani
On Wed, 2020-03-11 at 17:32 +0100, Michal Privoznik wrote:
> I still don't quite see the value in machinectl (maybe because I'm not 
> using systemd :-D)

Honestly, so far I haven't been able to figure out the use case for
registering libvirt VMs with machined either :)

Most of the operations are either not supported (login, shell, start)
or do not work as expected (list --all, reboot), so all you can
really do is list the subset of libvirt VMs that happen to be running
and power them off. I can't really imagine that being very useful to
anyone... Am I missing something?

> but anyway - it's a system-wide monitor of virtual 
> machines. Therefore it makes sense to register a domain started under 
> qemu:///embed there. I don't view embed mode as a way of starting VMs 
> secretly. It's a way of starting VMs privately and that's a different 
> thing. Other users might learn that my app is running a VM (plain 'ps' 
> would give it away), but they can not mangle with it in any way, e.g. 
> change its XML.

Of course it's not about secrecy, but for the same reasons
qemu:///embed VMs don't show up in the output of 'virsh list' it
also makes sense for them to be omitted from that of 'machinectl
list', I think.

It's easier to make the case for virsh because the expectation is
that, if they show up, you can perform various operations such as
edit or dumpxml or whatever on them; for machinectl, since the only
operation that seems to work is poweroff anyway, and that's
apparently achieved not by calling libvirt APIs but by killing the
QEMU process directly, then the case is not as obvious - but I think
the same rationale still applies.

> > > > Right now we're already doing
> > > > 
> > > >qemu-$domid-$domname
> > > > 
> > > > where I'm not entirely sure how much $domid actually buys us.
> > > 
> > > IIRC $domid was a hack because at one time we had problems with
> > > systemd not cleaning up the transient scope when QEMU was killed.
> > > This would prevent libvirt starting the guest again thereafter.
> > > I can't remember now if this was a bug we fixed in systemd or
> > > libvirt or both or neither.
> 
> It was introduced by bd773e74f0d1d1b9ebbfcaa645178316b4f2265c and the 
> commit message links to this bug: 
> https://bugs.freedesktop.org/show_bug.cgi?id=68370 which looks like 
> fixed in systemd.

That doesn't look right.

>From what I can gather, $domid was added much more recently,
specifically in c3bd0019c0e3f080dbf0d4bd08245ffb2daa2765, in order
to addresses https://bugzilla.redhat.com/show_bug.cgi?id=1282846.

While I get the overall idea behind the changes, I don't quite
understand the rationale behind adding $domid specifically...

> > I see! It would be neat if we could get rid of it, assuming of course
> > it's no longer needed on the platforms we target.
> 
> I don't think it's that simple. Machinectl poses some limitations on the 
> name: either it has to be a FQDN or a simple name without any dots. And 
> according to our code the strlen() must be <= 64 (don't know if that 
> comes from machinectl or is just our own limitation). Therefore, if you 
> have two domains which names would clash after our processing, the 
> domain ID guarantees unique strings are passed to machined.

... but here it is! Makes sense now :)

[...]
> > > There's no question that we must fix the machined problem.
> 
> I will try to post patches for this.

That'd be awesome! Thanks in advance :)

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: qemu:///embed and isolation from global components

2020-03-11 Thread Michal Privoznik

On 3/10/20 4:42 PM, Andrea Bolognani wrote:

On Mon, 2020-03-09 at 18:04 +, Daniel P. Berrangé wrote:

On Mon, Mar 09, 2020 at 06:09:13PM +0100, Andrea Bolognani wrote:

On Fri, 2020-03-06 at 17:49 +, Daniel P. Berrangé wrote:

On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:

[...]

Aside: instead of a per-VM setting, would it make sense for this to
be a connection-level setting? That way, even on traditional libvirt
deployments, the hypervisor admin could eg. opt out of machinectl
registration if they so desire; at the same time, you'd avoid the
situation where most VMs are confined using CGroups but a few are
not, which is probably not a desirable scenario.


Yes, functionally it would work fine as a connection level setting
too, though this hides the behaviour from the anyone looking at the
guest config. We've previously punted quite a few things to the
qemu.conf because we didn't want to go through process of representing
them in the domain XML. This was OK when the qemu.conf settings were
something done once at host deployment time.

With the embedded driver, I think this is not so desirable, as means
to get the configuration they want from a VM, they need to deal with
two distinct config files. The ideal would be that everything that
is commonly needed can be achieved solely in the domain XML, and
I think resource control backend is one such common tunable.


I don't have a strong opinion either way, and as far as my current
use is concerned it doesn't bother me to have to deal with a second
configuration file. The reason why I thought a per-VM setting might
not be desirable is that applications would then be able to
override it, and so maybe VMs created with virt-manager would be
registered against machinectl but VMs created using GNOME Boxes
would not, and if the sysadmin likes to use machinectl to get a
comprehensive view of the system they'd no longer be guaranteed
that. But if that's not the kind of scenario we think we should
prevent, then a per-VM setting is fine by me :)


I still don't quite see the value in machinectl (maybe because I'm not 
using systemd :-D), but anyway - it's a system-wide monitor of virtual 
machines. Therefore it makes sense to register a domain started under 
qemu:///embed there. I don't view embed mode as a way of starting VMs 
secretly. It's a way of starting VMs privately and that's a different 
thing. Other users might learn that my app is running a VM (plain 'ps' 
would give it away), but they can not mangle with it in any way, e.g. 
change its XML.




[...]

Right now we're already doing

   qemu-$domid-$domname

where I'm not entirely sure how much $domid actually buys us.


IIRC $domid was a hack because at one time we had problems with
systemd not cleaning up the transient scope when QEMU was killed.
This would prevent libvirt starting the guest again thereafter.
I can't remember now if this was a bug we fixed in systemd or
libvirt or both or neither.


It was introduced by bd773e74f0d1d1b9ebbfcaa645178316b4f2265c and the 
commit message links to this bug: 
https://bugs.freedesktop.org/show_bug.cgi?id=68370 which looks like 
fixed in systemd.




I see! It would be neat if we could get rid of it, assuming of course
it's no longer needed on the platforms we target.


I don't think it's that simple. Machinectl poses some limitations on the 
name: either it has to be a FQDN or a simple name without any dots. And 
according to our code the strlen() must be <= 64 (don't know if that 
comes from machinectl or is just our own limitation). Therefore, if you 
have two domains which names would clash after our processing, the 
domain ID guarantees unique strings are passed to machined.




[...]

Of course you can turn off virtlogd via qemu.conf


That's what I'm doing right now and it works well enough, but I'm
afraid that requiring a bunch of setup will discourage developers
from using the embedded driver. We should aim for a reasonable out
of the box experience instead.


Why do you need to turn it off though ?  It should already
"do the right thing" as the log files should appear under a
different location and not have any clash. Turning it off
immediately creates a denial of service CVE in your application.


I was getting the same SELinux denial that Michal reported a few
days back: virtlogd wants to verify it's being connected to by a
process running as root, but it's only allowed by the policy to
look into libvirtd's /proc/$pid for this information. So, for the
same reason virtqemud can't currently connect to virtlogd when
SELinux is in enforcing mode, neither can my qemu:///embed-using
application.

Besides that, there is the fact that a lot of people, mainly those
coming from a containers background, are not happy with having extra
daemons running. I'm not saying they would prefer being hit by a DoS
than having virtlogd running :) but they really, really don't like
daemons :)


None the less, as above I think we need common things 

Re: qemu:///embed and isolation from global components

2020-03-11 Thread Christophe de Dinechin
Le 9 mars 2020 à 14:03, Michal Privoznik  a écrit :
> 
> On 3/6/20 6:49 PM, Daniel P. Berrangé wrote:
>>> On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:
>>> Recently I've been working on integrating qemu:///embed into an
>>> application. It's been reasonably smooth so far :)
>>> 
>>> There's one thing, however, that has caused a bit of confusion, and
>>> I would like to clarify whether my expectations are incorrect, there
>>> are genuine bugs in the implementation that need to be addressed, or
>>> maybe the experimental nature of embedded driver support results in
>>> not all scenarios having yet been considered and catered for.
>>> 
>>> Using virt-qemu-run as an example, when I run it as root I see that
>>> 
>>>   * the domain doesn't show up in the output of 'virsh list' for
>>> qemu:///system;
>> This is good.
>>>   * it does, however, show up in the output of 'machinectl', with
>>> class=vm and service=libvirt-qemu;
>> This is bad. It is one of the gaps we need to deal with.
>> A long while back I proposed a domain XML option for this:
>>   https://www.redhat.com/archives/libvir-list/2017-December/msg00729.html
>> 
>> The "none" case meant inherit the cgroups placement of the parent
>> libvirtd (or now the app using the embedded driver), and would be
>> a reasonable default for the embedded case.
>> For the other cases, we certainly need to do something to ensure
>> uniqueness. This is possibly as simple as including a fixed
>> prefix like "qemu-$PID" where $PID is the app embedding the
>> driver. That said, we support closing the app, and re-starting
>> using the same embedded driver directory, which would change
>> PID.
> 
> Instead of PID we can derive the value from the root path, e.g. i-node of the 
> root dir or some substring of the path hash, say first 6 letters of 
> md5($root_path). This would guarantee the stability of the prefix across app 
> restarts (if we assume all root dirs to be on one FS so we don't have 
> clashing i-node numbers).
> 

Root path hash is more robust also in other cases, like file copy/rename often 
used for atomic updates. 

> Michal
> 




Re: qemu:///embed and isolation from global components

2020-03-11 Thread Daniel P . Berrangé
On Tue, Mar 10, 2020 at 07:25:46PM +0100, Andrea Bolognani wrote:
> On Tue, 2020-03-10 at 16:00 +, Daniel P. Berrangé wrote:
> > The split daemon model is intended to allow us to address this
> > long standing design flaw, by allowing the QEMU session driver
> > to optionally talk to a secondary driver running with different
> > privileges, instead of the instance running with the same privs.
> > 
> > So currently we have
> > 
> >   
> > 
> >   
> > 
> > This refers to a secondary driver running at the same privilege
> > level.
> > 
> > I expected we'd extend it to allow
> > 
> >   
> > 
> >   
> > 
> > This explicitly requests the secondary driver running with elevated
> > privileges.
> 
> This makes perfect sense to me, and in fact I believe you're
> basically suggesting what I was arguing for earlier :)
> 
> In your scenario, when you don't specify a scope you get the same
> one as the primary driver is using (this matches the current
> behavior): so if you are using qemu:///session, every 
> will use network:///session unless you explicitly specify
> scope='system', at which point network:///system will be used; in
> the same fashion, if you're connected to qemu:///embed, the default
> for s should be network:///embed, with the possibility
> to use either network:///session (with scope='session') or, most
> likely, network:///system (with scope='system').

No, I'm not talking about using the same URI for the secondary
drivers, I'm talking about using the same *privilege* level.
ie, qemu:///system and a qemu:///embed running as root should
both use the privileged network/storage driver. The qemu:///session
and qemu:///embed running as non-root should both default to
the  unprivileged network/storage drivers.

> > With such functionality present, it logically then also extends to
> > cover connections to daemons running in different namespaces.
> 
> I'm still unclear on how this scenario, which would apparently have
> multiple eg. privileged virtnetworkd, running at the same time but in
> different network namespaces, would work.

There would need to be some selection method, most likely a way
to explicitly specify the socket path, but this is a longish way
into the future.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: qemu:///embed and isolation from global components

2020-03-10 Thread Andrea Bolognani
On Tue, 2020-03-10 at 16:00 +, Daniel P. Berrangé wrote:
> The split daemon model is intended to allow us to address this
> long standing design flaw, by allowing the QEMU session driver
> to optionally talk to a secondary driver running with different
> privileges, instead of the instance running with the same privs.
> 
> So currently we have
> 
>   
> 
>   
> 
> This refers to a secondary driver running at the same privilege
> level.
> 
> I expected we'd extend it to allow
> 
>   
> 
>   
> 
> This explicitly requests the secondary driver running with elevated
> privileges.

This makes perfect sense to me, and in fact I believe you're
basically suggesting what I was arguing for earlier :)

In your scenario, when you don't specify a scope you get the same
one as the primary driver is using (this matches the current
behavior): so if you are using qemu:///session, every 
will use network:///session unless you explicitly specify
scope='system', at which point network:///system will be used; in
the same fashion, if you're connected to qemu:///embed, the default
for s should be network:///embed, with the possibility
to use either network:///session (with scope='session') or, most
likely, network:///system (with scope='system').

virtlogd would probably have to be treated a bit differently since
it needs to be running even before the first guest is started, but
other than that the same general principle should apply.

> With such functionality present, it logically then also extends to
> cover connections to daemons running in different namespaces.

I'm still unclear on how this scenario, which would apparently have
multiple eg. privileged virtnetworkd, running at the same time but in
different network namespaces, would work.

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: qemu:///embed and isolation from global components

2020-03-10 Thread Daniel P . Berrangé
On Tue, Mar 10, 2020 at 04:42:57PM +0100, Andrea Bolognani wrote:
> On Mon, 2020-03-09 at 18:04 +, Daniel P. Berrangé wrote:
> > Of course when we do connect to virnetworkd, we MUST ensure that
> > anything we do preserves isolation from other QEMU driver instances.
> > 
> > I would also note that the virtnetworkd daemon connected to, may
> > or may not actually be the same as the host OS one. It is entirely
> > possible that the application has opened the embedded QEMU driver
> > from within a namesace, that results in a connection to the
> > /var/run/libvirt/virnetworkd being serviced by a completely isolated
> > virnetworkd running inside a different network namespace from the
> > host OS virtnetworkd. This is of no concern to the QEMU driver
> > though - it just needs to honour what is in the domain XML.
> 
> This kind of setup sounds extremely confusing.
> 
> Would you have multiple instances of virtnetworkd, one per network
> namespace, all running at the same time? How would the application
> pick the correct one to connect to?

I forgot to mention that this is actually a scenario we'd like to
support even ignoring namespaces.  The big pain point for desktop
virt apps like Boxes or Virt-manager is that the QEMU session
driver lacked any sane networking connectivity. For this reason
we invented the QEMU setuid network helper which runs as root
and opens a TAP device connected to a bridge, passing it back to
unprivileged libvirtd. This is really crude and not a general
solution to the problem.

The key design flaw in session libvirtd, was tieing together the
privileges of the virt driver and the secondary drivers. So we
had a 1-1 relation between them. What we really need to have is
a N-N relation between them.

The virtual network driver as it exists today, requires elevated
privileges, but if we had a NetworkManager backed impl it could
work unprivileged. The nwfilter driver requires privileges.
The storage driver is a little different because some backends
(directory backend) work when unprivileged, but other backends
(SCSI, LVM, disk) only ever work when privileged.

The split daemon model is intended to allow us to address this
long standing design flaw, by allowing the QEMU session driver
to optionally talk to a secondary driver running with different
privileges, instead of the instance running with the same privs.

So currently we have

  

  

This refers to a secondary driver running at the same privilege
level.

I expected we'd extend it to allow

  

  

This explicitly requests the secondary driver running with elevated
privileges.

The same concept for disk storage

 

 

Would be extended to allow

 

 

The same for host devices, and for nwfilter.

With such functionality present, it logically then also extends to
cover connections to daemons running in different namespaces.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: qemu:///embed and isolation from global components

2020-03-10 Thread Andrea Bolognani
On Mon, 2020-03-09 at 18:04 +, Daniel P. Berrangé wrote:
> On Mon, Mar 09, 2020 at 06:09:13PM +0100, Andrea Bolognani wrote:
> > On Fri, 2020-03-06 at 17:49 +, Daniel P. Berrangé wrote:
> > > On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:
[...]
> > Aside: instead of a per-VM setting, would it make sense for this to
> > be a connection-level setting? That way, even on traditional libvirt
> > deployments, the hypervisor admin could eg. opt out of machinectl
> > registration if they so desire; at the same time, you'd avoid the
> > situation where most VMs are confined using CGroups but a few are
> > not, which is probably not a desirable scenario.
> 
> Yes, functionally it would work fine as a connection level setting
> too, though this hides the behaviour from the anyone looking at the
> guest config. We've previously punted quite a few things to the
> qemu.conf because we didn't want to go through process of representing
> them in the domain XML. This was OK when the qemu.conf settings were
> something done once at host deployment time.
> 
> With the embedded driver, I think this is not so desirable, as means
> to get the configuration they want from a VM, they need to deal with
> two distinct config files. The ideal would be that everything that
> is commonly needed can be achieved solely in the domain XML, and
> I think resource control backend is one such common tunable.

I don't have a strong opinion either way, and as far as my current
use is concerned it doesn't bother me to have to deal with a second
configuration file. The reason why I thought a per-VM setting might
not be desirable is that applications would then be able to
override it, and so maybe VMs created with virt-manager would be
registered against machinectl but VMs created using GNOME Boxes
would not, and if the sysadmin likes to use machinectl to get a
comprehensive view of the system they'd no longer be guaranteed
that. But if that's not the kind of scenario we think we should
prevent, then a per-VM setting is fine by me :)

[...]
> > Right now we're already doing
> > 
> >   qemu-$domid-$domname
> > 
> > where I'm not entirely sure how much $domid actually buys us.
> 
> IIRC $domid was a hack because at one time we had problems with
> systemd not cleaning up the transient scope when QEMU was killed.
> This would prevent libvirt starting the guest again thereafter.
> I can't remember now if this was a bug we fixed in systemd or
> libvirt or both or neither.

I see! It would be neat if we could get rid of it, assuming of course
it's no longer needed on the platforms we target.

[...]
> > > Of course you can turn off virtlogd via qemu.conf
> > 
> > That's what I'm doing right now and it works well enough, but I'm
> > afraid that requiring a bunch of setup will discourage developers
> > from using the embedded driver. We should aim for a reasonable out
> > of the box experience instead.
> 
> Why do you need to turn it off though ?  It should already
> "do the right thing" as the log files should appear under a
> different location and not have any clash. Turning it off
> immediately creates a denial of service CVE in your application.

I was getting the same SELinux denial that Michal reported a few
days back: virtlogd wants to verify it's being connected to by a
process running as root, but it's only allowed by the policy to
look into libvirtd's /proc/$pid for this information. So, for the
same reason virtqemud can't currently connect to virtlogd when
SELinux is in enforcing mode, neither can my qemu:///embed-using
application.

Besides that, there is the fact that a lot of people, mainly those
coming from a containers background, are not happy with having extra
daemons running. I'm not saying they would prefer being hit by a DoS
than having virtlogd running :) but they really, really don't like
daemons :)

> None the less, as above I think we need common things to be
> controllable via the domain XML. So either we need to make a
> tunable there for use of logd or not, or we need to implement
> the FIFO idea to avoid need for logd at all.

It seems like the FIFO idea (though I'll admit I don't fully
understand it) would be the best of both worlds.

[...]
> > > If you don't want to use virtnetworkd, then you won't be
> > > creating such an  config in the first place.
> > > The app will have the option to open an embedded seconary
> > > driver if desired. Some of the drivers won't really make
> > > sense as embedded things though, at least not without
> > > extra work. ie a embedded network or nwfilter driver has
> > > no value unless your app has moved into a new network
> > > namespace, as otherwise it will just fight with the
> > > global network driver.
> > 
> > Again, I think our defaults for qemu:///embed should be consistent
> > and conservative: instead of having developers opt out of using
> > network:///system, they should have to opt in before global
> > resources are involved.
> 
> They already opt-in 

Re: qemu:///embed and isolation from global components

2020-03-09 Thread Daniel P . Berrangé
On Mon, Mar 09, 2020 at 06:09:13PM +0100, Andrea Bolognani wrote:
> On Fri, 2020-03-06 at 17:49 +, Daniel P. Berrangé wrote:
> > On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:
> > >   * it does, however, show up in the output of 'machinectl', with
> > > class=vm and service=libvirt-qemu;
> > 
> > This is bad. It is one of the gaps we need to deal with.
> > 
> > A long while back I proposed a domain XML option for this:
> > 
> >   https://www.redhat.com/archives/libvir-list/2017-December/msg00729.html
> > 
> > 
> > 
> > The "none" case meant inherit the cgroups placement of the parent
> > libvirtd (or now the app using the embedded driver), and would be
> > a reasonable default for the embedded case.
> 
> Agreed. In my case I'll want to use "direct" instead, but "none"
> would indeed be a sane default for the embedded driver.

Yes, "none" makes sense for the emedded driver, as QEMU is intended
to be able to inherit process execution context from the calling
application. IOW, we must inherit cgroups placement, and not create
things under /machine.slice by default.

> Aside: instead of a per-VM setting, would it make sense for this to
> be a connection-level setting? That way, even on traditional libvirt
> deployments, the hypervisor admin could eg. opt out of machinectl
> registration if they so desire; at the same time, you'd avoid the
> situation where most VMs are confined using CGroups but a few are
> not, which is probably not a desirable scenario.

Yes, functionally it would work fine as a connection level setting
too, though this hides the behaviour from the anyone looking at the
guest config. We've previously punted quite a few things to the
qemu.conf because we didn't want to go through process of representing
them in the domain XML. This was OK when the qemu.conf settings were
something done once at host deployment time.

With the embedded driver, I think this is not so desirable, as means
to get the configuration they want from a VM, they need to deal with
two distinct config files. The ideal would be that everything that
is commonly needed can be achieved solely in the domain XML, and
I think resource control backend is one such common tunable.

> > For the other cases, we certainly need to do something to ensure
> > uniqueness. This is possibly as simple as including a fixed
> > prefix like "qemu-$PID" where $PID is the app embedding the
> > driver. That said, we support closing the app, and re-starting
> > using the same embedded driver directory, which would change
> > PID.
> 
> Right now we're already doing
> 
>   qemu-$domid-$domname
> 
> where I'm not entirely sure how much $domid actually buys us.

IIRC $domid was a hack because at one time we had problems with
systemd not cleaning up the transient scope when QEMU was killed.
This would prevent libvirt starting the guest again thereafter.
I can't remember now if this was a bug we fixed in systemd or
libvirt or both or neither.

> I think machine ids need to be unique per host, not per service,
> which is kind of a bummer because the obvious choice would be to
> generate a service name based on the embedded root... Michal already
> suggested another solution, perhaps that one is more viable.

Yes, they definitely need to be unique globally, but as agreed
above, we should not create cgroups by default at all - we must
inherit cgroups placement.


> > >   * virtlogd is automatically spawned, if not already active, to
> > > take care of the domain's log file.
> > 
> > This is trickier. The use of virtlogd was to fix a security DoS
> > where malicious QEMU can write to serial console, or trigger
> > QEMU to write to stdout/err, such that it exhausts the host
> > filesystem.  IIUC, virtlogd should still end up writing to
> > the logfile path associated with the embedded  driver root
> > directory prefix, so there shouldn't be a filename clash
> > unless I screwed up.
> > 
> > Since introducing virtlogd, however, I did think of a different
> > strategy, which would be to connect a FIFO to QEMU as the log
> > file FD. The QEMU driver itself can own the other end of the FIFO
> > and do rollover.
> > 
> > Of course you can turn off virtlogd via qemu.conf
> 
> That's what I'm doing right now and it works well enough, but I'm
> afraid that requiring a bunch of setup will discourage developers
> from using the embedded driver. We should aim for a reasonable out
> of the box experience instead.

Why do you need to turn it off though ?  It should already
"do the right thing" as the log files should appear under a
different location and not have any clash. Turning it off
immediately creates a denial of service CVE in your application.

None the less, as above I think we need common things to be
controllable via the domain XML. So either we need to make a
tunable there for use of logd or not, or we need to implement
the FIFO idea to avoid need for logd at all.

> > > The first one is expected, but the other two were a 

Re: qemu:///embed and isolation from global components

2020-03-09 Thread Andrea Bolognani
On Fri, 2020-03-06 at 17:49 +, Daniel P. Berrangé wrote:
> On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:
> >   * it does, however, show up in the output of 'machinectl', with
> > class=vm and service=libvirt-qemu;
> 
> This is bad. It is one of the gaps we need to deal with.
> 
> A long while back I proposed a domain XML option for this:
> 
>   https://www.redhat.com/archives/libvir-list/2017-December/msg00729.html
> 
> 
> 
> The "none" case meant inherit the cgroups placement of the parent
> libvirtd (or now the app using the embedded driver), and would be
> a reasonable default for the embedded case.

Agreed. In my case I'll want to use "direct" instead, but "none"
would indeed be a sane default for the embedded driver.

Aside: instead of a per-VM setting, would it make sense for this to
be a connection-level setting? That way, even on traditional libvirt
deployments, the hypervisor admin could eg. opt out of machinectl
registration if they so desire; at the same time, you'd avoid the
situation where most VMs are confined using CGroups but a few are
not, which is probably not a desirable scenario.

> For the other cases, we certainly need to do something to ensure
> uniqueness. This is possibly as simple as including a fixed
> prefix like "qemu-$PID" where $PID is the app embedding the
> driver. That said, we support closing the app, and re-starting
> using the same embedded driver directory, which would change
> PID.

Right now we're already doing

  qemu-$domid-$domname

where I'm not entirely sure how much $domid actually buys us.

I think machine ids need to be unique per host, not per service,
which is kind of a bummer because the obvious choice would be to
generate a service name based on the embedded root... Michal already
suggested another solution, perhaps that one is more viable.

Anyway, I think it's reasonable to expect that, when it comes to VMs
created via the embedded driver, the same way you'd not be able to
control them via virsh, you'd also not be able to do so via
machinectl, so I'm not too concerned about this once we flip the
default to "none" as discussed above.

> >   * virtlogd is automatically spawned, if not already active, to
> > take care of the domain's log file.
> 
> This is trickier. The use of virtlogd was to fix a security DoS
> where malicious QEMU can write to serial console, or trigger
> QEMU to write to stdout/err, such that it exhausts the host
> filesystem.  IIUC, virtlogd should still end up writing to
> the logfile path associated with the embedded  driver root
> directory prefix, so there shouldn't be a filename clash
> unless I screwed up.
> 
> Since introducing virtlogd, however, I did think of a different
> strategy, which would be to connect a FIFO to QEMU as the log
> file FD. The QEMU driver itself can own the other end of the FIFO
> and do rollover.
> 
> Of course you can turn off virtlogd via qemu.conf

That's what I'm doing right now and it works well enough, but I'm
afraid that requiring a bunch of setup will discourage developers
from using the embedded driver. We should aim for a reasonable out
of the box experience instead.

> > The first one is expected, but the other two were a surprise, at
> > least to me. Basically, what I would expect is that qemu:///embed
> > would work in a completely isolated way, only interacting with
> > system-wide components when that's explicitly requested.
> 
> The goal wasn't explicitly to avoid system-wide components,
> but it certainly was intended to avoid clashing resources.
> 
> IOW, machined, virtlogd are both valid to use, as long as
> the resource clashes are avoided. We should definitely have
> a way to disable these too.

I'd argue that most users of the embedded driver would probably
prefer it didn't interact with system-wide components: if that
wasn't the case, they'd just use qemu:///system or qemu:///session
instead.

Having a way to turn off those behaviors would certainly be a step
in the right direction, but I think ultimately we want to be in a
situation where developers opt in rather than out of them.

> > In other words, I would expect virtlogd not to be spawned, and the
> > domain not to be registered with machinectl; at the same time, if
> > the domain configuration is such that it contains for example
> > 
> >   
> > 
> > 
> >   
> > 
> > then I would expect to see a failure unless a connection to
> > network:///system had been explicitly and pre-emptively opened, and
> > not the current behavior which apparently is to automatically open
> > that connection and spawning virtnetworkd as a result.
> 
> The QEMU embedded driver is explicitly intended to be able to
> interact with other global secondary drivers.
> 
> If you don't want to use virtnetworkd, then you won't be
> creating such an  config in the first place.
> The app will have the option to open an embedded seconary
> driver if desired. Some of the drivers won't really make
> sense as embedded 

Re: qemu:///embed and isolation from global components

2020-03-09 Thread Michal Privoznik

On 3/6/20 6:49 PM, Daniel P. Berrangé wrote:

On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:

Recently I've been working on integrating qemu:///embed into an
application. It's been reasonably smooth so far :)

There's one thing, however, that has caused a bit of confusion, and
I would like to clarify whether my expectations are incorrect, there
are genuine bugs in the implementation that need to be addressed, or
maybe the experimental nature of embedded driver support results in
not all scenarios having yet been considered and catered for.

Using virt-qemu-run as an example, when I run it as root I see that

   * the domain doesn't show up in the output of 'virsh list' for
 qemu:///system;


This is good.


   * it does, however, show up in the output of 'machinectl', with
 class=vm and service=libvirt-qemu;


This is bad. It is one of the gaps we need to deal with.

A long while back I proposed a domain XML option for this:

   https://www.redhat.com/archives/libvir-list/2017-December/msg00729.html

 

The "none" case meant inherit the cgroups placement of the parent
libvirtd (or now the app using the embedded driver), and would be
a reasonable default for the embedded case.

For the other cases, we certainly need to do something to ensure
uniqueness. This is possibly as simple as including a fixed
prefix like "qemu-$PID" where $PID is the app embedding the
driver. That said, we support closing the app, and re-starting
using the same embedded driver directory, which would change
PID.


Instead of PID we can derive the value from the root path, e.g. i-node 
of the root dir or some substring of the path hash, say first 6 letters 
of md5($root_path). This would guarantee the stability of the prefix 
across app restarts (if we assume all root dirs to be on one FS so we 
don't have clashing i-node numbers).


Michal



Re: qemu:///embed and isolation from global components

2020-03-06 Thread Daniel P . Berrangé
On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:
> Recently I've been working on integrating qemu:///embed into an
> application. It's been reasonably smooth so far :)
> 
> There's one thing, however, that has caused a bit of confusion, and
> I would like to clarify whether my expectations are incorrect, there
> are genuine bugs in the implementation that need to be addressed, or
> maybe the experimental nature of embedded driver support results in
> not all scenarios having yet been considered and catered for.
> 
> Using virt-qemu-run as an example, when I run it as root I see that
> 
>   * the domain doesn't show up in the output of 'virsh list' for
> qemu:///system;

This is good.

>   * it does, however, show up in the output of 'machinectl', with
> class=vm and service=libvirt-qemu;

This is bad. It is one of the gaps we need to deal with.

A long while back I proposed a domain XML option for this:

  https://www.redhat.com/archives/libvir-list/2017-December/msg00729.html



The "none" case meant inherit the cgroups placement of the parent
libvirtd (or now the app using the embedded driver), and would be
a reasonable default for the embedded case.

For the other cases, we certainly need to do something to ensure
uniqueness. This is possibly as simple as including a fixed
prefix like "qemu-$PID" where $PID is the app embedding the
driver. That said, we support closing the app, and re-starting
using the same embedded driver directory, which would change
PID.

>   * virtlogd is automatically spawned, if not already active, to
> take care of the domain's log file.

This is trickier. The use of virtlogd was to fix a security DoS
where malicious QEMU can write to serial console, or trigger
QEMU to write to stdout/err, such that it exhausts the host
filesystem.  IIUC, virtlogd should still end up writing to
the logfile path associated with the embedded  driver root
directory prefix, so there shouldn't be a filename clash
unless I screwed up.

Since introducing virtlogd, however, I did think of a different
strategy, which would be to connect a FIFO to QEMU as the log
file FD. The QEMU driver itself can own the other end of the FIFO
and do rollover.

Of course you can turn off virtlogd via qemu.conf

> The first one is expected, but the other two were a surprise, at
> least to me. Basically, what I would expect is that qemu:///embed
> would work in a completely isolated way, only interacting with
> system-wide components when that's explicitly requested.

The goal wasn't explicitly to avoid system-wide components,
but it certainly was intended to avoid clashing resources.

IOW, machined, virtlogd are both valid to use, as long as
the resource clashes are avoided. We should definitely have
a way to disable these too.

> In other words, I would expect virtlogd not to be spawned, and the
> domain not to be registered with machinectl; at the same time, if
> the domain configuration is such that it contains for example
> 
>   
> 
> 
>   
> 
> then I would expect to see a failure unless a connection to
> network:///system had been explicitly and pre-emptively opened, and
> not the current behavior which apparently is to automatically open
> that connection and spawning virtnetworkd as a result.

The QEMU embedded driver is explicitly intended to be able to
interact with other global secondary drivers.

If you don't want to use virtnetworkd, then you won't be
creating such an  config in the first place.
The app will have the option to open an embedded seconary
driver if desired. Some of the drivers won't really make
sense as embedded things though, at least not without
extra work. ie a embedded network or nwfilter driver has
no value unless your app has moved into a new network
namespace, as otherwise it will just fight with the
global network driver.

> As a corollary to this, I would expect that two qemu:///embed
> connections using different roots to be able to define domains with
> the same name without any problem, but as it stands this is not
> possible because machinectl will refuse to register the second one.
> There might be other components that freak out as well, that's just
> the first one that blocked me.

Yes, that is definitely needed to work for resources created by
those drivers, so we must solve the name clashes for example.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



qemu:///embed and isolation from global components

2020-03-06 Thread Andrea Bolognani
Recently I've been working on integrating qemu:///embed into an
application. It's been reasonably smooth so far :)

There's one thing, however, that has caused a bit of confusion, and
I would like to clarify whether my expectations are incorrect, there
are genuine bugs in the implementation that need to be addressed, or
maybe the experimental nature of embedded driver support results in
not all scenarios having yet been considered and catered for.

Using virt-qemu-run as an example, when I run it as root I see that

  * the domain doesn't show up in the output of 'virsh list' for
qemu:///system;

  * it does, however, show up in the output of 'machinectl', with
class=vm and service=libvirt-qemu;

  * virtlogd is automatically spawned, if not already active, to
take care of the domain's log file.

The first one is expected, but the other two were a surprise, at
least to me. Basically, what I would expect is that qemu:///embed
would work in a completely isolated way, only interacting with
system-wide components when that's explicitly requested.

In other words, I would expect virtlogd not to be spawned, and the
domain not to be registered with machinectl; at the same time, if
the domain configuration is such that it contains for example

  


  

then I would expect to see a failure unless a connection to
network:///system had been explicitly and pre-emptively opened, and
not the current behavior which apparently is to automatically open
that connection and spawning virtnetworkd as a result.

As a corollary to this, I would expect that two qemu:///embed
connections using different roots to be able to define domains with
the same name without any problem, but as it stands this is not
possible because machinectl will refuse to register the second one.
There might be other components that freak out as well, that's just
the first one that blocked me.

Are my expectations unreasonable? Or should we work towards making
sure uses of qemu:///embed don't touch global resources and don't
clash with each other?

-- 
Andrea Bolognani / Red Hat / Virtualization