[libvirt] Using external ceph.conf for RBD pools and disks

2013-11-01 Thread Michael Chapman

Hi all,

At the moment, RBD storage pools in libvirt must be supplied with a list 
of Ceph monitor addresses, using  elements in the pool's source 
definition. Ceph itself has a configuration file, and this is used by 
default by all Ceph command-line utilities. This file can contain the 
monitor addresses for the cluster, as well as a bunch of other useful 
options (e.g. for tuning and debugging).


I think it would be nice if libvirt were able to load in this file when 
starting RBD storage pools. Before I send some patches through. however, I 
thought I'd better check to see whether my approach is sound.


First, I am not keen on having libvirt get librados to load the 
configuration file automatically. librados actually uses a search path to 
find the configuration file, and that path includes silly things like the 
current working directory. Since it can be told to load a single file, I 
think it would be better if it were made explicit in the storage pool XML, 
i.e.:


  
rbd

  rbd
  
  

  

  

 would be able to be used in addition to, or as an alternative to, 
a list of  elements. Would something along these lines this be 
suitable? Would it be better to use the  element's text content as 
the filename, rather than use an attribute? I'm not sure what style 
guidelines there are for something like this.


The second part is of course to make a similar change to RBD-based domain 
disk definitions, i.e.:


  ...
  


  



  

  
  ...

Again,  could be used instead of or alongside some  
elements.


This is where it gets a little tricky. At the moment,  in a disk's 
source definition is entirely optional. Furthermore, QEMU _always_ loads a 
Ceph configuration file -- either one supplied as a "conf" argument for 
the block device, or one found through the search path mentioned earlier. 
The only way to suppress this is to pass conf=/dev/null... but for 
backwards-compatibility (users may be relying on QEMU's use of the search 
path), I don't think we can do this now.


There's one final gotcha in all of this: if QEMU is given both a "conf" 
argument and a "mon_addr" argument, only the latter will take effect. This 
means if both  and  are supplied, then the  elements 
will override any monitor addresses from the configuration file.


For consistency, I intend to make an RBD storage pool have the same 
behaviour. However, would it perhaps be better if the user could only 
choose _either_  or a list of  elements? Personally, I don't 
think it's a big deal if the behaviour is clearly documented -- being able 
to load options from a config file while still defining hosts in the 
libvirt XML could be useful.


Anyway, before I send my patches through I'm interested in hearing 
people's thoughts on this. All sound sane? Too intrusive? A waste of time? 
:-)


- Michael

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Using external ceph.conf for RBD pools and disks

2013-11-01 Thread Daniel P. Berrange
On Sat, Nov 02, 2013 at 12:18:17AM +1100, Michael Chapman wrote:
> Hi all,
> 
> At the moment, RBD storage pools in libvirt must be supplied with a
> list of Ceph monitor addresses, using  elements in the pool's
> source definition. Ceph itself has a configuration file, and this is
> used by default by all Ceph command-line utilities. This file can
> contain the monitor addresses for the cluster, as well as a bunch of
> other useful options (e.g. for tuning and debugging).
> 
> I think it would be nice if libvirt were able to load in this file
> when starting RBD storage pools. Before I send some patches through.
> however, I thought I'd better check to see whether my approach is
> sound.
> 
> First, I am not keen on having libvirt get librados to load the
> configuration file automatically. librados actually uses a search
> path to find the configuration file, and that path includes silly
> things like the current working directory. Since it can be told to
> load a single file, I think it would be better if it were made
> explicit in the storage pool XML, i.e.:
> 
>   
> rbd
> 
>   rbd
>   
>   
> 
>   
> 
>   
> 
>  would be able to be used in addition to, or as an
> alternative to, a list of  elements. Would something along
> these lines this be suitable? Would it be better to use the 
> element's text content as the filename, rather than use an
> attribute? I'm not sure what style guidelines there are for
> something like this.
> 
> The second part is of course to make a similar change to RBD-based
> domain disk definitions, i.e.:
> 
>   ...
>   
> 
> 
>   
> 
> 
> 
>   
> 
>   
>   ...
> 
> Again,  could be used instead of or alongside some 
> elements.
> 
> This is where it gets a little tricky. At the moment,  in a
> disk's source definition is entirely optional. Furthermore, QEMU
> _always_ loads a Ceph configuration file -- either one supplied as a
> "conf" argument for the block device, or one found through the
> search path mentioned earlier. The only way to suppress this is to
> pass conf=/dev/null... but for backwards-compatibility (users may be
> relying on QEMU's use of the search path), I don't think we can do
> this now.
> 
> There's one final gotcha in all of this: if QEMU is given both a
> "conf" argument and a "mon_addr" argument, only the latter will take
> effect. This means if both  and  are supplied, then
> the  elements will override any monitor addresses from the
> configuration file.
> 
> For consistency, I intend to make an RBD storage pool have the same
> behaviour. However, would it perhaps be better if the user could
> only choose _either_  or a list of  elements?
> Personally, I don't think it's a big deal if the behaviour is
> clearly documented -- being able to load options from a config file
> while still defining hosts in the libvirt XML could be useful.
> 
> Anyway, before I send my patches through I'm interested in hearing
> people's thoughts on this. All sound sane? Too intrusive? A waste of
> time? :-)

We have always taken the position that we do not want to rely on host
configuration in this way. The goal of the XML configs is that they
fully describe the functional setup of the resource in question. This
is to ensure that if you put the same XML config on two different hosts
you can be sure that they will operate in the same way. If you leave out
a bunch of config information and rely on the host ceph.conf file, then
you can no longer ever be sure if two hosts are configured the same way
with libvirt. 

This is why we do not support use of the dnsmasq.conf file for configuring
virtual networks, and why we disable use of the /etc/qemu configuration
files for configuring guests. I don't think ceph is special here, so I'd
be against relying on a external ceph.conf file too.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Using external ceph.conf for RBD pools and disks

2013-11-01 Thread Michael Chapman

On Fri, 1 Nov 2013, Daniel P. Berrange wrote:

We have always taken the position that we do not want to rely on host
configuration in this way. The goal of the XML configs is that they
fully describe the functional setup of the resource in question. This
is to ensure that if you put the same XML config on two different hosts
you can be sure that they will operate in the same way. If you leave out
a bunch of config information and rely on the host ceph.conf file, then
you can no longer ever be sure if two hosts are configured the same way
with libvirt.


I suspected that might be the case -- half the reason I sent my email, 
really!


If it's desireable to not rely on any host configuration at all, should we 
be explicitly be passing conf=/dev/null to QEMU when setting up a RBD 
device? As I mentioned before, without that QEMU will implicitly try to 
find a system ceph.conf file using a built-in librados search path. Would 
this actually be backwards-incompatible change given it was never 
documented by libvirt?


- Michael

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Using external ceph.conf for RBD pools and disks

2013-11-01 Thread Eric Blake
On 11/01/2013 08:31 AM, Michael Chapman wrote:
> On Fri, 1 Nov 2013, Daniel P. Berrange wrote:
>> We have always taken the position that we do not want to rely on host
>> configuration in this way. The goal of the XML configs is that they
>> fully describe the functional setup of the resource in question. This
>> is to ensure that if you put the same XML config on two different hosts
>> you can be sure that they will operate in the same way. If you leave out
>> a bunch of config information and rely on the host ceph.conf file, then
>> you can no longer ever be sure if two hosts are configured the same way
>> with libvirt.
> 
> I suspected that might be the case -- half the reason I sent my email,
> really!
> 
> If it's desireable to not rely on any host configuration at all, should
> we be explicitly be passing conf=/dev/null to QEMU when setting up a RBD
> device?

Sure sounds like it to me.

> As I mentioned before, without that QEMU will implicitly try to
> find a system ceph.conf file using a built-in librados search path.
> Would this actually be backwards-incompatible change given it was never
> documented by libvirt?

The old behavior is broken, so we can bill this as a bug fix
(previously, qemu would behave differently than what the XML defined,
which is not supposed to happen) rather than a backwards-incompatible
change.  Can you propose a patch in time for inclusion in 1.1.4?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Using external ceph.conf for RBD pools and disks

2013-11-01 Thread Michael Chapman

On Fri, 1 Nov 2013, Eric Blake wrote:

The old behavior is broken, so we can bill this as a bug fix
(previously, qemu would behave differently than what the XML defined,
which is not supposed to happen) rather than a backwards-incompatible
change.  Can you propose a patch in time for inclusion in 1.1.4?


I can hammer out a patch quickly, but I won't have a chance to run it on a 
Ceph-enabled test machine before Monday. It's not critical, so I suggest 
it goes in after 1.1.4 is released.


- Michael

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Using external ceph.conf for RBD pools and disks

2013-11-01 Thread Josh Durgin

On 11/01/2013 07:42 AM, Eric Blake wrote:

On 11/01/2013 08:31 AM, Michael Chapman wrote:

As I mentioned before, without that QEMU will implicitly try to
find a system ceph.conf file using a built-in librados search path.
Would this actually be backwards-incompatible change given it was never
documented by libvirt?


The old behavior is broken, so we can bill this as a bug fix
(previously, qemu would behave differently than what the XML defined,
which is not supposed to happen) rather than a backwards-incompatible
change.  Can you propose a patch in time for inclusion in 1.1.4?


This will break OpenStack's usage of libvirt + rbd in Grizzly and
earlier releases, which relied on loading ceph.conf for the monitor
addresses. This is fixed in OpenStack Havana, but I wanted to note that
applications are relying on this behavior.

Passing conf=/dev/null removes the last remaining way of specifying
arbitrary ceph options for rbd devices, which is backwards-incompatible
in some setups even with well-behaved applications.

In general it may break setups using non-default options that libvirt
is not aware of. For example, ceph has an option to require messages
to be signed. This is off by default for backwards compatibility with
older ceph clients, but it can be enabled for qemu right now by adding
an option to /etc/ceph/ceph.conf. If libvirt passes conf=/dev/null,
guests are less secure since they may get their data from an untrusted
source that does not sign messages.

Ceph is a fast-moving complex project, and there are many options (and
will be more in the future) that affect security, performance tuning,
run-time introspection, logging, etc. I don't think libvirt should
remove the ability to configure these settings without having a way to
add them via xml. It doesn't seem feasible to make libvirt (and all
applications using it) aware of all existing and new options,
especially since many of them are quite ceph-specific.

Instead, I'd like to propose a mechanism for passing through generic
key/value pairs to configure block devices. Concretely, this could be
something like:


  
  









  


I don't care about the particular format, just that there's a way to
set these kinds of settings. It's much easier for users of libvirt
and ceph if these are treated as opaque strings by libvirt, since
they can ugrade ceph and use new options without upgrading libvirt
and any applications using it as well. I'm happy to provide patches
if this approach is acceptable.

Josh

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list