Re: [openstack-dev] [devstack] openstack client slowness / client-as-a-service

2016-04-19 Thread Daniel P. Berrange
On Tue, Apr 19, 2016 at 09:57:56AM -0500, Dean Troyer wrote:
> On Tue, Apr 19, 2016 at 9:06 AM, Adam Young  wrote:
> 
> > I wonder how much of that is Token caching.  In a typical CLI use patter,
> > a new token is created each time a client is called, with no passing of a
> > token between services.  Using a session can greatly decrease the number of
> > round trips to Keystone.
> >
> 
> Not as much as you think (or hope?).  Persistent token caching to disk will
> help some, at other expenses though.  Using --timing on OSC will show how
> much time the Identity auth round trip cost.
> 
> I don't have current numbers, the last time I instrumented OSC there were
> significant load times for some modules, so we went a good distance to
> lazy-load as much as possible.
> 
> What Dan sees WRT a persistent client process, though, is a combination of
> those two things: saving the Python loading and the Keystone round trips.

The 1.5sec overhead I eliminated doesn't actually have anything todo
with network round trips at all. Even if you turn off all network
services and just run 'openstack ' and let it fail due
to inability to connect it'll still have that 1.5 sec overhead. It
is all related to python runtime loading and work done during module
importing.

eg run 'unstack.sh' and then compare the main openstack client:

$ time /usr/bin/openstack server list
Discovering versions from the identity service failed when creating the 
password plugin. Attempting to determine version from URL.
Unable to establish connection to http://192.168.122.156:5000/v2.0/tokens

real0m1.555s
user0m1.407s
sys 0m0.147s

Against my client-as-a-service version:

$ time $HOME/bin/openstack server list
[Errno 111] Connection refused

real0m0.045s
user0m0.029s
sys 0m0.016s


I'm sure there is scope for also optimizing network traffic / round
trips, but I didn't investigate that at all.

> I have (had!) a version of DevStack that put OSC into a subprocess and
> called it via pipes to do essentially what Dan suggests.  It saves some
> time, at the expense of complexity that may or may not be worth the effort.

devstack doesn't actually really need any significant changes beyond
making sure $PATH pointed to the replacement client programs and that
the server was running - the latter could be automated as a launch on
demand thing which would limit devstack changes.

It actually doesn't technically need any devstack change - these
replacement clients could simply be put in some 3rd party git repo
and let developers who want the speed benefit simply put them in
their $PATH before running devstack.

> One thing missing is any sort of transactional control in the I/O with the
> subprocess, ie, an EOT marker.  I planned to add a -0 option (think xargs)
> to handle that but it's still down a few slots on my priority list.  Error
> handling is another problem, and at this point (for DevStack purposes
> anyway) I stopped the investigation, concluding that reliability trumped a
> few seconds saved here.

For I/O I simply replaced stdout + stderr with a new StringIO handle to
capture the data when running each command, and for error handling I
ensured the exit status was fed back & likewise stderr printed.

It is more than just a few seconds saved - almost 4 minutes, or
nearly 20% of entire time to run stack.sh on my machine


> Ultimately, this is one of the two giant nails in the coffin of continuing
> to persue CLIs in Python.  The other is co-installability. (See that
> current thread on the ML for pain points).  Both are easily solved with
> native-code-generating languages.  Go and Rust are at the top of my
> personal list here...

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [devstack] openstack client slowness / client-as-a-service

2016-04-18 Thread Daniel P. Berrange
There have been threads in the past about the slowness of the "openstack"
client tool such as this one by Sean last year:

  http://lists.openstack.org/pipermail/openstack-dev/2015-April/061317.html

Sean mentioned a 1.5s fixed overhead on openstack client, and mentions it
is significantly slower than the equivalent nova command. In my testing
I don't see any real speed difference between openstack & nova client
programs, so maybe that differential has been addressed since Sean's
original thread, or maybe nova has got slower.

Overall though, I find it is way too sluggish considering it is running
on a local machine with 12 cpus and 30 GB of RAM.

I had a quick go at trying to profile the tools with cprofile and analyse
with KCacheGrind as per this blog:

  
https://julien.danjou.info/blog/2015/guide-to-python-profiling-cprofile-concrete-case-carbonara

And notice that in profiling 'nova help' for example, the big sink appears
to come from the 'pkg_resource' module and its use of pyparsing. I didn't
spend any real time to dig into this in detail, because it got me wondering
whether we can easily just avoid the big startup penalty by not having to
startup a new python interpretor for each command we run.

I traced devstack and saw it run 'openstack' and 'neutron' commands approx
140 times in my particular configuration. If each one of those has a 1.5s
overhead, we could potentially save 3 & 1/2 minutes off devstack execution
time.

So as a proof of concept I have created an 'openstack-server' command
which listens on a unix socket for requests and then invokes the
OpenStackShell.run / OpenStackComputeShell.main / NeutronShell.run
methods as appropriate.

I then replaced the 'openstack', 'nova' and 'neutron' commands with
versions that simply call to the 'openstack-server' service over the
UNIX socket. Since devstack will always recreate these commands in
/usr/bin, I simply put my replacements in $HOME/bin and then made
sure $HOME/bin was first in the $PATH

You might call this 'command line as a service' :-)

Anyhow, with my devstack setup a traditional install takes

  real  21m34.050s
  user  7m8.649s
  sys   1m57.865s

And when using openstack-server it only takes

  real  17m47.059s
  user  3m51.087s
  sys   1m42.428s

So that has cut 18% off the total running time for devstack, which
is quite considerable really.

I'm attaching the openstack-server & replacement openstack commands
so you can see what I did. You have to manually run the openstack-server
command ahead of time and it'll print out details of every command run
on stdout.

Anyway, I'm not personally planning to take this experiment any further.
I'll probably keep using this wrapper in my own local dev env since it
does cut down on devstack time significantly. This mail is just to see
if it'll stimulate any interesting discussion or motivate someone to
explore things further.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
#!/usr/bin/python

import socket
import sys
import os
import os.path
import json

server_address = "/tmp/openstack.sock"

sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)

try:
sock.connect(server_address)
except socket.error, msg:
print >>sys.stderr, msg
sys.exit(1)


def send(sock, doc):
jdoc = json.dumps(doc)
sock.send('%d\n' % len(jdoc))
sock.sendall(jdoc)

def recv(sock):
length_str = ''

char = sock.recv(1)
if len(char) == 0:
print >>sys.stderr, "Unexpected end of file"
sys.exit(1)

while char != '\n':
length_str += char
char = sock.recv(1)
if len(char) == 0:
print >>sys.stderr, "Unexpected end of file"
sys.exit(1)

total = int(length_str)

# use a memoryview to receive the data chunk by chunk efficiently
jdoc = memoryview(bytearray(total))
next_offset = 0
while total - next_offset > 0:
recv_size = sock.recv_into(jdoc[next_offset:], total - next_offset)
next_offset += recv_size
try:
doc = json.loads(jdoc.tobytes())
except (TypeError, ValueError), e:
raise Exception('Data received was not in JSON format')
return doc

try:
env = {}
passenv = ["CINDER_VERSION",
   "OS_AUTH_URL",
   "OS_IDENTITY_API_VERSION",
   "OS_NO_CACHE",
   "OS_PASSWORD",
   "OS_PROJECT_NAME",
   "OS_REGION_NAME",
   "OS_TENANT_NAME",
   "OS_USERNAME",
   "OS_VOLUME_API_VERSION"]
for name in passenv:
if name in os.environ:
env[name] = os.environ[name]

cmd = {
"app": os.path.basename(sys.argv[0]),
"env": env,
"argv": 

Re: [openstack-dev] Nova hook

2016-04-14 Thread Daniel P. Berrange
On Thu, Apr 14, 2016 at 03:15:42PM +0800, Kenny Ji-work wrote:
> Hi all,​
> The nova hooks facility will be removed in the future, now
> what's the recommended method to add custom code into the nova's
> internal APIs? Thank you for answer!

The point of removing it is that we do *not* want people to add custom
code into nova's internal APIs, so there is explicitly no replacement
for this functionality.

If you have a use case that Nova does not currently address, that is
broadly useful then you can propose a blueprint/spec to explicitly
support this in Nova, rather than doing it out of tree via a hook

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][neutron] os-vif status report

2016-04-13 Thread Daniel P. Berrange
I won't be present at the forthcoming Austin summit, so to prepare other
people in case there are f2f discussions, this is a rough status report
on the os-vif progress


os-vif core
---

NB by os-vif core, I mean the python packages in the os_vif/ namespace.

The object model for describing the various different VIF backend
configurations is defined well enough that it should cover all the
VIF types currently used by Nova libvirt driver, and probably all
those needed by other virt drivers. The only exception is that we
do not have a representation for the vmware 'dvs' VIF type. There's
no real reason why not, other than the fact that we're concentrating
on converting the libvirt nova driver first. These are dealt with
by the os_vif.objects.VIFBase object and its subclasses.


We now have an object model for describing client host capabilities.
This is dealt with by the os_vif.objects.HostInfo versioned object
and things is used. Currently this object provides details of all
the os-vif plugins that are installed on the host, and which VIF
configs objects each supports.  The intent is that the HostInfo
object is serialized to JSON, and passed to Neutron by Nova when
creating a port.  This allows Neutron to dynamically decide which
plugin and which VIF config it wants to use for creating the port.


The os_vif.PluginBase class which all plugins must inherit from
has been enhanced so that plugins can declare configuration
parameters they wish to support. This allows config options for
the plugins to be included directly in the nova.conf file in
a dedicated section per plugin. For example, the linux bridge
plugin will have its parameters in a "[os_vif_linux_bridge]"
section in nova.conf.  This lets us setup the deprecations
correctly, so that when upgrading from older Nova, existing
settings in nova.conf still apply to the plugins provided
by os-vif.


os-vif reference plugins


Originally the intention was that all plugins would live outside
of the os-vif package. During discussions at the Nova mid-cycle
meeting there was a strong preference to have the linux bridge
and openvswitch plugin implementations be distributed as part of
the os-vif package directly.

As such we now have 'vif_plug_linux_bridge' and 'vif_plug_ovs'
python packages as part of the os-vif module. Note that these
are *not* under the os_vif python namespace, as the intention
was to keep their code structured as if they were separate,
so we can easily split them out again in future in we need to.

Both the linux bridge and ovs plugins have now been converted
over to use oslo.privsep instead of rootwrap for all the places
where they need to run privileged commands.


os-vif extra plugins


Jay has had GIT repositories created to hold the plugins for all
the other VIF types the libvirt driver needs to support to have
feature parity with Mitaka and earlier. AFAIK, no one has done
any work to actually get the code for these working. This is not
a blocker, since the way the Nova integration is written allows
us to incrementally convert each VIF type over to use os-vif, so
we avoid need for a "big bang".


os-vif Nova integration
---

I have a patch up for review against Nova that converts the libvirt
driver to use os-vif. It only does the conversion for linux bridge
and openvswitch, all other vif types fallback to using the current
code, as mentioned above.  The unit tests for this pass locally,
but I've not been able to verify its working correctly when run for
real. There's almost certainly privsep related integration tasks to
shake out - possibly as little as just installing the rootwrap filter
needed to allow use of privsep. My focus right now is ironing this
out so that I can verify linux bridge + ovs work with os-vif.


There is a new job defined in the experimental queue that tests that
can verify Nova against os-vif git master so we can get forwarning
if something in os-vif will cause Nova to break. This should also
let us verify that the integration is actually working in Nova CI
before allowing it to actually merge.


os-vif Neutron integration
--

As mentioned earlier we now have a HostInfo versioned object defined
in os-vif which Nova will populate. We need to extend the Neutron API
to accept this object when nova creates a port. This lets Neutron know
which VIF plugins are available and the configs they require.

Once Neutron has this information, instead of sending back the current
unstructured port binding dict, it will be able to send back a serialized
os_vif.objects.VIFBase subclass which formally describes the VIF it wants
Nova to use. This might be possible by just defining a VIF_TYPE_OS_VIF
and putting the VIFBase serialized data in another port binding metadata
field. Alternatively it might be desirable to extend the Neutron API to
more explictly represent os-vif. 

None of the Neutron integration has been started, or even written 

Re: [openstack-dev] [nova] FYI: Removing default flavors from nova

2016-04-06 Thread Daniel P. Berrange
On Wed, Apr 06, 2016 at 04:29:00PM +, Fox, Kevin M wrote:
> It feels kind of like a defcore issue though. Its harder for app
> developers to create stuff like heat templates intended for cross
> cloud that recommend a size, m1.small, without a common reference.

Even with Nova defining these default flavours, it didn't do anything
to help solve this problem as all the public cloud operators were
just deleting these flavours & creating their own. So it just gave
people a false sense of standardization where none actually existed.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Minimal secure identification of a new VM

2016-04-06 Thread Daniel P. Berrange
On Wed, Apr 06, 2016 at 04:03:18PM +, Hayes, Graham wrote:
> On 06/04/2016 16:54, Gary Kotton wrote:
> >
> >
> > On 4/6/16, 12:42 PM, "Daniel P. Berrange" <berra...@redhat.com> wrote:
> >
> >> On Tue, Apr 05, 2016 at 06:00:55PM -0400, Adam Young wrote:
> >>> We have a use case where we want to register a newly spawned Virtual
> >>> machine
> >>> with an identity provider.
> >>>
> >>> Heat also has a need to provide some form of Identity for a new VM.
> >>>
> >>>
> >>> Looking at the set of utilities right now, there does not seem to be a
> >>> secure way to do this.  Injecting files does not provide a path that
> >>> cannot
> >>> be seen by other VMs or machines in the system.
> >>>
> >>> For our use case, a short lived One-Time-Password is sufficient, but for
> >>> others, I think asymmetric key generation makes more sense.
> >>>
> >>> Is the following possible:
> >>>
> >>> 1.  In cloud-init, the VM generates a Keypair, then notifies the No0va
> >>> infrastructure (somehow) that it has done so.
> >>
> >> There's no currently secure channel for the guest to push information
> >> to Nova. The best we have is the metadata service, but we'd need to
> >> secure that with https, because the metadata server cannot be assumed
> >> to be running on the same host as the VM & so the channel is not protected
> >> against MITM attacks.
> 
> I thought the metadata API traffic was taken off the network by the
> compute node? Or is that just under the old nova-network?

Nope, there's no guarantee that the metadata server will be on the
local compute node - it might be co-located, but it equally might
be anywhere else.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] FPGA as a resource

2016-04-06 Thread Daniel P. Berrange
On Wed, Apr 06, 2016 at 12:28:02PM +0200, Roman Dobosz wrote:
> On Wed, 6 Apr 2016 16:12:16 +0800
> Zhipeng Huang  wrote:
> 
> > You are actually touching on something we have been working on. There is a
> > team in OPNFV DPACC project has been working acceleration related topics,
> > including folks from CMCC, Intel, ARM, Freescale, Huawei. We found out that
> > in order to have acceleration working under NFV scenrios, other than Nova
> > and Neutron's support, we also need a standalone service that manage
> > accelerators itself.
> > 
> > That means we want to treat accelerators, and FPGA being an important part
> > of it, as a first class resource citizen and we want to be able to do life
> > cycle management and scheduling on acceleration resources.
> > 
> > Based upon that requirement we started a new project called Nomad [1] on
> > Jan this year, to serve as an OpenStack service for distributed
> > acceleration management.
> > 
> > We've just started the project, and currently discussing the first BP [2].
> > We have a team working on IP-SEC based accelerator mgmt, and would love to
> > have more people to work on topics like FPGA.
> > 
> > We also have a topic on introducing Nomad accepted in Austin Summit [3].
> > 
> > You are more than welcomed to join the conversation : )
> 
> Thanks! I'll try to attend.
> 
> Nevertheless, I've briefly looked at page of project Nomad, and don't
> quite get it, how this might be related to cases described in this
> thread - i.e. providing attachable, non trivial devices as a new
> resources in Nova.

I don't think it is really relevant in the immediate term. Most of the
work to support FPGA will be internal to nova, to deal with modelling
of assignable devices and their scheduling / allocation.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Minimal secure identification of a new VM

2016-04-06 Thread Daniel P. Berrange
On Tue, Apr 05, 2016 at 06:00:55PM -0400, Adam Young wrote:
> We have a use case where we want to register a newly spawned Virtual machine
> with an identity provider.
> 
> Heat also has a need to provide some form of Identity for a new VM.
> 
> 
> Looking at the set of utilities right now, there does not seem to be a
> secure way to do this.  Injecting files does not provide a path that cannot
> be seen by other VMs or machines in the system.
> 
> For our use case, a short lived One-Time-Password is sufficient, but for
> others, I think asymmetric key generation makes more sense.
> 
> Is the following possible:
> 
> 1.  In cloud-init, the VM generates a Keypair, then notifies the No0va
> infrastructure (somehow) that it has done so.

There's no currently secure channel for the guest to push information
to Nova. The best we have is the metadata service, but we'd need to
secure that with https, because the metadata server cannot be assumed
to be running on the same host as the VM & so the channel is not protected
against MITM attacks.

Also currently the metadata server is readonly with the guest pulling
information from it - it doesn't currently allow guests to push information
into it. This is nice because the metadata servers could theoretically be
locked down to prevent may interactions with the rest of nova - it should
only need read-only access to info about the guests it is serving. If we
turn the metadata server into a bi-directional service which can update
information about guests, then it opens it up as a more attractive avenue
of attack for guest OS trying breach the host infra. This is a fairly
general concern with any approach where the guest has to have the ability
to push information back into Nova.

> 2.  Nova Compute reads the public Key off the device and sends it to
> conductor, which would then associate the public key with the server?
> 
> 3.  A third party system could then validate the association of the public
> key and the server, and build a work flow based on some signed document from
> the VM?

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] FPGA as a resource

2016-04-06 Thread Daniel P. Berrange
On Wed, Apr 06, 2016 at 07:34:46AM +0200, Roman Dobosz wrote:
> On Tue, 5 Apr 2016 13:58:44 +0100
> "Daniel P. Berrange" <berra...@redhat.com> wrote:
> 
> > Along similar lines we have proposals to add vGPU support to Nova,
> > where the vGPUs may or may not be exposed using SR-IOV. We also want
> > to be able to on the fly decide whether any physical GPU is assigned
> > entirely to a guest as a full PCI device, or whether we only assign
> > individual "virtual functions" of the GPU. This means that even if
> > the GPU in question does *not* use SR-IOV, we still need to track
> > the GPU and vGPUs in the same way as we track PCI devices, so that
> > we can avoid assigning a vGPU to the guest, if the underlying physical
> > PCI device is already assigned to the guest.
> 
> That's correct. I'd like to mention, that FPGAs can be also exposed
> other way than PCI (like in Xeon+FPGA). Not sure if this also applies
> to GPU.
> 
> > I can see we will have much the same issue with FPGAs, where we may
> > either want to assign the entire physical PCI device to a guest, or
> > just assign a particular slot in the FPGA to the guest. So even if
> > the FPGA is not using SR-IOV, we need to tie this all into the PCI
> > device tracking code, as we are intending for vGPUs.
> > 
> > All in all, I think we probably ought to generalize the PCI device
> > assignment modelling so that we're actually modelling generic
> > hardware devices which may or may not be PCI based, so that we can
> > accurately track the relationships between the devices.
> > 
> > With NIC devices we're also seeing a need to expose capabilities
> > against the PCI devices, so that the schedular can be more selective
> > in deciding which particular devices to assign. eg so we can distinguish
> > between NICs which support RDMA and those which don't, or identify NIC
> > with hardware offload features, and so on. I can see this need to
> > associate capabilities with devices is something that will likely
> > be needed for the FPGA scenario, and vGPUs too. So again this points
> > towards more general purpose modelling of assignable hardware devices
> > beyond the limited PCI device modelling we've got today.
> > 
> > Looking to the future I think we'll see more usecases for device
> > assignment appearing for other types of device.
> > 
> > IOW, I think it would be a mistake to model FPGAs as a distinct
> > object type on their own. Generalization of assignable devices
> > is the way to go
> 
> That's why I've bring the topic here on the list, so we can think about
> similar devices which could be generalized into one common accelerator
> type or even think about modeling PCI as such.

I wouldn't specialize it to "accelerators" as we'll inevitably come
across a need for other types of device assignment. We should just
generalize it to track *any* type of host hardware device that is
potentially assignable.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] post-copy live migration

2016-04-05 Thread Daniel P. Berrange
On Tue, Apr 05, 2016 at 05:17:41PM +0200, Luis Tomas wrote:
> Hi,
> 
> We are working on the possibility of including post-copy live migration into
> Nova (https://review.openstack.org/#/c/301509/)
> 
> At libvirt level, post-copy live migration works as follow:
> - Start live migration with a post-copy enabler flag
> (VIR_MIGRATE_POSTCOPY). Note this does not mean the migration is performed
> in post-copy mode, just that you can switch it to post-copy at any given
> time.
> - Change the migration from pre-copy to post-copy mode.
> 
> However, we are not sure what's the most convenient way of providing this
> functionality at Nova level.
> The current specs, propose to include an optional flag at the live migration
> API to include the VIR_MIGRATE_POSTCOPY flag when starting the live
> migration. Then we propose a second API to actually switch the migration
> from pre-copy to post-copy mode similarly to how it is done in LibVirt. This
> is also similar to how the new "force-migrate" option works to ensure
> migrations completion. In fact, this method could be an extension of the
> force-migrate, by switching to postcopy if the migration was started with
> the VIR_MIGRATE_POSTCOPY libvirt flag, or pause it otherwise.
> 
> The cons of this approach are that we expose a too specific mechanism
> through the API. To alleviate this, we could remove the "switch" API, and
> automatize the switch based on data transferred, available bandwidth or
> other related metrics. However we will still need the extension to the
> live-migration API to include the proper libvirt postcopy flag.

No we absolutely don't want to expose that in the API as a concept, as it
is private technical implementation detail of the KVM migration code.

> The other solution is to start all the migrations with the
> VIR_MIGRATE_POSTCOPY mode, and therefore no new APIs would be needed. The
> system could automatically detect the migration is taking too long (or is
> dirting memory faster than the sending rate), and automatically switch to
> post-copy.

Yes this is what we should be doing as default behaviour with new enough
QEMU IMHO.

> The cons of this is that including the VIR_MIGRATE_POSTCOPY flag has an
> overhead, and it will not be desirable to included for all migrations,
> specially is they can be nicely migrated with pre-copy mode. In addition, if
> the migration fails after the switching, the VM will be lost. Therefore,
> admins may want to ensure that post-copy is not used for some specific VMs.

We shouldn't be trying to run before we can walk. Even if post-copy
is hurts some guests, it'll still be a net win overall because it will
give a guarantee that migration can complete without needing to stop
guest CPUs entirely. All we need to start with is a nova.conf setting
to let admin turn off use of post-copy for the host for cases where
we want to priortize performance over the ability to migrate successfully.

Any plan wrt changing migration behaviour on a per-VM basis needs to
consider a much broader set of features than just post-copy. For example,
compression, autoconverge and max-downtime settings all have an overhead
or impact on the guest too. We don't want to end up exposing API flags to
turn any of these on/off individually. So any solution to this will have
to look at a combination of usage context and some kind of SLA marker on
the guest. eg if the migration is in the context of host-evacuate which
absolutely must always complete in finite time, we should always use
post-copy. If the migration is in the context of load-balancing workloads
across hosts, then some aspect of guest SLA must inform whether Nova chooses
to use post-copy, or compression or auto-converge, etc.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] FPGA as a resource

2016-04-05 Thread Daniel P. Berrange
On Tue, Apr 05, 2016 at 02:27:30PM +0200, Roman Dobosz wrote:
> Hey all,
> 
> On yesterday's scheduler meeting I was raised the idea of bringing up 
> the FPGA to the OpenStack as the resource, which than might be exposed 
> to the VMs.
> 
> The use cases for motivations, why one want do this, are pretty broad -
> having such chip ready on the computes might be beneficial either for
> consumers of the technology and data center administrators. The
> utilization of the hardware is very broad - the only limitations are
> human imagination and hardware capability - since it might be used for
> accelerating execution of algorithms from compression and cryptography,
> through pattern recognition, transcoding to voice/video analysis and
> processing and all the others in between. Using FPGA to perform data
> processing may significantly reduce CPU utilization, the time and power
> consumption, which is a benefit on its own.
> 
> On OpenStack side, unlike utilizing the CPU or memory, for actually 
> using specific algorithm with FPGAs, it has to be programmed first. So 
> in simplified scenario, it might go like this:
> 
> * User selects VM with image which supports acceleration,
> * Scheduler selects the appropriate compute host with FPGA available,
> * Compute gets request, program IP into FPGA and then boot up the 
>   VM with accelerator attached.
> * If VM is removed, it may optionally erase the FPGA.
> 
> As you can see, it seems not complicated at this point, however it 
> become more complex due to following things we also have to take into 
> consideration:
> 
> * recent FPGA are divided into regions (or slots) and every of them 
>   can be programmed separately
> * slots may or may not fit the same bitstream (the program which FPGA
>   is fed, the IP)
> * There is several products around (Altera, Xilinx, others), which make 
>   bitstream incompatible. Even between the products of the same company
> * libraries which abstract the hardware layer like AAL[1] and their 
>   versions
> * for some products, there is a need for tracking memory usage, which 
>   is located on PCI boards
> * some of the FPGAs can be exposed using SR-IOV, while some other not, 
>   which implies multiple usage abilities

Along similar lines we have proposals to add vGPU support to Nova,
where the vGPUs may or may not be exposed using SR-IOV. We also want
to be able to on the fly decide whether any physical GPU is assigned
entirely to a guest as a full PCI device, or whether we only assign
individual "virtual functions" of the GPU. This means that even if
the GPU in question does *not* use SR-IOV, we still need to track
the GPU and vGPUs in the same way as we track PCI devices, so that
we can avoid assigning a vGPU to the guest, if the underlying physical
PCI device is already assigned to the guest.

I can see we will have much the same issue with FPGAs, where we may
either want to assign the entire physical PCI device to a guest, or
just assign a particular slot in the FPGA to the guest. So even if
the FPGA is not using SR-IOV, we need to tie this all into the PCI
device tracking code, as we are intending for vGPUs.

All in all, I think we probably ought to generalize the PCI device
assignment modelling so that we're actually modelling generic
hardware devices which may or may not be PCI based, so that we can
accurately track the relationships between the devices.

With NIC devices we're also seeing a need to expose capabilities
against the PCI devices, so that the schedular can be more selective
in deciding which particular devices to assign. eg so we can distinguish
between NICs which support RDMA and those which don't, or identify NIC
with hardware offload features, and so on. I can see this need to
associate capabilities with devices is something that will likely
be needed for the FPGA scenario, and vGPUs too. So again this points
towards more general purpose modelling of assignable hardware devices
beyond the limited PCI device modelling we've got today.

Looking to the future I think we'll see more usecases for device
assignment appearing for other types of device.

IOW, I think it would be a mistake to model FPGAs as a distinct
object type on their own. Generalization of assignable devices
is the way to go

> In other words, it may be necessary to incorporate another actions:
> 
> * properly discover FPGA and its capabilities
> * schedule right bitstream with corresponding matching unoccupied FPGA
>   device/slot
> * actual program FPGA
> * provide libraries into VM, which are necessary for interacting between
>   user program and the exposed FPGA (or AAL) (this may be optional, 
>   since user can upload complete image with everything in place)
> * bitstream images have to be keep in some kind of service (Glance?) 
>   with some kind of way for identifying which image match what FPGA
> 
> All of that makes modelling resource extremely complicated, contrary to 
> CPU resource for example. I'd like to discuss how the goal 

[openstack-dev] KVM Forum 2016: Call For Participation

2016-03-10 Thread Daniel P. Berrange
=
KVM Forum 2016: Call For Participation
August 24-26, 2016 - Westin Harbor Castle - Toronto, Canada

(All submissions must be received before midnight May 1, 2016)
=

KVM Forum is an annual event that presents a rare opportunity
for developers and users to meet, discuss the state of Linux
virtualization technology, and plan for the challenges ahead. 
We invite you to lead part of the discussion by submitting a speaking
proposal for KVM Forum 2016.

At this highly technical conference, developers driving innovation
in the KVM virtualization stack (Linux, KVM, QEMU, libvirt) can
meet users who depend on KVM as part of their offerings, or to
power their data centers and clouds.

KVM Forum will include sessions on the state of the KVM
virtualization stack, planning for the future, and many
opportunities for attendees to collaborate. As we celebrate ten years
of KVM development in the Linux kernel, KVM continues to be a
critical part of the FOSS cloud infrastructure.

This year, KVM Forum is joining LinuxCon and ContainerCon in Toronto, 
Canada. Selected talks from KVM Forum will be presented on Wednesday
August 24 to the full audience of LinuxCon and ContainerCon. Also,
attendees of KVM Forum will have access to all of the LinuxCon and
ContainerCon talks on Wednesday.

http://events.linuxfoundation.org/cfp

Suggested topics:

KVM and Linux
* Scaling and optimizations
* Nested virtualization
* Linux kernel performance improvements
* Resource management (CPU, I/O, memory)
* Hardening and security
* VFIO: SR-IOV, GPU, platform device assignment
* Architecture ports

QEMU
* Management interfaces: QOM and QMP
* New devices, new boards, new architectures
* Scaling and optimizations
* Desktop virtualization and SPICE
* Virtual GPU
* virtio and vhost, including non-Linux or non-virtualized uses
* Hardening and security
* New storage features
* Live migration and fault tolerance
* High availability and continuous backup
* Real-time guest support
* Emulation and TCG
* Firmware: ACPI, UEFI, coreboot, u-Boot, etc.
* Testing

Management and infrastructure
* Managing KVM: Libvirt, OpenStack, oVirt, etc.
* Storage: glusterfs, Ceph, etc.
* Software defined networking: Open vSwitch, OpenDaylight, etc.
* Network Function Virtualization
* Security
* Provisioning
* Performance tuning


===
SUBMITTING YOUR PROPOSAL
===
Abstracts due: May 1, 2016

Please submit a short abstract (~150 words) describing your presentation
proposal. Slots vary in length up to 45 minutes. Also include the proposal
type -- one of:
- technical talk
- end-user talk

Submit your proposal here:
http://events.linuxfoundation.org/cfp
Please only use the categories "presentation" and "panel discussion"

You will receive a notification whether or not your presentation proposal
was accepted by May 27, 2016.

Speakers will receive a complimentary pass for the event. In the instance
that your submission has multiple presenters, only the primary speaker for a
proposal will receive a complementary event pass. For panel discussions, all
panelists will receive a complimentary event pass.

TECHNICAL TALKS

A good technical talk should not just report on what has happened over
the last year; it should present a concrete problem and how it impacts
the user and/or developer community. Whenever applicable, focus on
work that needs to be done, difficulties that haven't yet been solved,
and on decisions that other developers should be aware of. Summarizing
recent developments is okay but it should not be more than a small
portion of the overall talk.

END-USER TALKS

One of the big challenges as developers is to know what, where and how
people actually use our software. We will reserve a few slots for end
users talking about their deployment challenges and achievements.

If you are using KVM in production you are encouraged submit a speaking
proposal. Simply mark it as an end-user talk. As an end user, this is a
unique opportunity to get your input to developers.

HANDS-ON / BOF SESSIONS

We will reserve some time for people to get together and discuss
strategic decisions as well as other topics that are best solved within
smaller groups.

These sessions will be announced during the event. If you are interested
in organizing such a session, please add it to the list at

  http://www.linux-kvm.org/page/KVM_Forum_2016_BOF

Let people you think might be interested know about it, and encourage
them to add their names to the wiki page as well. Please try to
add your ideas to the list before KVM Forum starts.


PANEL DISCUSSIONS

If you are proposing a panel discussion, please make sure that you list
all of your potential panelists in your abstract. We will request full
biographies if a panel is accepted.


===
HOTEL / TRAVEL
===

This year's event will take place at the Westin Harbour Castle Toronto.
For 

Re: [openstack-dev] [nova] nova hooks - document & test or deprecate?

2016-03-03 Thread Daniel P. Berrange
On Thu, Mar 03, 2016 at 09:09:03AM -0600, Sam Matzek wrote:
> On Wed, Mar 2, 2016 at 8:25 PM, Adam Young  wrote:
> > On 02/29/2016 01:49 PM, Andrew Laski wrote:
> >>
> >>
> >> On Mon, Feb 29, 2016, at 01:18 PM, Dan Smith wrote:
> 
>  Forgive my ignorance or for playing devil's advocate, but wouldn't the
>  main difference between notifications and hooks be that notifications
>  are asynchronous and hooks aren't?
> >>>
> >>> The main difference is that notifications are external and intended to
> >>> be stable (especially with the versioned notifications effort). The
> >>> hooks are internal and depend wholly on internal data structures.
> >>>
>  In the case of how Rdo was using it,
>  they are adding things to the injected_files list before the instance is
>  created in the compute API.  You couldn't do that with notifications as
>  far as I know.
> >>>
> >>> Nope, definitely not, but I see that as a good thing. Injecting files
> >>> like that is likely to be very fragile and I think mostly regarded as
> >>> substantially less desirable than the alternatives, regardless of how it
> >>> happens.
> >>>
> >>> I think that Laski's point was that the most useful and least dangerous
> >>> thing that hooks can be used for is the use case that is much better
> >>> served by notifications.
> >
> >
> > I did the original proof-of-concept for this prior to the impl using the
> > hooks, just by modifying the metadata.
> >
> > http://adam.younglogic.com/2013/09/register-vm-freeipa/
> >
> > That works for a CLI based approach, but not for auto-registering VMs
> > created from the WebUI, and also only works if the user crafts the Metadata
> > properly.  It was not a secure approach.
> >
> > What we need is a way to be able to generate a secret and share that between
> > the new VM and the enrolling server.  The call does not strictly have to be
> > synchronous.  The enrolling server can wait if the VM is not up, and the VM
> > can wait if the enrolling server does not have the secret when the VM is
> > ready to enroll.
> >
> > We had discussed having a seperate service listen to notifications on the
> > bus and inject the data necessary into the IdM server.  The hooks were a
> > much better solution.
> >
> > I had seriously thought about using the Keystone token as the symmetric
> > shared secret.  It is a horrible solution, but so are all the rest.
> >
> > There is no security on the message bus at the moment, and until we get
> > some, we can't put a secret on the bus.
> >
> > So, don't deprecate until you have a solution.  All you will be doing is
> > putting people in a tight spot where they will have to fork the code base,
> > and that is downright antisocial.
> >
> > Let's plan this out in the Newton Summit and have a plan moving forward.
> 
> Deprecate isn't the same as remove unless I'm missing something on how
> this works.  I think we want to deprecate it to discourage further
> use, to gather current use cases, and to drive approved specs for
> those use cases.   Hooks should not be removed from tree until we have
> the replacements in tree.

Yes & no. We're not committing to supporting every single thing people
do via hooks today. We're saying that we want people to use notifications
primarily as a way to trigger async execution of hooks outside of nova.
Notifications already exist, but we're lacking an official way to trigger
scripts based off the receipt of a notification. So once that's done we
will have an official replacement available, and we'll consider removing
the hooks mechanism at this point. This will certainly /not/ cover all
possible use cases that hooks are used for today, and that is by design,
since we're explicitly not wanting people to be able to arbitrarily
change nova functionality.

We will however consider requests to add more data fields to notifications
if some key info is missing. Also if there are things that can't be done
via async notifications, we'll consider specs to add explicit features to
achieve a similar goal in nova. That doesn't mean we're going to accept
every request, sumply that we will evaluate them on merit, as we would
with any other Nova feature. Whether we wait for all these potential
feature requests to be implemented before we remove hooks is an open
question. I think it'll depend on the scope of the proposed features and
their importance to the Nova ecosystem, vs other features we have to deal
with implementing. IOW this will be a case by case decision, and we're
not going to wait an abitrarily long time for people to propose replaement
features before deleting hooks.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|


Re: [openstack-dev] [nova] nova hooks - document & test or deprecate?

2016-03-02 Thread Daniel P. Berrange
On Tue, Mar 01, 2016 at 01:53:18PM -0500, Rob Crittenden wrote:
> Daniel P. Berrange wrote:
> > On Mon, Feb 29, 2016 at 12:36:03PM -0700, Rich Megginson wrote:
> >> On 02/29/2016 12:19 PM, Chris Friesen wrote:
> >>> On 02/29/2016 12:22 PM, Daniel P. Berrange wrote:
> >>>
> >>>> There's three core scenarios for hooks
> >>>>
> >>>>  1. Modifying some aspect of the Nova operation
> >>>>  2. Triggering an external action synchronously to some Nova operation
> >>>>  3. Triggering an external action asynchronously to some Nova operation
> >>>>
> >>>> The Rdo example is falling in scenario 1 since it is modifying the
> >>>> injected files. I think this is is absolutely the kind of thing
> >>>> we should explicitly *never* support. When external code can arbitrarily
> >>>> modify some aspect of Nova operation we're in totally unchartered
> >>>> territory as to the behaviour of Nova. To support that we'd need to
> >>>> provide a stable internal API which is just not something we want to
> >>>> tie ourselves into. I don't know just what the Rdo example is trying
> >>>> to achieve, but whatever it is, it should be via some supportable API
> >>>> and not a hook.,
> >>>>
> >>>> Scenaris 2 and 3 are both valid to consider. Using the notifications
> >>>> system gets as an asynchronous trigger mechanism, which is probably
> >>>> fine for many scenarios.  The big question is whether there's a
> >>>> compelling need for scenario two, where the external action blocks
> >>>> execution of the Nova operation until it has completed its hook.
> >>>
> >>> Even in the case of scenario two it is possible in some cases to use a
> >>> proxy to intercept the HTTP request, take action, and then forward it or
> >>> reject it as appropriate.
> >>>
> >>> I think the real question is whether there's a need to trigger an external
> >>> action synchronously from down in the guts of the nova code.
> >>
> >> The hooks do the following: 
> >> https://github.com/rcritten/rdo-vm-factory/blob/use-centos/rdo-ipa-nova/novahooks.py#L271
> >>
> >> We need to generate a token (ipaotp) and call ipa host-add with that token
> >> _before_ the new machine has a chance to call ipa-client-install.  We have
> >> to guarantee that the client cannot call ipa-client-install until we get
> >> back the response from ipa that the host has been added with the token.
> >> Additionally, we store the token in an injected_file in the new machine, so
> >> the file can be removed as soon as possible.  We tried storing the token in
> >> the VM metadata, but there is apparently no way to delete it?  Can the
> >> machine do
> >>
> >> curl -XDELETE http://168.254.169.254/openstack/latest/metadata?key=value ?
> >>
> >> Using the build_instance.pre hook in Nova makes this simple and
> >> straightforward.  It's also relatively painless to use the 
> >> network_info.post
> >> hook to handle the floating ip address assignment.
> >>
> >> Is it possible to do the above using notifications without jumping through
> >> too many hoops?
> > 
> > So from a high level POV, you are trying to generate a security token
> > which will be provided to the guest OS before it is booted.
> > 
> > I think that is a pretty clearly useful feature, and something that
> > should really be officially integrated into Nova as a concept rather
> > than done behind nova's back as a hook.
> 
> Note that the reason the file was injected the way it was is so that
> Nova would have no idea there even is a token. We didn't want someone
> later peeking at the metadata, or a notification, to get the token.

That's just security through obscurity

Anyway, if you wish to be able to continue to support your use case,
I'll strongly recommend that you look to propose a mechanism to officially
support provision of tokens to guest OS in Nova. It is pretty clear from
this thread that the hooks mechanism you rely on is going to be deleted
in the future, likely as soon as the Newton release.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova hooks - document & test or deprecate?

2016-03-01 Thread Daniel P. Berrange
On Mon, Feb 29, 2016 at 12:36:03PM -0700, Rich Megginson wrote:
> On 02/29/2016 12:19 PM, Chris Friesen wrote:
> >On 02/29/2016 12:22 PM, Daniel P. Berrange wrote:
> >
> >>There's three core scenarios for hooks
> >>
> >>  1. Modifying some aspect of the Nova operation
> >>  2. Triggering an external action synchronously to some Nova operation
> >>  3. Triggering an external action asynchronously to some Nova operation
> >>
> >>The Rdo example is falling in scenario 1 since it is modifying the
> >>injected files. I think this is is absolutely the kind of thing
> >>we should explicitly *never* support. When external code can arbitrarily
> >>modify some aspect of Nova operation we're in totally unchartered
> >>territory as to the behaviour of Nova. To support that we'd need to
> >>provide a stable internal API which is just not something we want to
> >>tie ourselves into. I don't know just what the Rdo example is trying
> >>to achieve, but whatever it is, it should be via some supportable API
> >>and not a hook.,
> >>
> >>Scenaris 2 and 3 are both valid to consider. Using the notifications
> >>system gets as an asynchronous trigger mechanism, which is probably
> >>fine for many scenarios.  The big question is whether there's a
> >>compelling need for scenario two, where the external action blocks
> >>execution of the Nova operation until it has completed its hook.
> >
> >Even in the case of scenario two it is possible in some cases to use a
> >proxy to intercept the HTTP request, take action, and then forward it or
> >reject it as appropriate.
> >
> >I think the real question is whether there's a need to trigger an external
> >action synchronously from down in the guts of the nova code.
> 
> The hooks do the following: 
> https://github.com/rcritten/rdo-vm-factory/blob/use-centos/rdo-ipa-nova/novahooks.py#L271
> 
> We need to generate a token (ipaotp) and call ipa host-add with that token
> _before_ the new machine has a chance to call ipa-client-install.  We have
> to guarantee that the client cannot call ipa-client-install until we get
> back the response from ipa that the host has been added with the token.
> Additionally, we store the token in an injected_file in the new machine, so
> the file can be removed as soon as possible.  We tried storing the token in
> the VM metadata, but there is apparently no way to delete it?  Can the
> machine do
> 
> curl -XDELETE http://168.254.169.254/openstack/latest/metadata?key=value ?
> 
> Using the build_instance.pre hook in Nova makes this simple and
> straightforward.  It's also relatively painless to use the network_info.post
> hook to handle the floating ip address assignment.
> 
> Is it possible to do the above using notifications without jumping through
> too many hoops?

So from a high level POV, you are trying to generate a security token
which will be provided to the guest OS before it is booted.

I think that is a pretty clearly useful feature, and something that
should really be officially integrated into Nova as a concept rather
than done behind nova's back as a hook.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova hooks - document & test or deprecate?

2016-02-29 Thread Daniel P. Berrange
On Mon, Feb 29, 2016 at 12:03:00PM -0600, Matt Riedemann wrote:
> 
> 
> On 2/29/2016 11:23 AM, Andrew Laski wrote:
> >
> >
> >On Mon, Feb 29, 2016, at 12:12 PM, Dan Smith wrote:
> >>>In our continued quest on being more explicit about plug points it feels
> >>>like we should other document the interface (which means creating
> >>>stability on the hook parameters) or we should deprecate this construct
> >>>as part of a bygone era.
> >>>
> >>>I lean on deprecation because it feels like a thing we don't really want
> >>>to support going forward, but I can go either way.
> >>
> >>Deprecate and remove, please. We've been removing these sorts of things
> >>over time, and nova hooks have been ignored in that process. But really,
> >>making them more rigid is going to get in the way over time, trying to
> >>continue to honor an interface that codifies internals at a certain
> >>point in time, and leaving them as-is will just continue to generate
> >>issues like the quoted bug.
> >>
> >>I don't "lean" on deprecation, I feel strongly that these should go away.
> >
> >I've worked on a deployment that uses them heavily and would be impacted
> >by their removal. They are a very convenient place to put code that
> >should run based on Nova events but I have yet to see a usage that
> >couldn't have been implemented by having a service listen to
> >notifications and run that same code. However there is no service that
> >does this. So the only argument I can see for keeping them is that it's
> >more convenient to put that code into Nova rather than implement
> >something that listens for notifications. And that's not a convincing
> >argument to me.
> >
> >So I agree with moving forward on deprecation and think that
> >notifications provide a suitable replacement for the functionality
> >provided.
> 
> Forgive my ignorance or for playing devil's advocate, but wouldn't the main
> difference between notifications and hooks be that notifications are
> asynchronous and hooks aren't?  In the case of how Rdo was using it, they
> are adding things to the injected_files list before the instance is created
> in the compute API.  You couldn't do that with notifications as far as I
> know.

Sure, notifications do *not* replace all possible use case of the current
hooks, but I think that is actually desirable.

There's three core scenarios for hooks

 1. Modifying some aspect of the Nova operation
 2. Triggering an external action synchronously to some Nova operation
 3. Triggering an external action asynchronously to some Nova operation

The Rdo example is falling in scenario 1 since it is modifying the
injected files. I think this is is absolutely the kind of thing
we should explicitly *never* support. When external code can arbitrarily
modify some aspect of Nova operation we're in totally unchartered
territory as to the behaviour of Nova. To support that we'd need to
provide a stable internal API which is just not something we want to
tie ourselves into. I don't know just what the Rdo example is trying
to achieve, but whatever it is, it should be via some supportable API
and not a hook.,

Scenaris 2 and 3 are both valid to consider. Using the notifications
system gets as an asynchronous trigger mechanism, which is probably
fine for many scenarios.  The big question is whether there's a
compelling need for scenario two, where the external action blocks
execution of the Nova operation until it has completed its hook.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova hooks - document & test or deprecate?

2016-02-29 Thread Daniel P. Berrange
On Mon, Feb 29, 2016 at 12:23:09PM -0500, Andrew Laski wrote:
> 
> 
> On Mon, Feb 29, 2016, at 12:12 PM, Dan Smith wrote:
> > > In our continued quest on being more explicit about plug points it feels
> > > like we should other document the interface (which means creating
> > > stability on the hook parameters) or we should deprecate this construct
> > > as part of a bygone era.
> > > 
> > > I lean on deprecation because it feels like a thing we don't really want
> > > to support going forward, but I can go either way.
> > 
> > Deprecate and remove, please. We've been removing these sorts of things
> > over time, and nova hooks have been ignored in that process. But really,
> > making them more rigid is going to get in the way over time, trying to
> > continue to honor an interface that codifies internals at a certain
> > point in time, and leaving them as-is will just continue to generate
> > issues like the quoted bug.
> > 
> > I don't "lean" on deprecation, I feel strongly that these should go away.
> 
> I've worked on a deployment that uses them heavily and would be impacted
> by their removal. They are a very convenient place to put code that
> should run based on Nova events but I have yet to see a usage that
> couldn't have been implemented by having a service listen to
> notifications and run that same code. However there is no service that
> does this. So the only argument I can see for keeping them is that it's
> more convenient to put that code into Nova rather than implement
> something that listens for notifications. And that's not a convincing
> argument to me.

Yes, that's a prime example of a use case where we should be offering
a formally supportable solution instead of an unreliable hack using
hooks.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova hooks - document & test or deprecate?

2016-02-29 Thread Daniel P. Berrange
On Mon, Feb 29, 2016 at 11:59:06AM -0500, Sean Dague wrote:
> The nova/hooks.py infrastructure has been with us since early Nova. It's
> currently only annotated on a few locations - 'build_instance',
> 'create_instance', 'delete_instance', and 'instance_network_info'. It's
> got a couple of unit tests on it, but nothing that actually tests real
> behavior of the hooks we have specified.
> 
> It does get used in the wild, and we do break it with changes we didn't
> ever anticipate would impact it -
> https://bugs.launchpad.net/nova/+bug/1518321
> 
> However, when you look into how that is used, it's really really odd and
> fragile -
> https://github.com/richm/rdo-vm-factory/blob/master/rdo-ipa-nova/novahooks.py#L248
> 
> 
> def pre(self, *args, **kwargs):
> # args[7] is the injected_files parameter array
> # the value is ('filename', 'base64 encoded contents')
> ipaotp = str(uuid.uuid4())
> ipainject = ('/tmp/ipaotp', base64.b64encode(ipaotp))
> args[7].extend(self.inject_files)
> args[7].append(ipainject)
> 
> In our continued quest on being more explicit about plug points it feels
> like we should other document the interface (which means creating
> stability on the hook parameters) or we should deprecate this construct
> as part of a bygone era.
> 
> I lean on deprecation because it feels like a thing we don't really want
> to support going forward, but I can go either way.

As it is designed, I think the hooks mechanism is really unsupportable
long term. It is exposing users to arbitrary internal Nova data structures
which we have changed at will and we cannot possibly ever consider them
to be a stable consumable API. I'm rather surprised we've not seen more
bugs like the one you've shown above - most likely thats a reflection
on not many people actually using this facility.

I'd be strongly in favour of deprecation & removal of this hooking
mechanism, as its unsupportable in any sane manner when it exposes
code to our internal unstable API & object model.

If there's stuff people are doing in hooks that's impossible any other
way, we should really be looking at what we need to provide in our
public API to achieve the same goal, if it is use case we wish to be
able to support.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] A proposal to separate the design summit

2016-02-26 Thread Daniel P. Berrange
On Fri, Feb 26, 2016 at 08:55:52AM -0800, James Bottomley wrote:
> On Fri, 2016-02-26 at 16:03 +0000, Daniel P. Berrange wrote:
> > On Fri, Feb 26, 2016 at 10:39:08AM -0500, Rich Bowen wrote:
> > > 
> > > 
> > > On 02/22/2016 10:14 AM, Thierry Carrez wrote:
> > > > Hi everyone,
> > > > 
> > > > TL;DR: Let's split the events, starting after Barcelona.
> > > > 
> > > > 
> > > > 
> > > > Comments, thoughts ?
> > > 
> > > Thierry (and Jay, who wrote a similar note much earlier in 
> > > February, and Lauren, who added more clarity over on the marketing 
> > > list, and the many, many of you who have spoken up in this thread
> > > ...),
> > > 
> > > as a community guy, I have grave concerns about what the long-term
> > > effect of this move would be. I agree with your reasons, and the
> > > problems, but I worry that this is not the way to solve it.
> > > 
> > > Summit is one time when we have an opportunity to hold community up 
> > > to the folks that think only product - to show them how critical it 
> > > is that the people that are on this mailing list are doing the 
> > > awesome things that they're doing, in the upstream, in cooperation 
> > > and collaboration with their competitors.
> > > 
> > > I worry that splitting the two events would remove the community 
> > > aspect from the conference. The conference would become more 
> > > corporate, more product, and less project.
> > > 
> > > My initial response was "crap, now I have to go to four events 
> > > instead of two", but as I thought about it, it became clear that 
> > > that wouldn't happen. I, and everyone else, would end up picking 
> > > one event or the other, and the division between product and
> > > project would deepen.
> > > 
> > > Summit, for me specifically, has frequently been at least as much 
> > > about showing the community to the sales/marketing folks in my own
> > > company, as showing our wares to the customer.
> > 
> > I think what you describe is a prime reason for why separating the
> > events would be *beneficial* for the community contributors. The
> > conference has long ago become so corporate focused that its session
> > offers little to no value to me as a project contributor. What you
> > describe as a benefit of being able to put community people infront
> > of business people is in fact a significant negative for the design
> > summit productivity. It causes key community contributors to be 
> > pulled out of important design sessions to go talk to business 
> > people, making the design sessions significantly less productive.
> 
> It's Naïve to think that something is so sacrosanct that it will be
> protected come what may.  Everything eventually has to justify itself
> to the funders.  Providing quid pro quo to sales and marketing helps
> enormously with that justification and it can be managed so it's not a
> huge drain on productive time.  OpenStack may be the new shiny now, but
> one day it won't be and then you'll need the support of the people
> you're currently disdaining.
> 
> I've said this before in the abstract, but let me try to make it
> specific and personal: once the kernel was the new shiny and money was
> poured all over us; we were pure and banned management types from the
> kernel summit and other events, but that all changed when the dot com
> bust came.  You're from Red Hat, if you ask the old timers about the
> Ottawa Linux Symposium and allied Kernel Summit I believe they'll
> recall that in 2005(or 6) the Red Hat answer to a plea to fund travel
> was here's $25 a head, go and find a floor to crash on.  As the
> wrangler for the new Linux Plumbers Conference I had to come up with
> all sorts of convoluted schemes for getting Red Hat to fund developer
> travel most of which involved embarrassing Brian Stevens into approving
> it over the objections of his managers.  I don't want to go into detail
> about how Red Hat reached this situation; I just want to remind you
> that it happened before and it could happen again.

The proposal to split the design summit off actually aims to reduce
the travel cost burden. Currently we have a conference+design summit
at the wrong time, which is fairly unproductive due to people being
pulled out of the design summit for other tasks. So  we "fixed" that
by introducing mid-cycles to get real design work done. IOW contributors
end up with 4 events to travel to each year. With the proposed split
of the conference from te design su

Re: [openstack-dev] [all] A proposal to separate the design summit

2016-02-26 Thread Daniel P. Berrange
On Fri, Feb 26, 2016 at 10:39:08AM -0500, Rich Bowen wrote:
> 
> 
> On 02/22/2016 10:14 AM, Thierry Carrez wrote:
> > Hi everyone,
> > 
> > TL;DR: Let's split the events, starting after Barcelona.
> > 
> > 
> > 
> > Comments, thoughts ?
> 
> Thierry (and Jay, who wrote a similar note much earlier in February, and
> Lauren, who added more clarity over on the marketing list, and the many,
> many of you who have spoken up in this thread ...),
> 
> as a community guy, I have grave concerns about what the long-term
> effect of this move would be. I agree with your reasons, and the
> problems, but I worry that this is not the way to solve it.
>
> Summit is one time when we have an opportunity to hold community up to
> the folks that think only product - to show them how critical it is that
> the people that are on this mailing list are doing the awesome things
> that they're doing, in the upstream, in cooperation and collaboration
> with their competitors.
> 
> I worry that splitting the two events would remove the community aspect
> from the conference. The conference would become more corporate, more
> product, and less project.
> 
> My initial response was "crap, now I have to go to four events instead
> of two", but as I thought about it, it became clear that that wouldn't
> happen. I, and everyone else, would end up picking one event or the
> other, and the division between product and project would deepen.
> 
> Summit, for me specifically, has frequently been at least as much about
> showing the community to the sales/marketing folks in my own company, as
> showing our wares to the customer.

I think what you describe is a prime reason for why separating the
events would be *beneficial* for the community contributors. The
conference has long ago become so corporate focused that its session
offers little to no value to me as a project contributor. What you
describe as a benefit of being able to put community people infront
of business people is in fact a significant negative for the design
summit productivity. It causes key community contributors to be pulled
out of important design sessions to go talk to business people, making
the design sessions significantly less productive.

> Now, I know you guys put on awesome events, and you have probably
> thought about this already. The proposal to have the events be
> back-to-back across a weekend may indeed address some of these concerns,
> at the cost of the "less expensive city and venue" part of the proposal,
> and at the cost of being away from my family over yet another weekend.

Back to back crossing over a weekend is just a complete non-starter
of an idea due to the increased time away & giving up personal time
at the weekends for work.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] A proposal to separate the design summit

2016-02-25 Thread Daniel P. Berrange
On Thu, Feb 25, 2016 at 12:40:27PM +0100, Thierry Carrez wrote:
> Qiming Teng wrote:
> >[...]
> >Week 1:
> >   Wednesday-Friday: 3 days Summit.
> > * Primarily an event for marketing, sales, CTOs, architects,
> >   operators, journalists, ...
> > * Contributors can decide whether they want to attend this.
> >   Saturday-Sunday:
> > * Social activities: contributors meet-up, hang outs ...
> >
> >Week 2:
> >   Monday-Wednesday: 3 days Design Summit
> > * Primarily an event for developers.
> > * Operators can hold meetups during these days, or join project
> >   design summits.
> >
> >If you need to attend both events, you don't need two trips. Scheduling
> >both events by the end of a release cycle can help gather more
> >meaningful feedbacks, experiences or lessons from previous releases and
> >ensure a better plan for the coming release.
> >
> >If you want to attend just the main Summit or only the Design Summit,
> >you can plan your trip accordingly.
> 
> This was an option we considered. The main objection was that we are pretty
> burnt out and ready to go home when comes Friday on a single-week event, so
> the prospect of doing two consecutive weeks looked a bit like madness
> (especially considering ancillary events like upstream training, the board
> meeting etc. which tend to happen on the weekend before summit already). It
> felt like a good way to reduce our productivity and not make the most of the
> limited common time together. Furthermore it doesn't solve the issue of
> suboptimal timing as described in my original email.

I'd wager a sizeable number of contributors would outright refuse to attend
an event for 2 weeks. 6-7 days away from family is already a long time. As
such, I would certainly never do any event which spanned 2 weeks, even if
both weeks were relevant to my work.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova-compute blocking main thread under heavy disk IO

2016-02-22 Thread Daniel P. Berrange
On Mon, Feb 22, 2016 at 12:07:37PM -0500, Sean Dague wrote:
> On 02/22/2016 10:43 AM, Chris Friesen wrote:
> > Hi all,
> > 
> > We've recently run into some interesting behaviour that I thought I
> > should bring up to see if we want to do anything about it.
> > 
> > Basically the problem seems to be that nova-compute is doing disk I/O
> > from the main thread, and if it blocks then it can block all of
> > nova-compute (since all eventlets will be blocked).  Examples that we've
> > found include glance image download, file renaming, instance directory
> > creation, opening the instance xml file, etc.  We've seen nova-compute
> > block for upwards of 50 seconds.
> > 
> > Now the specific case where we hit this is not a production
> > environment.  It's only got one spinning disk shared by all the guests,
> > the guests were hammering on the disk pretty hard, the IO scheduler for
> > the instance disk was CFQ which seems to be buggy in our kernel.
> > 
> > But the fact remains that nova-compute is doing disk I/O from the main
> > thread, and if the guests push that disk hard enough then nova-compute
> > is going to suffer.
> > 
> > Given the above...would it make sense to use eventlet.tpool or similar
> > to perform all disk access in a separate OS thread?  There'd likely be a
> > bit of a performance hit, but at least it would isolate the main thread
> > from IO blocking.
> 
> Making nova-compute more robust is fine, though the reality is once you
> IO starve a system, a lot of stuff is going to fall over weird.
> 
> So there has to be a tradeoff of the complexity of any new code vs. what
> it gains. I think individual patches should be evaluated as such, or a
> spec if this is going to get really invasive.

There are OS level mechanisms (eg cgroups blkio controller) for doing
I/O priorization that you could use to give Nova higher priority over
the VMs, to reduce (if not eliminate) the possibility that a busy VM
can inflict a denial of service on the mgmt layer.  Of course figuring
out how to use that mechanism correctly is not entirely trivial.

I think it is probably worth focusing effort in that area, before jumping
into making all the I/O related code in Nova more complicated. eg have
someone investigate & write up recommendation in Nova docs for how to
configure the host OS & Nova such that VMs cannot inflict an I/O denial
of service attack on the mgmt service.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] A proposal to separate the design summit

2016-02-22 Thread Daniel P. Berrange
On Mon, Feb 22, 2016 at 04:14:06PM +0100, Thierry Carrez wrote:
> Hi everyone,
> 
> TL;DR: Let's split the events, starting after Barcelona.

Yes, please. Your proposal addresses the big issue I have with current
summits which is the really poor timing wrt start of each dev cycle.

> The idea would be to split the events. The first event would be for upstream
> technical contributors to OpenStack. It would be held in a simpler,
> scaled-back setting that would let all OpenStack project teams meet in
> separate rooms, but in a co-located event that would make it easy to have
> ad-hoc cross-project discussions. It would happen closer to the centers of
> mass of contributors, in less-expensive locations.

The idea that we can choose less expensive locations is great, but I'm a
little wary of focusing too much on "centers of mass of contributors", as
it can easily become an excuse to have it in roughly the same places each
time. As a non-USA based contributor, I really value the fact the the
summits rotate around different regions instead of spending all the time
in the USA as was the case earlier in openstcck days. Minimizing travel
costs is no doubt a welcome aim for companies' budgets, but it should not
be allowed to dominate to such a large extent that we miss representation
of different regions. ie if we never went back to Asia because the it is
cheaper for the /current/ majority of contributors to go to the US, we'll
make it harder to attract new contributors from those regions we avoid on
cost ground. The "center of mass of contributors" could become a self-
fullfilling prophecy.

IOW, I'm onboard with choosing less expensive locations, but would like
to see us still make the effort to reach out across different regions
for the events, and not become too US focused once again.

> The split should ideally reduce the needs to organize separate in-person
> mid-cycle events. If some are still needed, the main conference venue and
> time could easily be used to provide space for such midcycle events (given
> that it would end up happening in the middle of the cycle).

The obvious risk with suggesting that current mid-cycle events could take
place alongside the business conference, is that the "business conference"
ends up being just as large as our combined conference is today. IOW we
risk actually creating 4 big official developer events a year, instead of
the current 2 events + small unofficial mid-cycles. You'd need to find some
way to limit the scope of any "mid cycle" events that co-located with the
business conference to prevent it growing out of hand.  We really want to
make sure we keep the mid-cycles portrayed as optional small scale
"hackathons", and not something that contributors feel obligated to
attend. IMHO they're already risking getting out of hand - it is hard to
feel well connected to development plans if you miss the mid-cycle events.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Kuryr] will we use os-vif in kuryr

2016-02-18 Thread Daniel P. Berrange
On Thu, Feb 18, 2016 at 09:01:35AM +, Liping Mao (limao) wrote:
> Hi Kuryr team,
> 
> I see couple of commits to add support for vif plug.
> https://review.openstack.org/#/c/280411/
> https://review.openstack.org/#/c/280878/
> 
> Do we have plan to use os-vif?
> https://github.com/openstack/os-vif

FYI, we're trying reasonably hard to *not* make any assumptions about
what compute or network services are using os-vif. ie, we want os-vif
as a framework to be usable from Nova, or any other compute manager,
and likewise be usable from Neutron or any other network manager.
Obviously the actual implementations may be different, but the general
os-vif framework tries to be agnostic.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] ram size and huge page number for VM

2016-02-09 Thread Daniel P. Berrange
On Mon, Feb 08, 2016 at 07:47:56PM +, Serguei Bezverkhi (sbezverk) wrote:
> Hello,
> 
> I using the latest Liberty and I am trying to bring up a VM with some numa
> options configured using flavor. Specifically I need to give this VM 16GB
> of RAM and in addition  it will need to use 12x1GB huge pages. What I found
> out, there is no way in a flavor to specify RAM size and separately number
> of huge pages, only size is avaialbe, so to achieve what I needed I had to
> specify RAM size of 30GB. It would not be that bad, but instead of taken
> 16GB from server's RAM and 12 GB from Huge pages it has taken whole 30GB
> out of the server's total huge pages pool. Huge waist!!!

It not actually a huge waste - in fact it is likely to be beneficial to the
overall performance of your VM. You need to remember that there is no direct
relationship between allocation of huge pages to KVM and usage of huge pages
inside the guest OS - any combination is valid. In particular if you give
huge pages to KVM, then this has a performance benefit to the guest, even if
the guest OS doesn't use huge pages itself, because it increases the TLB hit
rate in the host for memory accesses by the guest. Using huge pages inside
the guest OS too, increases the TLB hit rate of the guest hardware further
extending this performance benefit. As such hosts which are intended to use
huge pages should have all their host memory allocated to the huge page pool,
except for what is required to run the various host OS services.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][tc] Proposal: Separate design summits from OpenStack conferences

2016-02-08 Thread Daniel P. Berrange
On Sun, Feb 07, 2016 at 03:07:20PM -0500, Jay Pipes wrote:
> I would love to see the OpenStack contributor community take back the design
> summit to its original format and purpose and decouple it from the OpenStack
> Summit's conference portion.
> 
> I believe the design summits should be organized by the OpenStack
> contributor community, not the OpenStack Foundation and its marketing and
> event planning staff. This will allow lower-cost venues to be chosen that
> meet the needs only of the small group of active contributors, not of huge
> masses of conference attendees. This will allow contributor companies to
> send *more* engineers to *more* design summits, which is something that
> really needs to happen if we are to grow our active contributor pool.
> 
> Once this decoupling occurs, I think that the OpenStack Summit should be
> renamed to the OpenStack Conference and Expo to better fit its purpose and
> focus. This Conference and Expo event really should be held once a year, in
> my opinion, and continue to be run by the OpenStack Foundation.
> 
> I, for one, would welcome events that have no conference check-in area, no
> evening parties with 2000 people, no keynote and powerpoint-as-a-service
> sessions, and no getting pulled into sales meetings.
> 
> OK, there, I said it.
> 
> Thoughts? Criticism? Support? Suggestions welcome.

I really agree with everything you say, except for the bit about the
community doing organization - I think its fine to let function event
staff continue with the burden of planning, as long as their goals are
directed by the community needs.

I might suggest that we could be a bit more radical with the developer
event and decouple the timing from the release cycle. The design summits
are portrayed as events where we plan the next 6 months of work, but the
release has already been open for a good 2-3 or more weeks before we meet
in the design summit. This always makes the first month of each development
cycle pretty inefficient as decisions are needlessly postponed until the
summit. The bulk of specs approval then doesn't happen until after the
summit, leaving even less time until feature freeze to get the work done.

In nova at least many of the major "priority themes" we decide upon are
tending to span across multiple development cycles, and we broadly seem
to have a good understanding of what the upcoming themes will be before
we get to the summit. The other problem with the design summit is that
since we have not often started the bulk of the dev work, we don't yet
know all the problems we're going to encounter. So we can talk forever
about theoretical stuff, which never becomes an issue and the actual
problems we uncover during implementation have to wait until the mid-cycle
for the real problem solving work. IOW I'm not really convinced we actually
need to have the design summit as a forum for "planning the next release"
nor is it enourmously useful for general problem solving, since it can be
too earlier in the dev process.

I think that our processes would become more efficient if we were to
decouple the design summit from the release cycle. We would be able to
focus on release planning right from the start of the dev cycle and not
pointlessly postpone decisions to a design summit, which would give us
more time to actually get the planned work written earlier in the cycle.

This would in turn let us make the developer summits into something which
strongly focuses on problem solving, where f2f collaboration is of maximum
benefit. IOW, it would be kind of like merging the design summit & midcycle
concepts into one - we'd have the benefits of the mid-cycle's focus on
explicit problem solving, combined with the ability to have cross-project
collaboration by being co-located with other projects. Instead of having
4 travel events a year, due to need to fix at 6 month intervals to align
with the release schedules, we could cut down to 2 or 3 developer events
a year, which are more productive overall.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Migration progress

2016-02-03 Thread Daniel P. Berrange
On Wed, Feb 03, 2016 at 10:44:36AM +, Daniel P. Berrange wrote:
> On Wed, Feb 03, 2016 at 10:37:24AM +, Koniszewski, Pawel wrote:
> > Hello everyone,
> > 
> > On the yesterday's live migration meeting we had concerns that interval of
> > writing migration progress to the database is too short.
> > 
> > Information about migration progress will be stored in the database and
> > exposed through the API (/servers//migrations/). In current
> > proposition [1] migration progress will be updated every 2 seconds. It
> > basically means that every 2 seconds a call through RPC will go from compute
> > to conductor to write migration data to the database. In case of parallel
> > live migrations each migration will report progress by itself.
> > 
> > Isn't 2 seconds interval too short for updates if the information is exposed
> > through the API and it requires RPC and DB call to actually save it in the
> > DB?
> > 
> > Our default configuration allows only for 1 concurrent live migration [2],
> > but it might vary between different deployments and use cases as it is
> > configurable. Someone might want to trigger 10 (or even more) parallel live
> > migrations and each might take even a day to finish in case of block
> > migration. Also if deployment is big enough rabbitmq might be fully-loaded.
> > I'm not sure whether updating each migration every 2 seconds makes sense in
> > this case. On the other hand it might be hard to observe fast enough that
> > migration is stuck if we increase this interval...
> 
> Do we have any actual data that this is a real problem. I have a pretty hard
> time believing that a database update of a single field every 2 seconds is
> going to be what pushes Nova over the edge into a performance collapse, even
> if there are 20 migrations running in parallel, when you compare it to the
> amount of DB queries & updates done across other areas of the code for pretty
> much every singke API call and background job.

Also note that progress is rounded to the nearest integer. So even if the
migration runs all day, there is a maximum of 100 possible changes in value
for the progress field, so most of the updates should turn in to no-ops at
the database level.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Migration progress

2016-02-03 Thread Daniel P. Berrange
On Wed, Feb 03, 2016 at 10:37:24AM +, Koniszewski, Pawel wrote:
> Hello everyone,
> 
> On the yesterday's live migration meeting we had concerns that interval of
> writing migration progress to the database is too short.
> 
> Information about migration progress will be stored in the database and
> exposed through the API (/servers//migrations/). In current
> proposition [1] migration progress will be updated every 2 seconds. It
> basically means that every 2 seconds a call through RPC will go from compute
> to conductor to write migration data to the database. In case of parallel
> live migrations each migration will report progress by itself.
> 
> Isn't 2 seconds interval too short for updates if the information is exposed
> through the API and it requires RPC and DB call to actually save it in the
> DB?
> 
> Our default configuration allows only for 1 concurrent live migration [2],
> but it might vary between different deployments and use cases as it is
> configurable. Someone might want to trigger 10 (or even more) parallel live
> migrations and each might take even a day to finish in case of block
> migration. Also if deployment is big enough rabbitmq might be fully-loaded.
> I'm not sure whether updating each migration every 2 seconds makes sense in
> this case. On the other hand it might be hard to observe fast enough that
> migration is stuck if we increase this interval...

Do we have any actual data that this is a real problem. I have a pretty hard
time believing that a database update of a single field every 2 seconds is
going to be what pushes Nova over the edge into a performance collapse, even
if there are 20 migrations running in parallel, when you compare it to the
amount of DB queries & updates done across other areas of the code for pretty
much every singke API call and background job.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Migration progress

2016-02-03 Thread Daniel P. Berrange
On Wed, Feb 03, 2016 at 11:27:16AM +, Paul Carlton wrote:
> On 03/02/16 10:49, Daniel P. Berrange wrote:
> >On Wed, Feb 03, 2016 at 10:44:36AM +0000, Daniel P. Berrange wrote:
> >>On Wed, Feb 03, 2016 at 10:37:24AM +, Koniszewski, Pawel wrote:
> >>>Hello everyone,
> >>>
> >>>On the yesterday's live migration meeting we had concerns that interval of
> >>>writing migration progress to the database is too short.
> >>>
> >>>Information about migration progress will be stored in the database and
> >>>exposed through the API (/servers//migrations/). In current
> >>>proposition [1] migration progress will be updated every 2 seconds. It
> >>>basically means that every 2 seconds a call through RPC will go from 
> >>>compute
> >>>to conductor to write migration data to the database. In case of parallel
> >>>live migrations each migration will report progress by itself.
> >>>
> >>>Isn't 2 seconds interval too short for updates if the information is 
> >>>exposed
> >>>through the API and it requires RPC and DB call to actually save it in the
> >>>DB?
> >>>
> >>>Our default configuration allows only for 1 concurrent live migration [2],
> >>>but it might vary between different deployments and use cases as it is
> >>>configurable. Someone might want to trigger 10 (or even more) parallel live
> >>>migrations and each might take even a day to finish in case of block
> >>>migration. Also if deployment is big enough rabbitmq might be fully-loaded.
> >>>I'm not sure whether updating each migration every 2 seconds makes sense in
> >>>this case. On the other hand it might be hard to observe fast enough that
> >>>migration is stuck if we increase this interval...
> >>Do we have any actual data that this is a real problem. I have a pretty hard
> >>time believing that a database update of a single field every 2 seconds is
> >>going to be what pushes Nova over the edge into a performance collapse, even
> >>if there are 20 migrations running in parallel, when you compare it to the
> >>amount of DB queries & updates done across other areas of the code for 
> >>pretty
> >>much every singke API call and background job.
> >Also note that progress is rounded to the nearest integer. So even if the
> >migration runs all day, there is a maximum of 100 possible changes in value
> >for the progress field, so most of the updates should turn in to no-ops at
> >the database level.
> >
> >Regards,
> >Daniel
> I agree with Daniel, these rpc and db access ops are a tiny percentage
> of the overall load on rabbit and mysql and properly configured these
> subsystems should have no issues with this workload.
> 
> One correction, unless I'm misreading it, the existing
> _live_migration_monitor code updates the progress field of the instance
> record every 5 seconds.  However this value can go up and down so
> an infinate number of updates are possible?

Oh yes, you are in fact correct. Technically you could have an unbounded
number of updates if migration goes backwards. Some mitigation against
this is if we see progress going backwards we'll actually abort the
migration if it gets stuck for too long. We'll also be progressively
increasing the permitted downtime. So except in pathelogical scenarios
I think the number of updates should still be relatively small.

> However, the issue raised here is not with the existing implementation
> but with the proposed change
> https://review.openstack.org/#/c/258813/5/nova/virt/libvirt/driver.py
> This add a save() operation on the migration object every 2 seconds

Ok, that is more heavy weight since it is recording the raw byte values
and so it is guaranteed to do a database update pretty much every time.
It still shouldn't be too unreasonable a loading though. FWIW I think
it is worth being consistent in the update frequency betweeen the
progress value & the migration object save, so switching to be every
5 seconds probably makes more sense, so we know both objects are
reflecting the same point in time.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][virt] rebuild action not in support-matrix

2016-02-02 Thread Daniel P. Berrange
On Mon, Feb 01, 2016 at 05:04:37PM -0700, Matt Riedemann wrote:
> 
> 
> On 2/1/2016 12:39 PM, Chen CH Ji wrote:
> >Hi
> >   We have been trying to enablement of our CI work for our nova
> >virt layer code ,so we need to configure the tempest cases based on our
> >nova driver capability
> >   I found that rebuild action is not listed in [1] (only talk
> >about rebuild in evacuate), but code [2] seems support virt layer
> >abstraction
> >   can someone point the rebuild action in [1] or it's missing
> >on purpose ? Thanks
> >
> >[1]http://docs.openstack.org/developer/nova/support-matrix.html
> >[2]https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2920
> >
> >
> >
> >__
> >OpenStack Development Mailing List (not for usage questions)
> >Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> Only the Ironic driver overrides the rebuild method, otherwise the compute
> manager has a default impl, so it's technically implemented for all virt
> drivers. There is also confusion around rebuild vs evacuate since both
> operations go through the rebuild_instance method in the compute manager,
> they are just separated by the 'recreate' parameter.
> 
> danpb might have reasons for not listing rebuild in the hypervisor support
> matrix - it might have just never been on the original wiki matrix. It'd be
> worth asking him.

The hypervisor matrix just copied the data from the original wiki. It is
certainly not a complete list of all features that are relevant. You could
make the matrix x10 bigger and it still wouldn't cover all interesting facts
across virt drivers. If anyone has things they want shown they should submit
patches

> But at the same time, since there is a default implementation, I'm also not
> sure if it's worth listing separately in the support matrix (but is also
> confusing I suppose to not list it at all).

That there is a default impl is really just an impl detail - if it is an
interesting feature from the user POV it is worth listing IMHO

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance][ironic][cinder][nova] 'tar' as an image disk_format

2016-01-27 Thread Daniel P. Berrange
On Wed, Jan 27, 2016 at 08:32:58AM -0430, Flavio Percoco wrote:
> On 27/01/16 08:20 -0430, Flavio Percoco wrote:
> >On 26/01/16 09:11 +0000, Daniel P. Berrange wrote:
> >>On Sun, Jan 24, 2016 at 12:00:16AM +0200, Duncan Thomas wrote:
> >>>I guess my wisdom would be 'why'? What does this enable you to do that you
> >>>couldn't do with similar ease with the formats we have and are people
> >>>trying to do that frequently.
> >>>
> >>>We've seen in cinder that image formats have a definite security surface to
> >>>them, and with glance adding arbitrary conversion pipelines, that surface
> >>>is going to increase with every format we add. This should mean we tend
> >>>towards being increasingly conservative I think.
> >>
> >>Safely extracting tar file contents to create a disk image to run the VM
> >>from is particularly non-trivial. There have been many security flaws in
> >>the past with apps doing tar file unpacking in this kind of scenario. For
> >>example, Docker has had not one, but *three* vulnerabilities in this area
> >>CVE-2014-6407, CVE-2014-9356, and CVE-2014-9357. So unless there is a
> >>pretty compelling reason, I'd suggest we stay away from supporting tar
> >>as an image format, and require traditional image formats where we we can
> >>treat the file payload as an opaque blob and thus avoid all these file
> >>processing risks.
> >
> >++
> >
> >From a Glance perspective, there wouldn't be much to do and most of the 
> >security
> >issues would live in the Ironic side. However, as a community, I think we 
> >should
> >send a clear message and protect our users and, in this case, the best way 
> >is to
> >avoid adding this format as supported.
> >
> >In future works (image conversions and whatnot) this could impact Glance as 
> >well.
> 
> It was brought to my attention (thanks Erno) that we support OVA already. This
> means we're basically exposed to the above already as the OVA container is a
> tarball anyway.
> 
> Glance protects itself from this by either not doing anything to the image or
> isolating operations on the image to specific workers (of course, this goes in
> addition to other security measures).
> 
> The difference, though, is that OVA files are a known container format for
> images, whereas tar.gz isn't.

NB nova doesn't do anything with OVA files either. IIRC, the only virt driver
that supports them is VMWare, and Nova just passes the file through as-is
to VMWare for processing. For libvirt / KVM we don't support OVS files at
all, partly because we don't want to be in the business of unpacking them
and turning them into disk images ourselves.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance][ironic][cinder][nova] 'tar' as an image disk_format

2016-01-26 Thread Daniel P. Berrange
On Sun, Jan 24, 2016 at 12:00:16AM +0200, Duncan Thomas wrote:
> I guess my wisdom would be 'why'? What does this enable you to do that you
> couldn't do with similar ease with the formats we have and are people
> trying to do that frequently.
> 
> We've seen in cinder that image formats have a definite security surface to
> them, and with glance adding arbitrary conversion pipelines, that surface
> is going to increase with every format we add. This should mean we tend
> towards being increasingly conservative I think.

Safely extracting tar file contents to create a disk image to run the VM
from is particularly non-trivial. There have been many security flaws in
the past with apps doing tar file unpacking in this kind of scenario. For
example, Docker has had not one, but *three* vulnerabilities in this area
CVE-2014-6407, CVE-2014-9356, and CVE-2014-9357. So unless there is a
pretty compelling reason, I'd suggest we stay away from supporting tar
as an image format, and require traditional image formats where we we can
treat the file payload as an opaque blob and thus avoid all these file
processing risks.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][tc] Stabilization cycles: Elaborating on the idea to move it forward

2016-01-21 Thread Daniel P. Berrange
On Wed, Jan 20, 2016 at 01:23:02PM -0430, Flavio Percoco wrote:
> Greetings,
> 
> At the Tokyo summit, we discussed OpenStack's development themes in a
> cross-project session. In this session a group of folks started discussing 
> what
> topics the overall community could focus on as a shared effort. One of the
> things that was raised during this session is the need of having cycles to
> stabilize projects. This was brought up by Robert Collins again in a 
> meeting[0]
> the TC had right after the summit and no much has been done ever since.
> 
> Now, "stabilization Cycles" are easy to dream about but really hard to do and
> enforce. Nonetheless, they are still worth a try or, at the very least, a
> thought. I'll try to go through some of the issues and benefits a 
> stabilization
> cycle could bring but bear in mind that the lists below are not exhaustive. In
> fact, I'd love for other folks to chime in and help building a case in favor 
> or
> against this.
> 
> Negative(?) effects
> ===
> 
> - Project won't get new features for a period of time Economic impact on
>  developers(?)
> - It was mentioned that some folks receive bonuses for landed features
> - Economic impact on companies/market because no new features were added (?)
> - (?)

It will push more development into non-upstream vendor private
branches.

> 
> Positive effects
> 
> 
> - Focus on bug fixing
> - Reduce review backlog
> - Refactor *existing* code/features with cleanups
> - Focus on multi-cycle features (if any) and complete those
> - (?)

I don't think the idea of stabalization cycles would really have
such a positive effect, certainly not while our release cycle is
6 months in length.

If you say the next cycle is primarily stabalization, then what
you are in effect saying is that people have to wait 12 months
for their desired new feature.  In the fast moving world of
cloud, I don't think that is a very credible approach. Even
with our current workflow, where we selectively approve features
for cycles, we have this impact of forcing people to wait 12
months, or more, for their features.

In the non-stabalization cycle, we're not going to be able to
merge a larger number of features than we already do today.
So in effect we'll have 2 cycles worth of features being
proposed for 1 cycle. When we inevitably reject moany of
those features they'll have to wait for the next non-stabalization
cycle, which means 18-24 months delay.

Of course in reality this kind of delay won't happen. What will
instead happen is that various vendors will get pressure from
their customers/partners and their local branches of openstack
packages will fork & diverge even further from upstream than
they already do today.

So while upstream branch will be "stabalized", most users will
probably get a *less* stable release because they'll be using
a branch from vendors with a tonne of non-upstream stuff added.


In addition having a stablization cycle will give the impression
that the following cycle is a non-stable one and likely cause
more distruption by pushing lots of features in at one time.
Instead of having a master branch which has an approximately
constant level of stabalization, you'll create a situation
where it fluctuates significantly, which is clearly worse for
people doing continuous deployment.

I think it is important to have the mindset that master should
*always* be considered stable - we already have this in general
and it is one of the success points of openstack's development
model IMHO. The idea of stabalization cycles is a step backwards

I still believe that if you want to improve stabality of the
codebase, we'd be better off moving to a shorter development
cycle. Even the 6 month cycle we have today is quite "lumpy"
in terms of what kind of work happens from month to month. If
we moved to a 2 month cycle, I think it would relieve pressure
to push in features quickly before freeze, because people would
know they'd have another opportunity very soon, instead of having
to wait 6+ months. I've previously suggested that here:

  http://lists.openstack.org/pipermail/openstack-dev/2015-February/057614.html

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][docker][powervm] out of tree virt driver breakage

2016-01-19 Thread Daniel P. Berrange
This is an alert for anyone who maintains an out of tree virt driver
for Nova (docker & powervm are the 2 I know of).

The following change has just merged changing the nova/virt/driver.py
API, and as such it will break any out of tree virt drivers until they
are updated

  commit fbe31e461ac3f16edb795993558a2314b4c16b52
  Author: Daniel P. Berrange <berra...@redhat.com>
  Date:   Mon Jun 8 17:58:09 2015 +0100

compute: convert manager to use nova.objects.ImageMeta

Update the virt driver API so that all methods with an
'image_meta' parameter take a nova.objects.ImageMeta
instance instead of a dict.

NB, this will break out of tree virt drivers until they
convert their code to use the new object.

Blueprint: mitaka-objects
Change-Id: I75465a2029b53aa4d338b80619ed7380e0d19e6a

Anywhere in your virt driver impl that uses the 'image_meta' parameter
should be updated to use the nova.objects.ImageMeta instance rather
than assuming it has a dict.

If you have any trouble understanding how to update the code, reply
to this message or find me on IRC for guidance, or look at changes
made to the libvirt/xenapi/vmware drivers in tree.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][infra] Ability to run newer QEMU in Gate jobs

2016-01-19 Thread Daniel P. Berrange
On Tue, Jan 19, 2016 at 05:47:58PM +, Jeremy Stanley wrote:
> On 2016-01-19 18:32:38 +0100 (+0100), Kashyap Chamarthy wrote:
> [...]
> > Matt Riedemann tells me on IRC that multi-node live migration job is
> > currently Ubuntu only, and to get a newer QEMU, it has to be added to
> > Ubuntu Cloud Archive.
> [...]
> 
> As discussed recently on another thread[1], we're not currently
> using UCA in jobs either. We can discuss it, but generally by the
> time people start actively wanting newer whatever we're only a few
> months away from the next LTS anyway. In this case I have hopes that
> in a few months we'll be able to start running jobs on Ubuntu 16.04
> LTS, which looks like it's going to ship with QEMU 2.5.

We'll almost certainly need to be able to test QEMU 2.6 in the N
release cycle, since that'll (hopefully) include support for TLS
encrypted migration & nbd traffic.  So I don't think waiting for
LTS releases is a viable strategy in general - we'll need UCA to
be available for at least some set of jobs we run. Alternatively
stick with LTS release for Ubuntu, and run other jobs with Fedora
and the virt-preview repository to give us coverage of the cutting
edge QEMU/libvirt stack.

> Alternatively, look into getting a live migration job running on
> CentOS 7 or Fedora 23 if it can't wait until after Mitaka.

CentOS 7 might be a nice target, since I think it'll likely have
more reliable migration support at the QEMU level than any distros
shipping close-to-upstream QEMU versions.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: copyright-holders file in each project, or copyright holding forced to the OpenStack Foundation

2016-01-15 Thread Daniel P. Berrange
On Fri, Jan 15, 2016 at 12:53:49PM +, Chris Dent wrote:
> On Fri, 15 Jan 2016, Thomas Goirand wrote:
> 
> >Whatever we choose, I think we should ban having copyright holding text
> >within our source code. While licensing is a good idea, as it is
> >accurate, the copyright holding information isn't and it's just missleading.
> 
> I think we should not add new copyright notifications in files.
> 
> I'd also be happy to see all the existing ones removed, but that may
> be a bigger problem.

Only the copyright holder who added the notice is permitted to
remove it. ie you can't unilaterally remove Copyright notices
added by other copyright holders. See LICENSE term (4)(c)

While you could undertake an exercise to get agreement from
every copyright holder to remote their notices, its is honestly
not worth the work IMHO.

> >If I was the only person to choose, I'd say let's go for 1/, but
> >probably managers of every company wont agree.
> 
> I think option one is correct.

Copyright assignment is never the correct answer.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: copyright-holders file in each project, or copyright holding forced to the OpenStack Foundation

2016-01-15 Thread Daniel P. Berrange
On Fri, Jan 15, 2016 at 08:48:21PM +0800, Thomas Goirand wrote:
> This isn't the first time I'm calling for it. Let's hope this time, I'll
> be heard.
> 
> Randomly, contributors put their company names into source code. When
> they do, then effectively, this tells that a given source file copyright
> holder is whatever is claimed, even though someone from another company
> may have patched it.
> 
> As a result, we have a huge mess. It's impossible for me, as a package
> maintainer, to accurately set the copyright holder names in the
> debian/copyright file, which is a required by the Debian FTP masters.

I don't think OpenStack is in a different situation to the vast
majority of open source projects I've worked with or seen. Except
for those projects requiring copyright assignment to a single
entity, it is normal for source files to contain an unreliable
random splattering of Copyright notices. This hasn't seemed to
create a blocking problem for their maintenance in Debian. Loooking
at the debian/copyright files I see most of them have just done a
grep for the 'Copyright' statements & included as is - IOW just
ignored the fact that this is essentially worthless info and included
it regardless.

> I see 2 ways forward:
> 1/ Require everyone to give-up copyright holding, and give it to the
> OpenStack Foundation.
> 2/ Maintain a copyright-holder file in each project.

3/ Do nothing, just populate debian/copyright with the random
   set of 'Copyright' lines that happen to be the source files,
   as appears to be common practice across many debian packages

   eg the kernel package


http://metadata.ftp-master.debian.org/changelogs/main/l/linux/linux_3.16.7-ckt11-1+deb8u3_copyright

"Copyright: 1991-2012 Linus Torvalds and many others"

   if its good enough for the Debian kernel package, it should be
   good enough for openstack packages too IMHO.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][os-vif] os-vif core review team membership

2016-01-13 Thread Daniel P. Berrange
On Tue, Jan 12, 2016 at 10:28:49PM +, Mooney, Sean K wrote:
> > -Original Message-
> > From: Moshe Levi [mailto:mosh...@mellanox.com]
> > Sent: Tuesday, January 12, 2016 4:23 PM
> > To: Russell Bryant; Daniel P. Berrange; openstack-
> > d...@lists.openstack.org
> > Cc: Jay Pipes; Mooney, Sean K; Sahid Orentino Ferdjaoui; Maxime Leroy
> > Subject: RE: [nova][neutron][os-vif] os-vif core review team membership
> > 
> > 
> > 
> > > -Original Message-
> > > From: Russell Bryant [mailto:rbry...@redhat.com]
> > > Sent: Tuesday, January 12, 2016 5:24 PM
> > > To: Daniel P. Berrange <berra...@redhat.com>; openstack-
> > > d...@lists.openstack.org
> > > Cc: Jay Pipes <jaypi...@gmail.com>; Sean Mooney
> > > <sean.k.moo...@intel.com>; Moshe Levi <mosh...@mellanox.com>; Sahid
> > > Orentino Ferdjaoui <sahid.ferdja...@redhat.com>; Maxime Leroy
> > > <maxime.le...@6wind.com>
> > > Subject: Re: [nova][neutron][os-vif] os-vif core review team
> > > membership
> > >
> > > On 01/12/2016 10:15 AM, Daniel P. Berrange wrote:
> > > > So far myself & Jay Pipes have been working on the initial os-vif
> > > > prototype and setting up infrastructure for the project. Obviously
> > > > we need more then just 2 people on a core team, and after looking at
> > > > those who've expressed interest in os-vif, we came up with a
> > > > cross-section of contributors across the Nova, Neutron and NFV
> > > > spaces to be the initial core team:
> > > >
> > > >   Jay Pipes
> > > >   Daniel Berrange
> > > >   Sean Mooney
> > > >   Moshe Levi
> > > >   Russell Bryant
> > > >   Sahid Ferdjaoui
> > > >   Maxime Leroy
> > > >
> > > > So unless anyone wishes to decline the offer, once infra actually
> > > > add me to the os-vif-core team I'll be making these people os-vif
> > > > core, so we can move forward with the work on the library...
> > >
> > > Thanks, I'm happy to help.
> > Same here.
> I would be happy to help work on moving os-vif forward in whatever way I can.
> Thank you for the invitation. I will not be able to travel to the nova 
> midcycle to discuss 
> os-vif however one of the engineers I work with (sfinucan) will be there.

That's no problem - mid-cycles are strictly optional.

> Will the status of the oslo.privsep work be discussed?

I don't believe there's any agenda set for the nova mid-cycle right now

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][neutron][os-vif] os-vif core review team membership

2016-01-12 Thread Daniel P. Berrange
So far myself & Jay Pipes have been working on the initial os-vif
prototype and setting up infrastructure for the project. Obviously
we need more then just 2 people on a core team, and after looking
at those who've expressed interest in os-vif, we came up with a
cross-section of contributors across the Nova, Neutron and NFV
spaces to be the initial core team:

  Jay Pipes
  Daniel Berrange
  Sean Mooney
  Moshe Levi
  Russell Bryant
  Sahid Ferdjaoui
  Maxime Leroy

So unless anyone wishes to decline the offer, once infra actually add
me to the os-vif-core team I'll be making these people os-vif core, so
we can move forward with the work on the library...

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Glance] Nova + Glance_v2 = Love

2016-01-08 Thread Daniel P. Berrange
On Fri, Jan 08, 2016 at 10:01:45AM -0430, Flavio Percoco wrote:
> On 29/12/15 07:41 -0600, Sam Matzek wrote:
> >On Thu, Dec 24, 2015 at 7:49 AM, Mikhail Fedosin  
> >wrote:
> >>Hello, it's another topic about glance v2 adoption in Nova, but it's
> >>different from the others. I want to declare that there is a set of commits,
> >>that make Nova version agnostic and allow to work with both glance apis. The
> >>idea of the solution is to determine the current api version in the
> >>beginning and make appropriate requests after.
> >>(https://review.openstack.org/#/c/228578/,
> >>https://review.openstack.org/#/c/238309/,
> >>https://review.openstack.org/#/c/259097/)
> >>
> >>Indeed, it requires some additional (and painful) work, but now all tempest
> >>tests pass in Jenkins.
> >>
> >>Note: this thread is not about xenplugin, there is another topic, called
> >>'Xenplugin + Glance_v2 = Hate'
> >>
> >>Here the main issues we faced and how we've solved them:
> >>
> >>1. "changes-since" filter for image-list is not supported in v2 api. Steve
> >>Lewis made a great job and implemented a set of filters with comparison
> >>operators https://review.openstack.org/#/c/197388/. Filtering by
> >>'changes-since' is completely similar to 'gte:updated_at'.
> >>
> >>2. Filtering by custom properties must have prefix 'property-'. In v2 it's
> >>not required.
> >>
> >>3. V2 states that all custom properties must be image attributes, but Nova
> >>assumes that they are included in 'properties' dictionary. It's fixed with
> >>glanceclient's method 'is_base_property(prop_name)', that returns False in
> >>case of custom property.
> >>
> >>4. is_public=True/False is visibility="public"/"private" respectively.
> >>
> >>5. Deleting custom image properties in Nova is performed with 'purge_props'
> >>flag. If it's set to True, then all prop names, that are not included in
> >>updated data, will be removed. In case of v2 we have to explicitly specify
> >>prop names in the list param 'remove_props'. To implement this behaviour, if
> >>'purge_props' is set, we make additional 'get' request to determine the
> >>existing properties and put in 'remove_prop' list only those, that are not
> >>in updated_data.
> >>
> >>6. My favourite:
> >>There is an ability to activate an empty image by just providing 'size = 0':
> >>https://review.openstack.org/#/c/9715/, in this case image will be a
> >>collection of metadata. Glance v2 doesn't support this "feature" and that's
> >>why we have to implement a very dirty workaround:
> >>* v2 requires, that disk_format and container-format must be set before
> >>the activation. if these params are not provided to 'create' method then we
> >>hardcode it to 'qcow2' and 'bare'.
> >>* we call 'upload' method with empty data (data = '') to activate image.
> >>Me (and the rest glance team) think that this image activation with
> >>zero-size is inconsistent and we won't implement it in v2. So, I'm going to
> >>ask if Nova really needs this feature and maybe it's possible to make it
> >>work without it.
> >
> >Nova uses this functionality when it creates snapshots of volume
> >backed instances, that is, instances that only have Cinder volumes
> >attached and do not have an ephemeral disk.
> >In this case Nova API creates Cinder snapshots for the Cinder volumes
> >and builds the block_device_mapping property with block devices that
> >reference the Cinder snapshots.  Nova activates this image with size=0
> >because this image does not have a disk and simply refers to the
> >collection of Cinder snapshots that collectively comprise the binary
> >image.  Callers of Glance outside of Nova may also use the APIs to
> >create "block device mapping" images as well that contain references
> >to Cinder volumes to attach, blank volumes to create, snapshots to
> >create volumes from, etc during the server creation.  Not having the
> >ability to create these images with V2 is a loss of function.
> 
> I disagree. Being able to activate empty images breaks the consistency
> of Glances API and I don't think it should be considered a feature but
> a bug. An active image is an image that can be used to *boot* a VM. If
> the image is empty, you simply can't do that.

NB if an empty image is associated with a kernel/initrd image then it
could conceptually still be bootable, as the VM could run entirely from
content contained in the initrd, or the initrd might have activated
some network accessed root device. Whether this works in practice
with glance empty images though I don't know. Just from a conceptual
POV, images don't need to contain any content at all.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|


Re: [openstack-dev] [nova][libvirt] Deprecating the live_migration_flag and block_migration_flag config options

2016-01-08 Thread Daniel P. Berrange
On Thu, Jan 07, 2016 at 09:07:00PM +, Mark McLoughlin wrote:
> On Thu, 2016-01-07 at 12:23 +0100, Sahid Orentino Ferdjaoui wrote:
> > On Mon, Jan 04, 2016 at 09:12:06PM +, Mark McLoughlin wrote:
> > > Hi
> > > 
> > > commit 8ecf93e[1] got me thinking - the live_migration_flag config
> > > option unnecessarily allows operators choose arbitrary behavior of the
> > > migrateToURI() libvirt call, to the extent that we allow the operator
> > > to configure a behavior that can result in data loss[1].
> > > 
> > > I see that danpb recently said something similar:
> > > 
> > >   https://review.openstack.org/171098
> > > 
> > >   "Honestly, I wish we'd just kill off  'live_migration_flag' and
> > >   'block_migration_flag' as config options. We really should not be
> > >   exposing low level libvirt API flags as admin tunable settings.
> > > 
> > >   Nova should really be in charge of picking the correct set of flags
> > >   for the current libvirt version, and the operation it needs to
> > >   perform. We might need to add other more sensible config options in
> > >   their place [..]"
> > 
> > Nova should really handle internal flags and this serie is running in
> > the right way.
> > 
> > > ...
> > 
> > >   4) Add a new config option for tunneled versus native:
> > > 
> > >    [libvirt]
> > >    live_migration_tunneled = true
> > > 
> > >  This enables the use of the VIR_MIGRATE_TUNNELLED flag. We have 
> > >      historically defaulted to tunneled mode because it requires the 
> > >      least configuration and is currently the only way to have a 
> > >      secure migration channel.
> > > 
> > >      danpb's quote above continues with:
> > > 
> > >        "perhaps a "live_migration_secure_channel" to indicate that 
> > >         migration must use encryption, which would imply use of 
> > >         TUNNELLED flag"
> > > 
> > >      So we need to discuss whether the config option should express the
> > >      choice of tunneled vs native, or whether it should express another
> > >      choice which implies tunneled vs native.
> > > 
> > >        https://review.openstack.org/263434
> > 
> > We probably have to consider that operator does not know much about
> > internal libvirt flags, so options we are exposing for him should
> > reflect benefice of using them. I commented on your review we should
> > at least explain benefice of using this option whatever the name is.
> 
> As predicted, plenty of discussion on this point in the review :)
> 
> You're right that we don't give the operator any guidance in the help
> message about how to choose true or false for this:
> 
>   Whether to use tunneled migration, where migration data is 
>   transported over the libvirtd connection. If True,
>   we use the VIR_MIGRATE_TUNNELLED migration flag
> 
> libvirt's own docs on this are here:
> 
>   https://libvirt.org/migration.html#transport
> 
> which emphasizes:
> 
>   - the data copies involved in tunneling
>   - the extra configuration steps required for native
>   - the encryption support you get when tunneling
> 
> The discussions I've seen on this topic wrt Nova have revolved around:
> 
>   - that tunneling allows for an encrypted transport[1]
>   - that qemu's NBD based drive-mirror block migration isn't supported
>     using tunneled mode, and that danpb is working on fixing this
>     limitation in libvirt
>   - "selective" block migration[2] won't work with the fallback qemu
>     block migration support, and so won't currently work in tunneled
>     mode

I'm not working on fixing it, but IIRC some other dev had proposed
patches.

> 
> So, the advise to operators would be:
> 
>   - You may want to choose tunneled=False for improved block migration 
>     capabilities, but this limitation will go away in future.
>   - You may want to choose tunneled=False if you wish to trade and
>     encrypted transport for a (potentially negligible) performance
>     improvement.
> 
> Does that make sense?
> 
> As for how to name the option, and as I said in the review, I think it
> makes sense to be straightforward here and make it clearly about
> choosing to disable libvirt's tunneled transport.
> 
> If we name it any other way, I think our explanation for operators will
> immediately jump to explaining (a) that it influences the TUNNELLED
> flag, and (b) the differences between the tunneled and native
> transports. So, if we're going to have to talk about tunneled versus
> native, why obscure that detail?

Ultimately we need to recognise that libvirt's tunnelled mode was
added as a hack, to work around fact that QEMU lacked any kind of
native security capabilities & didn't appear likely to ever get
them at that time.  As well as not working with modern NBD based
block device encryption, it really sucks for performance because
it introduces many extra data copies. So it is going to be quite
poor for large VMs with heavy rate of data dirtying.

The only long term relative "benefit" of tunnelled mode is that

Re: [openstack-dev] [nova] Testing concerns around boot from UEFI spec

2015-12-04 Thread Daniel P. Berrange
On Fri, Dec 04, 2015 at 07:43:41AM -0500, Sean Dague wrote:
> Can someone explain the licensing issue here? The Fedora comments make
> this sound like this is something that's not likely to end up in distros.

The EDK codebase contains a FAT driver which has a license that forbids
reusing the code outside of the EDK project.

[quote]
Additional terms: In addition to the forgoing, redistribution and use
of the code is conditioned upon the FAT 32 File System Driver and all
derivative works thereof being used for and designed only to read
and/or write to a file system that is directly managed by Intel's
Extensible Firmware Initiative (EFI) Specification v. 1.0 and later
and/or the Unified Extensible Firmware Interface (UEFI) Forum's UEFI
Specifications v.2.0 and later (together the "UEFI Specifications");
only as necessary to emulate an implementation of the UEFI Specifications;
and to create firmware, applications, utilities and/or drivers.
[/quote]

So while the code is open source, it is under a non-free license,
hence Fedora will not ship it. For RHEL we're reluctantly choosing
to ship it as an exception to our normal policy, since its the only
immediate way to make UEFI support available on x86 & aarch64

So I don't think the license is a reason to refuse to allow the UEFI
feature into Nova though, nor should it prevent us using the current
EDK bios in CI for testing purposes. It is really just an issue for
distros which only want 100% free software.

Unless the license on the existing code gets resolved, some Red Hat
maintainers have a plan to replace the existing FAT driver with an
alternative impl likely under GPL. At that time, it'll be acceptable
for inclusion in Fedora.

> That seems weird enough that I'd rather push back on our Platinum Board
> member to fix the licensing before we let this in. Especially as this
> feature is being drive by Intel.

As copyright holder, Intel could choose to change the license of their
code to make it free software avoiding all the problems. None the less,
as above, I don't think this is a blocker for inclusion of the feature
in Nova, nor our testing of it.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

2015-11-27 Thread Daniel P. Berrange
On Fri, Nov 27, 2015 at 07:37:50PM +0800, 少合冯 wrote:
> 2015-11-27 2:19 GMT+08:00 Daniel P. Berrange <berra...@redhat.com>:
> 
> > On Thu, Nov 26, 2015 at 05:39:04PM +, Daniel P. Berrange wrote:
> > > On Thu, Nov 26, 2015 at 11:55:31PM +0800, 少合冯 wrote:
> > > > 3.  dynamically choose when to activate xbzrle compress for live
> > migration.
> > > >  This is the best.
> > > >  xbzrle really wants to be used if the network is not able to keep
> > up
> > > > with the dirtying rate of the guest RAM.
> > > >  But how do I check the coming migration fit this situation?
> > >
> > > FWIW, if we decide we want compression support in Nova, I think that
> > > having the Nova libvirt driver dynamically decide when to use it is
> > > the only viable approach. Unfortunately the way the QEMU support
> > > is implemented makes it very hard to use, as QEMU forces you to decide
> > > to use it upfront, at a time when you don't have any useful information
> > > on which to make the decision :-(  To be useful IMHO, we really need
> > > the ability to turn on compression on the fly for an existing active
> > > migration process. ie, we'd start migration off and let it run and
> > > only enable compression if we encounter problems with completion.
> > > Sadly we can't do this with QEMU as it stands today :-(
> > >
> >
> [Shaohe Feng]
> Add more guys working on kernel/hypervisor in our loop.
> Wonder whether there will be any good solutions to improve it in QEMU in
> future.
> 
> 
> > > Oh and of course we still need to address the issue of RAM usage and
> > > communicating that need with the scheduler in order to avoid OOM
> > > scenarios due to large compression cache.
> > >
> > > I tend to feel that the QEMU compression code is currently broken by
> > > design and needs rework in QEMU before it can be pratically used in
> > > an autonomous fashion :-(
> >
> > Actually thinking about it, there's not really any significant
> > difference between Option 1 and Option 3. In both cases we want
> > a nova.conf setting live_migration_compression=on|off to control
> > whether we want to *permit* use  of compression.
> >
> > The only real difference between 1 & 3 is whether migration has
> > compression enabled always, or whether we turn it on part way
> > though migration.
> >
> > So although option 3 is our desired approach (which we can't
> > actually implement due to QEMU limitations), option 1 could
> > be made fairly similar if we start off with a very small
> > compression cache size which would have the effect of more or
> > less disabling compression initially.
> >
> > We already have logic in the code for dynamically increasing
> > the max downtime value, which we could mirror here
> >
> > eg something like
> >
> >  live_migration_compression=on|off
> >
> >   - Whether to enable use of compression
> >
> >  live_migration_compression_cache_ratio=0.8
> >
> >   - The maximum size of the compression cache relative to
> > the guest RAM size. Must be less than 1.0
> >
> >  live_migration_compression_cache_steps=10
> >
> >   - The number of steps to take to get from initial cache
> > size to the maximum cache size
> >
> >  live_migration_compression_cache_delay=75
> >
> >   - The time delay in seconds between increases in cache
> > size
> >
> >
> > In the same way that we do with migration downtime, instead of
> > increasing cache size linearly, we'd increase it in ever larger
> > steps until we hit the maximum. So we'd start off fairly small
> > a few MB, and monitoring the cache hit rates, we'd increase it
> > periodically.  If the number of steps configured and time delay
> > between steps are reasonably large, that would have the effect
> > that most migrations would have a fairly small cache and would
> > complete without needing much compression overhead.
> >
> > Doing this though, we still need a solution to the host OOM scenario
> > problem. We can't simply check free RAM at start of migration and
> > see if there's enough to spare for compression cache, as the schedular
> > can spawn a new guest on the compute host at any time, pushing us into
> > OOM. We really need some way to indicate that there is a (potentially
> > very large) extra RAM overhead for the guest during migration.
> >
> > ie if live_migration_compression_cache_ratio is 0.8 and w

Re: [openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

2015-11-27 Thread Daniel P. Berrange
On Fri, Nov 27, 2015 at 12:17:06PM +, Koniszewski, Pawel wrote:
> > -Original Message-
> > > > Doing this though, we still need a solution to the host OOM scenario
> > > > problem. We can't simply check free RAM at start of migration and
> > > > see if there's enough to spare for compression cache, as the
> > > > schedular can spawn a new guest on the compute host at any time,
> > > > pushing us into OOM. We really need some way to indicate that there
> > > > is a (potentially very large) extra RAM overhead for the guest during
> > migration.
> 
> What about CPU? We might end up with live migration that degrades
> performance of other VMs on source and/or destination node. AFAIK
> CPUs are heavily oversubscribed in many cases and this does not help.
> I'm not sure that this thing fits into Nova as it requires resource
> monitoring.

Nova already has the ability to set CPU usage tuning rules against
each VM. Since the CPU overhead is attributed to the QEMU process,
these existing tuning rules will apply. So there would only be
impact on other VMs, if you do not have any CPU tuning rules set
in Nova.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

2015-11-27 Thread Daniel P. Berrange
On Fri, Nov 27, 2015 at 01:01:15PM +, Koniszewski, Pawel wrote:
> > -Original Message-
> > From: Daniel P. Berrange [mailto:berra...@redhat.com]
> > Sent: Friday, November 27, 2015 1:24 PM
> > To: Koniszewski, Pawel
> > Cc: OpenStack Development Mailing List (not for usage questions); ???; Feng,
> > Shaohe; Xiao, Guangrong; Ding, Jian-feng; Dong, Eddie; Wang, Yong Y; Jin,
> > Yuntong
> > Subject: Re: [openstack-dev] [nova] [RFC] how to enable xbzrle compress for
> > live migration
> >
> > On Fri, Nov 27, 2015 at 12:17:06PM +, Koniszewski, Pawel wrote:
> > > > -Original Message-
> > > > > > Doing this though, we still need a solution to the host OOM
> > > > > > scenario problem. We can't simply check free RAM at start of
> > > > > > migration and see if there's enough to spare for compression
> > > > > > cache, as the schedular can spawn a new guest on the compute
> > > > > > host at any time, pushing us into OOM. We really need some way
> > > > > > to indicate that there is a (potentially very large) extra RAM
> > > > > > overhead for the guest during
> > > > migration.
> > >
> > > What about CPU? We might end up with live migration that degrades
> > > performance of other VMs on source and/or destination node. AFAIK CPUs
> > > are heavily oversubscribed in many cases and this does not help.
> > > I'm not sure that this thing fits into Nova as it requires resource
> > > monitoring.
> >
> > Nova already has the ability to set CPU usage tuning rules against each VM.
> > Since the CPU overhead is attributed to the QEMU process, these existing
> > tuning rules will apply. So there would only be impact on other VMs, if you
> > do
> > not have any CPU tuning rules set in Nova.
> 
> Not sure I understand it correctly, I assume that you are talking about CPU
> pinning. Does it mean that compression/decompression runs as part of VM
> threads?
> 
> If not then, well, it will require all VMs to be pinned on both hosts, source
> and destination (and in the whole cluster because of static configuration...).
> Also what about operating system performance? Will QEMU distinct OS processes
> somehow and won't affect them?

The compression runs in the migration thread of QEMU. This is not a vCPU
thread, but one of the QEMU emulator threads. So CPU usage policy set
against the QEMU emulator threads applies to the compression CPU overhead.

> Also, nova can reserve some memory for the host. Will QEMU also respect it?

No, its not QEMU's job to respect that. If you want to reserve resources
for only the host OS, then you need to setup suitable cgroup partitions
to separate VM from non-VM processes. The Nova reserved memory setting
is merely a hint to the schedular - it has no functional effect on its
own.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

2015-11-26 Thread Daniel P. Berrange
On Thu, Nov 26, 2015 at 11:55:31PM +0800, 少合冯 wrote:
> Hi all,
> We want to support xbzrle compress for live migration.
> 
> Now there are 3 options,
> 1. add the enable flag in nova.conf.
> such as a dedicated 'live_migration_compression=on|off" parameter in
> nova.conf.
> And nova simply enable it.
> seems not good.

Just having a live_migration_compression=on|off parameter that
unconditionally turns it on for all VMs is not really a solution
on its own, as it leaves out the problem of compression cache
memory size, which is at the root of the design problem.

Without sensible choice of the cache size, the compression is
either useless (to small and it won't get a useful number cache
hits and so won't save any data transfer bandwidth) or it is
hugely wasteful of resources (to large and you're just sucking
host RAM for no benefit). QEMU migration code maintainers
guidelines are that the cache size should be approximately
equal to the guest RAM working set. IOW for a 4 GB guest
you potentially need a 4 GB cache for migration, so we're
doubling the memory usage of a guest, without the schedular
being any the wiser, which will inevitably cause the host
to die in out of memory at some point.


> 2.  add a parameters in live migration API.
> 
> A new array compress will be added as optional, the json-schema as below::
> 
>   {
> 'type': 'object',
> 'properties': {
>   'os-migrateLive': {
> 'type': 'object',
> 'properties': {
>   'block_migration': parameter_types.boolean,
>   'disk_over_commit': parameter_types.boolean,
>   'compress': {
> 'type': 'array',
> 'items': ["xbzrle"],
>   },
>   'host': host
> },
> 'additionalProperties': False,
>   },
> },
> 'required': ['os-migrateLive'],
> 'additionalProperties': False,
>   }

I really don't think we want to expose this kind of hypervisor
specific detail in the live migration API of Nova. It just leaks
too many low level details. It still leaves the problem of deciding
the compression cache size unsolved and likewise the problem of the
schedular knowing about the memory usage for this cache in order to
avoid OOM

> 3.  dynamically choose when to activate xbzrle compress for live migration.
>  This is the best.
>  xbzrle really wants to be used if the network is not able to keep up
> with the dirtying rate of the guest RAM.
>  But how do I check the coming migration fit this situation?

FWIW, if we decide we want compression support in Nova, I think that
having the Nova libvirt driver dynamically decide when to use it is
the only viable approach. Unfortunately the way the QEMU support
is implemented makes it very hard to use, as QEMU forces you to decide
to use it upfront, at a time when you don't have any useful information
on which to make the decision :-(  To be useful IMHO, we really need
the ability to turn on compression on the fly for an existing active
migration process. ie, we'd start migration off and let it run and
only enable compression if we encounter problems with completion.
Sadly we can't do this with QEMU as it stands today :-(

Oh and of course we still need to address the issue of RAM usage and
communicating that need with the scheduler in order to avoid OOM
scenarios due to large compression cache.

I tend to feel that the QEMU compression code is currently broken by
design and needs rework in QEMU before it can be pratically used in
an autonomous fashion :-(

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

2015-11-26 Thread Daniel P. Berrange
On Thu, Nov 26, 2015 at 05:49:50PM +, Paul Carlton wrote:
> Seems to me the prevailing view is that we should get live migration to
> figure out the best setting for
> itself where possible.  There was discussion of being able have a default
> policy setting that will allow
> the operator to define balance between speed of migration and impact on the
> instance.  This could be
> a global default for the cloud with overriding defaults per aggregate,
> image, tenant and instance as
> well as the ability to vary the setting during the migration operation.
> 
> Seems to me that items like compression should be set in configuration files
> based on what works best
> given the cloud operator's environment?

Merely turning on use of compression is the "easy" bit - there needs to be
a way to deal with compression cache size allocation, which needs to have
some smarts in Nova, as there's no usable "one size fits all" value for
the compression cache size. If we did want to hardcode a compression cache
size, you'd have to pick set it as a scaling factor against the guest RAM
size. This is going to be very heavy on memory usage, so there needs careful
design work to solve the problem of migration compression triggering host
OOM scenarios, particularly since we can have multiple concurrent
migrations.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

2015-11-26 Thread Daniel P. Berrange
On Thu, Nov 26, 2015 at 05:39:04PM +, Daniel P. Berrange wrote:
> On Thu, Nov 26, 2015 at 11:55:31PM +0800, 少合冯 wrote:
> > 3.  dynamically choose when to activate xbzrle compress for live migration.
> >  This is the best.
> >  xbzrle really wants to be used if the network is not able to keep up
> > with the dirtying rate of the guest RAM.
> >  But how do I check the coming migration fit this situation?
> 
> FWIW, if we decide we want compression support in Nova, I think that
> having the Nova libvirt driver dynamically decide when to use it is
> the only viable approach. Unfortunately the way the QEMU support
> is implemented makes it very hard to use, as QEMU forces you to decide
> to use it upfront, at a time when you don't have any useful information
> on which to make the decision :-(  To be useful IMHO, we really need
> the ability to turn on compression on the fly for an existing active
> migration process. ie, we'd start migration off and let it run and
> only enable compression if we encounter problems with completion.
> Sadly we can't do this with QEMU as it stands today :-(
> 
> Oh and of course we still need to address the issue of RAM usage and
> communicating that need with the scheduler in order to avoid OOM
> scenarios due to large compression cache.
> 
> I tend to feel that the QEMU compression code is currently broken by
> design and needs rework in QEMU before it can be pratically used in
> an autonomous fashion :-(

Actually thinking about it, there's not really any significant
difference between Option 1 and Option 3. In both cases we want
a nova.conf setting live_migration_compression=on|off to control
whether we want to *permit* use  of compression.

The only real difference between 1 & 3 is whether migration has
compression enabled always, or whether we turn it on part way
though migration.

So although option 3 is our desired approach (which we can't
actually implement due to QEMU limitations), option 1 could
be made fairly similar if we start off with a very small
compression cache size which would have the effect of more or
less disabling compression initially.

We already have logic in the code for dynamically increasing
the max downtime value, which we could mirror here

eg something like

 live_migration_compression=on|off

  - Whether to enable use of compression

 live_migration_compression_cache_ratio=0.8

  - The maximum size of the compression cache relative to
the guest RAM size. Must be less than 1.0

 live_migration_compression_cache_steps=10

  - The number of steps to take to get from initial cache
size to the maximum cache size

 live_migration_compression_cache_delay=75

  - The time delay in seconds between increases in cache
size


In the same way that we do with migration downtime, instead of
increasing cache size linearly, we'd increase it in ever larger
steps until we hit the maximum. So we'd start off fairly small
a few MB, and monitoring the cache hit rates, we'd increase it
periodically.  If the number of steps configured and time delay
between steps are reasonably large, that would have the effect
that most migrations would have a fairly small cache and would
complete without needing much compression overhead.

Doing this though, we still need a solution to the host OOM scenario
problem. We can't simply check free RAM at start of migration and
see if there's enough to spare for compression cache, as the schedular
can spawn a new guest on the compute host at any time, pushing us into
OOM. We really need some way to indicate that there is a (potentially
very large) extra RAM overhead for the guest during migration.

ie if live_migration_compression_cache_ratio is 0.8 and we have a
4 GB guest, we need to make sure the schedular knows that we are
potentially going to be using 7.2 GB of memory during migration

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder][nova]Move encryptors to os-brick

2015-11-23 Thread Daniel P. Berrange
On Fri, Nov 20, 2015 at 02:44:17PM -0500, Ben Swartzlander wrote:
> On 11/20/2015 01:19 PM, Daniel P. Berrange wrote:
> >On Fri, Nov 20, 2015 at 02:45:15PM +0200, Duncan Thomas wrote:
> >>Brick does not have to take over the decisions in order to be a useful
> >>repository for the code. The motivation for this work is to avoid having
> >>the dm setup code copied wholesale into cinder, where it becomes difficult
> >>to keep in sync with the code in nova.
> >>
> >>Cinder needs a copy of this code since it is on the data path for certain
> >>operations (create from image, copy to image, backup/restore, migrate).
> >
> >A core goal of using volume encryption in Nova to provide protection for
> >tenant data, from a malicious storage service. ie if the decryption key
> >is only ever used by Nova on the compute node, then cinder only ever sees
> >ciphertext, never plaintext.  Thus if cinder is compromised, then it can
> >not compromise any data stored in any encrypted volumes.
> 
> There is a difference between the cinder service and the storage controller
> (or software system) that cinder manages. You can give the decryption keys
> to the cinder service without allowing the storage controller to see any
> plaintext.
> 
> As Walt says in the relevant patch [1], expecting cinder to do data
> management without ever performing I/O is unrealistic. The scenario where
> the compute admin doesn't trust the storage admin is understandable
> (although less important than other potential types of attacks IMO) but the
> scenario where the guy managing nova doesn't trust the guy managing cinder
> makes no sense at all.

So you are implicitly saying here that the cinder admin is different from
the storage admin. While that certainly may often be true, I strugle to
categorically say it is always going to be true.

Furthermore it is not only about the trust of the cinder administrator,
but rather trust of the integrity of the cinder service. OpenStack has
a great many components that are open to attack, and it is prudent to
design the system such that successfull security attacks are confined
to as large a degree as possible. From this POV I think it is entirely
reasonable & indeed sensible for Nova to have minimal trust of Cinder
as a whole when it comes to tenant data security.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder][nova]Move encryptors to os-brick

2015-11-23 Thread Daniel P. Berrange
On Fri, Nov 20, 2015 at 11:34:29AM -0800, Walter A. Boring IV wrote:
> On 11/20/2015 10:19 AM, Daniel P. Berrange wrote:
> >On Fri, Nov 20, 2015 at 02:45:15PM +0200, Duncan Thomas wrote:
> >>Brick does not have to take over the decisions in order to be a useful
> >>repository for the code. The motivation for this work is to avoid having
> >>the dm setup code copied wholesale into cinder, where it becomes difficult
> >>to keep in sync with the code in nova.
> >>
> >>Cinder needs a copy of this code since it is on the data path for certain
> >>operations (create from image, copy to image, backup/restore, migrate).
> >A core goal of using volume encryption in Nova to provide protection for
> >tenant data, from a malicious storage service. ie if the decryption key
> >is only ever used by Nova on the compute node, then cinder only ever sees
> >ciphertext, never plaintext.  Thus if cinder is compromised, then it can
> >not compromise any data stored in any encrypted volumes.
> >
> >If cinder is looking to get access to the dm-seutp code, this seems to
> >imply that cinder will be getting access to the plaintext data, which
> >feels to me like it de-values the volume encryption feature somewhat.
> >
> >I'm fuzzy on the details of just what code paths cinder needs to be
> >able to convert from plaintext to ciphertext or vica-verca, but in
> >general I think it is desirable if we can avoid any such operation
> >in cinder, and keep it so that only Nova compute nodes ever see the
> >decrypted data.
> Being able to limit the number of points where an encrypted volume can be
> used unencrypted
> is obviously a good goal.
> Unfortunately, it's entirely unrealistic to expect Cinder to never be able
> to have access that access.
>
> Cinder currently needs access to write data to volumes that are encrypted
> for several operations.
> 
> 1) copy volume to image

If a volume is encrypted and it is being copied to an image, IMHO we
should not aim to decrypt it. We should copy the data as is and mark
the image as encrypted in glance, and then use it as is next time the
image is needed.

FYI, Nova is already aiming to consider both the glance data storage
and the glance service as a whole, as untrustworthy. The first step
in this is using cryptographic signatures to detect unauthorized
image data modification by a compromised glance. Encryption will be
a later step in the process.

> 2) copy image to volume

This is semi-plausible as a place where Cinder needs to go from
unencrypted image data to encrypted volume data, when a user is
creating a volume from an image ahead of time, distinct from any
VM boot attempt. In such a case it is desirable that Cinder not
be able to request any existing volume keys from the key server,
merely have the ability to upload new keys and throw away its
local copy thereafter.

> 3) backup

Cinder should really not try to decrypt volumes when backing them
up. If it conversely wants to encrypt volumes during backup, it
can do so with separate backup keys, distinct from those used for
primary volume encryption for use at runtime.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Migration progress

2015-11-23 Thread Daniel P. Berrange
On Mon, Nov 23, 2015 at 08:36:32AM +, Paul Carlton wrote:
> John
> 
> At the live migration sub team meeting I undertook to look at the issue
> of progress reporting.
> 
> The use cases I'm envisaging are...
> 
> As a user I want to know how much longer my instance will be migrating
> for.
> 
> As an operator I want to identify any migration that are making slow
>  progress so I can expedite their progress or abort them.
> 
> The current implementation reports on the instance's migration with
> respect to memory transfer, using the total memory and memory remaining
> fields from libvirt to report the percentage of memory still to be
> transferred.  Due to the instance writing to pages already transferred
> this percentage can go up as well as down.  Daniel has done a good job
> of generating regular log records to report progress and highlight lack
> of progress but from the API all a user/operator can see is the current
> percentage complete.  By observing this periodically they can identify
> instance migrations that are struggling to migrate memory pages fast
> enough to keep pace with the instance's memory updates.
> 
> The problem is that at present we have only one field, the instance
> progress, to record progress.  With a live migration there are measures
> of progress, how much of the ephemeral disks (not needed for shared
> disk setups) have been copied and how much of the memory has been
> copied. Both can go up and down as the instance writes to pages already
> copied causing those pages to need to be copied again.  As Daniel says
> in his comments in the code, the disk size could dwarf the memory so
> reporting both in single percentage number is problematic.
> 
> We could add an additional progress item to the instance object, i.e.
> disk progress and memory progress but that seems odd to have an
> additional progress field only for this operation so this is probably
> a non starter!
> 
> For operations staff with access to log files we could report disk
> progress as well as memory in the log file, however that does not
> address the needs of users and whilst log files are the right place for
> support staff to look when investigating issues operational tooling
> is much better served by notification messages.
> 
> Thus I'd recommend generating periodic notifications during a migration
> to report both memory and disk progress would be useful?  Cloud
> operators are likely to manage their instance migration activity using
> some orchestration tooling which could consume these notifications and
> deduce what challenges the instance migration is encountering and thus
> determine how to address any issues.
> 
> The use cases are only partially addressed by the current
> implementation, they can repeatedly get the server details and look at
> the progress percentage to see how quickly (or even if) it is
> increasing and determine how long the instance is likely to be
> migrating for.  However for an instance that has a large disk and/or
> is doing a high rate of disk i/o they may see the percentage complete
> (i.e. memory) repeatedly showing 90%+ but the instance migration does
> not complete.
> 
> The nova spec https://review.openstack.org/#/c/248472/ suggests making
> detailed information available via the os-migrations object.  This is
> not a bad idea but I have some issues with the implementation that I
> will share on that spec.

As I mentioned in the spec, I won't support exposing anything other
than disk total + remaining via the API. All the other stats are
low level QEMU specific implementation details that I feel the public
API users have no business knowing about.

In general I think we need to be wary of exposing lots of info + knobs
via the API, as that direction essentially ends up forcing the problem
onto client application. The focus should really be on ensuring that
Nova consumes all these stats exposed by QEMU and makes decisions
itself based on that.

At most an external application should have information on the data
transfer progress. I'm not even convinced that applications should
need to be able to figure out if a live migration is stuck. I generally
think that any scenario in which a live migration can get stuck is a
bug in Nova's management of the migration process. IOW, the focus of
our efforts should be on ensuring Nova does the right thing to guarantee
that live migration will never get stuck. At which point an Nova client
user / application should really only care about the overall progress
of a live migration.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage 

Re: [openstack-dev] [cinder][glance]Upload encrypted volumes to images

2015-11-23 Thread Daniel P. Berrange
On Mon, Nov 23, 2015 at 03:45:55AM +, Li, Xiaoyan wrote:
> Hi all,
> More help about volume encryption is needed. 
> 
> About uploading encrypted volumes to image, there are three options:
> 1. Glance only keeps non-encrypted images. So when uploading encrypted
> volumes to image, cinder de-crypts the data and upload.

This may be desirable in some cases, but for people wanting to provide
end to end encryption of all tenant data, unencrypting volumes when
converting them to images to store is glance is really the last thing
we want to do. Once tenant data is encrypted, the goal should be to
never decrypt it again except when booting an instance with the volume
or image.

> 2. Glance maintain encrypted images. Cinder just upload the encrypted
> data to image.

That is highly desirable as an option, since it allows glance to remain an
relatively untrusted component. The image signature work will soon allow
Nova to consider glance as untrusted, by allowing Nova to verify that Glance
has not tampered with the data that was provided by user, nor tried to serve
Nova data from a different user.  Following this lead, I think the ability
to prevent Glance seeing any plaintext data from the image is an obvious
beneficial step forwards.

> 3. Just prevent the function to upload encrypted volumes to images.

That's obviously fairly limiting.

> Option 1 No changes needed in Glance. But it may be not safe. As we
> decrypt the data, and upload it to images.

s/may be not safe/is not safe/.

> Option 2 This imports encryption to Glance which needs to manage the
> encryption metadata.

Glance doesn't need to do all that much besides recording a few
bits of metadata, so that doesn't seem unreasonable todo.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder][glance]Upload encrypted volumes to images

2015-11-23 Thread Daniel P. Berrange
On Mon, Nov 23, 2015 at 07:05:05AM +0100, Philipp Marek wrote:
> > About uploading encrypted volumes to image, there are three options:
> > 1. Glance only keeps non-encrypted images. So when uploading encrypted 
> >volumes to image, cinder de-crypts the data and upload.
> > 2. Glance maintain encrypted images. Cinder just upload the encrypted 
> >data to image. 
> > 3. Just prevent the function to upload encrypted volumes to images.
> >
> > Option 1 No changes needed in Glance. But it may be not safe. As we decrypt 
> > the data, and upload it to images. 
> > Option 2 This imports encryption to Glance which needs to manage the 
> > encryption metadata.
> > 
> > Please add more if you have other suggestions. How do you think which one 
> > is preferred.
> Well, IMO only option 1 is useful.
> 
> Option 2 means that the original volume, the image, and all derived volumes 
> will share the same key, right?

That depends on how you implement it really. If you directly upload the
encrypted volume as-is, and then directly boot later VMs of the same
image, as-is they'll obviously share the same key. It is possible though
for cinder to re-encrypt the volume with a different key before uploading
it, or more likely for Nova to re-encrypt the image with a different key
after downloading it to boot an instance.

> That's not good. (Originally: "unacceptable")

If the images and all volumes are all owned by a single tenant user it
is not a big deal if they have the same key. Depending on what threats
you are protecting against, it may be more desirable than having the
data stored unencrypted in glance.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder][nova]Move encryptors to os-brick

2015-11-20 Thread Daniel P. Berrange
On Fri, Nov 20, 2015 at 03:22:04AM +, Li, Xiaoyan wrote:
> Hi all,
> 
> To fix bug [1][2] in Cinder, Cinder needs to use nova/volume/encryptors[3]
> to attach/detach encrypted volumes. 
> 
> To decrease the code duplication, I raised a BP[4] to move encryptors to
> os-brick[5].
> 
> Once it is done, Nova needs to update to use the common library. This
> is BP raised. [6]

You need to proposal a more detailed spec for this, not merely a blueprint
as there are going to be significant discussion points here.

In particular for the QEMU/KVM nova driver, this proposal is not really
moving in a direction that is aligned with our long term desire/plan for
volume encryption and/or storage management in Nova with KVM.  While we
currently use dm-crypt with volumes that are backed by block devices,
this is not something we wish to use long term. Increasingly the storage
used is network based, and while we use in-kernel network clients for
iSCSI/NFS, we use an in-QEMU client for RBD/Gluster storage. QEMU also
has support for in-QEMU clients for iSCSI/NFS and it is likely we'll use
them in Nova in future too.

Now encryption throws a (small) spanner in the works as the only way to
access encrypted data right now is via dm-crypt, which obviously doesn't
fly when there's no kernel block device to attach it to. Hence we are
working in enhancement to QEMU to let it natively handle LUKS format
volumes. At which point we'll stop using dm-crypt for for anything and
do it all in QEMU.

Nova currently decides whether it wants to use the in-kernel network
client, or an in-QEMU network client for the various network backed
storage drivers. If os-brick takes over encryption setup with dm-crypt,
then it would potentially be taking the decision away from Nova about
whether to use in-kernel or in-QEMU clients, which is not desirable.
Nova must retain control over which configuration approach is best
for the hypervisor it is using.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder][nova]Move encryptors to os-brick

2015-11-20 Thread Daniel P. Berrange
On Fri, Nov 20, 2015 at 02:45:15PM +0200, Duncan Thomas wrote:
> Brick does not have to take over the decisions in order to be a useful
> repository for the code. The motivation for this work is to avoid having
> the dm setup code copied wholesale into cinder, where it becomes difficult
> to keep in sync with the code in nova.
> 
> Cinder needs a copy of this code since it is on the data path for certain
> operations (create from image, copy to image, backup/restore, migrate).

A core goal of using volume encryption in Nova to provide protection for
tenant data, from a malicious storage service. ie if the decryption key
is only ever used by Nova on the compute node, then cinder only ever sees
ciphertext, never plaintext.  Thus if cinder is compromised, then it can
not compromise any data stored in any encrypted volumes.

If cinder is looking to get access to the dm-seutp code, this seems to
imply that cinder will be getting access to the plaintext data, which
feels to me like it de-values the volume encryption feature somewhat.

I'm fuzzy on the details of just what code paths cinder needs to be
able to convert from plaintext to ciphertext or vica-verca, but in
general I think it is desirable if we can avoid any such operation
in cinder, and keep it so that only Nova compute nodes ever see the
decrypted data.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra] Remove .mailmap files from OpenStack repos

2015-11-19 Thread Daniel P. Berrange
On Thu, Nov 19, 2015 at 07:28:03PM +0300, Mikhail Fedosin wrote:
> Currently we have .mailmap files in the root of almost all OpenStack repos:
> https://github.com/openstack/glance/blob/master/.mailmap
> https://github.com/openstack/horizon/blob/master/.mailmap
> https://github.com/openstack/nova/blob/master/.mailmap
> https://github.com/openstack/cinder/blob/master/.mailmap
> etc.
> 
> But it seems that they're very outdated (last time many of them were
> updated about 2 years ago). So, do we still need these files or it's better
> to remove them?

I think that the mailmap files provide pretty useful feature when
looking at the git history to see who wrote what. They ensure you
are given the best current email address for the author, not whatever
(now bouncing) email address they had 4 years ago. So deleting them
is not a great idea IMHO.

There could be better ways to maintain them though. Given that it
is not unusual for people to work across multiple different openstack
projects, it seems silly to manually update .mailmap in each project
individually.

We could have a single globally maintained mailmap file which we
automatically sync out to each project's GIT repo. It probably
wouldn't be too hard to write a script that looked at the list
of GIT authors in history and identify a large portion of the
cases where someone has changed their email addr and so let us
semi-automatically update the central mailmap file.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][infra] Getting a bleeding edge libvirt gate job running

2015-11-18 Thread Daniel P. Berrange
On Wed, Nov 18, 2015 at 05:18:28PM +1100, Tony Breeds wrote:
> On Tue, Nov 17, 2015 at 03:32:45PM -0800, Jay Pipes wrote:
> > On 11/17/2015 11:10 AM, Markus Zoeller wrote:
> > >Background
> > >==
> > >The blueprint [1] wants to utilize the *virtlogd* logging deamon from
> > >libvirt. Among others to solve bug [2], one of our oldest ones. The
> > >funny part is, that this libvirt feature is still in development. This
> > >was a trigger to see if we can create a gate job which utilizes the
> > >latest, bleeding edge, version of libvirt to test such features. We
> > >discussed it shortly in IRC [3] (tonyb, bauzas, markus_z) and wanted to
> > >get some feedback here. The summary of the idea is:
> > >* Create a custom repo which contains the latest libvirt version
> > >* Enhance Devstack so that it can point to a custom repo to install
> > >   the built libvirt packages
> > >* Have a nodepool image which is compatible with the libvirt packages
> > >* In case of [1]: check if tempest needs further/changed tests
> > >
> > >Open questions
> > >==
> > >* Is already someone working on something like that and I missed it?
> > 
> > Sean (cc'd) might have some information on what he's doing in the OVS w/
> > DPDK build environment, which AFAIK requires a later build of libvirt than
> > available in most distros.
> > 
> > >* If 'no', is there already a repo which contains the very latest
> > >   libvirt builds which we can utilize?
> > >
> > >I haven't done anything with the gates before, which means there is a
> > >very high chance I'm missing something or missunderstanding a concept.
> > >Please let me know what you think.
> > 
> > A generic "build libvirt or OVS from this source repo" dsvm job would be
> > great I think. That would allow overrides in ENV variables to point the job
> > to a URI for grabbing sources of OVS (DPDK OVS, mainline OVS) or libvirt
> > that would be built into the target nodepool images.
> 
> I was really hoping to decouple the build from the dsvm jobs.  My initial
> thoughts were a add a devstack plugin that add $repo and then upgrade
> $packages.  I wanted to decouple the build from install as I assumed that the
> delays in building libvirt (etc) would be problematic *and* provide another
> failure mode for devstack that we really don't want to deal with.
> 
> I was only thinking of having libvirt and qemu in there but if the plug-in was
> abstract enough it could easily provide packages for other help utils (like 
> OVS
> and DPDK).
> 
> When I started looking at this Ubuntu was the likely candidate as Fedora in 
> the gate
> wasn't really a stable thing.  I see a little more fedora in nodepool so 
> perhaps a
> really quick win would be to just use the lib-virt preview on F22.

Trying to build from bleeding edge is just a can of worms as you'll need to
have someone baby-sitting the job to fix it up on new releases when the
list of build deps changes or build options alter. As an example, next
QEMU release will require you to pull in 3rd party libcacard library
for SPICE build, since it was split out, so there's already a build
change pending that would cause a regression in the gate.

So, my recommendation would really be to just use Fedora with virt-preview
for the bleeding edge and avoid trying to compile stuff in the gate. The
virt-preview repository tracks upstream releases of QEMU+Libvirt+libguestfs
with minimal delay and is built with the same configuration as future Fedora
releases will use. So such testing is good evidence that Nova won't break on
the next Fedora release.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] [nova] Image Signature Verification

2015-11-17 Thread Daniel P. Berrange
On Tue, Nov 17, 2015 at 02:09:42PM -0300, Flavio Percoco wrote:
> On 13/11/15 09:35 +0000, Daniel P. Berrange wrote:
> >On Thu, Nov 12, 2015 at 08:30:53PM +, Poulos, Brianna L. wrote:
> >>Hello,
> >>
> >>There has recently been additional discussion about the best way to handle
> >>image signature verification in glance and nova [1].  There are two
> >>options being discussed for the signature (the examples below using
> >>'RSA-PSS' as the type, and SHA-256 as the hash method):
> >>
> >>1. The signature is of the glance checksum of the image data (currently a
> >>hash which is hardcoded to be MD5)
> >>signature = RSA-PSS(SHA-256(MD5(IMAGE-CONTENT)))
> >>
> >>2. The signature of the image data directly
> >>signature = RSA-PSS(SHA-256(IMAGE-CONTENT))
> >>
> >>The 1st option is what is currently in glance's liberty release [2].  This
> >>approach was chosen with the understanding that the glance checksum would
> >>be updated to be configurable [3].  Although the 2nd option was initially
> >>proposed, the glance community opposed it during the pre-Liberty virtual
> >>mini-summit in May 2015 (due to the performance impact of doing two hashes
> >>of the image data--one for the 'checksum' and the other for the
> >>signature), and it was decided to proceed with the 1st option during the
> >>Liberty summit [4].
> >>
> >>During the Mitaka Summit, making the glance checksum configurable was
> >>discussed during a design session [5].  It was decided that instead of
> >>making the 'checksum' image field configurable, it would be preferable to
> >>compute a separate, configurable (on a per-image basis, with a site-wide
> >>default) hash, and then use that hash when MD5 wasn't sufficient (such as
> >>in the case of signature verification). This second hash would be computed
> >>at the same time the MD5 'checksum' was computed.
> >>
> >>Which brings us to the nova spec which is under discussion [1], which is
> >>to add the ability to verify signatures in nova.  The nova community has
> >>made it clear that the promise of providing a configurable hash in glance
> >>is not good enough--they never want to support any signatures that use MD5
> >>in any way, shape, or form; nor do they want to rely on asking glance for
> >>what hash option was used.  To that end, the push is to use the 2nd option
> >>to verify signatures in nova from the start.
> >
> >As well as not wanting MD5, I believe that computing signatures based
> >on a configurable checksum in glance provides a bad user experiance.
> >The user generating the signature of their image, now has to have a
> >way to query glance to find out what checksum it used, in order to
> >generate their signature. Further if the glance admin ever wants to
> >change their checksum algorithm, they'd break all existing signatures
> >by doing so. These are just as important reasons why I want Nova
> >to use the 2nd option and compute signatures directly on the image
> >content.
> 
> This is a very good point. Thanks for bringing it up, Dan.
> 
> >
> >>Since the glance community no longer seems opposed to the idea of
> >>computing two hashes (the second hash being optional, of course), the 2nd
> >>option has now become valid from the glance perspective.  This would
> >>require modifying the existing implementation in glance to verify a
> >>signature of the image data, rather than verifying a checksum of the image
> >>data, but would have no additional performance hit beyond the cost to
> >>compute the second hash.  Note that the image data would still only be
> >>read once -- the checksum update (for the MD5 hash) and the signature
> >>verification update (for the signature hash) would occur in the same loop.
> >>Although this would mean that signatures generated using option 1 would no
> >>longer verify, since signatures generated using option 1 are based on an
> >>MD5 hash (and were waiting for the checksum configurability before
> >>becoming a viable cryptographic option anyway), this does not pose a
> >>significant issue.
> >
> >A second point about the current proposal from Nova's POV is that
> >we do not like the image property names currently used. In Liberty
> >Nova standardized on the property naming scheme it uses to have 3
> >naming prefixes
> >
> > https://github.com/openstack/nova/blob/master/nova/objects/image_meta.py#L166
> >
> >- 'hw_' - properties that affe

Re: [openstack-dev] [glance] [nova] Image Signature Verification

2015-11-13 Thread Daniel P. Berrange
On Thu, Nov 12, 2015 at 08:30:53PM +, Poulos, Brianna L. wrote:
> Hello,
> 
> There has recently been additional discussion about the best way to handle
> image signature verification in glance and nova [1].  There are two
> options being discussed for the signature (the examples below using
> 'RSA-PSS' as the type, and SHA-256 as the hash method):
> 
> 1. The signature is of the glance checksum of the image data (currently a
> hash which is hardcoded to be MD5)
> signature = RSA-PSS(SHA-256(MD5(IMAGE-CONTENT)))
> 
> 2. The signature of the image data directly
> signature = RSA-PSS(SHA-256(IMAGE-CONTENT))
> 
> The 1st option is what is currently in glance's liberty release [2].  This
> approach was chosen with the understanding that the glance checksum would
> be updated to be configurable [3].  Although the 2nd option was initially
> proposed, the glance community opposed it during the pre-Liberty virtual
> mini-summit in May 2015 (due to the performance impact of doing two hashes
> of the image data--one for the 'checksum' and the other for the
> signature), and it was decided to proceed with the 1st option during the
> Liberty summit [4].
> 
> During the Mitaka Summit, making the glance checksum configurable was
> discussed during a design session [5].  It was decided that instead of
> making the 'checksum' image field configurable, it would be preferable to
> compute a separate, configurable (on a per-image basis, with a site-wide
> default) hash, and then use that hash when MD5 wasn't sufficient (such as
> in the case of signature verification). This second hash would be computed
> at the same time the MD5 'checksum' was computed.
> 
> Which brings us to the nova spec which is under discussion [1], which is
> to add the ability to verify signatures in nova.  The nova community has
> made it clear that the promise of providing a configurable hash in glance
> is not good enough--they never want to support any signatures that use MD5
> in any way, shape, or form; nor do they want to rely on asking glance for
> what hash option was used.  To that end, the push is to use the 2nd option
> to verify signatures in nova from the start.

As well as not wanting MD5, I believe that computing signatures based
on a configurable checksum in glance provides a bad user experiance.
The user generating the signature of their image, now has to have a
way to query glance to find out what checksum it used, in order to
generate their signature. Further if the glance admin ever wants to
change their checksum algorithm, they'd break all existing signatures
by doing so. These are just as important reasons why I want Nova
to use the 2nd option and compute signatures directly on the image
content.

> Since the glance community no longer seems opposed to the idea of
> computing two hashes (the second hash being optional, of course), the 2nd
> option has now become valid from the glance perspective.  This would
> require modifying the existing implementation in glance to verify a
> signature of the image data, rather than verifying a checksum of the image
> data, but would have no additional performance hit beyond the cost to
> compute the second hash.  Note that the image data would still only be
> read once -- the checksum update (for the MD5 hash) and the signature
> verification update (for the signature hash) would occur in the same loop.
> Although this would mean that signatures generated using option 1 would no
> longer verify, since signatures generated using option 1 are based on an
> MD5 hash (and were waiting for the checksum configurability before
> becoming a viable cryptographic option anyway), this does not pose a
> significant issue.

A second point about the current proposal from Nova's POV is that
we do not like the image property names currently used. In Liberty
Nova standardized on the property naming scheme it uses to have 3
naming prefixes

  https://github.com/openstack/nova/blob/master/nova/objects/image_meta.py#L166

 - 'hw_' - properties that affect virtual hardware configuration
 - 'os_' - properties that affect guest OS setup / configuration
 - 'img_' - properties that affect handling of images by the host

The signature properties are obviously all related to the handling
of images by the host, so from Nova's POV we should have an 'img_'
prefix on all their names.

We probably should have alerted glance devs to this naming convention
before now to avoid this problem, but I guess we forgot. It would be
great if glance devs could bear this preferred naming convention in
mind if there are any future cases where there is a property that
needs to be used by both Nova & Glance code.

Anyway since the change in the way we calculate signatures on images
is a non-backwards compatible change for users of the current glance
impl, changing these property names at this point is reasonable todo.

Glance could use the property name to determine whether it is
getting an old or new style signature. ie if the 

Re: [openstack-dev] [nova] hackathon day

2015-11-12 Thread Daniel P. Berrange
On Thu, Nov 12, 2015 at 11:15:09AM +, Rosa, Andrea (HP Cloud Services) 
wrote:
> Hi
> 
> I knew that people in China had a 3 days hackathon  few months ago I was 
> thinking to have a similar thing in Europe.
> My original idea was to propose to add an extra day after the mid-cycle but I 
> am not sure if that is a good idea anymore:

The day after the mid-cycle is the main day for travelling to FOSDEM, so
a bad choice for people who want to attend FOSDEM too.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [tc][all][osprofiler] OSprofiler is dead, long live OSprofiler

2015-11-09 Thread Daniel P. Berrange
On Mon, Nov 09, 2015 at 02:57:37AM -0800, Boris Pavlovic wrote:
> Hi stackers,
> 
> Intro
> ---
> 
> It's not a big secret that OpenStack is huge and complicated ecosystem of
> different
> services that are working together to implement OpenStack API.
> 
> For example booting VM is going through many projects and services:
> nova-api, nova-scheduler, nova-compute, glance-api, glance-registry,
> keystone, cinder-api, neutron-api... and many others.
> 
> The question is how to understand what part of the request takes the most
> of the time and should be improved. It's especially interested to get such
> data under the load.
> 
> To make it simple, I wrote OSProfiler which is tiny library that should be
> added to all OpenStack
> projects to create cross project/service tracer/profiler.
> 
> Demo (trace of CLI command: nova boot) can be found here:
> http://boris-42.github.io/ngk.html
> 
> This library is very simple. For those who wants to know how it works and
> how it's integrated with OpenStack take a look here:
> https://github.com/openstack/osprofiler/blob/master/README.rst
> 
> What is the current status?
> ---
> 
> Good news:
> - OSprofiler is mostly done
> - OSprofiler is integrated with Cinder, Glance, Trove & Ceilometer
> 
> Bad news:
> - OSprofiler is not integrated in a lot of important projects: Keystone,
> Nova, Neutron
> - OSprofiler can use only Ceilometer + oslo.messaging as a backend
> - OSprofiler stores part of arguments in api-paste.ini part in project.conf
> which is terrible thing
> - There is no DSVM job that check that changes in OSprofiler don't break
> the projects that are using it
> - It's hard to enable OSprofiler in DevStack
> 
> Good news:
> I spend some time and made 4 specs that should address most of issues:
> https://github.com/openstack/osprofiler/tree/master/doc/specs
> 
> Let's make it happen in Mitaka!
> 
> Thoughts?
> By the way somebody would like to join this effort?)

I'm very interested in seeing this kind of capability integrated across
openstack. I've really needed it in working in Nova for many times.
6 months back or so (when I didn't know osprofiler existed), I hacked up
a roughly equivalent library for openstack:

   https://github.com/berrange/oslo.devsupport

I never had time to persue this further and then found out about osprofiler
so it dropped in priority for me.

Some notable things I think I did differently

 - Used oslo.versionedobjects for recording the data to provide a
   structured data model with easy support for future extension,
   and well defined serialization format

 - Structured data types for all the various common types of operation
   to instrument (database request, RPC call, RPC dispatch, REST call
   REST dispatch, native library call, thread spawn, thread main,
   external command spawn). This is to ensure all apps using the library
   provide the same set of data for each type of operation.

 - Ability to capture stack traces against each profile point to
   allow creation of control flow graphs showing which code paths
   consumed significant time.

 - Abstract "consumer" API for different types of persistence backend.
   Rather than ceilometer, my initial consumer just serialized to
   plain files in well known directories, using oslo.versionedobjects
   serialization format. I can see ceilometer might be nice for
   production deployments, but plain files was simpler for developer
   environments which might not even be running ceilometer

 - Explicit tracking of nesting of instrunmented operation had a parent
   operation. At the top level was things like thread main, RPC dispatch
   and REST dispatch. IIUC, with osprofiler you could potentially infer
   the nesting by sorting based on start/end timestamps, but I felt an
   explicit representation was nicer to work with from the object
   model POV.

My approach would require oslo.devsupport to be integrated into the
oslo.concurrency, oslo.db, oslo.messaging components, as well as the
various apps. I did quick hacks to enable this for Nova & pieces it
uses you can see from these (non-upstreamed) commits:

  
https://github.com/berrange/oslo.concurrency/commit/2fbbc9cf4f23c5c2b30ff21e9e06235a79edbc20
  
https://github.com/berrange/oslo.messaging/commit/5a1fd87650e56a01ae9d8cc773e4d030d84cc6d8
  
https://github.com/berrange/nova/commit/3320b8957728a1acf786296eadf0bb40cb4df165

I see you already intend to make ceilometer optional and allow other
backends, so that's nice. I would love to see Osprofiler take on some
of the other ideas I had in my alternative, particularly the data model
based around oslo.versionedobjects to provide standard format for various
core types of operation, and the ability to record stack traces.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- 

Re: [openstack-dev] [nova][policy] Exposing hypervisor details to users

2015-11-06 Thread Daniel P. Berrange
On Fri, Nov 06, 2015 at 05:08:59PM +1100, Tony Breeds wrote:
> Hello all,
> I came across [1] which is notionally an ironic bug in that horizon 
> presents
> VM operations (like suspend) to users.  Clearly these options don't make sense
> to ironic which can be confusing.
> 
> There is a horizon fix that just disables migrate/suspened and other 
> functaions
> if the operator sets a flag say ironic is present.  Clealy this is sub optimal
> for a mixed hv environment.
> 
> The data needed (hpervisor type) is currently avilable only to admins, a quick
> hack to remove this policy restriction is functional.
> 
> There are a few ways to solve this.
> 
>  1. Change the default from "rule:admin_api" to "" (for 
> os_compute_api:os-extended-server-attributes and
> os_compute_api:os-hypervisors), and set a list of values we're
> comfortbale exposing the user (hypervisor_type and
> hypervisor_hostname).  So a user can get the hypervisor_name as part of
> the instance deatils and get the hypervisor_type from the
> os-hypervisors.  This would work for horizon but increases the API load
> on nova and kinda implies that horizon would have to cache the data and
> open-code assumptions that hypervisor_type can/can't do action $x
> 
>  2. Include the hypervisor_type with the instance data.  This would place the 
> burdon on nova.  It makes the looking up instance details slightly more
> complex but doesn't result in additional API queries, nor caching
> overhead in horizon.  This has the same opencoding issues as Option 1.
> 
>  3. Define a service user and have horizon look up the hypervisors details 
> via 
> that role.  Has all the drawbacks as option 1 and I'm struggling to
> think of many benefits.
> 
>  4. Create a capabilitioes API of some description, that can be queried so 
> that
> consumers (horizon) can known
> 
>  5. Some other way for users to know what kind of hypervisor they're on, 
> Perhaps
> there is an established image property that would work here?
> 
> If we're okay with exposing the hypervisor_type to users, then #2 is pretty
> quick and easy, and could be done in Mitaka.  Option 4 is probably the best
> long term solution but I think is best done in 'N' as it needs lots of
> discussion.

I think that exposing hypervisor_type is very much the *wrong* approach
to this problem. The set of allowed actions varies based on much more than
just the hypervisor_type. The hypervisor version may affect it, as may
the hypervisor architecture, and even the version of Nova. If horizon
restricted its actions based on hypevisor_type alone, then it is going
to inevitably prevent the user from performing otherwise valid actions
in a number of scenarios.

IMHO, a capabilities based approach is the only viable solution to
this kind of problem.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Proposal to add Sylvain Bauza to nova-core

2015-11-06 Thread Daniel P. Berrange
On Fri, Nov 06, 2015 at 03:32:00PM +, John Garbutt wrote:
> Hi,
> 
> I propose we add Sylvain Bauza[1] to nova-core.
> 
> Over the last few cycles he has consistently been doing great work,
> including some quality reviews, particularly around the Scheduler.
> 
> Please respond with comments, +1s, or objections within one week.

+1 from me, I think Sylvain will be a valuable addition to the team
for his scheduler expertize.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Proposal to add Alex Xu to nova-core

2015-11-06 Thread Daniel P. Berrange
On Fri, Nov 06, 2015 at 03:32:04PM +, John Garbutt wrote:
> Hi,
> 
> I propose we add Alex Xu[1] to nova-core.
> 
> Over the last few cycles he has consistently been doing great work,
> including some quality reviews, particularly around the API.
> 
> Please respond with comments, +1s, or objections within one week.

+1 from me, the tireless API patch & review work has been very helpful
to our efforts in this area.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][policy] Exposing hypervisor details to users

2015-11-06 Thread Daniel P. Berrange
On Fri, Nov 06, 2015 at 07:09:59AM -0500, Sean Dague wrote:
> On 11/06/2015 04:49 AM, Daniel P. Berrange wrote:
> > 
> > I think that exposing hypervisor_type is very much the *wrong* approach
> > to this problem. The set of allowed actions varies based on much more than
> > just the hypervisor_type. The hypervisor version may affect it, as may
> > the hypervisor architecture, and even the version of Nova. If horizon
> > restricted its actions based on hypevisor_type alone, then it is going
> > to inevitably prevent the user from performing otherwise valid actions
> > in a number of scenarios.
> > 
> > IMHO, a capabilities based approach is the only viable solution to
> > this kind of problem.
> 
> Right, we just had a super long conversation about this in #openstack-qa
> yesterday with mordred, jroll, and deva around what it's going to take
> to get upgrade tests passing with ironic.
> 
> Capabilities is the right approach, because it means we're future
> proofing our interface by telling users what they can do, not some
> arbitrary string that they need to cary around a separate library to
> figure those things out.
> 
> It seems like capabilities need to exist on flavor, and by proxy instance.
> 
> GET /flavors/bm.large/capabilities
> 
> {
>  "actions": {
>  'pause': False,
>  'unpause': False,
>  'rebuild': True
>  ..
>   }
> 
> A starting point would definitely be the set of actions that you can
> send to the flavor/instance. There may be features beyond that we'd like
> to classify as capabilities, but actions would be a very concrete and
> attainable starting point. With microversions we don't have to solve
> this all at once, start with a concrete thing and move forward.

I think there are two distinct use cases for capabilities we need to
consider.

 1. Before I launch an instance, does the cloud provide features XYZ

 2. Given this running instance, am I able to perform operation XYZ

Having capabilities against the flavour /might/ be sufficient for
#1, but it isn't sufficient for #2.

For example, the ability to hotplug disks to a running instance will
depend on what disk controller the instance is using. The choice of
disk controller used will vary based on image metadata properties,
eg ide vs scsi vs virtio-blk. IDE does not support hotplug, but
scsi & virtio-blk do. So we can't answer the question "does hotplug
disk work for this instance" simply based on the flavour - we need
to ask it against the instance.

What we can answer against the flavour is whether the hypervisor
driver is able to support hotplug in principle, given a suitably
configured instance. That said, even that is not an exact science
if you take into account fact that the cloud could be running
compute nodes with different versions, and the flavour does not
directly control which version of a compute node we'll run against.

Having capabilities against the flavour would certainly allow for
an improvement in Horizon UI vs its current state, but to be able
to perfectly represent what is possible for an instance, Horizon
would ultimately require capabilities against the isntance,

So I think we'll likely end up having to implement both capabilities
against a flavour and against an instance. So you'd end up with a
flow something like

 - Check to see if cloud provider supports hotplug by checking
   flavour capability==disk-hotplug

 - When booting an instance mark it as requiring capability==disk-hotplug
   to ensure its scheduled to a node which supports that capability

 - When presenting UI for operations against an instance, check
   that the running instance supports capability==disk-hotplug


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Upgrade to Gerrit 2.11

2015-10-21 Thread Daniel P. Berrange
On Tue, Oct 13, 2015 at 05:08:29PM -0700, Zaro wrote:
> Hello All,
> 
> The openstack-infra team would like to upgrade from our Gerrit 2.8 to
> Gerrit 2.11.  We are proposing to do the upgrade shortly after the
> Mitaka summit.  The main motivation behind the upgrade is to allow us
> to take advantage of some of the new REST api, ssh commands, and
> stream events features.  Also we wanted to stay closer to upstream so
> it will be easier to pick up more recent features and fixes.

Looking at the release notes I see my most wanted new feature, keyword
tagging of changes, is available

[quote]
Hashtags.

Hashtags can be added to changes. The hashtags are stored in git notes and
are indexed in the secondary index.

This feature requires the notedb to be enabled.
[/quote]

It is listed as an experimental feature, but I'd really love to see this
enabled if at all practical.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra][all] Reviews with a prio label?

2015-10-20 Thread Daniel P. Berrange
On Tue, Oct 20, 2015 at 06:09:51PM +0200, Markus Zoeller wrote:
> In ML post [1] I wondered if it would be possible to introduce a new
> "prio" label in Gerrit which could help in focusing review efforts to
> increase the throughput. With this new post I'd like to discuss if we
> think this could be useful. For example, this would allow to create this
> query in Gerrit:
> 
> "status:open label:Prio=3" 
> 
> I was curious how this could look like in Gerrit, which resulted in the
> screenshots available at [2]. This would minimize the gap between the 
> prio of the blueprints/bugs and their commit reviews.
> 
> I'm mostly active in Nova, so here a short example of how we currently
> try to speed up the merges of trivial fixes:
> 
> * contributor "A" spots a review which looks trivial
> * contributor "A" copies the review ID into an etherpad
> * core reviewer "B" reads the etherpad when possible
> * core reviewer "B" does a review too and eventually gives a +W
> * core reviewer "B" removes that review from the Etherpad when it merges
> 
> This workflow is only necessary because Gerrit does not allow to 
> categorize reviews, e.g. into a group of "trivial fixes". 
> 
> I noticed in my "mini poc" that it would be possible to set permissions
> to specific label values. Which allows us to have a "trivialfix" prio 
> which can be set by everyone, but also a "high" prio which can be set 
> by an automated entity which reuses the priority of the blueprint or 
> bug report.
> 
> Do you think this would speed things up? Or is this just nitpicking on
> an already good enough workflow?

What you're describing is really just a special-case of allowing
arbitrary user tagging of changes. If gerrit had a free-format
keyword tag facility that users could use & query, it'd open up many
possible options for improving our general workflow, and letting
users customize their own workflow.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-10-08 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 03:54:29PM -0600, Chris Friesen wrote:
> On 10/07/2015 03:14 AM, Daniel P. Berrange wrote:
> 
> >For suspended instances, the scenario is really the same as with completely
> >offline instances. The only extra step is that you need to migrate the saved
> >image state file, as well as the disk images. This is trivial once you have
> >done the code for migrating disk images offline, since its "just one more 
> >file"
> >to care about.  Officially apps aren't supposed to know where libvirt keeps
> >the managed save files, but I think it is fine for Nova to peek behind the
> >scenes to get them. Alternatively I'd be happy to see an API added to libvirt
> >to allow the managed save files to be uploaded & downloaded via a libvirt
> >virStreamPtr object, in the same way we provide APIs to  upload & download
> >disk volumes. This would avoid the need to know explicitly about the file
> >location for the managed save image.
> 
> Assuming we were using libvirt with the storage pools API could we currently
> (with existing libvirt) migrate domains that have been suspended with
> virDomainSave()?  Or is the only current option to have nova move the file
> over using passwordless access?

If you used virDomainSave() instead of virDomainManagedSave() then you control
the file location, so you could create a directory based storage pool and
save the state into that directory, at which point you can use the storag
pool APIs to upload/download that data.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] naming N and O releases nowish

2015-10-08 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 02:57:59PM +0200, Thierry Carrez wrote:
> Sean Dague wrote:
> > We're starting to make plans for the next cycle. Long term plans are
> > getting made for details that would happen in one or two cycles.
> > 
> > As we already have the locations for the N and O summits I think we
> > should do the naming polls now and have names we can use for this
> > planning instead of letters. It's pretty minor but it doesn't seem like
> > there is any real reason to wait and have everyone come up with working
> > names that turn out to be confusing later.
> 
> That sounds fair. However the release naming process currently states[1]:
> 
> """
> The process to chose the name for a release begins once the location of
> the design summit of the release to be named is announced and no sooner
> than the opening of development of the previous release.
> """
> 
> ...which if I read it correctly means we could pick N now, but not O. We
> might want to change that (again) first.

Since changing the naming process may take non-negligible time, could
we parallelize, so we can at least press ahead with picking a name for
N asap which is permitted by current rules.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-10-07 Thread Daniel P. Berrange
On Tue, Oct 06, 2015 at 11:43:52AM -0600, Chris Friesen wrote:
> On 10/06/2015 11:27 AM, Paul Carlton wrote:
> >
> >
> >On 06/10/15 17:30, Chris Friesen wrote:
> >>On 10/06/2015 08:11 AM, Daniel P. Berrange wrote:
> >>>On Tue, Oct 06, 2015 at 02:54:21PM +0100, Paul Carlton wrote:
> >>>>https://review.openstack.org/#/c/85048/ was raised to address the
> >>>>migration of instances that are not running but people did not warm to
> >>>>the idea of bringing a stopped/suspended instance to a paused state to
> >>>>migrate it.  Is there any work in progress to get libvirt enhanced to
> >>>>perform the migration of non active virtual machines?
> >>>
> >>>Libvirt can "migrate" the configuration of an inactive VM, but does
> >>>not plan todo anything related to storage migration. OpenStack could
> >>>already solve this itself by using libvirt storage pool APIs to
> >>>copy storage volumes across, but the storage pool worked in Nova
> >>>is stalled
> >>>
> >>>https://review.openstack.org/#/q/status:abandoned+project:openstack/nova+branch:master+topic:bp/use-libvirt-storage-pools,n,z
> >>>
> >>
> >>What is the libvirt API to migrate a paused/suspended VM? Currently nova 
> >>uses
> >>dom.managedSave(), so it doesn't know what file libvirt used to save the
> >>state.  Can libvirt migrate that file transparently?
> >>
> >>I had thought we might switch to virDomainSave() and then use the cold
> >>migration framework, but that requires passwordless ssh.  If there's a way 
> >>to
> >>get libvirt to handle it internally via the storage pool API then that would
> >>be better.
> 
> 
> >So my reading of this is the issue could be addressed in Mitaka by
> >implementing
> >http://specs.openstack.org/openstack/nova-specs/specs/kilo/approved/use-libvirt-storage-pools.html
> >
> >and
> >https://review.openstack.org/#/c/126979/4/specs/kilo/approved/migrate-libvirt-volumes.rst
> >
> >
> >is there any prospect of this being progressed?
> 
> Paul, that would avoid the need for cold migrations to use passwordless ssh
> between nodes.  However, I think there may be additional work to handle
> migrating paused/suspended instances--still waiting for Daniel to address
> that bit.

Migrating paused VMs should "just work" - certainly at the libvirt/QEMU
level there's no distinction between a paused & running VM wrt migration.
I know that historically Nova has blocked migration if the VM is paused
and I recall patches to remove that pointless restriction. I can't
remember if they ever merged.

For suspended instances, the scenario is really the same as with completely
offline instances. The only extra step is that you need to migrate the saved
image state file, as well as the disk images. This is trivial once you have
done the code for migrating disk images offline, since its "just one more file"
to care about.  Officially apps aren't supposed to know where libvirt keeps
the managed save files, but I think it is fine for Nova to peek behind the
scenes to get them. Alternatively I'd be happy to see an API added to libvirt
to allow the managed save files to be uploaded & downloaded via a libvirt
virStreamPtr object, in the same way we provide APIs to  upload & download
disk volumes. This would avoid the need to know explicitly about the file
location for the managed save image.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-10-07 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 10:26:05AM +0100, Paul Carlton wrote:
> I'd be happy to take this on in Mitaka

Ok, first step would be to re-propose the old Kilo spec against Mitaka and
we should be able to fast-approve it.

> >>>So my reading of this is the issue could be addressed in Mitaka by
> >>>implementing
> >>>http://specs.openstack.org/openstack/nova-specs/specs/kilo/approved/use-libvirt-storage-pools.html
> >>>
> >>>and
> >>>https://review.openstack.org/#/c/126979/4/specs/kilo/approved/migrate-libvirt-volumes.rst

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] revisiting minimum libvirt version

2015-10-07 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 06:32:53AM -0400, Sean Dague wrote:
> The following review https://review.openstack.org/#/c/171098 attempts to
> raise the minimum libvirt version to 1.0.3.
> 
> In May that was considered a no go -
> http://lists.openstack.org/pipermail/openstack-operators/2015-May/007012.html
> 
> Can we reconsider that decision and up this to 1.2 for what we're
> regularly testing with. It would also allow some cleaning out of a lot
> of conditional pathing, which is getting pretty deep in ifdefs -
> https://github.com/openstack/nova/blob/251e09ab69e5dd1ba2c917175bb408c708843f6e/nova/virt/libvirt/driver.py#L359-L424

I've actually just sent a thread suggesting we pick 1.1.1:

  http://lists.openstack.org/pipermail/openstack-dev/2015-October/076302.html

It is possible we could decide to pick a 1.2.x release, if we're willing to
drop further distros. Lets continue the discussion in that other thread
I created.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Min libvirt for Mitaka is 0.10.2 and suggest Nxxx uses 1.1.1

2015-10-07 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 07:17:09AM -0400, Sean Dague wrote:
> On 10/07/2015 07:02 AM, Daniel P. Berrange wrote:
> > On Wed, Oct 07, 2015 at 06:55:44AM -0400, Sean Dague wrote:
> >> Isn't RHEL 7.1 just an update stream on RHEL 7.0? It seems a little
> >> weird to keep the 1.1.1 support instead of just going up to 1.2.2.
> > 
> > Yes & no. There are in fact two different streams users can take
> > with RHEL. They can stick on a bugfix only stream, which would be
> > 7.0.1, 7.0.2, etc, or they can take the bugfix + features stream
> > which is 7.1, 7.2, etc. They can't stick on the bugfix only
> > stream forever though, so given that by time Nxx is released
> > 7.2 will also be available, we are probably justified in dropping
> > 7.0 support.
> > 
> > The next oldest distro libvirt would be Debian Wheezy-backports at 1.2.1.
> > If we are happy to force Debian users to Jessie, then next oldest after
> > that is Ubuntu 14.04 LTS with 1.2.2.
> 
> 1.2.1 seems reasonable, it's also probably worth asking the Debian folks
> if they can put 1.2.2 into the backport stream.
> 
> I think it might also be worth pre-declaring the O minimum as well so
> that instead of just following the distros we are signaling what we'd
> like in there. Because in the O time frame it feels like 1.2.8 would be
> a reasonable minimum, and that would give distros a year of warning to
> ensure they got things there.

FYI I extended the distro support wiki page with details of the min
libvirt we have required in each Nova release:

  
https://wiki.openstack.org/wiki/LibvirtDistroSupportMatrix#Nova_release_min_version

We could just add a row for O release with an educated guess
as to a possible target, to give people an idea of where we're
likely to go.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] FYI: Updated Mitaka specs template

2015-10-07 Thread Daniel P. Berrange
FYI anyone who is pushing specs for review against Mitaka should be aware
that yesterday we merged a change to the spec template. Specifically we
have removed the "Project priority" section of the template, since it has
been a source of much confusion, cannot be filled out until after the
summit decides on priorities and priority specs are already tracked via
etherpad.

So if anyone has a spec up for review, simply delete the "Project priority"
section of your template when pushing your next update of it. It should
have already only contained the word "None" in any case :-)

Once priorities are decided we will track priority specs via this page:

  https://etherpad.openstack.org/p/mitaka-nova-priorities-tracking

Regards,
Daniel

[1] https://review.openstack.org/#/c/230916/
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Min libvirt for Mitaka is 0.10.2 and suggest Nxxx uses 1.1.1

2015-10-07 Thread Daniel P. Berrange
In the Liberty version of OpenStack we had a min libvirt of 0.9.11 and
printed a warning on startup if you had < 0.10.2, to the effect that
Mitaka will required 0.10.2

This mail is a reminder that we will[1] mandate libvirt >= 0.10.2 when
Mitaka is released.


Looking forward to the N release, I am suggesting that we target
a new min libvirt of 1.1.1 for that cycle.

Based on info in

   https://wiki.openstack.org/wiki/LibvirtDistroSupportMatrix

this will exclude the following distros running running Nova Nxxx
release:

 - Fedora 20 - it will be end-of-life way before Nxxx is released

 - RHEL 6 - Red Hat stopped shipping Nova on RHEL-6 after Icehouse
and base distro only supports Python 2.6

 - OpenSUSE 12 - this was end-of-life about 6 months ago now

 - SLES 11 - base distro only supports Python 2.6

 - Debian Wheezy - Debian Jessie is current stable, and Wheezy-backports
   provides new enough libvirt for people who wish to
   stay on Wheezy

The min distros required would thus be Fedora 21, RHEL 7.0, OpenSUSE 13
SLES 12, Debian Wheezy and Ubuntu 14.04 (Trusty LTS)

Regards,
Daniel

[1] https://review.openstack.org/#/c/231917/
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Min libvirt for Mitaka is 0.10.2 and suggest Nxxx uses 1.1.1

2015-10-07 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 06:55:44AM -0400, Sean Dague wrote:
> On 10/07/2015 06:46 AM, Daniel P. Berrange wrote:
> > In the Liberty version of OpenStack we had a min libvirt of 0.9.11 and
> > printed a warning on startup if you had < 0.10.2, to the effect that
> > Mitaka will required 0.10.2
> > 
> > This mail is a reminder that we will[1] mandate libvirt >= 0.10.2 when
> > Mitaka is released.
> > 
> > 
> > Looking forward to the N release, I am suggesting that we target
> > a new min libvirt of 1.1.1 for that cycle.
> > 
> > Based on info in
> > 
> >https://wiki.openstack.org/wiki/LibvirtDistroSupportMatrix
> > 
> > this will exclude the following distros running running Nova Nxxx
> > release:
> > 
> >  - Fedora 20 - it will be end-of-life way before Nxxx is released
> > 
> >  - RHEL 6 - Red Hat stopped shipping Nova on RHEL-6 after Icehouse
> > and base distro only supports Python 2.6
> > 
> >  - OpenSUSE 12 - this was end-of-life about 6 months ago now
> > 
> >  - SLES 11 - base distro only supports Python 2.6
> > 
> >  - Debian Wheezy - Debian Jessie is current stable, and Wheezy-backports
> >provides new enough libvirt for people who wish to
> >stay on Wheezy
> > 
> > The min distros required would thus be Fedora 21, RHEL 7.0, OpenSUSE 13
> > SLES 12, Debian Wheezy and Ubuntu 14.04 (Trusty LTS)
> > 
> > Regards,
> > Daniel
> > 
> > [1] https://review.openstack.org/#/c/231917/
> 
> Isn't RHEL 7.1 just an update stream on RHEL 7.0? It seems a little
> weird to keep the 1.1.1 support instead of just going up to 1.2.2.

Yes & no. There are in fact two different streams users can take
with RHEL. They can stick on a bugfix only stream, which would be
7.0.1, 7.0.2, etc, or they can take the bugfix + features stream
which is 7.1, 7.2, etc. They can't stick on the bugfix only
stream forever though, so given that by time Nxx is released
7.2 will also be available, we are probably justified in dropping
7.0 support.

The next oldest distro libvirt would be Debian Wheezy-backports at 1.2.1.
If we are happy to force Debian users to Jessie, then next oldest after
that is Ubuntu 14.04 LTS with 1.2.2.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-10-07 Thread Daniel P. Berrange
On Tue, Oct 06, 2015 at 06:27:12PM +0100, Paul Carlton wrote:
> 
> 
> On 06/10/15 17:30, Chris Friesen wrote:
> >On 10/06/2015 08:11 AM, Daniel P. Berrange wrote:
> >>On Tue, Oct 06, 2015 at 02:54:21PM +0100, Paul Carlton wrote:
> >>>https://review.openstack.org/#/c/85048/ was raised to address the
> >>>migration of instances that are not running but people did not warm to
> >>>the idea of bringing a stopped/suspended instance to a paused state to
> >>>migrate it.  Is there any work in progress to get libvirt enhanced to
> >>>perform the migration of non active virtual machines?
> >>
> >>Libvirt can "migrate" the configuration of an inactive VM, but does
> >>not plan todo anything related to storage migration. OpenStack could
> >>already solve this itself by using libvirt storage pool APIs to
> >>copy storage volumes across, but the storage pool worked in Nova
> >>is stalled
> >>
> >>https://review.openstack.org/#/q/status:abandoned+project:openstack/nova+branch:master+topic:bp/use-libvirt-storage-pools,n,z
> >>
> >
> >What is the libvirt API to migrate a paused/suspended VM? Currently nova
> >uses dom.managedSave(), so it doesn't know what file libvirt used to save
> >the state.  Can libvirt migrate that file transparently?
> >
> >I had thought we might switch to virDomainSave() and then use the cold
> >migration framework, but that requires passwordless ssh.  If there's a way
> >to get libvirt to handle it internally via the storage pool API then that
> >would be better.
> >
> >Chris
> >
> >__
> >
> >OpenStack Development Mailing List (not for usage questions)
> >Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> So my reading of this is the issue could be addressed in Mitaka by
> implementing
> http://specs.openstack.org/openstack/nova-specs/specs/kilo/approved/use-libvirt-storage-pools.html
> and
> https://review.openstack.org/#/c/126979/4/specs/kilo/approved/migrate-libvirt-volumes.rst
> 
> is there any prospect of this being progressed?

The guy who started that work, Solly Ross, is no longer involved in the
Nova project. The overall idea is still sound, but the patches need more
work to get them into a state suitable for serious review & potential
merge. So it is basically waiting for someone motivated to take over
the existing patches Solly did...

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [nova] Min libvirt for Mitaka is 0.10.2 and suggest Nxxx uses 1.1.1

2015-10-07 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 11:46:58AM +0100, Daniel P. Berrange wrote:
> In the Liberty version of OpenStack we had a min libvirt of 0.9.11 and
> printed a warning on startup if you had < 0.10.2, to the effect that
> Mitaka will required 0.10.2
> 
> This mail is a reminder that we will[1] mandate libvirt >= 0.10.2 when
> Mitaka is released.
> 
> 
> Looking forward to the N release, I am suggesting that we target
> a new min libvirt of 1.1.1 for that cycle.
> 
> Based on info in
> 
>https://wiki.openstack.org/wiki/LibvirtDistroSupportMatrix
> 
> this will exclude the following distros running running Nova Nxxx
> release:
> 
>  - Fedora 20 - it will be end-of-life way before Nxxx is released
> 
>  - RHEL 6 - Red Hat stopped shipping Nova on RHEL-6 after Icehouse
> and base distro only supports Python 2.6
> 
>  - OpenSUSE 12 - this was end-of-life about 6 months ago now
> 
>  - SLES 11 - base distro only supports Python 2.6
> 
>  - Debian Wheezy - Debian Jessie is current stable, and Wheezy-backports
>provides new enough libvirt for people who wish to
>  stay on Wheezy
> 
> The min distros required would thus be Fedora 21, RHEL 7.0, OpenSUSE 13
> SLES 12, Debian Wheezy and Ubuntu 14.04 (Trusty LTS)

If we want to be slightly more aggressive and target 1.2.1 we would
additionally loose RHEL-7.0 and OpenSUSE 13.1.  This is probably
not the end of the world, since by the time Nxxx is released, I
expect people will have moved to a newer minor update of those
distros (RHEL-7.1 / OpenSUSE 13.2).

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [nova] Min libvirt for Mitaka is 0.10.2 and suggest Nxxx uses 1.1.1

2015-10-07 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 11:13:12AM +, Tim Bell wrote:
> 
> Although Red Hat is no longer supporting RHEL 6 after Icehouse, a number of
> users such as GoDaddy and CERN are using Software Collections to run the
> Python 2.7 code.

Do you have any educated guess as to when you might switch to deploying
new OpenStack version exclusively on RHEL 7 ? I understand such a switch
is likely to take a while so you can test its performance and reliability
and so on, but I'm assuming you'll eventually switch ?

> However, since this modification would only take place when Mitaka gets
> released, this would realistically give those sites a year to complete
> migration to RHEL/CentOS 7 assuming they are running from one of the
> community editions.
> 
> What does the 1.1.1 version bring that is the motivation for raising the
> limit ?

If we require 1.1.1 we could have unconditional support for

 - Hot-unplug of PCI devices (needs 1.1.1)
 - Live snapshots (needs 1.0.0)
 - Live volume snapshotting (needs 1.1.1)
 - Disk sector discard support (needs 1.0.6)
 - Hyper-V clock tunables (needs 1.0.0 & 1.1.0)

If you lack those versions, in case of hotunplug, and live volume
snapshots we just refuse the corresponding API call. With live
snapshots we fallback to non-live snapshots. For disk discard and
hyperv clock we just run with degraded functionality. The lack of
hyperv clock tunables means Windows guests will have unreliable
time keeping and are likely to suffer random BSOD, which I think
is a particularly important issue.


And of course we remove a bunch of conditional logic from Nova
which simplifies the code paths and removes code paths which
rarely get testing coverage.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] naming N and O releases nowish

2015-10-07 Thread Daniel P. Berrange
On Wed, Oct 07, 2015 at 07:47:31AM -0400, Sean Dague wrote:
> We're starting to make plans for the next cycle. Long term plans are
> getting made for details that would happen in one or two cycles.
> 
> As we already have the locations for the N and O summits I think we
> should do the naming polls now and have names we can use for this
> planning instead of letters. It's pretty minor but it doesn't seem like
> there is any real reason to wait and have everyone come up with working
> names that turn out to be confusing later.

Yep, it would be nice to have names decided further in advance than
we have done in the past. It saves having to refer to N, O
all the time, or having people invent their own temporary names like
Lemming and Muppet...

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-10-06 Thread Daniel P. Berrange
On Tue, Oct 06, 2015 at 02:54:21PM +0100, Paul Carlton wrote:
> https://review.openstack.org/#/c/85048/ was raised to address the
> migration of instances that are not running but people did not warm to
> the idea of bringing a stopped/suspended instance to a paused state to
> migrate it.  Is there any work in progress to get libvirt enhanced to
> perform the migration of non active virtual machines?

Libvirt can "migrate" the configuration of an inactive VM, but does
not plan todo anything related to storage migration. OpenStack could
already solve this itself by using libvirt storage pool APIs to
copy storage volumes across, but the storage pool worked in Nova
is stalled

https://review.openstack.org/#/q/status:abandoned+project:openstack/nova+branch:master+topic:bp/use-libvirt-storage-pools,n,z
> 

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Fwd: [nova] live migration in Mitaka

2015-10-05 Thread Daniel P. Berrange
On Fri, Oct 02, 2015 at 04:38:30PM -0400, Mathieu Gagné wrote:
> On 2015-10-02 4:18 PM, Pavel Boldin wrote:
> > 
> > You have to pass device names from /dev/, e.g., if a VM has
> > ephemeral disk
> > attached at /dev/vdb you need to pass in 'vdb'. Format expected by
> > migrate_disks is ",...".
> > 
> > 
> > This is the format expected by the `virsh' utility and will not work in
> > Python.
> > 
> > The libvirt-python has now support for passing lists to a parameter [1].
> > 
> > [1]
> > http://libvirt.org/git/?p=libvirt-python.git;a=commit;h=9896626b8277e2ffba1523d2111c96b08fb799e8
> >  
> 
> Thanks for the info. I was banging my head against the wall, trying to
> understand why it didn't accept my list of strings.
> 
> Now the next challenge is with Ubuntu packages, only python-libvirt
> 1.2.15 is available in Ubuntu Willy. :-/

You already need a runtime version check for libvirt to see if the new
migration API is available. You just need to extend that to also check
the libvirt client version


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Fwd: [nova] live migration in Mitaka

2015-10-05 Thread Daniel P. Berrange
On Mon, Oct 05, 2015 at 01:10:45PM +0300, Pavel Boldin wrote:
> Daniel,
> 
> It is done already in the proposed patch.
> 
> But this one is about Wily having libvirt 1.2.16 and libvirt-python 1.2.15.

Assuming you are refering to this patch:

https://review.openstack.org/#/c/227278/3/nova/virt/libvirt/driver.py

that code is only checking the libvirt daemon version, it is not checking
the libvirt-python version.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [cinder] [all] The future of Cinder API v1

2015-09-30 Thread Daniel P. Berrange
On Wed, Sep 30, 2015 at 08:10:43AM -0400, Sean Dague wrote:
> On 09/30/2015 07:29 AM, Ivan Kolodyazhny wrote:
> > Sean,
> > 
> > openstack client supports Cinder API v2 since Liberty. What it the right
> > way ti fix grenade?
> 
> Here's the thing.
> 
> With this change: Rally doesn't work, novaclient doesn't work, grenade
> doesn't work. Apparently nearly all the libraries in the real world
> don't work.
> 
> I feel like that list of incompatibilities should have been collected
> before this change. Managing a major API transition is a big deal, and
> having a pretty good idea who you are going to break before you do it is
> important. Just putting it out there and watching fallout isn't the
> right approach.

I have to agree, breaking APIs is a very big deal for consumers of
those APIs. When you break API you are trading off less work for
maintainers, vs extra pain for users. IMHO intentionally creating
pain for users is something that should be avoided unless there is
no practical alternative. I'd go as far as to say we should never
break API at all, which would mean keeping v1 around forever,
albeit recommending people use v2. If we really do want to kill
v1 and inflict pain on consumers, then we need to ensure that pain
is as close to zero as possible. This means we should not kill v1
until we've verified that all known current clients impl of v1 have
a v2 implementation available.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] -1 due to line length violation in commit messages

2015-09-25 Thread Daniel P. Berrange
On Fri, Sep 25, 2015 at 11:05:09AM -0400, Doug Hellmann wrote:
> git tools such as git log and git show indent the commit message in
> their output, so you don't actually have the full 79/80 character width
> to work with. That's where the 72 comes from.

It is also commonly done because so that when you copy commits into
email, the commit message doesn't get further line breaks inserted.
This isn't a big deal with openstack, as we don't use an email workflow
for patch review.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-09-23 Thread Daniel P. Berrange
On Wed, Sep 23, 2015 at 01:48:17PM +0100, Paul Carlton wrote:
> 
> 
> On 22/09/15 16:44, Daniel P. Berrange wrote:
> >On Tue, Sep 22, 2015 at 09:29:46AM -0600, Chris Friesen wrote:
> >>>>There is also work on post-copy migration in QEMU. Normally with live
> >>>>migration, the guest doesn't start executing on the target host until
> >>>>migration has transferred all data. There are many workloads where that
> >>>>doesn't work, as the guest is dirtying data too quickly, With post-copy 
> >>>>you
> >>>>can start running the guest on the target at any time, and when it faults
> >>>>on a missing page that will be pulled from the source host. This is
> >>>>slightly more fragile as you risk loosing the guest entirely if the source
> >>>>host dies before migration finally completes. It does guarantee that
> >>>>migration will succeed no matter what workload is in the guest. This is
> >>>>probably N cycle material.
> >>It seems to me that the ideal solution would be to start doing pre-copy
> >>migration, then if that doesn't converge with the specified downtime value
> >>then maybe have the option to just cut over to the destination and do a
> >>post-copy migration of the remaining data.
> >Yes, that is precisely what the QEMU developers working on this
> >featue suggest we should do. The lazy page faulting on the target
> >host has a performance hit on the guest, so you definitely need
> >to give a little time for pre-copy to start off with, and then
> >switch to post-copy once some benchmark is reached, or if progress
> >info shows the transfer is not making progress.
> >
> >Regards,
> >Daniel
> I'd be a bit concerned about automatically switching to the post copy
> mode.  As Daniel commented perviously, if something goes wrong on the
> source node the customer's instance could be lost.  Many cloud operators
> will want to control the use of this mode.  As per my previous message
> this could be something that could be set on or off by default but
> provide a PUT operation on os-migration to update setting on for a
> specific migration

NB, if you are concerned about the source host going down while
migration is still taking place, you will loose the VM even with
pre-copy mode too, since the VM will of course still be running
on the source.

The new failure scenario is essentially about the network
connection between the source & host guest - if the network
layer fails while post-copy is running, then you loose the
VM.

In some sense post-copy will reduce the window of failure,
because it should ensure that the VM migration completes
in a faster & finite amount of time. I think this is
probably particularly important for host evacuation so
the admin can guarantee to get all the VMs off a host in
a reasonable amount of time.

As such I don't think you need expose post-copy as a concept in the
API, but I could see a nova.conf value to say whether use of post-copy
was acceptable, so those who want to have stronger resilience against
network failure can turn off post-copy.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-09-22 Thread Daniel P. Berrange
On Tue, Sep 22, 2015 at 09:05:11AM -0600, Chris Friesen wrote:
> On 09/21/2015 02:56 AM, Daniel P. Berrange wrote:
> >On Fri, Sep 18, 2015 at 05:47:31PM +, Carlton, Paul (Cloud Services) 
> >wrote:
> >>However the most significant impediment we encountered was customer
> >>complaints about performance of instances during migration.  We did a little
> >>bit of work to identify the cause of this and concluded that the main issues
> >>was disk i/o contention.  I wonder if this is something you or others have
> >>encountered?  I'd be interested in any idea for managing the rate of the
> >>migration processing to prevent it from adversely impacting the customer
> >>application performance.  I appreciate that if we throttle the migration
> >>processing it will take longer and may not be able to keep up with the rate
> >>of disk/memory change in the instance.
> >
> >I would not expect live migration to have an impact on disk I/O, unless
> >your storage is network based and using the same network as the migration
> >data. While migration is taking place you'll see a small impact on the
> >guest compute performance, due to page table dirty bitmap tracking, but
> >that shouldn't appear directly as disk I/O problem. There is no throttling
> >of guest I/O at all during migration.
> 
> Technically if you're doing a lot of disk I/O couldn't you end up with a
> case where you're thrashing the page cache enough to interfere with
> migration?  So it's actually memory change that is the problem, but it might
> not be memory that the application is modifying directly but rather memory
> allocated by the kernel.
> 
> >>Could you point me at somewhere I can get details of the tuneable setting
> >>relating to cutover down time please?  I'm assuming that at these are
> >>libvirt/qemu settings?  I'd like to play with them in our test environment
> >>to see if we can simulate busy instances and determine what works.  I'd also
> >>be happy to do some work to expose these in nova so the cloud operator can
> >>tweak if necessary?
> >
> >It is already exposed as 'live_migration_downtime' along with
> >live_migration_downtime_steps, and live_migration_downtime_delay.
> >Again, it shouldn't have any impact on guest performance while
> >live migration is taking place. It only comes into effect when
> >checking whether the guest is ready to switch to the new host.
> 
> Has anyone given thought to exposing some of these new parameters to the
> end-user?  I could see a scenario where an image might want to specify the
> acceptable downtime over migration.  (On the other hand that might be tricky
> from the operator perspective.)

I'm of the opinion that we should really try to avoid exposing *any*
migration tunables to the tenant user. All the tunables are pretty
hypervisor specific and low level and not very friendly to expose
to tenants. Instead our focus should be on ensuring that it will
always "just work" from the tenants POV.

When QEMU gets 'post copy' migration working, we'll want to adopt
that asap, as that will give us the means to guarantee that migration
will always complete with very little need for tuning.

At most I could see the users being able to given some high level
indication as to whether their images tolerate some level of
latency, so Nova can decide what migration characteristic is
acceptable.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-09-22 Thread Daniel P. Berrange
On Tue, Sep 22, 2015 at 09:29:46AM -0600, Chris Friesen wrote:
> >>There is also work on post-copy migration in QEMU. Normally with live
> >>migration, the guest doesn't start executing on the target host until
> >>migration has transferred all data. There are many workloads where that
> >>doesn't work, as the guest is dirtying data too quickly, With post-copy you
> >>can start running the guest on the target at any time, and when it faults
> >>on a missing page that will be pulled from the source host. This is
> >>slightly more fragile as you risk loosing the guest entirely if the source
> >>host dies before migration finally completes. It does guarantee that
> >>migration will succeed no matter what workload is in the guest. This is
> >>probably N cycle material.
> 
> It seems to me that the ideal solution would be to start doing pre-copy
> migration, then if that doesn't converge with the specified downtime value
> then maybe have the option to just cut over to the destination and do a
> post-copy migration of the remaining data.

Yes, that is precisely what the QEMU developers working on this
featue suggest we should do. The lazy page faulting on the target
host has a performance hit on the guest, so you definitely need
to give a little time for pre-copy to start off with, and then
switch to post-copy once some benchmark is reached, or if progress
info shows the transfer is not making progress.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-09-21 Thread Daniel P. Berrange
On Fri, Sep 18, 2015 at 05:47:31PM +, Carlton, Paul (Cloud Services) wrote:
> However the most significant impediment we encountered was customer
> complaints about performance of instances during migration.  We did a little
> bit of work to identify the cause of this and concluded that the main issues
> was disk i/o contention.  I wonder if this is something you or others have
> encountered?  I'd be interested in any idea for managing the rate of the
> migration processing to prevent it from adversely impacting the customer
> application performance.  I appreciate that if we throttle the migration
> processing it will take longer and may not be able to keep up with the rate
> of disk/memory change in the instance.

I would not expect live migration to have an impact on disk I/O, unless
your storage is network based and using the same network as the migration
data. While migration is taking place you'll see a small impact on the
guest compute performance, due to page table dirty bitmap tracking, but
that shouldn't appear directly as disk I/O problem. There is no throttling
of guest I/O at all during migration.

> Could you point me at somewhere I can get details of the tuneable setting
> relating to cutover down time please?  I'm assuming that at these are
> libvirt/qemu settings?  I'd like to play with them in our test environment
> to see if we can simulate busy instances and determine what works.  I'd also
> be happy to do some work to expose these in nova so the cloud operator can
> tweak if necessary?

It is already exposed as 'live_migration_downtime' along with
live_migration_downtime_steps, and live_migration_downtime_delay.
Again, it shouldn't have any impact on guest performance while
live migration is taking place. It only comes into effect when
checking whether the guest is ready to switch to the new host.

> I understand that you have added some functionality to the nova compute
> manager to collect data on migration progress and emit this to the log file.
> I'd like to propose that we extend this to emit notification message
> containing progress information so a cloud operator's orchestration can
> consume these events and use them to monitor progress of individual
> migrations.  This information could be used to generate alerts or tickets so
> that support staff can intervene.  The smarts in qemu to help it make
> progress are very welcome and necessary but in my experience the cloud
> operator needs to be able to manage these and if it is necessary to slow
> down or even pause a customer's instance to complete the migration the cloud
> operator may need to gain customer consent before proceeding.

We already update the Nova  instance object's 'progress' value with the
info on the migration progress. IIRC, this is visible via 'nova show '
or something like that.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] live migration in Mitaka

2015-09-18 Thread Daniel P. Berrange
On Fri, Sep 18, 2015 at 11:53:05AM +, Murray, Paul (HP Cloud) wrote:
> Hi All,
> 
> There are various efforts going on around live migration at the moment:
> fixing up CI, bug fixes, additions to cover more corner cases, proposals
> for new operations
> 
> Generally live migration could do with a little TLC (see: [1]), so I am
> going to suggest we give some of that care in the next cycle.
> 
> Please respond to this post if you have an interest in this and what you
> would like to see done. Include anything you are already getting on with
> so we get a clear picture. If there is enough interest I'll put this
> together as a proposal for a work stream. Something along the lines of
> "robustify live migration".

We merged some robustness improvements for migration during Liberty.
Specifically, with KVM we now track the progress of data transfer
and if it is not making forward progress during a set window of
time, we will abort the migration. This ensures you don't get a
migration that never ends. We also now have code which dynamically
increases the max permitted downtime during switchover, to try and
make it more likely to succeeed. We could do with getting feedback
on how well the various tunable settings work in practie for real
world deployments, to see if we need to change any defaults.

There was a proposal to nova to allow the 'pause' operation to be
invoked while migration was happening. This would turn a live
migration into a coma-migration, thereby ensuring it succeeds.
I cna't remember if this merged or not, as i can't find the review
offhand, but its important to have this ASAP IMHO, as when
evacuating VMs from a host admins need a knob to use to force
successful evacuation, even at the cost of pausing the guest
temporarily.

In libvirt upstream we now have the ability to filter what disks are
migrated during block migration. We need to leverage that new feature
to fix the long standing problems of block migration when non-local
images are attached - eg cinder volumes. We definitely want this
in Mitaka.

We should look at what we need to do to isolate the migration data
network from the main management network. Currently we live
migrate over whatever network is associated with the compute hosts
primary Hostname / IP address. This is not neccessarily the fastest
NIC on the host. We ought to be able to record an alternative
hostname / IP address against each compute host to indicate the
desired migration interface.

Libvirt/KVM have the ability to turn on compression for migration
which again improves the chances of convergance & thus success.
We would look at leveraging that.

QEMU has a crude "auto-converge" flag you can turn on, which limits
guest CPU execution time, in an attempt to slow down data dirtying
rate to again improve chance of successful convergance.

I'm working on enhancements to QEMU itself to support TLS encryption
for migration. This will enable openstack to have secure migration
datastream, without having to tunnel via libvirtd. This is useful
as tunneling via libvirtd doesn't work with block migration. It will
also be much faster than tunnelling. This probably might be merged
in QEMU before Mitaka cycle ends, but more likely it is Nxxx cycle

There is also work on post-copy migration in QEMU. Normally with
live migration, the guest doesn't start executing on the target
host until migration has transferred all data. There are many
workloads where that doesn't work, as the guest is dirtying data
too quickly, With post-copy you can start runing the guest on the
target at any time, and when it faults on a missing page that will
be pulled from the source host. This is slightly more fragile as
you risk loosing the guest entirely if the source host dies before
migration finally completes. It does guarantee that migration will
succeed no matter what workload is in the guest. This is probably
N cycle material.

Testing. Testing. Testing.

Lots more I can't think of right now

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

2015-09-15 Thread Daniel P. Berrange
On Mon, Sep 14, 2015 at 09:34:31PM -0400, Jay Pipes wrote:
> On 09/10/2015 05:23 PM, Brent Eagles wrote:
> >Hi,
> >
> >I was recently informed of a situation that came up when an engineer
> >added an SR-IOV nic to a compute node that was hosting some guests that
> >had VFs attached. Unfortunately, adding the card shuffled the PCI
> >addresses causing some degree of havoc. Basically, the PCI addresses
> >associated with the previously allocated VFs were no longer valid.
> >
> >I tend to consider this a non-issue. The expectation that hosts have
> >relatively static hardware configuration (and kernel/driver configs for
> >that matter) is the price you pay for having pets with direct hardware
> >access. That being said, this did come as a surprise to some of those
> >involved and I don't think we have any messaging around this or advice
> >on how to deal with situations like this.
> >
> >So what should we do? I can't quite see altering OpenStack to deal with
> >this situation (or even how that could work). Has anyone done any
> >research into this problem, even if it is how to recover or extricate
> >a guest that is no longer valid? It seems that at the very least we
> >could use some stern warnings in the docs.
> 
> Hi Brent,
> 
> Interesting issue. We have code in the PCI tracker that ostensibly handles
> this problem:
> 
> https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L145-L164
> 
> But the note from yjiang5 is telling:
> 
> # Pci properties may change while assigned because of
> # hotplug or config changes. Although normally this should
> # not happen.
> # As the devices have been assigned to a instance, we defer
> # the change till the instance is destroyed. We will
> # not sync the new properties with database before that.
> # TODO(yjiang5): Not sure if this is a right policy, but
> # at least it avoids some confusion and, if
> # we can add more action like killing the instance
> # by force in future.
> 
> Basically, if the PCI device tracker notices that an instance is assigned a
> PCI device with an address that no longer exists in the PCI device addresses
> returned from libvirt, it will (eventually, in the _free_instance() method)
> remove the PCI device assignment from the Instance object, but it will make
> no attempt to assign a new PCI device that meets the original PCI device
> specification in the launch request.
> 
> Should we handle this case and attempt a "hot re-assignment of a PCI
> device"? Perhaps. Is it high priority? Not really, IMHO.

Hotplugging new PCI devices to a running host should not have any impact
on existing PCI device addresses - it'll merely add new adddresses for
new devices - existing devices are unchanged. So Everything should "just
work" in that case. IIUC, Brent's Q was around turning off the host and
cold-plugging/unplugging hardware, which /is/ liable to arbitrarily
re-arrange existing PCI device addresses.

> If you'd like to file a bug against Nova, that would be cool, though.

I think it is explicitly out of scope for Nova to deal with this
scenario.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

2015-09-15 Thread Daniel P. Berrange
On Thu, Sep 10, 2015 at 06:53:06PM -0230, Brent Eagles wrote:
> Hi,
> 
> I was recently informed of a situation that came up when an engineer
> added an SR-IOV nic to a compute node that was hosting some guests that
> had VFs attached. Unfortunately, adding the card shuffled the PCI
> addresses causing some degree of havoc. Basically, the PCI addresses
> associated with the previously allocated VFs were no longer valid.

This seems to be implying that they took the host offline to make
hardware changes, and then tried to re-start the originally running
guests directly, without letting the schedular re-run.

If correct, then IMHO that is an unsupported approach. After making
any hardware changes you should essentially consider that to be a
new compute host. There is no expectation that previously running
guests on that host can be restarted. You must let the compute
host report its new hardware capabilities, and let the schedular
place guests on it from scratch, using the new PCI address info.

> I tend to consider this a non-issue. The expectation that hosts have
> relatively static hardware configuration (and kernel/driver configs for
> that matter) is the price you pay for having pets with direct hardware
> access. That being said, this did come as a surprise to some of those
> involved and I don't think we have any messaging around this or advice
> on how to deal with situations like this.
> 
> So what should we do? I can't quite see altering OpenStack to deal with
> this situation (or even how that could work). Has anyone done any
> research into this problem, even if it is how to recover or extricate
> a guest that is no longer valid? It seems that at the very least we
> could use some stern warnings in the docs.

Taking a host offline for maintenance, should be considered
equivalent to throwing away the existing host and deploying a new
host. There should be zero state carry-over from OpenStack POV,
since both the software and hardware changes can potentially
invalidate previous informationm used by the schedular for deploying
on that host.  The idea of recovering a previously running guest
should be explicitly unsupported.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] What is the no_device flag for in block device mapping?

2015-09-09 Thread Daniel P. Berrange
On Tue, Sep 08, 2015 at 05:32:29PM +, Murray, Paul (HP Cloud) wrote:
> Hi All,
> 
> I'm wondering what the "no_device" flag is used for in the block device
> mappings. I had a dig around in the code but couldn't figure out why it
> is there. The name suggests an obvious meaning, but I've learnt not to
> guess too much from names.
> 
> Any pointers welcome.

I was going to suggest reading the docs

  http://docs.openstack.org/developer/nova/block_device_mapping.html

but they don't mention 'no_device' at all :-(

When we find out what it actually means we should document it there :-)

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] testing for setting the admin password via the libvirt driver

2015-08-28 Thread Daniel P. Berrange
On Tue, Aug 25, 2015 at 09:14:33AM -0500, Matt Riedemann wrote:
 Support to change the admin password on an instance via the libvirt driver
 landed in liberty [1] but the hypervisor support matrix wasn't updated [2].
 There is a version restriction in the driver that it won't work unless
 you're using at least libvirt 1.2.16.
 
 We should be able to at least update the hypervisor support matrix that this
 is supported for libvirt with the version restriction.  markus_z actually
 pointed that out in the review of the change to add the support but it was
 ignored.

Yes, in that case, it'd be appropriate to update the support matrix and
add in a footnote against it mentioning the min required version.

 The other thing I was wondering about was testing.  The check/gate queue
 jobs with ubuntu 14.04 only have libvirt 1.2.2.
 
 There is the fedora 21 job that runs on the experimental queue and I've
 traditionally considered this a place to test out libvirt driver features
 that need something newer than 1.2.2, but that only goes up to libvirt
 1.2.9.3 [3].
 
 It looks like you have to get up to fedora 23 to be able to test this
 set-admin-password function [4].  In fact it looks like the only major
 distro out there right now that supports this new enough version of libvirt
 is fc23 [5].
 
 Does anyone fancy getting a f23 job setup in the experimental queue for
 nova?  It would be nice to actually be able to test the bleeding edge
 features that we put into the driver code.

F23 is not released yet so may have instability which will hamper running
gate jobs. The other alternative is to setup a stable Fedora release
like F22, and then enable the VirtPreview repositor which gives you newer
set of the virt toolchain from F23/rawhide. This should be more stable
than running entire of F23/rawhide distro.

https://fedoraproject.org/wiki/Virtualization_Preview_Repository

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] should we allow overcommit for a single VM?

2015-08-24 Thread Daniel P. Berrange
On Mon, Aug 17, 2015 at 01:22:28PM -0600, Chris Friesen wrote:
 
 I tried bringing this up on the irc channel, but nobody took the bait.
 Hopefully this will generate some discussion.
 
 I just filed bug 1485631.  Nikola suggested one way of handling it, but
 there are some complications that I thought I should highlight so we're all
 on the same page.
 
 The basic question is, if a host has X CPUs in total for VMs, and a single
 instance wants X+1 vCPUs, should we allow it?  (Regardless of overcommit
 ratio.)  There is also an equivalent question for RAM.
 
 Currently we have two different answers depending on whether numa topology
 is involved or not.  Should we change one of them to make it consistent with
 the other?  If so, a) which one should we change, and b) how would we do
 that given that it results in a user-visible behaviour change?  (Maybe a
 microversion, even though the actual API doesn't change, just whether the
 request passes the scheduler filter or not?)

I agree with Nikola, that the NUMA impl is the correct one. The existance
of overcommit is motivated by the idea that most users will not in fact
consume all the resources allocated to their VM all of the time and thus
on average you don't need to reserve 100% of resources for every single VM.
Users will usually be able to burst upto 100% of their allocation when
needed, if some portion of other users are mostly inactive.

If you allow a single VM to overcommit against itself though, this breaks
down. It is never possible for their single VM to burst to consume 100%
of the resources allocated to it, since the host if physically incapable
of providing that much resource.

On that basis, I think the correct behaviour is to consider overcommit
to be a factor that applies across a set of VMs only. Never allow a
single VM to overcommit against itself. Which is what the NUMA code
in libvirt currently implements. I think we should align the non-NUMA
codepath with this too.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [openstack][nova] Streamlining of config options in nova

2015-08-13 Thread Daniel P. Berrange
On Wed, Aug 12, 2015 at 07:20:24PM +0200, Markus Zoeller wrote:
 Another thing which makes it hard to understand the impact of the config
 options is, that it's not clear how the interdependency to other config 
 options is. As an example, the serial_console.base_url has a 
 dependency to DEFAULT.cert and DEFAULT.key if you want to use 
 secured websockets (base_url=wss://...). Another one is the option
 serial_console.serialproxy_port. This port number must be the same
 as it is in serial_console.base_url. I couldn't find an explanation to
 this.
 
 The three questions I have with every config option:
 1) which service(s) access this option?
 2) what does it do? / what's the impact? 
 3) which other options do I need to tweek to get the described impact?
 
 Would it make sense to stage the changes?
 M cycle: move the config options out of the modules to another place
  (like the approach Sean proposed) and annotate them with
  the services which uses them
 N cycle: inject the options into the drivers and eliminate the global
  variables this way (like Daniel et al. proposed)

The problem I see is that as long as we're using config options as
global variables, figuring out which services use which options is
a major non-trivial effort. Some may be easy to figure out, but
with many it gets into quite call path analysis, and the usage is
changing under your feet as new reviews are posted. So personally
I think it would be more practical todo the reverse. ie stop using
the config options as global variables, and then split up the
config file so that we have a separate one for each service.

ie a /etc/nova/nova-compute.conf and get rid of /etc/nova/nova.conf

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Hyper-V 2008 R2 support

2015-08-04 Thread Daniel P. Berrange
On Tue, Aug 04, 2015 at 02:34:19PM +, Alessandro Pilotti wrote:
 Hi guys,
 
 Just a quick note on the Windows editions support matrix updates for the Nova
 Hyper-V driver and Neutron networking-hyperv ML2 agent:  
 
 We are planning to drop legacy Windows Server / Hyper-V Server 2008 R2 support
 starting with Liberty.
 
 Windows Server / Hyper-V Server 2012 and above will continue to be supported.

What do you mean precisely by drop support here ?  Are you merely no longer
testing it, or is Nova actually broken with Hyper-V 2k8 R2  in Liberty ?

Generally if we intend to drop a hypervisor platform we'd expect to have a
deprecation period for 1 cycle where Nova would print out a warning message
on startup to alert administrators if using the platform that is intended
to be dropped. This gives them time to plan a move to a newer platform
before we drop the support.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [openstack][nova] Streamlining of config options in nova

2015-07-27 Thread Daniel P. Berrange
On Fri, Jul 24, 2015 at 09:48:15AM +0100, Daniel P. Berrange wrote:
 On Thu, Jul 23, 2015 at 05:55:36PM +0300, mhorban wrote:
  Hi all,
  
  During development process in nova I faced with an issue related with config
  options. Now we have lists of config options and registering options mixed
  with source code in regular files.
  From one side it can be convenient: to have module-encapsulated config
  options. But problems appear when we need to use some config option in
  different modules/packages.
  
  If some option is registered in module X and module X imports module Y for
  some reasons...
  and in one day we need to import this option in module Y we will get
  exception
  NoSuchOptError on import_opt in module Y.
  Because of circular dependency.
  To resolve it we can move registering of this option in Y module(in the
  inappropriate place) or use other tricks.
  
  I offer to create file options.py in each package and move all package's
  config options and registration code there.
  Such approach allows us to import any option in any place of nova without
  problems.
  
  Implementations of this refactoring can be done piece by piece where piece
  is
  one package.
  
  What is your opinion about this idea?
 
 I tend to think that focusing on problems with dependancy ordering when
 modules import each others config options is merely attacking a symptom
 of the real root cause problem.
 
 The way we use config options is really entirely wrong. We have gone
 to the trouble of creating (or trying to create) structured code with
 isolated functional areas, files and object classes, and then we throw
 in these config options which are essentially global variables which are
 allowed to be accessed by any code anywhere. This destroys the isolation
 of the various classes we've created, and means their behaviour often
 based on side effects of config options from unrelated pieces of code.
 It is total madness in terms of good design practices to have such use
 of global variables.
 
 So IMHO, if we want to fix the real big problem with config options, we
 need to be looking to solution where we stop using config options as
 global variables. We should change our various classes so that the
 neccessary configurable options as passed into object constructors
 and/or methods as parameters.
 
 As an example in the libvirt driver.
 
 I would set it up so that /only/ the LibvirtDriver class in driver.py
 was allowed to access the CONF config options. In its constructor it
 would load all the various config options it needs, and either set
 class attributes for them, or pass them into other methods it calls.
 So in the driver.py, instead of calling CONF.libvirt.libvirt_migration_uri
 everywhere in the code,  in the constructor we'd save that config param
 value to an attribute 'self.mig_uri = CONF.libvirt.libvirt_migration_uri'
 and then where needed, we'd just call self.mig_uri.
 
 Now in the various other libvirt files, imagebackend.py, volume.py
 vif.py, etc. None of those files would /ever/ access CONF.*. Any time
 they needed a config parameter, it would be passed into their constructor
 or method, by the LibvirtDriver or whatever invoked them.
 
 Getting rid of the global CONF object usage in all these files trivially
 now solves the circular dependancy import problem, as well as improving
 the overall structure and isolation of our code, freeing all these methods
 from unexpected side-effects from global variables.

Another significant downside of using CONF objects as global variables
is that it is largely impossible to say which nova.conf setting is
used by which service. Figuring out whether a setting affects nova-compute
or nova-api or nova-conductor, or ... largely comes down to guesswork or
reliance on tribal knowledge. It would make life significantly easier for
both developers and administrators if we could clear this up and in fact
have separate configuration files for each service, holding only the
options that are relevant for that service.  Such a cleanup is not going
to be practical though as long as we're using global variables for config
as it requires control-flow analysis find out what affects what :-(

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [openstack][nova] Streamlining of config options in nova

2015-07-24 Thread Daniel P. Berrange
On Thu, Jul 23, 2015 at 05:55:36PM +0300, mhorban wrote:
 Hi all,
 
 During development process in nova I faced with an issue related with config
 options. Now we have lists of config options and registering options mixed
 with source code in regular files.
 From one side it can be convenient: to have module-encapsulated config
 options. But problems appear when we need to use some config option in
 different modules/packages.
 
 If some option is registered in module X and module X imports module Y for
 some reasons...
 and in one day we need to import this option in module Y we will get
 exception
 NoSuchOptError on import_opt in module Y.
 Because of circular dependency.
 To resolve it we can move registering of this option in Y module(in the
 inappropriate place) or use other tricks.
 
 I offer to create file options.py in each package and move all package's
 config options and registration code there.
 Such approach allows us to import any option in any place of nova without
 problems.
 
 Implementations of this refactoring can be done piece by piece where piece
 is
 one package.
 
 What is your opinion about this idea?

I tend to think that focusing on problems with dependancy ordering when
modules import each others config options is merely attacking a symptom
of the real root cause problem.

The way we use config options is really entirely wrong. We have gone
to the trouble of creating (or trying to create) structured code with
isolated functional areas, files and object classes, and then we throw
in these config options which are essentially global variables which are
allowed to be accessed by any code anywhere. This destroys the isolation
of the various classes we've created, and means their behaviour often
based on side effects of config options from unrelated pieces of code.
It is total madness in terms of good design practices to have such use
of global variables.

So IMHO, if we want to fix the real big problem with config options, we
need to be looking to solution where we stop using config options as
global variables. We should change our various classes so that the
neccessary configurable options as passed into object constructors
and/or methods as parameters.

As an example in the libvirt driver.

I would set it up so that /only/ the LibvirtDriver class in driver.py
was allowed to access the CONF config options. In its constructor it
would load all the various config options it needs, and either set
class attributes for them, or pass them into other methods it calls.
So in the driver.py, instead of calling CONF.libvirt.libvirt_migration_uri
everywhere in the code,  in the constructor we'd save that config param
value to an attribute 'self.mig_uri = CONF.libvirt.libvirt_migration_uri'
and then where needed, we'd just call self.mig_uri.

Now in the various other libvirt files, imagebackend.py, volume.py
vif.py, etc. None of those files would /ever/ access CONF.*. Any time
they needed a config parameter, it would be passed into their constructor
or method, by the LibvirtDriver or whatever invoked them.

Getting rid of the global CONF object usage in all these files trivially
now solves the circular dependancy import problem, as well as improving
the overall structure and isolation of our code, freeing all these methods
from unexpected side-effects from global variables.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [openstack][nova] Streamlining of config options in nova

2015-07-24 Thread Daniel P. Berrange
On Thu, Jul 23, 2015 at 11:57:01AM -0500, Michael Still wrote:
 In fact, I did an example of what I thought it would look like already:
 
 https://review.openstack.org/#/c/205154/
 
 I welcome discussion on this, especially from people who couldn't make
 it to the mid-cycle. Its up to y'all if you do that on this thread or
 in that review.

I think this kind of thing needs to have a spec proposed for it, so we
can go through the details of the problem and the design considerations
for it. This is especially true considering this proposal comes out of
a f2f meeting where the majority of the community was not present to
participate in the discussion.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


<    1   2   3   4   5   6   7   8   >