Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

Alon Bar-Lev Tue, 27 Nov 2012 00:54:08 -0800


----- Original Message -----
> From: "Livnat Peer" <lp...@redhat.com>
> To: "Adam Litke" <a...@us.ibm.com>
> Cc: "Alon Bar-Lev" <abar...@redhat.com>, "VDSM Project Development" 
> <vdsm-devel@lists.fedorahosted.org>
> Sent: Tuesday, November 27, 2012 10:42:00 AM
> Subject: Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary
> 
> On 26/11/12 16:59, Adam Litke wrote:
> > On Mon, Nov 26, 2012 at 02:57:19PM +0200, Livnat Peer wrote:
> >> On 26/11/12 03:15, Shu Ming wrote:
> >>> Livnat,
> >>>
> >>> Thanks for your summary.  I got comments below.
> >>>
> >>> 2012-11-25 18:53, Livnat Peer:
> >>>> Hi All,
> >>>> We have been discussing $subject for a while and I'd like to
> >>>> summarized
> >>>> what we agreed and disagreed on thus far.
> >>>>
> >>>> The way I see it there are two related discussions:
> >>>>
> >>>>
> >>>> 1. Getting VDSM networking stack to be distribution agnostic.
> >>>> - We are all in agreement that VDSM API should be generic enough
> >>>> to
> >>>> incorporate multiple implementation. (discussed on this thread:
> >>>> Alon's
> >>>> suggestion, Mark's patch for adding support for netcf etc.)
> >>>>
> >>>> - We would like to maintain at least one implementation as the
> >>>> working/up-to-date implementation for our users, this
> >>>> implementation
> >>>> should be distribution agnostic (as we all acknowledge this is
> >>>> an
> >>>> important goal for VDSM).
> >>>> I also think that with the agreement of this community we can
> >>>> choose to
> >>>> change our focus, from time to time, from one implementation to
> >>>> another
> >>>> as we see fit (today it can be OVS+netcf and in a few months
> >>>> we'll use
> >>>> the quantum based implementation if we agree it is better)
> >>>>
> >>>> 2. The second discussion is about persisting the network
> >>>> configuration
> >>>> on the host vs. dynamically retrieving it from a centralized
> >>>> location
> >>>> like the engine. Danken raised a concern that even if going with
> >>>> the
> >>>> dynamic approach the host should persist the management network
> >>>> configuration.
> >>>
> >>> About dynamical retrieving from a centralized location,  when
> >>> will the
> >>> retrieving start? Just in the very early stage of host booting
> >>> before
> >>> network functions?  Or after the host startup and in the normal
> >>> running
> >>> state of the host?  Before retrieving the configuration,  how
> >>> does the
> >>> host network connecting to the engine? I think we need a basic
> >>> well
> >>> known network between hosts and the engine first.  Then after the
> >>> retrieving, hosts should reconfigure the network for later
> >>> management.
> >>> However, the timing to retrieve and reconfigure are challenging.
> >>>
> >>
> >> We did not discuss the dynamic approach in details on the list so
> >> far
> >> and I think this is a good opportunity to start this discussion...
> >>
> >> From what was discussed previously I can say that the need for a
> >> well
> >> known network was raised by danken, it was referred to as the
> >> management
> >> network, this network would be used for pulling the full host
> >> network
> >> configuration from the centralized location, at this point the
> >> engine.
> >>
> >> About the timing for retrieving the configuration, there are
> >> several
> >> approaches. One of them was described by Alon, and I think he'll
> >> join
> >> this discussion and maybe put it in his own words, but the idea
> >> was to
> >> 'keep' the network synchronized at all times. When the host have
> >> communication channel to the engine and the engine detects there
> >> is a
> >> mismatch in the host configuration, the engine initiates 'apply
> >> network
> >> configuration' action on the host.
> >>
> >> Using this approach we'll have a single path of code to maintain
> >> and
> >> that would reduce code complexity and bugs - That's quoting Alon
> >> Bar Lev
> >> (Alon I hope I did not twisted your words/idea).
> >>
> >> On the other hand the above approach makes local tweaks on the
> >> host
> >> (done manually by the administrator) much harder.
> > 
> > I worry a lot about the above if we take the dynamic approach.  It
> > seems we'd
> > need to introduce before/after 'apply network configuration' hooks
> > where the
> > admin could add custom config commands that aren't yet modeled by
> > engine.
> > 
> 
> yes, and I'm not sure the administrators would like the fact that we
> are
> 'forcing' them to write everything in a script and getting familiar
> with
> VDSM hooking mechanism (which in some cases require the use of custom
> properties on the engine level) instead of running a simple command
> line.


In which case will we force? Please be more specific.
If we can pass most of the iproute2, brctl, bond parameters via key/value pairs 
via the API, what in your view that is common or even seldom should be used?
This hook mechanism is only as fallback, provided to calm people down.

> 
> >> Any other approaches ?
> > 
> > Static configuration has the advantage of allowing a host to bring
> > itself back
> > online independent of the engine.  This is also useful for anyone
> > who may want
> > to deploy a vdsm node in standalone mode.
> > 
> > I think it would be possible to easily support a quasi-static
> > configuration mode
> > simply by extending the design of the dynamic approach slightly.
> >  In dynamic
> > mode, the network configuration is passed down as a well-defined
> > data structure.
> > When a particular configuration has been committed, vdsm could
> > write a copy of
> > that configuration data structure to
> > /var/run/vdsm/network-config.json.  During
> > a subsequent boot, if the engine cannot be contacted after
> > activating the
> > management network, the cached configuration can be applied using
> > the same code
> > as for dynamic mode.  We'd have to flesh out the circumstances
> > under which this
> > would happen.
> 
> I like this approach a lot but we need to consider that network
> configuration is an accumulated state, for example -
> 
> 1. The engine sends a setup-network command with the full host
> network
> configuration
> 2. The user configures new network on the host, the engine sends a
> new
> setup-network request to VDSM which includes only the delta requested
> by
> the user (adding the required network)
> 3. VDSM adds the new network

THIS IS COMPLEX!!!!!!!
Almost AI.
As you need to complete the network setting with what you know.

> and this can go on and on, for dealing with this issue:
> 
> We can either hold network-config.json per setup-network command and
> then for recovering the network configuration state we need to
> execute
> chain of set-up networks commands.
> 
> Or we can move the logic of calculating the delta from engine to VDSM
> and on each setup network have the engine pass the full
> configuration.
> The problem with that approach is that the analysis logic of the
> delta
> has to be done on the engine anyway to give a quick feedback to the
> user
> on the validity of his action.
> Maintaining this logic/code twice is not something we want (it's bad
> enough to do it once....)

I don't understand how the two algorithm are the same...
UI is much more/less verbose at different aspects, while taking the full 
configuration and convert to actual setting is a completely different sequence.
What the feedback of the user? as far as I understand the user is only 
interested in the end-result... building his own network and expect it to be 
applied.
 
> A third option is to extend the current API of setup network to
> include
> the full configuration in addition to the delta that is sent today.
> The
> full configuration would be used for creating network-config.json and
> for that alone, VDSM would change network configuration according to
> the
> delta sent as it does today.

Always pass full configuration, why deal with two cases?

> The problem with that approach is that I'm sure someone on the list
> would say it is a contamination to the API, and we should 'never'
> pass
> 'duplicate' information. Personally I find this option the easiest
> one
> to deal with the above issue.
> 

Livnat, I don't see any argument of persistence vs non persistence as the above 
is common to any approach taken.

Only this "manual configuration" argument keeps poping, which as I wrote is 
irrelevant in large scale and we do want to go into large scale.

Alon

> 
> Livnat
> 
> > 
> >>
> >> I'd like to add a more general question to the discussion what are
> >> the
> >> advantages of taking the dynamic approach?
> >> So far I collected two reasons:
> >>
> >> -It is a 'cleaner' design, removes complexity on VDSM code, easier
> >> to
> >> maintain going forward, and less bug prone (I agree with that one,
> >> as
> >> long as we keep the retrieving configuration mechanism/algorithm
> >> simple).
> >>
> >> -It adheres to the idea of having a stateless hypervisor - some
> >> more
> >> input on this point would be appreciated
> >>
> >> Any other advantages?
> >>
> >> discussing the benefits of having the persisted
> > 
> > As I mentioned above, the main benefit I see of having some sort of
> > persistent
> > configuration is:
> > 
> > - To allow the host to operate independently of the engine in
> > either a failure
> >   scenario or in a standalone configuration.
> > 
> 
> _______________________________________________
> vdsm-devel mailing list
> vdsm-devel@lists.fedorahosted.org
> https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> 
_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

Reply via email to