----- Original Message ----- > From: "Livnat Peer" <lp...@redhat.com> > To: "Adam Litke" <a...@us.ibm.com> > Cc: "Alon Bar-Lev" <abar...@redhat.com>, "VDSM Project Development" > <vdsm-devel@lists.fedorahosted.org> > Sent: Tuesday, November 27, 2012 10:42:00 AM > Subject: Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary > > On 26/11/12 16:59, Adam Litke wrote: > > On Mon, Nov 26, 2012 at 02:57:19PM +0200, Livnat Peer wrote: > >> On 26/11/12 03:15, Shu Ming wrote: > >>> Livnat, > >>> > >>> Thanks for your summary. I got comments below. > >>> > >>> 2012-11-25 18:53, Livnat Peer: > >>>> Hi All, > >>>> We have been discussing $subject for a while and I'd like to > >>>> summarized > >>>> what we agreed and disagreed on thus far. > >>>> > >>>> The way I see it there are two related discussions: > >>>> > >>>> > >>>> 1. Getting VDSM networking stack to be distribution agnostic. > >>>> - We are all in agreement that VDSM API should be generic enough > >>>> to > >>>> incorporate multiple implementation. (discussed on this thread: > >>>> Alon's > >>>> suggestion, Mark's patch for adding support for netcf etc.) > >>>> > >>>> - We would like to maintain at least one implementation as the > >>>> working/up-to-date implementation for our users, this > >>>> implementation > >>>> should be distribution agnostic (as we all acknowledge this is > >>>> an > >>>> important goal for VDSM). > >>>> I also think that with the agreement of this community we can > >>>> choose to > >>>> change our focus, from time to time, from one implementation to > >>>> another > >>>> as we see fit (today it can be OVS+netcf and in a few months > >>>> we'll use > >>>> the quantum based implementation if we agree it is better) > >>>> > >>>> 2. The second discussion is about persisting the network > >>>> configuration > >>>> on the host vs. dynamically retrieving it from a centralized > >>>> location > >>>> like the engine. Danken raised a concern that even if going with > >>>> the > >>>> dynamic approach the host should persist the management network > >>>> configuration. > >>> > >>> About dynamical retrieving from a centralized location, when > >>> will the > >>> retrieving start? Just in the very early stage of host booting > >>> before > >>> network functions? Or after the host startup and in the normal > >>> running > >>> state of the host? Before retrieving the configuration, how > >>> does the > >>> host network connecting to the engine? I think we need a basic > >>> well > >>> known network between hosts and the engine first. Then after the > >>> retrieving, hosts should reconfigure the network for later > >>> management. > >>> However, the timing to retrieve and reconfigure are challenging. > >>> > >> > >> We did not discuss the dynamic approach in details on the list so > >> far > >> and I think this is a good opportunity to start this discussion... > >> > >> From what was discussed previously I can say that the need for a > >> well > >> known network was raised by danken, it was referred to as the > >> management > >> network, this network would be used for pulling the full host > >> network > >> configuration from the centralized location, at this point the > >> engine. > >> > >> About the timing for retrieving the configuration, there are > >> several > >> approaches. One of them was described by Alon, and I think he'll > >> join > >> this discussion and maybe put it in his own words, but the idea > >> was to > >> 'keep' the network synchronized at all times. When the host have > >> communication channel to the engine and the engine detects there > >> is a > >> mismatch in the host configuration, the engine initiates 'apply > >> network > >> configuration' action on the host. > >> > >> Using this approach we'll have a single path of code to maintain > >> and > >> that would reduce code complexity and bugs - That's quoting Alon > >> Bar Lev > >> (Alon I hope I did not twisted your words/idea). > >> > >> On the other hand the above approach makes local tweaks on the > >> host > >> (done manually by the administrator) much harder. > > > > I worry a lot about the above if we take the dynamic approach. It > > seems we'd > > need to introduce before/after 'apply network configuration' hooks > > where the > > admin could add custom config commands that aren't yet modeled by > > engine. > > > > yes, and I'm not sure the administrators would like the fact that we > are > 'forcing' them to write everything in a script and getting familiar > with > VDSM hooking mechanism (which in some cases require the use of custom > properties on the engine level) instead of running a simple command > line.
In which case will we force? Please be more specific. If we can pass most of the iproute2, brctl, bond parameters via key/value pairs via the API, what in your view that is common or even seldom should be used? This hook mechanism is only as fallback, provided to calm people down. > > >> Any other approaches ? > > > > Static configuration has the advantage of allowing a host to bring > > itself back > > online independent of the engine. This is also useful for anyone > > who may want > > to deploy a vdsm node in standalone mode. > > > > I think it would be possible to easily support a quasi-static > > configuration mode > > simply by extending the design of the dynamic approach slightly. > > In dynamic > > mode, the network configuration is passed down as a well-defined > > data structure. > > When a particular configuration has been committed, vdsm could > > write a copy of > > that configuration data structure to > > /var/run/vdsm/network-config.json. During > > a subsequent boot, if the engine cannot be contacted after > > activating the > > management network, the cached configuration can be applied using > > the same code > > as for dynamic mode. We'd have to flesh out the circumstances > > under which this > > would happen. > > I like this approach a lot but we need to consider that network > configuration is an accumulated state, for example - > > 1. The engine sends a setup-network command with the full host > network > configuration > 2. The user configures new network on the host, the engine sends a > new > setup-network request to VDSM which includes only the delta requested > by > the user (adding the required network) > 3. VDSM adds the new network THIS IS COMPLEX!!!!!!! Almost AI. As you need to complete the network setting with what you know. > and this can go on and on, for dealing with this issue: > > We can either hold network-config.json per setup-network command and > then for recovering the network configuration state we need to > execute > chain of set-up networks commands. > > Or we can move the logic of calculating the delta from engine to VDSM > and on each setup network have the engine pass the full > configuration. > The problem with that approach is that the analysis logic of the > delta > has to be done on the engine anyway to give a quick feedback to the > user > on the validity of his action. > Maintaining this logic/code twice is not something we want (it's bad > enough to do it once....) I don't understand how the two algorithm are the same... UI is much more/less verbose at different aspects, while taking the full configuration and convert to actual setting is a completely different sequence. What the feedback of the user? as far as I understand the user is only interested in the end-result... building his own network and expect it to be applied. > A third option is to extend the current API of setup network to > include > the full configuration in addition to the delta that is sent today. > The > full configuration would be used for creating network-config.json and > for that alone, VDSM would change network configuration according to > the > delta sent as it does today. Always pass full configuration, why deal with two cases? > The problem with that approach is that I'm sure someone on the list > would say it is a contamination to the API, and we should 'never' > pass > 'duplicate' information. Personally I find this option the easiest > one > to deal with the above issue. > Livnat, I don't see any argument of persistence vs non persistence as the above is common to any approach taken. Only this "manual configuration" argument keeps poping, which as I wrote is irrelevant in large scale and we do want to go into large scale. Alon > > Livnat > > > > >> > >> I'd like to add a more general question to the discussion what are > >> the > >> advantages of taking the dynamic approach? > >> So far I collected two reasons: > >> > >> -It is a 'cleaner' design, removes complexity on VDSM code, easier > >> to > >> maintain going forward, and less bug prone (I agree with that one, > >> as > >> long as we keep the retrieving configuration mechanism/algorithm > >> simple). > >> > >> -It adheres to the idea of having a stateless hypervisor - some > >> more > >> input on this point would be appreciated > >> > >> Any other advantages? > >> > >> discussing the benefits of having the persisted > > > > As I mentioned above, the main benefit I see of having some sort of > > persistent > > configuration is: > > > > - To allow the host to operate independently of the engine in > > either a failure > > scenario or in a standalone configuration. > > > > _______________________________________________ > vdsm-devel mailing list > vdsm-devel@lists.fedorahosted.org > https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel > _______________________________________________ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel