Re: [netmod] I-D Action: draft-kwatsen-netmod-opstate-02.txt

Kent Watsen Fri, 12 Feb 2016 08:51:54 -0800

[As a contributor]




>>>>   If not, does it become a configuration error if a
>>>> line card is inserted to which the configuration can not be applied?
>>> As above, this question doesn't directly apply.
>>>
>>> But a similar question might be: what would happen if the configuration
>>> had previously been accepted and then a linecard removed?  In this case
>>> the the affected interface configuration would stay in intended but be
>>> removed from applied by the system.
>> Looks like the semantics you describe are somewhat inconsistent.
>OK, they were not meant to be inconsistent at all:
>
>- The intended configuration is what the operator wants, and can only be 
>changed by the operator.
>- The applied configuration is what state the system is in, and should 
>be continuously kept up to date with the current system state.
>
>- If a config change completes without any errors that means that for 
>every node in the changeset, the applied config node matches the 
>intended config node.
>- This is why I would say that the cleanest semantics are that 
>attempting to configure an interface that doesn't exist should be 
>regarded as a config apply error.  (If the operator wants to push this 
>config in then I think that they should be using the continue-on-error 
>option - or perhaps something similar).

I disagree.  JUNOS specifically allows for preconfiguration for hardware that 
does not exists yet.  It’s a well-regarded feature.  It is not an error, but 
may be reported as a warning.  This is why we defined leaf "apply-warning” in 
this draft (see subject line).  That said, I do agree that the applied 
configuration would not have the config for the missing hardware, as would be 
evident in a diff between it and the intended configuration.  



>>   BTW,
>> I have been told that 'some operators' do provision configuration for
>> hardware not yet present. RFC 7223 has been designed to allow this to
>> happen. Anyway, this is an example. The point is that it is not clear
>Yes.  I'm not opposed to this on principal, but if a choice of semantics 
>are supported then it should be down the operators request to choose the 
>semantics.  Having rules that give flexibility in how a device is 
>allowed to behave just makes them more difficult for the operator to manage.

I think that this doesn’t scale.  By this token, servers should support 
everything the IETF defines.  But that’s not how it works, instead we have 
features and capabilities that enable devices to advertise what they support.  
This ability is important when a brown-field device is moving to support 
NETCONF/RESTCONF but can’t break compatibility with its legacy APIs.



>>>>Note that NETCONF together with YANG provides a rather clear
>>>> definition what validation of a configuration datastore means. When we
>>>> talk about applied config and the difference between intended and
>>>> applied config, the notion of what is a configuration error is not
>>>> clear cut anymore.
>>> Personally, I agree that having tighter semantics are a good thing here
>>> (and should be covered by the solution draft).
>> Then I note that the requirement is at least not well defined.
>Perhaps, but we had already spent a long time discussing the 
>requirements draft, hence personally I think that it is OK to 
>specify/agree the precise semantics/behaviour in a solution draft.

Not just that, but it was the decision we made as a WG, and why the 
requirements say "The configuration protocol MUST specify how configuration 
errors are handled.” - right?



>>>> Second, in order to rollback, there needs to be a configuration that
>>>> can be safely rolled back into. The only robust way I can imagine to
>>>> implement this rollback requirement is to use locks until the whole
>>>> new config has been applied, thereby turning an asynchronous system
>>>> into a synchronous system. Otherwise, I fail to see how I can ensure
>>>> that I have a configuration that can be safely rolled back into.
>>> Locks are the simplest way of implementing this, but I agree that they
>>> defeat the point of async configuration updates.
>>>
>>> I don't think that locking is the only way to solve this.  I would have
>>> thought that one of the optimistic transactional locking approaches
>>> could be used to process multiple requests concurrently.
>> As long as it is clear to which configuration to roll back to. Once
>> you process multiple requests concurrently, this is not at all clear,
>> at least not to me.
>Logically, the system must revert to a state where the failed 
>configuration request was never applied.  I think this is the only 
>guarantee that is being made to the client.
>
>One way to achieve this would be rollback the configuration (intended 
>and applied) back to exactly the point before that failed request (or 
>any other request) was processed.
>
>Any concurrent configuration requests that were previously in progress 
>would then need to be re-applied one by one (in the same order).  Each 
>of these could fail in a similar way.
>
>It may be possible to optimize this behavior, but I'm not sure whether 
>that would be particularly useful - the working assumption here is that 
>in the mainline case you would expect that the vast majority of 
>configuration updates shouldn't fail.

Agreed, and that is why this draft (see subject) says:

  In order to implement the rollback behavior, for <edit-config> or
  <commit>, it is necessary for the server to maintain a global lock
  until the processing is complete.  That is, either a 'sync' request
  returns or an 'async' request's 'sync-complete' notification has been
  sent.  Any attempts to read or write either intended or applied
  configuration will be blocked until the request completes.

Perhaps “global lock” is overreaching, if the system supports partial-locks, 
then just the locks over the impacted config would be needed.  Are we mixing up 
supporting Asynchronous Configuration Operation with parallel processing?




>But I think that there is a clear separation between whether a
>configuration is semantically valid and whether it can be successfully 
>applied:
>
>Whether the configuration is semantically valid must only rely on the 
>configuration.
>
>But whether or not the configuration is able to be applied can and does 
>depend on the operational state of the device (memory, what LCs are 
>present, correct functioning of the code, etc).

FWIW, when working on solution #2, we originally defined a secondary 
error-option called “applied-error-option” that could control the behavior for 
the “apply” phase of processing a config request.  This became difficult to 
specify, so we backed off and decided to simply leverage the existing 
error-option parameter, by extending it to also cover the apply phase, with 
what we felt was an intuitive interpretation of the error-option values.


Kent



_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Re: [netmod] I-D Action: draft-kwatsen-netmod-opstate-02.txt

Reply via email to