Re: [netmod] I-D Action: draft-kwatsen-netmod-opstate-02.txt

Robert Wilton Fri, 12 Feb 2016 06:07:50 -0800


On 12/02/2016 08:19, Juergen Schoenwaelder wrote:

On Thu, Feb 11, 2016 at 02:52:27PM +0000, Robert Wilton wrote:

On 11/02/2016 09:29, Juergen Schoenwaelder wrote:

We are discussing this text:

        D.  The configuration protocol MUST specify how configuration
            errors are handled.  Errors SHOULD be handled by semantics
            similar to NETCONF's error-options for the <edit-config>
            operation (stop-on-error, continue-on-error, rollback-on-
            error), as described in Section 7.2 in [RFC6241], but
            extended to incorporate both the intended and applied
            configurations.  Support for "rollback on error" semantics
            SHOULD be provided.

First, let me observe that it is underspecified what a 'configuration
error' is in the context of this requirement statement. Is the
configuration of an interface that is currently not present a
configuration error?

My opinion is yes, this is an error and hence would cause configuration
to fail and rollback if "rollback-on-error" semantics had been requested.

If the request had used continue-on-error semantics then I would expect
that you would see this configuration in intended, but not applied.

  If not, does it become a configuration error if a
line card is inserted to which the configuration can not be applied?

As above, this question doesn't directly apply.

But a similar question might be: what would happen if the configuration
had previously been accepted and then a linecard removed?  In this case
the the affected interface configuration would stay in intended but be
removed from applied by the system.

Looks like the semantics you describe are somewhat inconsistent.

OK, they were not meant to be inconsistent at all:

- The intended configuration is what the operator wants, and can only bechanged by the operator.- The applied configuration is what state the system is in, and shouldbe continuously kept up to date with the current system state.

- If a config change completes without any errors that means that forevery node in the changeset, the applied config node matches theintended config node.- This is why I would say that the cleanest semantics are thatattempting to configure an interface that doesn't exist should beregarded as a config apply error. (If the operator wants to push thisconfig in then I think that they should be using the continue-on-erroroption - or perhaps something similar).

  BTW,
I have been told that 'some operators' do provision configuration for
hardware not yet present. RFC 7223 has been designed to allow this to
happen. Anyway, this is an example. The point is that it is not clear

Yes. I'm not opposed to this on principal, but if a choice of semanticsare supported then it should be down the operators request to choose thesemantics. Having rules that give flexibility in how a device isallowed to behave just makes them more difficult for the operator to manage.

cut whether something that can't be immediately applied really is an
error.

Note that NETCONF together with YANG provides a rather clear
definition what validation of a configuration datastore means. When we
talk about applied config and the difference between intended and
applied config, the notion of what is a configuration error is not
clear cut anymore.

Personally, I agree that having tighter semantics are a good thing here
(and should be covered by the solution draft).

Then I note that the requirement is at least not well defined.

Perhaps, but we had already spent a long time discussing therequirements draft, hence personally I think that it is OK tospecify/agree the precise semantics/behaviour in a solution draft.

Second, in order to rollback, there needs to be a configuration that
can be safely rolled back into. The only robust way I can imagine to
implement this rollback requirement is to use locks until the whole
new config has been applied, thereby turning an asynchronous system
into a synchronous system. Otherwise, I fail to see how I can ensure
that I have a configuration that can be safely rolled back into.

Locks are the simplest way of implementing this, but I agree that they
defeat the point of async configuration updates.

I don't think that locking is the only way to solve this.  I would have
thought that one of the optimistic transactional locking approaches
could be used to process multiple requests concurrently.

As long as it is clear to which configuration to roll back to. Once
you process multiple requests concurrently, this is not at all clear,
at least not to me.

Logically, the system must revert to a state where the failedconfiguration request was never applied. I think this is the onlyguarantee that is being made to the client.

One way to achieve this would be rollback the configuration (intendedand applied) back to exactly the point before that failed request (orany other request) was processed.

Any concurrent configuration requests that were previously in progresswould then need to be re-applied one by one (in the same order). Eachof these could fail in a similar way.

It may be possible to optimize this behavior, but I'm not sure whetherthat would be particularly useful - the working assumption here is thatin the mainline case you would expect that the vast majority ofconfiguration updates shouldn't fail.

Hence, I think that the decision as to whether or not to use a global
lock should probably be specified during the client operation (if at
all), and the solution draft shouldn't enforce always using a server
side lock.  Instead it should specify the exact semantics that a client
can expect when interacting with the server during configuration
requests.  This is to allow for flexibility in server implementations -
i.e. to trade off complexity for performance.

If it is unclear from the specification what the result of a rollback
is, then I believe this serves little value. And see the previous

The specification of the result after rollback does need to be specifiedand included in the solution draft.

examples; what can be applied at time t in general depends on the
overall state of the system at time t. YANG does not allow config true
nodes to depend on config false nodes and the reason is that
configuration validity must not depend on the time varying operational
state of a device.


Yes, I agree with that.

But I think that there is a clear separation between whether aconfiguration is semantically valid and whether it can be successfullyapplied:

Whether the configuration is semantically valid must only rely on theconfiguration.

But whether or not the configuration is able to be applied can and doesdepend on the operational state of the device (memory, what LCs arepresent, correct functioning of the code, etc).

Rob

/js


_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Re: [netmod] I-D Action: draft-kwatsen-netmod-opstate-02.txt

Reply via email to