On 12/02/2016 08:19, Juergen Schoenwaelder wrote:
On Thu, Feb 11, 2016 at 02:52:27PM +0000, Robert Wilton wrote:
On 11/02/2016 09:29, Juergen Schoenwaelder wrote:
We are discussing this text:

        D.  The configuration protocol MUST specify how configuration
            errors are handled.  Errors SHOULD be handled by semantics
            similar to NETCONF's error-options for the <edit-config>
            operation (stop-on-error, continue-on-error, rollback-on-
            error), as described in Section 7.2 in [RFC6241], but
            extended to incorporate both the intended and applied
            configurations.  Support for "rollback on error" semantics
            SHOULD be provided.

First, let me observe that it is underspecified what a 'configuration
error' is in the context of this requirement statement. Is the
configuration of an interface that is currently not present a
configuration error?
My opinion is yes, this is an error and hence would cause configuration
to fail and rollback if "rollback-on-error" semantics had been requested.

If the request had used continue-on-error semantics then I would expect
that you would see this configuration in intended, but not applied.

  If not, does it become a configuration error if a
line card is inserted to which the configuration can not be applied?
As above, this question doesn't directly apply.

But a similar question might be: what would happen if the configuration
had previously been accepted and then a linecard removed?  In this case
the the affected interface configuration would stay in intended but be
removed from applied by the system.
Looks like the semantics you describe are somewhat inconsistent.
OK, they were not meant to be inconsistent at all:

- The intended configuration is what the operator wants, and can only be changed by the operator. - The applied configuration is what state the system is in, and should be continuously kept up to date with the current system state.

- If a config change completes without any errors that means that for every node in the changeset, the applied config node matches the intended config node. - This is why I would say that the cleanest semantics are that attempting to configure an interface that doesn't exist should be regarded as a config apply error. (If the operator wants to push this config in then I think that they should be using the continue-on-error option - or perhaps something similar).


  BTW,
I have been told that 'some operators' do provision configuration for
hardware not yet present. RFC 7223 has been designed to allow this to
happen. Anyway, this is an example. The point is that it is not clear
Yes. I'm not opposed to this on principal, but if a choice of semantics are supported then it should be down the operators request to choose the semantics. Having rules that give flexibility in how a device is allowed to behave just makes them more difficult for the operator to manage.


cut whether something that can't be immediately applied really is an
error.
Note that NETCONF together with YANG provides a rather clear
definition what validation of a configuration datastore means. When we
talk about applied config and the difference between intended and
applied config, the notion of what is a configuration error is not
clear cut anymore.
Personally, I agree that having tighter semantics are a good thing here
(and should be covered by the solution draft).
Then I note that the requirement is at least not well defined.
Perhaps, but we had already spent a long time discussing the requirements draft, hence personally I think that it is OK to specify/agree the precise semantics/behaviour in a solution draft.



Second, in order to rollback, there needs to be a configuration that
can be safely rolled back into. The only robust way I can imagine to
implement this rollback requirement is to use locks until the whole
new config has been applied, thereby turning an asynchronous system
into a synchronous system. Otherwise, I fail to see how I can ensure
that I have a configuration that can be safely rolled back into.
Locks are the simplest way of implementing this, but I agree that they
defeat the point of async configuration updates.

I don't think that locking is the only way to solve this.  I would have
thought that one of the optimistic transactional locking approaches
could be used to process multiple requests concurrently.
As long as it is clear to which configuration to roll back to. Once
you process multiple requests concurrently, this is not at all clear,
at least not to me.
Logically, the system must revert to a state where the failed configuration request was never applied. I think this is the only guarantee that is being made to the client.

One way to achieve this would be rollback the configuration (intended and applied) back to exactly the point before that failed request (or any other request) was processed.

Any concurrent configuration requests that were previously in progress would then need to be re-applied one by one (in the same order). Each of these could fail in a similar way.


It may be possible to optimize this behavior, but I'm not sure whether that would be particularly useful - the working assumption here is that in the mainline case you would expect that the vast majority of configuration updates shouldn't fail.


Hence, I think that the decision as to whether or not to use a global
lock should probably be specified during the client operation (if at
all), and the solution draft shouldn't enforce always using a server
side lock.  Instead it should specify the exact semantics that a client
can expect when interacting with the server during configuration
requests.  This is to allow for flexibility in server implementations -
i.e. to trade off complexity for performance.
If it is unclear from the specification what the result of a rollback
is, then I believe this serves little value. And see the previous
The specification of the result after rollback does need to be specified and included in the solution draft.

examples; what can be applied at time t in general depends on the
overall state of the system at time t. YANG does not allow config true
nodes to depend on config false nodes and the reason is that
configuration validity must not depend on the time varying operational
state of a device.

Yes, I agree with that.

But I think that there is a clear separation between whether a configuration is semantically valid and whether it can be successfully applied:

Whether the configuration is semantically valid must only rely on the configuration.

But whether or not the configuration is able to be applied can and does depend on the operational state of the device (memory, what LCs are present, correct functioning of the code, etc).

Rob



/js


_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Reply via email to