On 12/02/2016 08:19, Juergen Schoenwaelder wrote:
On Thu, Feb 11, 2016 at 02:52:27PM +0000, Robert Wilton wrote:
On 11/02/2016 09:29, Juergen Schoenwaelder wrote:
We are discussing this text:
D. The configuration protocol MUST specify how configuration
errors are handled. Errors SHOULD be handled by semantics
similar to NETCONF's error-options for the <edit-config>
operation (stop-on-error, continue-on-error, rollback-on-
error), as described in Section 7.2 in [RFC6241], but
extended to incorporate both the intended and applied
configurations. Support for "rollback on error" semantics
SHOULD be provided.
First, let me observe that it is underspecified what a 'configuration
error' is in the context of this requirement statement. Is the
configuration of an interface that is currently not present a
configuration error?
My opinion is yes, this is an error and hence would cause configuration
to fail and rollback if "rollback-on-error" semantics had been requested.
If the request had used continue-on-error semantics then I would expect
that you would see this configuration in intended, but not applied.
If not, does it become a configuration error if a
line card is inserted to which the configuration can not be applied?
As above, this question doesn't directly apply.
But a similar question might be: what would happen if the configuration
had previously been accepted and then a linecard removed? In this case
the the affected interface configuration would stay in intended but be
removed from applied by the system.
Looks like the semantics you describe are somewhat inconsistent.
OK, they were not meant to be inconsistent at all:
- The intended configuration is what the operator wants, and can only be
changed by the operator.
- The applied configuration is what state the system is in, and should
be continuously kept up to date with the current system state.
- If a config change completes without any errors that means that for
every node in the changeset, the applied config node matches the
intended config node.
- This is why I would say that the cleanest semantics are that
attempting to configure an interface that doesn't exist should be
regarded as a config apply error. (If the operator wants to push this
config in then I think that they should be using the continue-on-error
option - or perhaps something similar).
BTW,
I have been told that 'some operators' do provision configuration for
hardware not yet present. RFC 7223 has been designed to allow this to
happen. Anyway, this is an example. The point is that it is not clear
Yes. I'm not opposed to this on principal, but if a choice of semantics
are supported then it should be down the operators request to choose the
semantics. Having rules that give flexibility in how a device is
allowed to behave just makes them more difficult for the operator to manage.
cut whether something that can't be immediately applied really is an
error.
Note that NETCONF together with YANG provides a rather clear
definition what validation of a configuration datastore means. When we
talk about applied config and the difference between intended and
applied config, the notion of what is a configuration error is not
clear cut anymore.
Personally, I agree that having tighter semantics are a good thing here
(and should be covered by the solution draft).
Then I note that the requirement is at least not well defined.
Perhaps, but we had already spent a long time discussing the
requirements draft, hence personally I think that it is OK to
specify/agree the precise semantics/behaviour in a solution draft.
Second, in order to rollback, there needs to be a configuration that
can be safely rolled back into. The only robust way I can imagine to
implement this rollback requirement is to use locks until the whole
new config has been applied, thereby turning an asynchronous system
into a synchronous system. Otherwise, I fail to see how I can ensure
that I have a configuration that can be safely rolled back into.
Locks are the simplest way of implementing this, but I agree that they
defeat the point of async configuration updates.
I don't think that locking is the only way to solve this. I would have
thought that one of the optimistic transactional locking approaches
could be used to process multiple requests concurrently.
As long as it is clear to which configuration to roll back to. Once
you process multiple requests concurrently, this is not at all clear,
at least not to me.
Logically, the system must revert to a state where the failed
configuration request was never applied. I think this is the only
guarantee that is being made to the client.
One way to achieve this would be rollback the configuration (intended
and applied) back to exactly the point before that failed request (or
any other request) was processed.
Any concurrent configuration requests that were previously in progress
would then need to be re-applied one by one (in the same order). Each
of these could fail in a similar way.
It may be possible to optimize this behavior, but I'm not sure whether
that would be particularly useful - the working assumption here is that
in the mainline case you would expect that the vast majority of
configuration updates shouldn't fail.
Hence, I think that the decision as to whether or not to use a global
lock should probably be specified during the client operation (if at
all), and the solution draft shouldn't enforce always using a server
side lock. Instead it should specify the exact semantics that a client
can expect when interacting with the server during configuration
requests. This is to allow for flexibility in server implementations -
i.e. to trade off complexity for performance.
If it is unclear from the specification what the result of a rollback
is, then I believe this serves little value. And see the previous
The specification of the result after rollback does need to be specified
and included in the solution draft.
examples; what can be applied at time t in general depends on the
overall state of the system at time t. YANG does not allow config true
nodes to depend on config false nodes and the reason is that
configuration validity must not depend on the time varying operational
state of a device.
Yes, I agree with that.
But I think that there is a clear separation between whether a
configuration is semantically valid and whether it can be successfully
applied:
Whether the configuration is semantically valid must only rely on the
configuration.
But whether or not the configuration is able to be applied can and does
depend on the operational state of the device (memory, what LCs are
present, correct functioning of the code, etc).
Rob
/js
_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod