Re: [netmod] netmod-opstate-reqs and error option terms (rollback on error)

Gert Grammel Mon, 21 Dec 2015 12:18:12 -0800

Hi Rob,

Seems we are pretty close with our understanding, so snipping the only 
discussion point left, with further comments:


Gert


<snip>
The real problem is what you called in another email  'hybrid' or 
cheating-synchronous implementations. This leads to a situation where the 
client is made to believe the intended config is applied, but the server still 
didn't apply it yet. Take the case where the server runs into trouble after the 
synchronous-commit (which lets the client believe that the intended config is 
applied) and decides to roll-back. From a client perspective this would look 
like a node randomly losing its committed configuration. There is tons of code 
required on the client side to cope with that situation. So what was the 
purpose of implementing it that way in the first place - instead of just 
applying an asynchronous implementation?
Yes, I agree that handling rollback could be a problem in this scenario,
and hence I would propose that the behaviour in such a scenario is
explicitly documented as being undefined in whatever solution is agreed
upon :-)
Gert> sadly, we can’t roll time back. So the best option is indeed to document 
the issue and move on to improve things

Otherwise assuming that all requests are strictly synchronous or
asynchronous then I think that we should be OK with the following two
rules on the server:

  1.  All edit-config requests must strictly be processed in order.

Gert> yes, although there are corner cases we need to hash out.  I.e. a client 
may set  a leaf to <x> followed by setting it to <y>. Then a server may skip 
setting it to <x>, because it is anyhow overwritten by <y>.

2) You cannot tell a client that a request has been full applied unless
all previous requests specifying rollback-on-error semantics with any
overlapping nodes with the current request have either be applied or
aborted (i.e. rolled back)

Gert> with “have bee full applied” you meant a state that I tentatively named 
‘validated’  in my earlier email?  I am a bit sensitive to naming here because 
of those ‘cheating synchronous nodes’ would return ‘applied’ without actually 
doing so in full. It would be great if we could quickly converge on some naming 
convention to remove ambiguity.
The rule holds true but I don’t see a dependency on “rollback-on-error” 
semantics. In my view it is applicable also in cases of "continue-on-error” and 
“stop-on-error” in the sense that unless any error has been reported, the 
client still needs to wait for the ‘validated’ state before it can reliably 
assume the config was applied.

<snip>

From: Robert Wilton <rwil...@cisco.com<mailto:rwil...@cisco.com>>
Date: Monday 21 December 2015 19:55
To: Gert Grammel <ggram...@juniper.net<mailto:ggram...@juniper.net>>, Jason 
Sterne 
<jason.ste...@alcatel-lucent.com<mailto:jason.ste...@alcatel-lucent.com>>, 
"netmod@ietf.org<mailto:netmod@ietf.org>" 
<netmod@ietf.org<mailto:netmod@ietf.org>>
Subject: Re: [netmod] netmod-opstate-reqs and error option terms (rollback on 
error)

Hi Gert,

Please see inline ...

On 05/11/2015 03:53, Gert Grammel wrote:
Jason,

A synchronous config basically contains two pieces of information in the commit:
1) the intended configuration is valid (i.e. is syntactically correct) and
2) the intended config has been applied
Any error that would affect the config before the commit could be rolled back 
to the old config and a suitable notification sent to the client. After the 
commit, there is no roll-back.
I agree.

Similarly for asynchronous, however here the information needs to be split into 
two messages:
1) a commit that the intended config is valid
2) another message when the intended config is fully applied (let's call this 
'validated').
A rollback can happen before the intended config is fully applied i.e. before 
the 'validated' state is reached.
I agree.


The real problem is what you called in another email  'hybrid' or 
cheating-synchronous implementations. This leads to a situation where the 
client is made to believe the intended config is applied, but the server still 
didn't apply it yet. Take the case where the server runs into trouble after the 
synchronous-commit (which lets the client believe that the intended config is 
applied) and decides to roll-back. From a client perspective this would look 
like a node randomly losing its committed configuration. There is tons of code 
required on the client side to cope with that situation. So what was the 
purpose of implementing it that way in the first place - instead of just 
applying an asynchronous implementation?
Yes, I agree that handling rollback could be a problem in this scenario,
and hence I would propose that the behaviour in such a scenario is
explicitly documented as being undefined in whatever solution is agreed
upon :-)

Otherwise assuming that all requests are strictly synchronous or
asynchronous then I think that we should be OK with the following two
rules on the server:
1) All edit-config requests must strictly be processed in order.
2) You cannot tell a client that a request has been full applied unless
all previous requests specifying rollback-on-error semantics with any
overlapping nodes with the current request have either be applied or
aborted (i.e. rolled back)

Thanks,
Rob



Gert




-----Original Message-----
From: netmod [mailto:netmod-boun...@ietf.org] On Behalf Of Sterne, Jason (Jason)
Sent: 03 November 2015 08:24
To: netmod@ietf.org<mailto:netmod@ietf.org>
Subject: [netmod] netmod-opstate-reqs and error option terms (rollback on error)

Hi all,

The term "rollback on error" (and other error options) has been used during 
these discussions around the opstate requirements.

That term already has some meaning in RFC6241 (or at least rollback-on-error 
does and that is pretty close) and IMO it (today) has nothing to do with 
"applied" config.  It is an error option that has the scope of the contents of 
a single edit-config request and how those contents get applied (all or 
nothing) to the candidate DS (which is neither intended nor applied config) or 
to the running DS (intended) if the <target> is <running/>.

I think we need to clarify this "all or nothing" concept and how it is related 
to "applied" config.  We may also want to use slightly different terminology so 
we don't get confused with today's meaning of rollback-on-error.

There are a few transitions to consider when editing a config and applying it 
to a device (I'll give the example of using the candidate DS):
(A) config changes   ---> candidate DS   (<edit-config>)
(B) candidate DS  ----> running (intended)  (<commit>)
(C) intended ----> applied  (internal processed in the device)

Today rollback-on-error is only applicable to transition (A).

Transition (B) does have all-or-nothing properties (as described in RFC6241) 
but that isn't related to "rollback-on-error".

Is there some intention in the opstate requirements to add some sort of 
all-or-nothing behavior to transition (C) ?  i.e. if some part of an edit fails 
during the transition from intended->applied we should "rollback" the other 
parts that may have already been applied ?

Would we then remove it all from intended as well ?

I'm not sure how that would work for an async/hybrid (read "real") system.  
We've already done an "ack" back to the client before transition (C) so the 
client may have already sent some additional new config that depends on the 
previous edit.  That would mean that new config isn't valid.

Jason

_______________________________________________
netmod mailing list
netmod@ietf.org<mailto:netmod@ietf.org>
https://www.ietf.org/mailman/listinfo/netmod

_______________________________________________
netmod mailing list
netmod@ietf.org<mailto:netmod@ietf.org>
https://www.ietf.org/mailman/listinfo/netmod
.

_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Re: [netmod] netmod-opstate-reqs and error option terms (rollback on error)

Reply via email to