Re: [netmod] netmod-opstate-reqs and error option terms (rollback on error)

Robert Wilton Tue, 22 Dec 2015 12:32:40 -0800

Hi Gert,

On 21/12/2015 20:17, Gert Grammel wrote:

Hi Rob,

Seems we are pretty close with our understanding, so snipping the onlydiscussion point left, with further comments:


Gert


<snip>

    The real problem is what you called in another email  'hybrid' or
    cheating-synchronous implementations. This leads to a situation
    where the client is made to believe the intended config is
    applied, but the server still didn't apply it yet. Take the case
    where the server runs into trouble after the synchronous-commit
    (which lets the client believe that the intended config is
    applied) and decides to roll-back. From a client perspective this
    would look like a node randomly losing its committed
    configuration. There is tons of code required on the client side
    to cope with that situation. So what was the purpose of
    implementing it that way in the first place - instead of just
    applying an asynchronous implementation?

Yes, I agree that handling rollback could be a problem in this scenario,
and hence I would propose that the behaviour in such a scenario is
explicitly documented as being undefined in whatever solution is agreed
upon :-)

Gert> sadly, we can’t roll time back. So the best option is indeed todocument the issue and move on to improve things


Otherwise assuming that all requests are strictly synchronous or
asynchronous then I think that we should be OK with the following two
rules on the server:

 1. All edit-config requests must strictly be processed in order.

Gert> yes, although there are corner cases we need to hash out. I.e.a client may set a leaf to <x> followed by setting it to <y>. Then aserver may skip setting it to <x>, because it is anyhow overwritten by<y>.

Yes, it might, but from a correctness point of view it will need to ableto fail the subsequent request to set the value to <y> and hence set itto <x> instead.

2) You cannot tell a client that a request has been full applied unless
all previous requests specifying rollback-on-error semantics with any
overlapping nodes with the current request have either be applied or
aborted (i.e. rolled back)
Gert> with “have bee full applied” you meant a state that Itentatively named ‘validated’ in my earlier email? I am a bitsensitive to naming here because of those ‘cheating synchronous nodes’would return ‘applied’ without actually doing so in full. It would begreat if we could quickly converge on some naming convention to removeambiguity.

Yes, it means the same state as your 'validated'. I basically just meanthat the synchronous operation has completed (as per the definition of'Synchronous Configuration Operation' defined indraft-ietf-netmod-opstate-reqs-01).

The rule holds true but I don’t see a dependency on“rollback-on-error” semantics. In my view it is applicable also incases of "continue-on-error” and “stop-on-error” in the sense thatunless any error has been reported, the client still needs to wait forthe ‘validated’ state before it can reliably assume the config wasapplied.

I would see that a "continue-on-error" configuration operation isprocessed best effort. I.e. even if applying one of more config nodesfailed to be applied, the rest of the configuration contained in a besteffort operation would always be applied. As such subsequent operationswould not need to wait for a best effort request to complete firstbefore they can complete.


Thanks,
Rob


<snip>

From: Robert Wilton <rwil...@cisco.com <mailto:rwil...@cisco.com>>
Date: Monday 21 December 2015 19:55

To: Gert Grammel <ggram...@juniper.net <mailto:ggram...@juniper.net>>,Jason Sterne <jason.ste...@alcatel-lucent.com<mailto:jason.ste...@alcatel-lucent.com>>, "netmod@ietf.org<mailto:netmod@ietf.org>" <netmod@ietf.org <mailto:netmod@ietf.org>>Subject: Re: [netmod] netmod-opstate-reqs and error option terms(rollback on error)


Hi Gert,

Please see inline ...

On 05/11/2015 03:53, Gert Grammel wrote:

    Jason,

    A synchronous config basically contains two pieces of information
    in the commit:
    1) the intended configuration is valid (i.e. is syntactically
    correct) and
    2) the intended config has been applied
    Any error that would affect the config before the commit could be
    rolled back to the old config and a suitable notification sent to
    the client. After the commit, there is no roll-back.

I agree.

    Similarly for asynchronous, however here the information needs to
    be split into two messages:
    1) a commit that the intended config is valid
    2) another message when the intended config is fully applied
    (let's call this 'validated').
    A rollback can happen before the intended config is fully applied
    i.e. before the 'validated' state is reached.

I agree.


    The real problem is what you called in another email  'hybrid' or
    cheating-synchronous implementations. This leads to a situation
    where the client is made to believe the intended config is
    applied, but the server still didn't apply it yet. Take the case
    where the server runs into trouble after the synchronous-commit
    (which lets the client believe that the intended config is
    applied) and decides to roll-back. From a client perspective this
    would look like a node randomly losing its committed
    configuration. There is tons of code required on the client side
    to cope with that situation. So what was the purpose of
    implementing it that way in the first place - instead of just
    applying an asynchronous implementation?

Yes, I agree that handling rollback could be a problem in this scenario,
and hence I would propose that the behaviour in such a scenario is
explicitly documented as being undefined in whatever solution is agreed
upon :-)

Otherwise assuming that all requests are strictly synchronous or
asynchronous then I think that we should be OK with the following two
rules on the server:
1) All edit-config requests must strictly be processed in order.
2) You cannot tell a client that a request has been full applied unless
all previous requests specifying rollback-on-error semantics with any
overlapping nodes with the current request have either be applied or
aborted (i.e. rolled back)

Thanks,
Rob



    Gert




    -----Original Message-----
    From: netmod [mailto:netmod-boun...@ietf.org] On Behalf Of Sterne,
    Jason (Jason)
    Sent: 03 November 2015 08:24
    To: netmod@ietf.org <mailto:netmod@ietf.org>
    Subject: [netmod] netmod-opstate-reqs and error option terms
    (rollback on error)

    Hi all,

    The term "rollback on error" (and other error options) has been
    used during these discussions around the opstate requirements.

    That term already has some meaning in RFC6241 (or at least
    rollback-on-error does and that is pretty close) and IMO it
    (today) has nothing to do with "applied" config.  It is an error
    option that has the scope of the contents of a single edit-config
    request and how those contents get applied (all or nothing) to the
    candidate DS (which is neither intended nor applied config) or to
    the running DS (intended) if the <target> is <running/>.

    I think we need to clarify this "all or nothing" concept and how
    it is related to "applied" config.  We may also want to use
    slightly different terminology so we don't get confused with
    today's meaning of rollback-on-error.

    There are a few transitions to consider when editing a config and
    applying it to a device (I'll give the example of using the
    candidate DS):
    (A) config changes   ---> candidate DS (<edit-config>)
    (B) candidate DS  ----> running (intended)  (<commit>)
    (C) intended ----> applied  (internal processed in the device)

    Today rollback-on-error is only applicable to transition (A).

    Transition (B) does have all-or-nothing properties (as described
    in RFC6241) but that isn't related to "rollback-on-error".

    Is there some intention in the opstate requirements to add some
    sort of all-or-nothing behavior to transition (C) ?  i.e. if some
    part of an edit fails during the transition from intended->applied
    we should "rollback" the other parts that may have already been
    applied ?

    Would we then remove it all from intended as well ?

    I'm not sure how that would work for an async/hybrid (read "real")
    system.  We've already done an "ack" back to the client before
    transition (C) so the client may have already sent some additional
    new config that depends on the previous edit.  That would mean
    that new config isn't valid.

    Jason

    _______________________________________________
    netmod mailing list
    netmod@ietf.org <mailto:netmod@ietf.org>
    https://www.ietf.org/mailman/listinfo/netmod

    _______________________________________________
    netmod mailing list
    netmod@ietf.org <mailto:netmod@ietf.org>
    https://www.ietf.org/mailman/listinfo/netmod
    .





_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Re: [netmod] netmod-opstate-reqs and error option terms (rollback on error)

Reply via email to