Hi Carlos, > Am 04.05.2016 um 17:13 schrieb Carlos Pignataro (cpignata) > <[email protected]>: > > Hi, Mirja, > >> On May 4, 2016, at 10:41 AM, Mirja Kuehlewind (IETF) <[email protected]> >> wrote: >> >> Hi Carlos, >> >> below. >> >>> Am 04.05.2016 um 16:33 schrieb Carlos Pignataro (cpignata) >>> <[email protected]>: >>> >>> Thanks much for the response, Mirja! >>> >>> I think we are converging, please see inline. >>> >>>> On May 4, 2016, at 10:13 AM, Mirja Kuehlewind (IETF) <[email protected]> >>>> wrote: >>>> >>>> Hi Carlos, >>>> >>>> see below. >>>> >>>>> Am 03.05.2016 um 19:24 schrieb Carlos Pignataro (cpignata) >>>>> <[email protected]>: >>>>> >>>>> Hi, Mirja, >>>>> >>>>>> On May 3, 2016, at 12:31 PM, Mirja Kuehlewind (IETF) >>>>>> <[email protected]> wrote: >>>>>> >>>>>> Hi Carlos, >>>>>> >>>>>> >>>>>>> Am 03.05.2016 um 15:40 schrieb Carlos Pignataro (cpignata) >>>>>>> <[email protected]>: >>>>>>> >>>>>>> Hi, Mirja, >>>>>>> >>>>>>> What is an uncontrolled packet in an IP network, and what entity >>>>>>> controls controlled ones? :-) >>>>>> >>>>>> Questions over questions… :-) >>>>>> >>>>>> See below... >>>>>> >>>>>>> >>>>>>> More seriously, please see inline. >>>>>>> >>>>>>>> On May 3, 2016, at 5:35 AM, Mirja Kuehlewind <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Mirja Kühlewind has entered the following ballot position for >>>>>>>> draft-ietf-bfd-seamless-base-09: Discuss >>>>>>>> >>>>>>>> When responding, please keep the subject line intact and reply to all >>>>>>>> email addresses included in the To and CC lines. (Feel free to cut this >>>>>>>> introductory paragraph, however.) >>>>>>>> >>>>>>>> >>>>>>>> Please refer to >>>>>>>> https://www.ietf.org/iesg/statement/discuss-criteria.html >>>>>>>> for more information about IESG DISCUSS and COMMENT positions. >>>>>>>> >>>>>>>> >>>>>>>> The document, along with other ballot positions, can be found here: >>>>>>>> https://datatracker.ietf.org/doc/draft-ietf-bfd-seamless-base/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ---------------------------------------------------------------------- >>>>>>>> DISCUSS: >>>>>>>> ---------------------------------------------------------------------- >>>>>>>> >>>>>>>> As S-BFD has no initiation process anymore it is not guarenteed that >>>>>>>> the >>>>>>>> receiver/responder actually exists. That means that packets could float >>>>>>>> (uncontrolled) in the network or even outside of the adminstrative >>>>>>>> domain >>>>>>>> (e.g. due to configuration mistakes). From my point of view this >>>>>>>> document >>>>>>>> should recommend/require two things: >>>>>>>> >>>>>>> >>>>>>> We have called out the misconfiguration — however: >>>>>>> >>>>>>>> 1) A maximum number of S-BFD packet that is allow to be send without >>>>>>>> getting a response (maybe leading to a local error report). >>>>>>>> >>>>>>> >>>>>>> This can result in a deadlock situation, if an S-BFD Reflector is >>>>>>> enabled much later. I’m very hesitant to cap the packets sent. We can, >>>>>>> and I think it is useful, MAY log an error for multiple timeouts. >>>>>> >>>>>> Okay, I understand that a hard limit probably does make sense. An error >>>>>> log seems definitely useful. >>>>> >>>>> OK, that sounds good. See below. >>>>> >>>>>> Another proposal for consideration: Currently the draft says an >>>>>> initiator should only send one packet per second if the target is in >>>>>> ADMINDOWN state. In this case there this state is explicit announced. >>>>>> However if the other end just disappears or was never/not yet there, one >>>>>> could use an exponential back off instead, starting with a smaller >>>>>> intervals than one second but then increase it exponentially. Just an >>>>>> idea... >>>>> >>>>> Thanks for the proposal. Please have in mind however that this is a >>>>> protocol for detecting liveness (and lack of it), so increasing >>>>> exponentially defeats the purpose. >>>>> >>>>> Further, exponential back off may not be the best choice when interacting >>>>> with routing protocols. >>>>> >>>>> What we currently say is: >>>>> The criteria for declaring loss of >>>>> reachability and the action that would be triggered as a result >>>>> are outside the scope of this document. >>>>> >>>>> As much of these are implementation choices. >>>>> >>>>> But we can add at the end “, and MAY include logging an error.“ >>>> >>>> Please do so. >>> >>> Done. >>> >>>> >>>>>> >>>>>>> >>>>>>>> 2) Egress filtering at the adminstrative border of the domain that uses >>>>>>>> S-BFD to make sure that no S-BFD packets leave the domain. >>>>>>>> >>>>>>> >>>>>>> This is no different than any other application that uses UDP; a >>>>>>> misconfigured DNS server will result in the same, and a traceroute is >>>>>>> also not too different. This seems too onerous of a requirement. An >>>>>>> administrative domain filters at ingress. >>>>>> >>>>>> First of all, just because other protocols might have such a problem, >>>>>> that does mean it’s okay. >>>>> >>>>> I agree with this. I had a different point in mind though — trying to >>>>> specify this on every UDP application might not be the most effective >>>>> way. Perhaps there’s a UDP guideline you are uncovering. >>>>> >>>>>> However, correctly me if I’m wrong, but there the situation seems >>>>>> slightly different because there is no termination criterium at all that >>>>>> means an s-bfd node would send useless data forever (… until manual >>>>>> change of the config). >>>>>> >>>>> >>>>> But as far as this doc is concerned, let me try to make some >>>>> clarifications (and a correction). >>>>> >>>>> There are termination criteria — the document says: >>>>> >>>>> An SBFDInitiator may be a persistent session on the initiator with a >>>>> timer for S-BFD control packet transmissions (stateful >>>>> SBFDInitiator). An SBFDInitiator may also be a module, a script or a >>>>> tool on the initiator that transmits one or more S-BFD control >>>>> packets "when needed" (stateless SBFDInitiator). >>>>> >>>>> For the case in which you have a “when needed” SBFDInitiator, the >>>>> workflow is like a “ping”. >>>>> >>>>> For the case in which you have a “persistent" SBFDInitiator, in theory >>>>> this can run forever (for some value of ever). However, please don’t >>>>> loose track of why this protocol exists. Having OAM failures and do >>>>> nothing about it defeats the purpose of having OAM. Meaning, a red alarm >>>>> will blink, a honk will horn, and the config state will be changed >>>>> (manually or by some support system). >>>>> >>>>> In other words, I do not think this is such a unique case (insofar as >>>>> running ad-infinutum). >>>> >>>> I still believe that the case where you have a misconfiguration and the >>>> initiator sends packets (forever) but never ever gest a reply (and never >>>> has seen a reply in the past), is a different case and can be detected and >>>> handled separately. >>>> >>> >>> Again, I would not qualify this as ‘forever’, but I understand what you >>> mean. >>> >>>>> >>>>>> I still believe that egress filtering would be more appropriate here >>>>>> (than ingress) because the domain that is using s-bfd knows about it and >>>>>> therefor can set up the respective filters and should not spam others >>>>>> while hoping that filters are in place. >>>>>> >>>>> >>>>> To me, there is no insignificant operational complexity with requiring >>>>> the addition of filters throughout, for one particular application not >>>>> leaking (where the leak does not cause anything special), and when the >>>>> leak might happen because of a misconfiguration (or bug) but will be >>>>> detected by the operational support systems. The ROI does not seem to add >>>>> up. >>>> >>>> Okay the document should probably not require egress filtering in any case >>>> but what’s about saying something like: >>>> >>>> „If S-BFD is used it SHOULD be ensured that S-BFD control packet do not >>>> propagate outside of the administrative domain that uses it.“ >>>> >>> >>> We can add an additional explanation of the problem before a statement, but >>> I do not think that particular SHOULD is actionable. How’s something like: >>> >>> Explain that without handshake, a persistent initiator can send blindly, to >>> then add “In such case, operational measures SHOULD be taken to identify if >>> S-BFD packets are not responded to for an extended period of time, and >>> remediate the situation” >>> >>>> This is not an uncommon thing to specify also for other protocols. >>>> >>> >>> I agree. Frankly, I am happy with either statement, but I think the latter >>> might be more operationally actionable. >>> >>> Preference? >> >> I still would prefer something in the line as I proposed. I think there >> could effectively be different action to be taken here, e.g. agree >> filtering or measurement to detect failure, as well as no action if for some >> other reason it can be ensure that should a misconfiguration can not happen >> (or is at least very unlikely to happen) e.g because things are automated >> and there are additional checks before apply a config. >> > > Perhaps I can add “for an extended period of time” to the first statement (or > similar wording of your liking)? > > Your main concern is the “forever”. Let’s ensure it is not “forever”. > However, I’m concerned that a single packet out (like a ping to the wrong > address) will be violating “ it SHOULD be ensured that S-BFD control packet > do not propagate outside”
The concern it not „forever“ but putting (unnecessary) load on other network (by accident). So I agree, one or a few packets is not a problem. So yes, adding “for an extended period of time” is fine. We could also/instead exchange the word „ensure“ by something else (maybe „control“…?). Mirja > > Would that work? > > Thanks, > > — Carlos. > >> The second SHOULD that you proposed is from my point of view actually an >> additional point that I would also be happy to see in the doc. >> >> Mirja >> >> >>> >>> Thanks, >>> >>> — Carlos. >>> >>>> Mirja >>>> >>>> >>>>> >>>>> Does the explanation of the termination criteria help? >>>>> >>>>>>> >>>>>>> Seems to me the logging will alert someone/something to take action, >>>>>>> and should be enough. >>>>>> >>>>>> Logging plus alerts is definitely a good thing. >>>>>> >>>>> >>>>> I agree. >>>>> >>>>> Will add “, and MAY include logging an error.” as per above. >>>>> >>>>> Do you think we should expand on this somewhere else in the document? >>>>> >>>>> Thanks, >>>>> >>>>> — Carlos. >>>>> >>>>>> Mirja >>>>>> >>>>>> >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> — Carlos. >>> >> >
