Hi Carlos,

> Am 04.05.2016 um 17:13 schrieb Carlos Pignataro (cpignata) 
> <[email protected]>:
> 
> Hi, Mirja,
> 
>> On May 4, 2016, at 10:41 AM, Mirja Kuehlewind (IETF) <[email protected]> 
>> wrote:
>> 
>> Hi Carlos,
>> 
>> below.
>> 
>>> Am 04.05.2016 um 16:33 schrieb Carlos Pignataro (cpignata) 
>>> <[email protected]>:
>>> 
>>> Thanks much for the response, Mirja!
>>> 
>>> I think we are converging, please see inline.
>>> 
>>>> On May 4, 2016, at 10:13 AM, Mirja Kuehlewind (IETF) <[email protected]> 
>>>> wrote:
>>>> 
>>>> Hi Carlos,
>>>> 
>>>> see below.
>>>> 
>>>>> Am 03.05.2016 um 19:24 schrieb Carlos Pignataro (cpignata) 
>>>>> <[email protected]>:
>>>>> 
>>>>> Hi, Mirja,
>>>>> 
>>>>>> On May 3, 2016, at 12:31 PM, Mirja Kuehlewind (IETF) 
>>>>>> <[email protected]> wrote:
>>>>>> 
>>>>>> Hi Carlos,
>>>>>> 
>>>>>> 
>>>>>>> Am 03.05.2016 um 15:40 schrieb Carlos Pignataro (cpignata) 
>>>>>>> <[email protected]>:
>>>>>>> 
>>>>>>> Hi, Mirja,
>>>>>>> 
>>>>>>> What is an uncontrolled packet in an IP network, and what entity 
>>>>>>> controls controlled ones? :-)
>>>>>> 
>>>>>> Questions over questions… :-)
>>>>>> 
>>>>>> See below...
>>>>>> 
>>>>>>> 
>>>>>>> More seriously, please see inline.
>>>>>>> 
>>>>>>>> On May 3, 2016, at 5:35 AM, Mirja Kuehlewind <[email protected]> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Mirja Kühlewind has entered the following ballot position for
>>>>>>>> draft-ietf-bfd-seamless-base-09: Discuss
>>>>>>>> 
>>>>>>>> When responding, please keep the subject line intact and reply to all
>>>>>>>> email addresses included in the To and CC lines. (Feel free to cut this
>>>>>>>> introductory paragraph, however.)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Please refer to 
>>>>>>>> https://www.ietf.org/iesg/statement/discuss-criteria.html
>>>>>>>> for more information about IESG DISCUSS and COMMENT positions.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> The document, along with other ballot positions, can be found here:
>>>>>>>> https://datatracker.ietf.org/doc/draft-ietf-bfd-seamless-base/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>> DISCUSS:
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>> 
>>>>>>>> As S-BFD has no initiation process anymore it is not guarenteed that 
>>>>>>>> the
>>>>>>>> receiver/responder actually exists. That means that packets could float
>>>>>>>> (uncontrolled) in the network or even outside of the adminstrative 
>>>>>>>> domain
>>>>>>>> (e.g. due to configuration mistakes). From my point of view this 
>>>>>>>> document
>>>>>>>> should recommend/require two things:
>>>>>>>> 
>>>>>>> 
>>>>>>> We have called out the misconfiguration — however:
>>>>>>> 
>>>>>>>> 1) A maximum number of S-BFD packet that is allow to be send without
>>>>>>>> getting a response (maybe leading to a local error report).
>>>>>>>> 
>>>>>>> 
>>>>>>> This can result in a deadlock situation, if an S-BFD Reflector is 
>>>>>>> enabled much later. I’m very hesitant to cap the packets sent. We can, 
>>>>>>> and I think it is useful, MAY log an error for multiple timeouts.
>>>>>> 
>>>>>> Okay, I understand that a hard limit probably does make sense. An error 
>>>>>> log seems definitely useful.
>>>>> 
>>>>> OK, that sounds good. See below.
>>>>> 
>>>>>> Another proposal for consideration: Currently the draft says an 
>>>>>> initiator should only send one packet per second if the target is in 
>>>>>> ADMINDOWN state. In this case there this state is explicit announced. 
>>>>>> However if the other end just disappears or was never/not yet there, one 
>>>>>> could use an exponential back off instead, starting with a smaller 
>>>>>> intervals than one second but then increase it exponentially. Just an 
>>>>>> idea...
>>>>> 
>>>>> Thanks for the proposal. Please have in mind however that this is a 
>>>>> protocol for detecting liveness (and lack of it), so increasing 
>>>>> exponentially defeats the purpose.
>>>>> 
>>>>> Further, exponential back off may not be the best choice when interacting 
>>>>> with routing protocols.
>>>>> 
>>>>> What we currently say is:
>>>>>   The criteria for declaring loss of
>>>>>   reachability and the action that would be triggered as a result
>>>>>   are outside the scope of this document.
>>>>> 
>>>>> As much of these are implementation choices.
>>>>> 
>>>>> But we can add at the end “, and MAY include logging an error.“
>>>> 
>>>> Please do so.
>>> 
>>> Done.
>>> 
>>>> 
>>>>>> 
>>>>>>> 
>>>>>>>> 2) Egress filtering at the adminstrative border of the domain that uses
>>>>>>>> S-BFD to make sure that no S-BFD packets leave the domain.
>>>>>>>> 
>>>>>>> 
>>>>>>> This is no different than any other application that uses UDP; a 
>>>>>>> misconfigured DNS server will result in the same, and a traceroute is 
>>>>>>> also not too different. This seems too onerous of a requirement. An 
>>>>>>> administrative domain filters at ingress.
>>>>>> 
>>>>>> First of all, just because other protocols might have such a problem, 
>>>>>> that does mean it’s okay.
>>>>> 
>>>>> I agree with this. I had a different point in mind though — trying to 
>>>>> specify this on every UDP application might not be the most effective 
>>>>> way. Perhaps there’s a UDP guideline you are uncovering.
>>>>> 
>>>>>> However, correctly me if I’m wrong, but there the situation seems 
>>>>>> slightly different because there is no termination criterium at all that 
>>>>>> means an s-bfd node would send useless data forever (… until manual 
>>>>>> change of the config).
>>>>>> 
>>>>> 
>>>>> But as far as this doc is concerned, let me try to make some 
>>>>> clarifications (and a correction).
>>>>> 
>>>>> There are termination criteria — the document says:
>>>>> 
>>>>> An SBFDInitiator may be a persistent session on the initiator with a
>>>>> timer for S-BFD control packet transmissions (stateful
>>>>> SBFDInitiator).  An SBFDInitiator may also be a module, a script or a
>>>>> tool on the initiator that transmits one or more S-BFD control
>>>>> packets "when needed" (stateless SBFDInitiator).
>>>>> 
>>>>> For the case in which you have a “when needed” SBFDInitiator, the 
>>>>> workflow is like a “ping”.
>>>>> 
>>>>> For the case in which you have a “persistent" SBFDInitiator, in theory 
>>>>> this can run forever (for some value of ever). However, please don’t 
>>>>> loose track of why this protocol exists. Having OAM failures and do 
>>>>> nothing about it defeats the purpose of having OAM. Meaning, a red alarm 
>>>>> will blink, a honk will horn, and the config state will be changed 
>>>>> (manually or by some support system).
>>>>> 
>>>>> In other words, I do not think this is such a unique case (insofar as 
>>>>> running ad-infinutum).
>>>> 
>>>> I still believe that the case where you have a misconfiguration and the 
>>>> initiator sends packets (forever) but never ever gest a reply (and never 
>>>> has seen a reply in the past), is a different case and can be detected and 
>>>> handled separately.
>>>> 
>>> 
>>> Again, I would not qualify this as ‘forever’, but I understand what you 
>>> mean.
>>> 
>>>>> 
>>>>>> I still believe that egress filtering would be more appropriate here 
>>>>>> (than ingress) because the domain that is using s-bfd knows about it and 
>>>>>> therefor can set up the respective filters and should not spam others 
>>>>>> while hoping that filters are in place.
>>>>>> 
>>>>> 
>>>>> To me, there is no insignificant operational complexity with requiring 
>>>>> the addition of filters throughout, for one particular application not 
>>>>> leaking (where the leak does not cause anything special), and when the 
>>>>> leak might happen because of a misconfiguration (or bug) but will be 
>>>>> detected by the operational support systems. The ROI does not seem to add 
>>>>> up.
>>>> 
>>>> Okay the document should probably not require egress filtering in any case 
>>>> but what’s about saying something like:
>>>> 
>>>> „If S-BFD is used it SHOULD be ensured that S-BFD control packet do not 
>>>> propagate outside of the administrative domain that uses it.“
>>>> 
>>> 
>>> We can add an additional explanation of the problem before a statement, but 
>>> I do not think that particular SHOULD is actionable. How’s something like:
>>> 
>>> Explain that without handshake, a persistent initiator can send blindly, to 
>>> then add “In such case, operational measures SHOULD be taken to identify if 
>>> S-BFD packets are not responded to for an extended period of time, and 
>>> remediate the situation”
>>> 
>>>> This is not an uncommon thing to specify also for other protocols.
>>>> 
>>> 
>>> I agree. Frankly, I am happy with either statement, but I think the latter 
>>> might be more operationally actionable.
>>> 
>>> Preference?
>> 
>> I still would prefer something in the line as I proposed. I think there 
>> could effectively  be different action to be taken here, e.g. agree 
>> filtering or measurement to detect failure, as well as no action if for some 
>> other reason it can be ensure that should a misconfiguration can not happen 
>> (or is at least very unlikely to happen) e.g because things are automated 
>> and there are additional checks before apply a config.
>> 
> 
> Perhaps I can add “for an extended period of time” to the first statement (or 
> similar wording of your liking)?
> 
> Your main concern is the “forever”. Let’s ensure it is not “forever”. 
> However, I’m concerned that a single packet out (like a ping to the wrong 
> address) will be violating “ it SHOULD be ensured that S-BFD control packet 
> do not propagate outside”

The concern it not „forever“ but putting (unnecessary) load on other network 
(by accident). So I agree, one or a few packets is not a problem. So yes, 
adding “for an extended period of time” is fine. We could also/instead exchange 
the word „ensure“ by something else (maybe „control“…?).

Mirja



> 
> Would that work?
> 
> Thanks,
> 
> — Carlos.
> 
>> The second SHOULD that you proposed is from my point of view actually an 
>> additional point that I would also be happy to see in the doc.
>> 
>> Mirja
>> 
>> 
>>> 
>>> Thanks,
>>> 
>>> — Carlos.
>>> 
>>>> Mirja
>>>> 
>>>> 
>>>>> 
>>>>> Does the explanation of the termination criteria help?
>>>>> 
>>>>>>> 
>>>>>>> Seems to me the logging will alert someone/something to take action, 
>>>>>>> and should be enough.
>>>>>> 
>>>>>> Logging plus alerts is definitely a good thing.
>>>>>> 
>>>>> 
>>>>> I agree.
>>>>> 
>>>>> Will add “, and MAY include logging an error.” as per above.
>>>>> 
>>>>> Do you think we should expand on this somewhere else in the document?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> — Carlos.
>>>>> 
>>>>>> Mirja
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Thoughts?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> — Carlos.
>>> 
>> 
> 

Reply via email to