Eduard,

пн, 2 авг. 2021 г. в 13:45, Vasilenko Eduard <[email protected]>:

> It is the key in this presentation “This behavior MUST be switched off by
> default”
>
> It has been shown on slides 7-10 that flow label change on RTO is enabled
> by default for BSD and LINUX.
>
> It needs regulation. It needs a standard RFC. Because it kills Anycast
> otherwise.
>
As I'm partially responsible for the key points of the presentation, I can
stress that it is a bit different.

   - We have an opportunity for self-healing TCP on top of IPv6, it should
   be preserved;
   - The Linux defaults should be changed to a safe mode to prevent session
   timeouts;
   - The hash recalculation behavior should be documented;

I'm not sure what you mean by the term 'regulation'.


> The story of how to use RTO to work-around “silent drop” vendor’s bugs
> could be a good informational RFC.
>
> My be people developing iOAM would pay more attention to this use case.
>
>
>
> IMHO: these are 2 separate drafts.
>
I'm not sure about it, we'll try to provide -00 before the next IETF
meeting, let's see how it progresses.


> Eduard
>
> *From:* Alexander Azimov [mailto:[email protected]]
> *Sent:* Monday, August 2, 2021 1:20 PM
> *To:* Vasilenko Eduard <[email protected]>; Jeff Tantsura <
> [email protected]>
> *Cc:* routing WG <[email protected]>
> *Subject:* Re: Self-healing Networking with Flow Label
>
>
>
> Eduard,
>
>
>
> Please see the quote from the slide 28. My suggestion was:
>
>
>
> Client – sends SYN, Server – responds with SYN&ACK
>
>    - In case of SYN_RTO or RTO events Server SHOULD recalculate its TCP
>    socket hash, thus change Flow Label. This behavior MAY be switched on by
>    default;
>    - In case of SYN_RTO or RTO events Client MAY recalculate its TCP
>    socket hash, thus change Flow Label. This behavior MUST be switched off by
>    default;
>
> This looks like a safe default behavior, that saves the part of the
> improvements, but makes the work with stateful anycast services safe.
>
>
>
> And yes, IMO it's ok to have a knob to enable it in the controlled
> environment. If you ask how to enable it in the presence of internal
> anycast services - there was also a suggestion in the slides: eBPF. It
> gives a good way to make this kind of separation.
>
>
>
> 02.08.2021, 11:48, "Vasilenko Eduard" <[email protected]>:
>
> Hi Jeff,
>
> The situation when Control Plane does not understand what the Forwarding
> pane doing is a bug.
>
> Yes, RTO in TCP helps to find a work-around for this bug. And yes, Anycast
> is typically absent inside DC – it does not create the problem in the DC
> environment.
>
>
>
> But the same LINUX is used outside DC. RTO Flow Label change here would
> create even more problems if Anycast would happen on the traffic path (not
> much predictable for client).
>
> Do we need separate LINUX distribution for DC and separate distribution
> for other environments?
>
> Or should we rely on the proper non-default configuration for different
> environments? (Admin should not forget to change)
>
> What if Anycast would become needed in DC?
>
>
>
> RTO flow label recalculation is mutually exclusive with Anycast on the
> traffic part.
>
> What is more valuable for the public?
>
>
>
> IMHO: It is better to fight the problem of such type of a bug with iOAM
> than to cancel Anycast.
>
>
>
> IMHO: It is better to have Flow Label recalculation on RTO as “off” by
> default.
>
> DC Admin has the higher qualification to activate it in a controlled
> environment than every client worldwide that should not forget to disable
> it.
>
>
>
> Eduard
>
> *From:* Jeff Tantsura [mailto:[email protected]
> <[email protected]>]
> *Sent:* Monday, August 2, 2021 6:56 AM
> *To:* Vasilenko Eduard <[email protected]>
> *Cc:* [email protected]; routing WG <[email protected]>
> *Subject:* Re: Self-healing Networking with Flow Label
>
>
>
> Eduard,
>
>
>
> The issue is present not in link/device case, if well implemented - fast
> rehash takes care of updating forwarding within a number of ms. The problem
> is with  “gray” failures,  when the link in question is UP from
> routing/forwarding prospective but drops traffic (mostly occasionally and
> when a HW bug occurs has a distinct flow attributes).
>
>
>
> In many large DC fabrics, the majority of the traffic is east-west,
> between end-points that aren’t anycast. In such deployments - the solution
> solves  issues rather elegantly and without any interventions from the
> operator.
>
> The issues/side effects are well understood and will be documented.
>
>
>
> The best way to receive RTGWG emails is well… to subscribe to RTGWG ;-)
>
> Cheers,
>
> Jeff
>
>
>
>
> On Aug 1, 2021, at 09:47, Vasilenko Eduard <[email protected]>
> wrote:
>
> 
>
> Hi  Alexander,
>
>
>
> Have I understood your presentation right?
>
> The client SHOULD change IPv6 flow label after SYN RTO to have a chance to
> be moved to the working path inside DC fabric (if DC fabric supports flow
> label for hash calculation)
>
> But at the same time
>
> The client SHOULD NOT change the IPv6 flow label after SYN RTO to avoid
> being switched to a different TCP proxy engine.
>
>
>
> Looks like a deadlock, especially if both things should happen for the
> same traffic:
>
> it should reach DC fabric
>
> and
>
> it should be hash load-balanced between different TCP proxy engines (or
> applications) inside DC Fabric.
>
>
>
> I see one bad solution (“Disable Flow Label”):
>
> Routers up to TCP proxy engine SHOULD be configured not to use flow label
> (by the way these are all routers on the Internet),
>
> TCP flow engines SHOULD be outside of the DC Fabric (CLOS) – probably in
> front of it.
>
> Routers/Switches inside DC Fabric SHOULD use flow labels.
>
>
>
> I see another bad solution (“Disable Anycast”):
>
> Disable anycast on routers in principle, use only stateful LB.
>
>
>
>
>
> It has been commented in the chat that Anycast is not possible in
> principle for stateful connection. It is too general a statement.
>
> Anycast is just not compatible with Flow Label. It is not a problem for
> IPv4 anycast even if the connection is stateful (TCP) because 5-tuple for
> hash would not change.
>
> Hence, IPv6 anycast has become dead at the time when Flow Label change has
> been added in LINUX for active TCP session.
>
>
>
> Among 3 thins:
>
> -          Anycast
>
> -          Flow Label load balancing (basic Flow Label functionality)
>
> -          Flow Label change on the active session for application to be
> more active in new path search
>
> You have to choose which one to kill – all 3 are not compatible with each
> other at the same.
>
> I vote to disable Flow Label change in LINUX. Then wait till the network
> would fix itself.
>
> We have so many fancy TE tools our days. A broken link or a broken node
> could be excluded from routing for 50ms.
>
>
>
> PS: I am not subscribed to the RTGWG alias, please keep me on a copy of
> this thread.
>
> <image001.png>
>
> Best Regards
>
> Eduard Vasilenko
>
> Senior Architect
>
> Europe Standardization & Industry Development Department
>
> Tel: +7(985) 910-1105, +7(916) 800-5506
>
>
>
> _______________________________________________
> rtgwg mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/rtgwg
>
>
>
>
>
> --
>
> Best regards,
>
> Alexander Azimov
>
>
> _______________________________________________
> rtgwg mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/rtgwg
>


-- 
Best regards,
Alexander Azimov
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Reply via email to