These types of issues are endemic to TCP anycast, unfortunately. If your 
services depend on anycast, you really want to make sure that your hashing algo 
network-wide uses the TCP/IP five-tuple, and the five-tuple *only*; I’m 
assuming that’s what changing to the the “original” load-sharing CEF algorithm 
accomplished in your case.

My personal war story involved connections to a database server that would 
change DSCP values between the TCP handshake and subsequent packets, resulting 
in RSVP-TE putting the packets on different paths... to a PE router that 
incorporated the inbound destination MAC (read: inbound interface) into the 
hash algo, with no option to disable. Different inbound interfaces = different 
dest MAC = different anycast destinations = immediate SEV1 impact as all our 
database connections broke at once.

The solution*? Reconfigure each L3 interface on the router to the same MAC 
address. :P

-Chris

* An alternate solution involved screaming into the vendor support void, which 
eventually did get results... a year or so later.
** As I’m typing this, I’ve realized that an iptables rule on the server to 
undo the DSCP change would have fixed; putting that thought onto the shelf if 
needed later.

> On Dec 2, 2025, at 09:12, Tim Durack via NANOG <[email protected]> wrote:
> 
> In case it is useful for anyone else, underlying issue looks to be this:
> 
> Cisco CSCws27022: ECN bits being included as part of ECMP hash on IPv6 TCP
> flows (Workaround: Do not use ECMP)
> 
> Appears to be platform specific, affecting Cisco Catalyst C9K UADP ASIC
> (C9500-32C)
> 
> Another work-around might be to configure "ip cef load-sharing algorithm
> original"
> 
> Tim:>
> 
> On Tue, Mar 25, 2025 at 4:33 PM Tim Durack <[email protected]> wrote:
> 
>> Very helpful, thanks! Will post my own short story once complete...
>> 
>> On Tue, Mar 25, 2025 at 4:24 PM Toke Høiland-Jørgensen <[email protected]>
>> wrote:
>> 
>>> Tim Durack <[email protected]> writes:
>>> 
>>>> Toke,
>>>> 
>>>> Resurrecting an old thread, did you ever write this one up?
>>> 
>>> Hi Tim
>>> 
>>> Thank you for the reminder! No, I never did get around to writing
>>> anything at the time. However, now that you reminded me, I collected my
>>> old notes and posted this:
>>> 
>>> 
>>> https://blog.tohojo.dk/2025/03/ecn-ecmp-and-anycast-a-cocktail-of-broken-connections.html
>>> 
>>>> I believe I have a customer reporting a similar problem with IPv6 TCP
>>> ECN
>>>> probably ECMP resulting in RST coming back from anycast services
>>>> (Cloudflare).
>>>> 
>>>> Tricky one to debug, looking for similar reports...
>>> 
>>> Hoping the above is helpful :)
>>> 
>>> -Toke
>>> 
>> 
>> 
>> --
>> Tim:>
>> 
> 
> 
> -- 
> Tim:>
> 
> 
> -- 
> Tim:>
> _______________________________________________
> NANOG mailing list 
> https://lists.nanog.org/archives/list/[email protected]/message/KSVJBYJYTIEXCHF66JBWR3WBLJT7QX5J/

_______________________________________________
NANOG mailing list 
https://lists.nanog.org/archives/list/[email protected]/message/UW4PU2VVAJVUPAMHKVDM4ON5TKMIWZ2W/

Reply via email to