Hi Everyone,

Quick synopsis of our network, multiple pops, all connected via various 3rd 
party carriers, who all use differing MTUs, that can also "change" 
unexpectedly(Unavoidable unfortunately!)...hence, we have a few options, 
disable transport path-mtu-discovery, and run with the small 536 MTU default, 
or try setting a larger MTU, and hope the interpop links MTU doesnt drop below 
this, or use a "dynamic" approach, ala transport path-mtu-discovery.


Faced an unusual issue last night - 2 ME3600's, both connected together, and 
connected to an ASR1006 (POPD) peer with 2 RR's (ASR1K's)...both "had" 
transport path-mtu-discovery enabled, and had happily peered with the 2 RRs for 
~50weeks....last night, one of engineers attempted to peer with MS/360 on 
"ME01", and caused the peering seesions from this ME to flap to the 
RR's.....and I assume transport path-mtu-discovery was then triggered to 
"re-calc" the optimum MTU to the RRs (This is one piece of info Im not sure 
on....when does transport path-mtu-discovery actually calc the MTU, what are 
the triggers for it to re-calc?)


Anyway, the value it ended up with, was too large, and BGP sessions to the 2 
RR's would establish for 3minutes, fail, then re-establish ~9sec 
later....disabling transport path-mtu-discovery "fixed" this.


The thing that concerns/confuses me about transport path-mtu-discovery (And if 
it simply is unreliable on the ME's to use in our network), is that on the 2 
ME's, both with the same path to the 2 RRs, transport path-mtu-discovery came 
up with 2 completely different MTU sizes.


PE01 (When it failed) - 1954 bytes

PE02 (Which is still is apparently using) - 2936 bytes


Now, ping tests from both these ME's show that 2936 bytes is absolutely not 
achievable (Where it got this number, I dont know)....but BGP is still up and 
running, and has been for 50 weeks....so it cant be using this MTU size?


The max I can get through from both PE's is 1552 (Output below from both 
ME's)...so Im guessing if PE02 should have any flap/issue, we will be hit with 
a similiar issue that occurred last night on PE01


PE01-EQ-SY3-L1H500160-R1803-RU38#ping xxx.xxx.xxx.213 size 1552 df-bit Type 
escape sequence to abort.
Sending 5, 1552-byte ICMP Echos to xxx.xxx.xxx.213, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/18/20 ms 
PE01-EQ-SY3-L1H500160-R1803-RU38#ping xxx.xxx.xxx.213 size 1553 df-bit Type 
escape sequence to abort.
Sending 5, 1553-byte ICMP Echos to xxx.xxx.xxx.213, timeout is 2 seconds:
Packet sent with the DF bit set
.....
Success rate is 0 percent (0/5)


PE02-EQ-SY3-L1H500160-R1803-RU37#ping xxx.xxx.xxx.213 size 1552 df-bit Type 
escape sequence to abort.
Sending 5, 1552-byte ICMP Echos to xxx.xxx.xxx.213, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/19/20 ms 
PE02-EQ-SY3-L1H500160-R1803-RU37#ping xxx.xxx.xxx.213 size 1553 df-bit Type 
escape sequence to abort.
Sending 5, 1553-byte ICMP Echos to xxx.xxx.xxx.213, timeout is 2 seconds:
Packet sent with the DF bit set
.....
Success rate is 0 percent (0/5)



Any insight/recommendations are highly appreciated......as it stands now, I 
dont think we have any other choice than to completely remove transport 
path-mtu-discovery, and run with the small 536byte default...not ideal, but Im 
at a loss how transport mtu disc actually "works out" the MTU it decides 
on....from my limited experience with it, lol, it appears to pick a number at 
random (I know this cant be the case)



Cheers


_______________________________________________
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

Reply via email to