JNPR send notification because of hold timer expired (meaning no BGP messages
are received from the
neighbor) - this is correct behavior from BGP perspective.
Do you have logs on CSCO side for the same event? I assume you will see
retransmission of UPDATE
message (not Keepalive message). This Update message is dropped somewhere on
the path between CSCO
and JNPR. And CSCO retrsmits this message. Since UPDATE message is sent within
Keepalive timer, no
Keepalives are sent.
The most common cause of dropping is mismatch of MPLS MTU, or L2 device with
misconfigured MTUs
somewhere in between.
You have to figure out (debugs, traceoptions, tcpdumps, whats ever) which
device on the path is
dropping.
//Krzysztof
-----Original Message-----
From: Tima Maryin [mailto:t...@transtelecom.net]
Sent: Thursday, 12 November, 2009 9:07
To: kszarkow...@gmail.com
Cc: juniper-nsp@puck.nether.net
Subject: Re: [j-nsp] MX960 JunOS recommendations
First of all thanks to all who cares :)
I'll reply one by one
Derick Winkworth wrote:
> How about some debugs or traceoptions?
>
>
traceoptions at last Jun says that box dosen't receive bgp notifications some
times. haven't tried any more yet
sth...@nethelp.no wrote:
>
> Make sure that your IP MTU is the same on both Cisco and Juniper sides.
> If you run IS-IS, make sure your CLNS MTU is the same on both Cisco and
> Juniper sides.
IP mtu are the same, otherwise ospf do not come up
> People have been running interoperable Cisco and Juniper networks for
> many years. This is not rocket science.
Yeah, we installed several Juns into our network several months ago and this is
the only problem which we couldn't solve and rolled back to previous software
(well i do not count some rpd crashes on box with aggregated interfaces which we
can avoid for now. jtac evetually said that its PR439627. I can't read this
hidden PR, but its supposed to be fixed in 10.x and 9.3Rnextrelease )
Krzysztof Szarkowicz wrote:
With MTUs around 9000 configured on ALL links in the network there should be no
problem with BGP,
since as per RFC4271, section 4:
The maximum message size is 4096 octets. All implementations are required to
support this maximum
message size.
So even if MPLS and IP MTUs slightly differ, with sizes around 9000 it doesn't
matter from BGP
perspective.
The only thing that comes in my mind, that there are some L2 switches in
between and there is
something wrong with MTU on those switches. Worth to check.
There are no switches between them
its
7301-geoptic-7606-tengig-t1600-tengig-mx960
Its lab setup. On the real network it was slightly different, but actually its
the same from this problem point of view
Could you paste from the log the Notification message generated when the BGP
session is tear down?
I didn't find any dependance from interfaces load or anything else.
It can be 3-4 gig load (like it was on real network) or empty (like its in
lab), bgp session may drop once per minute or stay up for 30 - 60 mins.
Cisco can be either GSR or 7301, Juniper can be mx or T.
There is nothing special in logs.
Thats the one from mx960:
Nov 12 06:18:31 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 307818660 snd_nxt: 307818660 snd_wnd: 16230
rcv_nxt: 614682635 rcv_adv: 614699019, hold timer 0
Nov 12 06:20:48 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 1301747029 snd_nxt: 1301747029 snd_wnd: 16211
rcv_nxt: 732160622 rcv_adv: 732177006, hold timer 0
Nov 12 06:22:53 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 2024212109 snd_nxt: 2024212109 snd_wnd: 16230
rcv_nxt: 3950965686 rcv_adv: 3950982070, hold timer 0
Nov 12 06:24:56 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 2363347692 snd_nxt: 2363347692 snd_wnd: 16230
rcv_nxt: 1449362513 rcv_adv: 1449378897, hold timer 0
Nov 12 06:59:09 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 3704141975 snd_nxt: 3704141975 snd_wnd: 15985
rcv_nxt: 2261397920 rcv_adv: 2261414304, hold timer 0
Nov 12 07:01:19 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 1379635866 snd_nxt: 1379635866 snd_wnd: 16230
rcv_nxt: 612357774 rcv_adv: 612374158, hold timer 0
Nov 12 07:04:06 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 3377139997 snd_nxt: 3377139997 snd_wnd: 16211
rcv_nxt: 544711184 rcv_adv: 544727568, hold timer 0
Nov 12 07:20:37 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 3633708680 snd_nxt: 3633708680 snd_wnd: 16175
rcv_nxt: 1216109422 rcv_adv: 1216125806, hold timer 0
Nov 12 07:22:54 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 0
rcvcc: 0 TCP state: 4, snd_una: 4034247055 snd_nxt: 4034247055 snd_wnd: 16211
rcv_nxt: 2010186633 rcv_adv: 2010203017, hold timer 0
Nov 12 07:25:00 mskl04ra rpd[1080]: bgp_hold_timeout:3571: NOTIFICATION sent to
10.136.0.13 (Internal AS 20485): code 4 (Hold Timer Expired Error), Reason:
holdtime expired for 10.136.0.13 (Internal AS 20485), socket buffer sndcc: 38
rcvcc: 0 TCP state: 4, snd_una: 3122195868 snd_nxt: 3122195868 snd_wnd: 16268
rcv_nxt: 209999860 rcv_adv: 210016244, hold timer 0
Thanks,
Krzysztof
-----Original Message-----
From: Tima Maryin [mailto:t...@transtelecom.net]
Sent: Wednesday, 11 November, 2009 15:12
To: kszarkow...@gmail.com
Cc: juniper-nsp@puck.nether.net
Subject: Re: [j-nsp] MX960 JunOS recommendations
Uhm, i see your point here.
We indeed have cisco - cisco - Jun - Jun setup
My cisco interface mtu = ip mtu = mpls mtu =9000
But i reeeealy doubt that bgp keepalive packet size can come close to that mtu.
On Juniper i set interface mtu = cisco mtu +14 and it works fine!
And! As you say, it reports different mpls mtu value:
> show interfaces xe-1/0/0 | match MTU
Link-level type: Ethernet, MTU: 9014, LAN-PHY mode, Speed: 10Gbps, Loopback:
None, Source filtering: Disabled,
Protocol inet, MTU: 9000
Protocol mpls, MTU: 8988
Protocol multiservice, MTU: Unlimited
As far as i understand "default mpls mtu" term (not sure that i _fully_
understand it though) it seems, Juniper supposes 3 labels maximum.
I dont see any reasons for device to drop packets which has 1 or 2 labels and
bigger than mpls mtu, but still ok from interface mtu point ov view.
As per your logic, device should drop all traffic that match such criteria but
it seems only bgp session keepalives and i didn't see any other problems
But still, i made an experiment on Juniper and cisco which has bgp session
between them.
cisco:
#sh mpls interfaces g 0/0 detail | i MTU
MTU = 9000
#sh ip int g 0/0 | i MTU
MTU is 9000 bytes
#sh run int g 0/0
Building configuration...
Current configuration : 212 bytes
!
interface GigabitEthernet0/0
description --- to 7606-2 ---
mtu 9000
ip address 10.3.13.2 255.255.255.0
load-interval 30
duplex full
speed 1000
media-type gbic
no negotiation auto
tag-switching ip
end
If i set mtu 9000 under family mpls and commit it, it looks like this:
> show interfaces xe-1/0/0 | match MTU
Link-level type: Ethernet, MTU: 9014, LAN-PHY mode, Speed: 10Gbps, Loopback:
None, Source filtering: Disabled,
Protocol inet, MTU: 9000
Protocol mpls, MTU: 9000
Flags: Is-Primary, User-MTU
Protocol multiservice, MTU: Unlimited
and problem still persists
please let me know if you have any other ideas :)
p.s. Its the same effect if i set tag-sw mtu 8988 on cisco and leave it
'default' (=8988) on juniper
Krzysztof Szarkowicz wrote:
Let me guess.
Your network is multivendor network (JNPR and CSCO) and some transit devices
are CSCO?
CSCO and JNPR uses different algorithm to calculate default MPLS MTU (if MPLS
MTU is not
explicitely
configured) which results in 4 byte difference between CSCO side and JNPR side
of the same link
for
MPLS MTU (the IP MTU is equal on both ends, so no problem with OSPF).
If on JNPR side your MPLS MTU is say 1500 and on the CSCO side the MPLS MTU is
1504, when the
CSCO
device send an BGP update packet towards JNPR device with size 1502, this
packet is dropped by
JNPR
device (as it is to big), and TCP ACK is not sent back. CSCO is keeping by
resending this 1502
long
packet, and JNPR is constantly dropping. Thus, after hold timer expires, the
Notification message
is
sent.
I assume that with 9.3.R3.8 you didn't catched the '1502' packet sizes.
Could you check with some show commands, what is the MPLS MTU on both ends of
the link (which is
terminated on CSCO on one side and JNPR on other side)?
//Krzysztof
-----Original Message-----
From: Tima Maryin [mailto:t...@transtelecom.net]
Sent: Wednesday, 11 November, 2009 9:57
To: kszarkow...@gmail.com
Cc: juniper-nsp@puck.nether.net
Subject: Re: [j-nsp] MX960 JunOS recommendations
What did you mean by "inappropriately configured" ?
There are the same mtu settings everywhere and traffic passes quite well.
And ospf session goes up without problems.
And how comes that "inappropriately configured IP and MPLS MTU" work well on
9.3R3.8 ?
Krzysztof Szarkowicz wrote:
It is not a nasty bug, but problem of inappropriately configured IP and MPLS
MTUs on transit
nodes.
//Krzysztof
-----Original Message-----
From: juniper-nsp-boun...@puck.nether.net
[mailto:juniper-nsp-boun...@puck.nether.net] On Behalf
Of
Tima Maryin
Sent: Wednesday, 11 November, 2009 8:28
To: juniper-nsp@puck.nether.net
Subject: Re: [j-nsp] MX960 JunOS recommendations
9.3R4.4 has a nasty bug which occures in setup when you have bgp session over
chain of few routers/links with ospf/ldp
bgp session occasionally goes down with notification timeout. Even when there is
no traffic at all and no physical errors
rollback to 9.3r3 helps though
JTAC still not confirmed it, but it easlily can be reprodused in lab