Re: [j-nsp] [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
hi
I'd like to test with LACP slow, then can see if physical interface still
flaps...

Thanks for your support

Il giorno dom 11 feb 2024 alle ore 18:02 Saku Ytti  ha
scritto:

> On Sun, 11 Feb 2024 at 17:52, james list  wrote:
>
> > - why physical interface flaps in DC1 if it is related to lacp ?
>
> 16:39:35.813 Juniper reports LACP timeout (so problem started at
> 16:39:32, (was traffic passing at 32, 33, 34 seconds?))
> 16:39:36.xxx Cisco reports interface down, long after problem has
> already started
>
> Why Cisco reports physical interface down, I'm not sure. But clearly
> the problem was already happening before interface down, and first log
> entry is LACP timeout, which occurs 3s after the problem starts.
> Perhaps Juniper asserts for some reason RFI? Perhaps Cisco resets the
> physical interface once removed from LACP?
>
> > - why the same setup in DC2 do not report issues ?
>
> If this is is LACP related software issue, could be difference not
> identified. You need to gather more information, like how does ping
> look throughout this event, particularly before syslog entries. And if
> ping still works up-until syslog, you almost certainly have software
> issue with LACP inject at Cisco, or more likely LACP punt at Juniper.
>
> --
>   ++ytti
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
Hi
I have a couple of points to ask related to your idea:
- why physical interface flaps in DC1 if it is related to lacp ?
- why the same setup in DC2 do not report issues ?

NEXUS01# sh logging | in  Initia | last 15
2024 Jan 17 22:37:49 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 18 23:54:25 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 19 00:58:13 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 19 07:15:04 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 22 16:03:13 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 25 21:32:29 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 26 18:41:12 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 28 05:07:20 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 29 04:06:52 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 30 03:09:44 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  5 18:13:20 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  6 02:17:25 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  6 22:00:24 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 09:29:36 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 16:39:36 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)

Il giorno dom 11 feb 2024 alle ore 14:36 Saku Ytti  ha
scritto:

> On Sun, 11 Feb 2024 at 15:24, james list  wrote:
>
> > While on Juniper when the issue happens I always see:
> >
> > show log messages | last 440 | match LACPD_TIMEOUT
> > Jan 25 21:32:27.948 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5:
> lacp current while timer expired current Receive State: CURRENT
> 
> > Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5:
> lacp current while timer expired current Receive State: CURRENT
>
> Ok so problem always starts by Juniper seeing 3seconds without LACP
> PDU, i.e. missing 3 consecutive LACP PDU. It would be good to ping
> while this problem is happening, to see if ping stops at 3s before the
> syslog lines, or at the same time as syslog lines.
> If ping stops 3s before, it's link problem from cisco to juniper.
> If ping stops at syslog time (my guess), it's software problem.
>
> There is unfortunately log of bug surface here, both on inject and on
> punt path. You could be hitting PR1541056 on the Juniper end. You
> could test for this by removing distributed LACP handling with 'set
> routing-options ppm no-delegate-processing'
> You could also do packet capture for LACP on both ends, to try to see
> if LACP was sent by Cisco and received by capture, but not by system.
>
>
> --
>   ++ytti
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
On Cisco I see physical goes down (initializing), what does that mean?

While on Juniper when the issue happens I always see:

show log messages | last 440 | match LACPD_TIMEOUT
Jan 25 21:32:27.948 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Jan 26 18:41:12.514 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Jan 28 05:07:20.283 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Jan 29 04:06:51.768 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Jan 30 03:09:43.923 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  5 18:13:20.158 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  6 02:17:23.703 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  6 22:00:23.758 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  9 09:29:35.728 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT

Il giorno dom 11 feb 2024 alle ore 14:10 Saku Ytti  ha
scritto:

> Hey James,
>
> You shared this off-list, I think it's sufficiently material to share.
>
> 2024 Feb  9 16:39:36 NEXUS1
> %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface
> port-channel101 is down (No operational members)
> 2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN:
> port-channel101: Ethernet1/44 is down
> Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5:
> lacp current while timer expired current Receive State: CURRENT
> Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACP_INTF_DOWN: ae49:
> Interface marked down due to lacp timeout on member et-0/1/5
>
> We can't know the order of events here, due to no subsecond precision
> enabled on Cisco end.
>
> But if failure would start from interface down, it would take 3seconds
> for Juniper to realise LACP failure. However we can see that it
> happens in less than 1s, so we can determine the interface was not
> down first, the first problem was Juniper not receiving 3 consecutive
> LACP PDUs, 1s apart, prior to noticing any type of interface state
> related problems.
>
> Is this always the order of events? Does it always happen with Juniper
> noticing problems receiving LACP PDU first?
>
>
> On Sun, 11 Feb 2024 at 14:55, james list via juniper-nsp
>  wrote:
> >
> > Hi
> >
> > 1) cable has been replaced with a brand new one, they said that to check
> an
> > MPO 100 Gbs cable is not that easy
> >
> > 3) no errors reported on both side
> >
> > 2) here the output of cisco and juniper
> >
> > NEXUS1# sh interface eth1/44 transceiver details
> > Ethernet1/44
> > transceiver is present
> > type is QSFP-100G-SR4
> > name is CISCO-INNOLIGHT
> > part number is TR-FC85S-NC3
> > revision is 2C
> > serial number is INL27050TVT
> > nominal bitrate is 25500 MBit/sec
> > Link length supported for 50/125um OM3 fiber is 70 m
> > cisco id is 17
> > cisco extended id number is 220
> > cisco part number is 10-3142-03
> > cisco product id is QSFP-100G-SR4-S
> > cisco version id is V03
> >
> > Lane Number:1 Network Lane
> >SFP Detail Diagnostics Information (internal calibration)
> >
> >
> 
> > Current  Alarms  Warnings
> > Measurement HighLow High  Low
> >
> >
> 
> >   Temperature   30.51 C75.00 C -5.00 C 70.00 C
> 0.00 C
> >   Voltage3.28 V 3.63 V  2.97 V  3.46 V
> 3.13 V
> >   Current6.40 mA   12.45 mA 3.25 mA12.45 mA
>  3.25
> > mA
> >   Tx Power   0.98 dBm   5.39 dBm  -12.44 dBm2.39 dBm
>  -8.41
> > dBm
> >   Rx Power  -1.60 dBm   5.39 dBm  -14.31 dBm2.39 dBm
> -10.31
> > dBm
> >   Transmit Fault Count = 0
> >
> >
> 
> >   Note: ++  hi

Re: [j-nsp] [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
Hi

1) cable has been replaced with a brand new one, they said that to check an
MPO 100 Gbs cable is not that easy

3) no errors reported on both side

2) here the output of cisco and juniper

NEXUS1# sh interface eth1/44 transceiver details
Ethernet1/44
transceiver is present
type is QSFP-100G-SR4
name is CISCO-INNOLIGHT
part number is TR-FC85S-NC3
revision is 2C
serial number is INL27050TVT
nominal bitrate is 25500 MBit/sec
Link length supported for 50/125um OM3 fiber is 70 m
cisco id is 17
cisco extended id number is 220
cisco part number is 10-3142-03
cisco product id is QSFP-100G-SR4-S
cisco version id is V03

Lane Number:1 Network Lane
   SFP Detail Diagnostics Information (internal calibration)


Current  Alarms  Warnings
Measurement HighLow High  Low


  Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
  Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
  Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
mA
  Tx Power   0.98 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
dBm
  Rx Power  -1.60 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
dBm
  Transmit Fault Count = 0


  Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning

Lane Number:2 Network Lane
   SFP Detail Diagnostics Information (internal calibration)


Current  Alarms  Warnings
Measurement HighLow High  Low


  Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
  Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
  Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
mA
  Tx Power   0.62 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
dBm
  Rx Power  -1.18 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
dBm
  Transmit Fault Count = 0


  Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning

Lane Number:3 Network Lane
   SFP Detail Diagnostics Information (internal calibration)


Current  Alarms  Warnings
Measurement HighLow High  Low


  Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
  Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
  Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
mA
  Tx Power   0.87 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
dBm
  Rx Power   0.01 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
dBm
  Transmit Fault Count = 0


  Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning

Lane Number:4 Network Lane
   SFP Detail Diagnostics Information (internal calibration)


Current  Alarms  Warnings
Measurement HighLow High  Low


  Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
  Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
  Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
mA
  Tx Power   0.67 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
dBm
  Rx Power   0.11 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
dBm
  Transmit Fault Count = 0


  Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning



MX1> show interfaces diagnostics optics et-1/0/5
Physical interface: et-1/0/5
Module temperature:  38 degrees C / 100 degrees
F
Module voltage:  3.2740 V
Module temperature high alarm :  Off
Module temperature low alarm  :  Off
Module temperature high warning   :  Off
Module temperature low warning:  Off
Module voltage high alarm   

Re: [j-nsp] [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
Hi
there are no errors on both interfaces (Cisco and Juniper).

here following logs of one event on both side, config and LACP stats.

LOGS of one event time 16:39:

CISCO
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PARENT_DOWN: Interface
port-channel101.2303 is down (Parent interface is down)
2024 Feb  9 16:39:36 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Down - sent:  other configuration change
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from Ethernet1/44 to none
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel101:
Ethernet1/44 is down
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 10 Kbit
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-SPEED: Interface port-channel101,
operational speed changed to 100 Gbps
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DUPLEX: Interface
port-channel101, operational duplex mode changed to Full
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface
port-channel101, operational Receive Flow Control state changed to off
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface
port-channel101, operational Transmit Flow Control state changed to off
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel101:
Ethernet1/44 is up
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from none to Ethernet1/44
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 1 Kbit
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface Ethernet1/44 is up
in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface port-channel101 is
up in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface
port-channel101.2303 is up in Layer3
2024 Feb  9 16:39:43 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Up



Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACP_INTF_DOWN: ae49: Interface
marked down due to lacp timeout on member et-0/1/5
Feb  9 16:39:35.819 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49: bundle IFD minimum bandwidth or minimum links not met, Bandwidth
(Current : Required) 0 : 1000 Number of links (Current : Required)
0 : 1
Feb  9 16:39:35.815 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from COLLECTING_DISTRIBUTING to
ATTACHED, actor port state : |EXP|-|-|-|IN_SYNC|AGG|SHORT|ACT|, partner
port state : |-|-|DIS|COL|OUT_OF_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:35.869 2024  MX1 rpd[31866]: bgp_ifachange_group:10697:
NOTIFICATION sent to 172.16.6.18 (External AS xxx): code 6 (Cease) subcode
6 (Other Configuration Change), Reason: Interface change for the peer-group
Feb  9 16:39:35.909 2024  MX1 mib2d[31909]: SNMP_TRAP_LINK_DOWN: ifIndex
684, ifAdminStatus up(1), ifOperStatus down(2), ifName ae49
Feb  9 16:39:36.083 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from ATTACHED to
COLLECTING_DISTRIBUTING, actor port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|, partner port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:36.089 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49 is now Up. uplinks 1 >= min_links 1
Feb  9 16:39:36.089 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49: bundle IFD minimum bandwidth or minimum links not met, Bandwidth
(Current : Required) 0 : 1000 Number of links (Current : Required)
0 : 1
Feb  9 16:39:36.085 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from COLLECTING_DISTRIBUTING to
ATTACHED, actor port state : |-|-|-|-|IN_SYNC|AGG|SHORT|ACT|, partner port
state : |-|-|-|-|OUT_OF_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:39.095 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from ATTACHED to
COLLECTING_DISTRIBUTING, actor port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|, partner port state :
|-|-|-|-|IN_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:39.101 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49 is now Up. uplinks 1 >= min_links 1
Feb  9 16:39:39.109 2024  MX1 mib2d[31909]: SNMP_TRAP_LINK_UP: ifIndex 684,
ifAdminStatus up(1), ifOperStatus up(1), ifName ae49
Feb  9 16:39:41.190 2024  MX1 rpd[31866]: bgp_recv: read from peer
172.16.6.18 (External AS xxx) failed: Unknown error: 48110976


CONFIG:

CISCO

NEXUS1# sh run int por

Re: [j-nsp] [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
DC technicians states cable are the same in both DCs and direct, no patch
panel

Cheers

Il giorno dom 11 feb 2024 alle ore 11:20 nivalMcNd d 
ha scritto:

> Can it be DC1 is connecting links over an intermediary patch panel and you
> face fibre disturbance? That may be eliminated if your interfaces on DC1
> links do not go down
>
> On Sun, Feb 11, 2024, 21:16 Igor Sukhomlinov via cisco-nsp <
> cisco-...@puck.nether.net> wrote:
>
>> Hi James,
>>
>> Do you happen to run the same software on all nexuses and all MXes?
>> Do the DC1 and DC2 bgp session exchange the same amount of routing updates
>> across the links?
>>
>>
>> On Sun, Feb 11, 2024, 21:09 james list via cisco-nsp <
>> cisco-...@puck.nether.net> wrote:
>>
>> > Dear experts
>> > we have a couple of BGP peers over a 100 Gbs interconnection between
>> > Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different
>> datacenters
>> > like this:
>> >
>> > DC1
>> > MX1 -- bgp -- NEXUS1
>> > MX2 -- bgp -- NEXUS2
>> >
>> > DC2
>> > MX3 -- bgp -- NEXUS3
>> > MX4 -- bgp -- NEXUS4
>> >
>> > The issue we see is that sporadically (ie every 1 to 3 days) we notice
>> BGP
>> > flaps only in DC1 on both interconnections (not at the same time),
>> there is
>> > still no traffic since once noticed the flaps we have blocked deploy on
>> > production.
>> >
>> > We've already changed SPF (we moved the ones from DC2 to DC1 and
>> viceversa)
>> > and cables on both the interconnetion at DC1 without any solution.
>> >
>> > SFP we use in both DCs:
>> >
>> > Juniper - QSFP-100G-SR4-T2
>> > Cisco - QSFP-100G-SR4
>> >
>> > over MPO cable OM4.
>> >
>> > Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the
>> issue.
>> >
>> > Any idea or suggestion what to check or to do ?
>> >
>> > Thanks in advance
>> > Cheers
>> > James
>> > ___
>> > cisco-nsp mailing list  cisco-...@puck.nether.net
>> > https://puck.nether.net/mailman/listinfo/cisco-nsp
>> > archive at http://puck.nether.net/pipermail/cisco-nsp/
>> >
>> ___
>> cisco-nsp mailing list  cisco-...@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
yes same version
currently no traffic exchange is in place, just BGP peer setup
no traffic

Il giorno dom 11 feb 2024 alle ore 11:16 Igor Sukhomlinov <
dvalinsw...@gmail.com> ha scritto:

> Hi James,
>
> Do you happen to run the same software on all nexuses and all MXes?
> Do the DC1 and DC2 bgp session exchange the same amount of routing updates
> across the links?
>
>
> On Sun, Feb 11, 2024, 21:09 james list via cisco-nsp <
> cisco-...@puck.nether.net> wrote:
>
>> Dear experts
>> we have a couple of BGP peers over a 100 Gbs interconnection between
>> Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different
>> datacenters
>> like this:
>>
>> DC1
>> MX1 -- bgp -- NEXUS1
>> MX2 -- bgp -- NEXUS2
>>
>> DC2
>> MX3 -- bgp -- NEXUS3
>> MX4 -- bgp -- NEXUS4
>>
>> The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
>> flaps only in DC1 on both interconnections (not at the same time), there
>> is
>> still no traffic since once noticed the flaps we have blocked deploy on
>> production.
>>
>> We've already changed SPF (we moved the ones from DC2 to DC1 and
>> viceversa)
>> and cables on both the interconnetion at DC1 without any solution.
>>
>> SFP we use in both DCs:
>>
>> Juniper - QSFP-100G-SR4-T2
>> Cisco - QSFP-100G-SR4
>>
>> over MPO cable OM4.
>>
>> Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.
>>
>> Any idea or suggestion what to check or to do ?
>>
>> Thanks in advance
>> Cheers
>> James
>> ___
>> cisco-nsp mailing list  cisco-...@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
Hi
One think I've omit to say is that BGP is over a LACP with currently just
one interface 100 Gbs.

I see that the issue is triggered on Cisco when eth interface seems to go
in Initializing state:


2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PARENT_DOWN: Interface
port-channel101.2303 is down (Parent interface is down)
2024 Feb  9 16:39:36 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Down - sent:  other configuration change
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from Ethernet1/44 to none
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel101:
Ethernet1/44 is down
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 10 Kbit
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-SPEED: Interface port-channel101,
operational speed changed to 100 Gbps
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DUPLEX: Interface
port-channel101, operational duplex mode changed to Full
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface
port-channel101, operational Receive Flow Control state changed to off
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface
port-channel101, operational Transmit Flow Control state changed to off
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel101:
Ethernet1/44 is up
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from none to Ethernet1/44
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 1 Kbit
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface Ethernet1/44 is up
in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface port-channel101 is
up in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface
port-channel101.2303 is up in Layer3
2024 Feb  9 16:39:43 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Up

Cheers
James

Il giorno dom 11 feb 2024 alle ore 11:12 Gert Doering 
ha scritto:

> Hi,
>
> On Sun, Feb 11, 2024 at 11:08:29AM +0100, james list via cisco-nsp wrote:
> > we notice BGP flaps
>
> Any particular error message?  BGP flaps can happen due to many different
> reasons, and usually $C is fairly good at logging the reason.
>
> Any interface errors, packet errors, ping packets lost?
>
> "BGP flaps" *can* be related to lower layer issues (so: interface counters,
> error counters, extended pings) or to something unrelated, like "MaxPfx
> exceeded"...
>
> gert
> --
> "If was one thing all people took for granted, was conviction that if you
>  feed honest figures into a computer, honest figures come out. Never
> doubted
>  it myself till I met a computer with a sense of humor."
>  Robert A. Heinlein, The Moon is a Harsh
> Mistress
>
> Gert Doering - Munich, Germany
> g...@greenie.muc.de
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] Fwd: Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
Dear experts
we have a couple of BGP peers over a 100 Gbs interconnection between
Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different datacenters
like this:

DC1
MX1 -- bgp -- NEXUS1
MX2 -- bgp -- NEXUS2

DC2
MX3 -- bgp -- NEXUS3
MX4 -- bgp -- NEXUS4

The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
flaps only in DC1 on both interconnections (not at the same time), there is
still no traffic since once noticed the flaps we have blocked deploy on
production.

We've already changed SPF (we moved the ones from DC2 to DC1 and viceversa)
and cables on both the interconnetion at DC1 without any solution.

SFP we use in both DCs:

Juniper - QSFP-100G-SR4-T2
Cisco - QSFP-100G-SR4

over MPO cable OM4.

Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.

Any idea or suggestion what to check or to do ?

Thanks in advance
Cheers
James
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via juniper-nsp
Dear experts
we have a couple of BGP peers over a 100 Gbs interconnection between
Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different datacenters
like this:

DC1
MX1 -- bgp -- NEXUS1
MX2 -- bgp -- NEXUS2

DC2
MX3 -- bgp -- NEXUS3
MX4 -- bgp -- NEXUS4

The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
flaps only in DC1 on both interconnections (not at the same time), there is
still no traffic since once noticed the flaps we have blocked deploy on
production.

We've already changed SPF (we moved the ones from DC2 to DC1 and viceversa)
and cables on both the interconnetion at DC1 without any solution.

SFP we use in both DCs:

Juniper - QSFP-100G-SR4-T2
Cisco - QSFP-100G-SR4

over MPO cable OM4.

Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.

Any idea or suggestion what to check or to do ?

Thanks in advance
Cheers
James
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] input errors on QFX5110

2023-08-08 Thread james list via juniper-nsp
Dear experts,
a customer of mine has a Fiber Channel over IP switch connected to a port
QFX5110 port and sees a lot of "Input errors" and "oversize frames"

Input errors: Errors: 118467
Oversized frames118467

below extensive output


Are those counters related to frames dropped or just a counter ?

The same machine connected to a previous QFX5100 (different Junos) does not
highlight the same counter.

Since the FCoIP switch manager sees packet loss and retransmissions and
states no jumbo frame should be needed, which is your idea of the issue ?

I see as well this:

 Autonegotiation information:
Negotiation status: Incomplete

Thanks in advance
James


QFX5110A> show interfaces ge-0/0/0 extensive
Physical interface: ge-0/0/0, Enabled, Physical link is Up
  Interface index: 652, SNMP ifIndex: 707, Generation: 145
  Description: FCOIP
  Link-level type: Ethernet, MTU: 1514, LAN-PHY mode, Speed: 1000mbps, BPDU
Error: None, Loop Detect PDU Error: None, Ethernet-Switching Error: None,
  MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled,
Flow control: Disabled, Auto-negotiation: Enabled, Remote fault: Online,
Media type: Fiber,
  IEEE 802.3az Energy Efficient Ethernet: Disabled, Auto-MDIX: Enabled
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x4000
  Link flags : None
  CoS queues : 12 supported, 12 maximum usable queues
  Hold-times : Up 0 ms, Down 0 ms
  Current address: bc:7c:6c:23:11:03, Hardware address: bc:7c:6c:23:11:03
  Last flapped   : 2023-03-23 10:54:13 CET (19w5d 00:48 ago)
  Statistics last cleared: 2023-08-08 10:56:52 CEST (01:45:26 ago)
  Traffic statistics:
   Input  bytes  :  40792860463  9378952 bps
   Output bytes  :   4232537714  4886680 bps
   Input  packets: 52096906 4710 pps
   Output packets: 29620532 3841 pps
   IPv6 transit statistics:
   Input  bytes  :0
   Output bytes  :0
   Input  packets:0
   Output packets:0
  Input errors:
Errors: 118467, Drops: 0, Framing errors: 0, Runts: 0, Policed
discards: 0, L3 incompletes: 0, L2 channel errors: 0, L2 mismatch timeouts:
0, FIFO errors: 0,
Resource errors: 0
  Output errors:
Carrier transitions: 0, Errors: 0, Drops: 0, Collisions: 0, Aged
packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource
errors: 0
  Egress queues: 12 supported, 5 in use
  Queue counters:   Queued packets  Transmitted packets  Dropped
packets
0 29604377 29604377
   0
300
   0
400
   0
7 6835 6835
   0
8 5865 5865
   0
  Queue number: Mapped forwarding classes
0   best-effort
3   fcoe
4   no-loss
7   network-control
8   mcast
  Active alarms  : None
  Active defects : None
  PCS statistics  Seconds
Bit errors 0
Errored blocks 0
  Ethernet FEC statistics  Errors
FEC Corrected Errors0
FEC Uncorrected Errors  0
FEC Corrected Errors Rate   0
FEC Uncorrected Errors Rate 0
  MAC statistics:  Receive Transmit
Total octets   40792860463   4232537714
Total packets 52096906 29620532
Unicast packets   52096695 29607881
Broadcast packets0 1452
Multicast packets  21111199
CRC/Align errors 00
FIFO errors  00
MAC control frames   00
MAC pause frames 00
Oversized frames118467
Jabber frames0
Fragment frames  0
VLAN tagged frames   0
Code violations  0
  MAC Priority Flow Control Statistics:
Priority :  0 00
Priority :  1 00
Priority :  2 00
Priority :  3 00
Priority :  4 00
Priority :  5 00
Priority :  6 00
Priority :  7   

[j-nsp] Fwd: Port-channel not working Juniper vs Cisco

2023-06-11 Thread james list via juniper-nsp
Dear experts
we've an issue in setting up a port-channel between a Juniper EX4400 and a
Cisco Nexus N9K-C93180YC-EX over an SX 1 Gbs link.

We've implemented the following configuration but on Juniper side it is
interface flapping while on Cisco side it remains down.
Light levels seem ok.

Has anyone ever experienced the same ? Any suggestions ?

Thanks in advance for any hint
Kind regards
James

JUNIPER *

> show configuration interfaces ae10 | display set
set interfaces ae10 description "to Cisco leaf"
set interfaces ae10 aggregated-ether-options lacp active
set interfaces ae10 aggregated-ether-options lacp periodic fast
set interfaces ae10 unit 0 family ethernet-switching interface-mode trunk
set interfaces ae10 unit 0 family ethernet-switching vlan members 301

> show configuration interfaces ge-0/2/3 | display set
set interfaces ge-0/2/3 description "to Cisco leaf"
set interfaces ge-0/2/3 ether-options 802.3ad ae10

> show vlans VLAN_301

Routing instanceVLAN name Tag  Interfaces
default-switch  VLAN_301  301 ae10.0




CISCO  ***

interface Ethernet1/41
  description <[To EX4400]>
  switchport
  switchport mode trunk
  switchport trunk allowed vlan 301
  channel-group 41 mode active
  no shutdown

interface port-channel41
  description <[To EX4400]>
  switchport
  switchport mode trunk
  switchport trunk allowed vlan 301


# sh vlan id 301

VLAN Name StatusPorts
  -
---
301  P2P_xxx  activePo1, Po41, Eth1/1, Eth1/41

VLAN Type Vlan-mode
 ---
301  enet CE

Remote SPAN VLAN

Disabled

Primary  Secondary  Type Ports
---  -  ---
 ---
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Cut through and buffer questions

2021-11-19 Thread james list via juniper-nsp
Can you please share the output of:

show class-of-service shared-buffer

on your QFX5100 ?

Cheers
James

Il giorno ven 19 nov 2021 alle ore 11:58 Thomas Bellman 
ha scritto:

> On 2021-11-19 09:49, james list via juniper-nsp wrote:
>
> > I try to rephrase the question you do not understand: if I enable cut
> > through or change buffer is it traffic affecting ?
>
> On the QFX 5xxx series and (at least) EX 46xx series, the forwarding
> ASIC needs to reset in order to change between store-and-forward and
> cut-through, and traffic will be lost until the reprogramming has been
> completed.  Likewise, changing buffer config will need to reset the
> ASIC.  When I have tested it, this has taken at most one second, though,
> so for many people it will be a non-event.
>
> One thing to remember when using cut-through forwarding, is that packets
> that have suffered bit errors or truncation, so the CRC checksum is
> incorrect, will still be forwarded, and not be discarded by the switch.
> This is usually not a problem in itself, but if you are not aware of it,
> it is easy to get confused when troubleshooting bit errors (you see
> ingress errors on one switch, and think it is the link to the switch
> that has problems, but in reality it might just be that the switch on
> the other end that is forwarding broken packets *it* received).
>
>
> > Regarding the drops here the outputs (15h after clear statistics):
> [...abbreviated...]
> > Queue: 0, Forwarding classes: best-effort
> >   Transmitted:
> > Packets  :6929684309190446 pps
> > Bytes: 4259968408584 761960360 bps
> > Total-dropped packets:  1592 0 pps
> > Total-dropped bytes  :   2244862 0 bps
> [...]> Queue: 7, Forwarding classes: network-control
> >   Transmitted:
> > Packets  : 59234 0 pps
> > Bytes:   4532824   504 bps
> > Total-dropped packets: 0 0 pps
> > Total-dropped bytes  : 0 0 bps
> > Queue: 8, Forwarding classes: mcast
> >   Transmitted:
> > Packets  :   655370488 pps
> > Bytes:5102847425663112 bps
> > Total-dropped packets:   279 0 pps
> > Total-dropped bytes  :423522 0 bps
>
> These drop figures don't immediately strike me as excessive.  We
> certainly have much higher drop percentages, and don't see much
> practical performance problems.  But it will very much depend on
> your application.  The one thing I note is that you have much
> more multicast than we do, and you see drops in that forwarding
> class.
>
> I didn't quite understand if you see actual application or
> performance problems.
>
>
> > show class-of-service shared-buffer
> > Ingress:
> >   Total Buffer :  12480.00 KB
> >   Dedicated Buffer :  2912.81 KB
> >   Shared Buffer:  9567.19 KB
> > Lossless  :  861.05 KB
> > Lossless Headroom :  4305.23 KB
> > Lossy :  4400.91 KB
>
> This looks like a QFX5100 or EX4600, with the 12 Mbyte buffer in the
> Broadcom Trident 2 chip.  You probably want to read this page, to
> understand how to configure buffer allocation for your needs:
>
>
> https://www.juniper.net/documentation/us/en/software/junos/traffic-mgmt-qfx/topics/concept/cos-qfx-series-buffer-configuration-understanding.html
>
> In my network, we only have best-effort traffic, and very little
> multi- or broadcast traffic (basically just ARP/Neighbour discovery,
> DHCP, and OSPF), so we use these settings on our QFX5100 and EX4600
> switches:
>
> forwarding-options {
> cut-through;
> }
> class-of-service {
> /* Max buffers to best-effort traffic, minimum for lossless
> ethernet */
> shared-buffer {
> ingress {
> percent 100;
> buffer-partition lossless { percent 5; }
> buffer-partition lossless-headroom { percent 0; }
> buffer-partition lossy { percent 95; }
> }
> egress {
> percent 100;
> buffer-partition lossless { percent 5; }
> buffer-partition lossy { percent 75; }
> buffer-partition multicast { percent 20; }
> }
> }
> }
>
> (On our QFX5120 switches, I have moved even more buffer space to
> the "lossy" classes.)  But you need to tune to *your* needs; the
> above is for our needs.
>
>
> /Bellman
>
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Cut through and buffer questions

2021-11-19 Thread james list via juniper-nsp
Hi
I mentioned MX and QFX (output related QFX5100) in first email because
traffic pattern spread both.
I never mentioned internet.

I also understood cut through cannot help but obviously I cannot change QFX
switches because we loss few udp packets for a single application, the idea
could be to change shared buffers for unused queues and add to used one,
correct ?
Based on the output provided what you suggest to change ?
I also understand this kind of change is traffic affecting.

I also need to understand how shared buffer queues on QFX are attached to
COS queues.

Thanks, cheers
James



Il giorno ven 19 nov 2021 alle ore 10:07 Saku Ytti  ha
scritto:

> On Fri, 19 Nov 2021 at 10:49, james list  wrote:
>
> Hey,
>
> > I try to rephrase the question you do not understand: if I enable cut
> through or change buffer is it traffic affecting ?
>
> There is no cut-through and I was hoping after reading the previous
> email, you'd understand why it won't help you at all nor is it
> desirable. Changing QoS config may be traffic affecting, but you
> likely do not have the monitoring capability to observe it.
>
> > Regarding the drops here the outputs (15h after clear statistics):
>
> You talked about MX, so I answered from MX perspective. But your
> output is not from MX.
>
> The device you actually show has exceedingly tiny buffers and is not
> meant for Internet WAN use, that is, it does not expect significantly
> higher sender rate to receiver rate with high RTT. It is meant for
> datacenter use, where RTT is low and speed delta is small.
>
> In real life Internet you need larger buffers because of this
> senderPC => internets => receiverPC
>
> Let's imagine an RTT of 200ms and receiver 10GE and sender 100GE.
> - 10Gbps * 200ms = 250MB TCP window needed to fill it
> - as TCP windows grow exponentially in absence of loss, you could have
> 128MB => 250MB growth
> - this means, senderPC might serialise 128MB of data at 100Gbps
> - this 128MB you can only send at 10 Gbps rate, rest you have to take
> into the buffers
> - intentionally pathological example
> - 'easy' fix is, that sender doesn't burst the data at its own rate,
> but does rate estimation and sends window growth at estimated receiver
> rate, this practically removes buffering needs entirely
> - 'easy' fix is not standard behaviour, but some cloudyshops configure
> their linux like this thankfully (Linux already does bandwidth
> estimation, and you can ask 'tc' to shape the session to esimated
> bandwidth'
>
> What you need to do is change the device to a one that is intended for
> the application you have.
> If you can do anything at all, what you can do, is ensure that you
> have minimum amount of QoS classes and those QoS classes have maximum
> amount of buffer. So that unused queues aren't holding empty memory
> while used queue is starving. But even this will have only marginal
> benefit.
>
> Cut-through does nothing, because your egress is congested, you can
> only use cut-through if egress is not congested.
>
>
>
> --
>   ++ytti
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Cut through and buffer questions

2021-11-19 Thread james list via juniper-nsp
: 0 0 bps
  Transmitted:
Packets  : 59234 0 pps
Bytes:   4532824   504 bps
Tail-dropped packets : Not Available
RL-dropped packets   : 0 0 pps
RL-dropped bytes : 0 0 bps
Total-dropped packets: 0 0 pps
Total-dropped bytes  : 0 0 bps
Queue: 8, Forwarding classes: mcast
  Queued:
Packets  : 0 0 pps
Bytes: 0 0 bps
  Transmitted:
Packets  :   655370488 pps
Bytes:5102847425663112 bps
Tail-dropped packets : Not Available
RL-dropped packets   : 0 0 pps
RL-dropped bytes : 0 0 bps
Total-dropped packets:   279 0 pps
Total-dropped bytes  :423522 0 bps

{master:0}



show class-of-service shared-buffer
Ingress:
  Total Buffer :  12480.00 KB
  Dedicated Buffer :  2912.81 KB
  Shared Buffer:  9567.19 KB
Lossless  :  861.05 KB
Lossless Headroom :  4305.23 KB
Lossy :  4400.91 KB

  Lossless Headroom Utilization:
  Node Device Total  Used  Free
  0   4305.23 KB 0.00 KB   4305.23 KB

  1   4305.23 KB 0.00 KB   4305.23 KB

  2   4305.23 KB 0.00 KB   4305.23 KB

  3   4305.23 KB 0.00 KB   4305.23 KB

  4   4305.23 KB 0.00 KB   4305.23 KB

Egress:
  Total Buffer :  12480.00 KB
  Dedicated Buffer :  3744.00 KB
  Shared Buffer:  8736.00 KB
Lossless  :  4368.00 KB
Multicast :  1659.84 KB
Lossy :  2708.16 KB

Cheers
James


Il giorno ven 19 nov 2021 alle ore 08:36 Saku Ytti  ha
scritto:

> On Thu, 18 Nov 2021 at 23:20, james list via juniper-nsp
>  wrote:
>
> > 1) is MX family switching by default in cut through or store and forward
> > mode? I was not able to find a clear information
>
> Store and forward.
>
> > 2) is in general (on MX or QFX) jeopardizing the traffic the action to
> > enable cut through or change buffer allocation?
>
> I don't understand the question.
>
> > I have some output discard on an interface (class best effort) and some
> UDP
> > packets are lost hence I am tuning to find a solution.
>
> I don't think how this relates to cut-through at all.
>
> Cut-through works when ingress can start writing frame to egress while
> still reading it, this is ~never the case in multistage ingress+egress
> buffered devices. And even in devices where it is the case, it only
> works if egress interface happens to be not serialising the packet at
> that time, so the percentage of frames actually getting cut-through
> behaviour in cut-through devices is low in typical applications,
> applications where it is high likely could have been replaced by a
> direct connection.
> Modern multistage devices have low single digit microseconds internal
> latency and nanoseconds jitter.  One microsecond is about 200m in
> fiber, so that gives you the scale of how much distance you can reduce
> by reducing the delay incurred by multistage device.
>
> Now having said that, what actually is the problem. What are 'output
> discards', which counter are you looking at? Have you modified QoS
> configuration, can you share it? By default JNPR is 95% BE, 5% NC
> (unlike Cisco, which is 100% BE, which I think is better default), and
> buffer allocation is same, so if you are actually QoS tail-dropping in
> default JNPR configuration, you're creating massive delays, because
> the buffer allocation us huge and your problem is rather simply that
> you're offering too much to the egress, and best you can do is reduce
> buffer allocation to have lower collateral damage.
>
> --
>   ++ytti
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] Cut through and buffer questions

2021-11-18 Thread james list via juniper-nsp
Hi all
Questions:
1) is MX family switching by default in cut through or store and forward
mode? I was not able to find a clear information

2) is in general (on MX or QFX) jeopardizing the traffic the action to
enable cut through or change buffer allocation?

I have some output discard on an interface (class best effort) and some UDP
packets are lost hence I am tuning to find a solution.

Thanks in advance for any hint

Cheers
James
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] [c-nsp] strange issue

2021-07-29 Thread james list via juniper-nsp
Hi
I've to ask for the VM routing table and then I will share.

VM gateway is load balancer.

Cheers
James

Il giorno gio 29 lug 2021 alle ore 18:17 Ryan Rawdon  ha
scritto:

>
> > On Jul 29, 2021, at 11:55 AM, james list  wrote:
> >
> >
> > Internet - Firewall – Lan - Load balancer – Lan – hypervisor- VM
> >
> >
> >
> > It happens sometime that the VM do not respond anymore to Load balancer
> for
> > external ip addresses until on the Load balancer it is setted to source
> NAT
> > (SNAT) the internet traffic and then SNAT it’s removed.
> >
>
> Can  you share the routing table of the VM in question?  Specifically/most
> importantly - Is the load balancer being used as the VM’s  default gateway,
> or does the VM use the firewall as its default gateway?  In the latter
> case, I would expect the load balancer to SNAT traffic or act as a full
> layer 7 proxy where a new TCP connection is established from the load
> balancer to the upstream servers.
>
> With a misconfiguration or misaligned design intention here, I could see
> the intended behavior depending on ARP or firewall/connection state
> tracking behavior in the devices.
>
>
> > Something like an action that solicit the VM to refresh the arp.
> >
> >
> >
> > While health check from Loadbalancer to VM in the same LAN subnet never
> > stops to work.
> >
> >
> >
> > Does anybody ever encountered the same problem on VM environments ?
>
> In the absence of evidence otherwise, I suspect your issue is not
> VM-specific.  Do you have examples of physical hosts in the same LAN that
> do not exhibit this problem?  If so, has the routing table (default gateway
> and possibly other persistent static routes) been compared?
>
> >
> > Any idea ?
> >
> >
> >
> > Thanks in advance
> >
> > James
> > ___
> > cisco-nsp mailing list  cisco-...@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/cisco-nsp
> > archive at http://puck.nether.net/pipermail/cisco-nsp/
>
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] strange issue

2021-07-29 Thread james list via juniper-nsp
Dear experts

My customer has the following very simple infrastructure:



Internet - Firewall – Lan - Load balancer – Lan – hypervisor- VM



It happens sometime that the VM do not respond anymore to Load balancer for
external ip addresses until on the Load balancer it is setted to source NAT
(SNAT) the internet traffic and then SNAT it’s removed.

Something like an action that solicit the VM to refresh the arp.



While health check from Loadbalancer to VM in the same LAN subnet never
stops to work.



Does anybody ever encountered the same problem on VM environments ?

Any idea ?



Thanks in advance

James
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp