Re: [j-nsp] interpreting 10Gb interface "PCS statistics" values

2016-10-21 Thread Chuck Anderson
When I was getting these and the Cisco far end was getting tons of
errors, the light levels were good all around.  It ended up being a
fiber problem near the transmitter.  Try shooting the fiber link with
an OTDR to see if you are getting lots of reflections.

On Fri, Oct 21, 2016 at 12:23:18PM -0700, Michael Loftis wrote:
> Was hoping someone who knew more could chime in...but it's measured in
> seconds basically because the PCS (physical coding sublayer) does NOT
> keep detailed statistics...so the "Seconds" value means there were X
> distinct seconds in which an error was flagged in that category...the
> previous response detailing bit vs errored blocks I think is wrong.
> The PCS layer can repair single bit errors, thus a second with one or
> more single bit (but correctable!) errors is a "bit errored second" -
> if it is unabled to correct and recover a valid PCS block then you get
> the "errored block" seconds...
> 
> It's not a raw count of the number of those errors, just that it
> occurred in a ~1s window X times.  You can totally get PCS errors
> unplugging an optic or otherwise shutting down the remote end.  You
> can totally get spurious PCS errors from a marginal ish link that
> shows PLENTY of light (SNR is low or a marginal cable).  in MX
> specifically it *can* in very rare circumstances indicate a problem
> even between the optic and the MICmost of the time my suggestion
> for PCS errors is clear counters and check in 1h and 24h.  If you get
> a significant number of errored seconds in a 24h period then
> check/clean ends and patches, maybe replace optics.
> 
> Also beware, lots of DOM bugs in various JunOS releases cause the DOM
> values to get stuck, and it can be hard or impossible to check in a
> non outage causing way (sometimes you can safely bend the patch cable
> and observe the increase in loss to verify your DOM values aren't
> stuck) - I've had this most commonly in the past on DPC cards but have
> also observed it in MPC cards.  The DOM data is also highly dependent
> upon the optic itself and there's a LOT of buggy stuff out there so
> it's not all juniper's fault there.
> 
> 
> On Fri, Oct 21, 2016 at 11:07 AM, David B Funk
>  wrote:
> > Thanks guys but this isn't what I was asking.
> >
> > The optical power is similar (within a few tenths of a dBm) at my end, down
> > by 3 dBm at the far end of the link that is having issues (-6.23 dBm as
> > opposed to -3.73 dBm) but not enough to explain what I'm seeing.
> >
> > The big question I have is: What does "30 Seconds" mean for an attribute
> > that by description of the docs is supposed to be number of PCS blocks with
> > invalid Sync headers?
> > Particularly when the guy on the Cisco at the other end says his error
> > counters are going up like crazy (and packets are being dropped) while the
> > stats my end stays constant at "30 Seconds".
> > What does that mean?
> >
> > The particularly frustrating thing is that data streams are dropping packets
> > (EG iperf3 showing retries and seriously degraded performance) but none of
> > the interface stats are showing any values that indicate an issue other than
> > that "30 Seconds".
> >
> > Can anybody tell me what "30 Seconds" means (in the context of an error
> > counter)?
> >
> >
> >
> >
> > On Fri, 21 Oct 2016, Christopher Costa wrote:
> >
> >> Here's my notes from a jtac review about these a couple years ago:
> >>
> >>
> >>
> >> [pcs] encoding is continually transmitting to keep the line in sync. The
> >> PCS layer is directly below the MAC layer so for MX,
> >> it’s on the MIC. PCS errors can be caused by anything MIC or lower, i.e.
> >> transceiver, fiber, line equipment, etc.
> >>
> >>
> >>
> >>  PCS functionality:
> >>  ===
> >>  IEEE 802.3ae 10GbE interfaces use a 64B/66B encoder/decoder in the
> >> PHY-PCS (Physical Coding Sub layer) to allow reasonable
> >> clock recovery and facilitate alignment of the data stream at the
> >> receiver.
> >>  As the scheme name suggests, 64 bits of data on the MAC layer are
> >> transmitted as a 66-bit code block on the PHY layer, which
> >> realizes easier clock/timing synchronization. A 66-bit code block contains
> >> a 2-bit Sync. Header + 8 octets data/control field.
> >>   If the Sync. header is '01', the 8 octets are entirely data.
> >>  If the Sync. header is '10', an 8-bit Type field follows, plus 56 bits of
> >> data/control field.
> >>   The 8 octets data/control field is scrambled by using a self-synchronous
> >> scrambler to achieve complete DC-balance on the
> >> serial line.
> >>  PCS statistics displays PCS fault conditions by checking valid Sync.
> >> headers received with every 66 bits interval, so that we
> >> can monitor 10Gbps high speed transmission line quality.
> >>   If the 64B/66B receiver does not detect the 2-bit Sync.
> >>  Header with regular 66-bit interval and it estimates the high BER (Bit
> >> Error Rate of >10^-4), PCS statistics will report a
> >> 

Re: [j-nsp] interpreting 10Gb interface "PCS statistics" values

2016-10-21 Thread Michael Loftis
Was hoping someone who knew more could chime in...but it's measured in
seconds basically because the PCS (physical coding sublayer) does NOT
keep detailed statistics...so the "Seconds" value means there were X
distinct seconds in which an error was flagged in that category...the
previous response detailing bit vs errored blocks I think is wrong.
The PCS layer can repair single bit errors, thus a second with one or
more single bit (but correctable!) errors is a "bit errored second" -
if it is unabled to correct and recover a valid PCS block then you get
the "errored block" seconds...

It's not a raw count of the number of those errors, just that it
occurred in a ~1s window X times.  You can totally get PCS errors
unplugging an optic or otherwise shutting down the remote end.  You
can totally get spurious PCS errors from a marginal ish link that
shows PLENTY of light (SNR is low or a marginal cable).  in MX
specifically it *can* in very rare circumstances indicate a problem
even between the optic and the MICmost of the time my suggestion
for PCS errors is clear counters and check in 1h and 24h.  If you get
a significant number of errored seconds in a 24h period then
check/clean ends and patches, maybe replace optics.

Also beware, lots of DOM bugs in various JunOS releases cause the DOM
values to get stuck, and it can be hard or impossible to check in a
non outage causing way (sometimes you can safely bend the patch cable
and observe the increase in loss to verify your DOM values aren't
stuck) - I've had this most commonly in the past on DPC cards but have
also observed it in MPC cards.  The DOM data is also highly dependent
upon the optic itself and there's a LOT of buggy stuff out there so
it's not all juniper's fault there.


On Fri, Oct 21, 2016 at 11:07 AM, David B Funk
 wrote:
> Thanks guys but this isn't what I was asking.
>
> The optical power is similar (within a few tenths of a dBm) at my end, down
> by 3 dBm at the far end of the link that is having issues (-6.23 dBm as
> opposed to -3.73 dBm) but not enough to explain what I'm seeing.
>
> The big question I have is: What does "30 Seconds" mean for an attribute
> that by description of the docs is supposed to be number of PCS blocks with
> invalid Sync headers?
> Particularly when the guy on the Cisco at the other end says his error
> counters are going up like crazy (and packets are being dropped) while the
> stats my end stays constant at "30 Seconds".
> What does that mean?
>
> The particularly frustrating thing is that data streams are dropping packets
> (EG iperf3 showing retries and seriously degraded performance) but none of
> the interface stats are showing any values that indicate an issue other than
> that "30 Seconds".
>
> Can anybody tell me what "30 Seconds" means (in the context of an error
> counter)?
>
>
>
>
> On Fri, 21 Oct 2016, Christopher Costa wrote:
>
>> Here's my notes from a jtac review about these a couple years ago:
>>
>>
>>
>> [pcs] encoding is continually transmitting to keep the line in sync. The
>> PCS layer is directly below the MAC layer so for MX,
>> it’s on the MIC. PCS errors can be caused by anything MIC or lower, i.e.
>> transceiver, fiber, line equipment, etc.
>>
>>
>>
>>  PCS functionality:
>>  ===
>>  IEEE 802.3ae 10GbE interfaces use a 64B/66B encoder/decoder in the
>> PHY-PCS (Physical Coding Sub layer) to allow reasonable
>> clock recovery and facilitate alignment of the data stream at the
>> receiver.
>>  As the scheme name suggests, 64 bits of data on the MAC layer are
>> transmitted as a 66-bit code block on the PHY layer, which
>> realizes easier clock/timing synchronization. A 66-bit code block contains
>> a 2-bit Sync. Header + 8 octets data/control field.
>>   If the Sync. header is '01', the 8 octets are entirely data.
>>  If the Sync. header is '10', an 8-bit Type field follows, plus 56 bits of
>> data/control field.
>>   The 8 octets data/control field is scrambled by using a self-synchronous
>> scrambler to achieve complete DC-balance on the
>> serial line.
>>  PCS statistics displays PCS fault conditions by checking valid Sync.
>> headers received with every 66 bits interval, so that we
>> can monitor 10Gbps high speed transmission line quality.
>>   If the 64B/66B receiver does not detect the 2-bit Sync.
>>  Header with regular 66-bit interval and it estimates the high BER (Bit
>> Error Rate of >10^-4), PCS statistics will report a
>> problem.
>>   PCS statistics :
>>  
>>  - "Bit errors" indicates the number of PCS blocks with invalid Sync
>> headers.
>>  - "Errored blocks" indicates the number of PCS blocks with a valid Sync.
>> header but invalid block format.
>>
>>
>> On Fri, Oct 21, 2016 at 9:37 AM, Michael Carey  wrote:
>>   David,
>>
>>   When I've seen PCS statistical errors before, it pointed to either a
>>   failing optic that needed replaced in our MX or a drastic change in
>> optical

Re: [j-nsp] interpreting 10Gb interface "PCS statistics" values

2016-10-21 Thread Wojciech Janiszewski
Hi David,

I'd say it's a number of seconds with high error rate or errored blocks. In
other words it doesn't show the number of errors but the number of seconds
during which errors were detected.

Regards,
Wojciech

21.10.2016 20:08 "David B Funk"  napisał(a):

> Thanks guys but this isn't what I was asking.
>
> The optical power is similar (within a few tenths of a dBm) at my end,
> down by 3 dBm at the far end of the link that is having issues (-6.23 dBm
> as opposed to -3.73 dBm) but not enough to explain what I'm seeing.
>
> The big question I have is: What does "30 Seconds" mean for an attribute
> that by description of the docs is supposed to be number of PCS blocks with
> invalid Sync headers?
> Particularly when the guy on the Cisco at the other end says his error
> counters are going up like crazy (and packets are being dropped) while the
> stats my end stays constant at "30 Seconds".
> What does that mean?
>
> The particularly frustrating thing is that data streams are dropping
> packets (EG iperf3 showing retries and seriously degraded performance) but
> none of the interface stats are showing any values that indicate an issue
> other than that "30 Seconds".
>
> Can anybody tell me what "30 Seconds" means (in the context of an error
> counter)?
>
>
>
> On Fri, 21 Oct 2016, Christopher Costa wrote:
>
> Here's my notes from a jtac review about these a couple years ago:
>>
>>
>>
>> [pcs] encoding is continually transmitting to keep the line in sync. The
>> PCS layer is directly below the MAC layer so for MX,
>> it’s on the MIC. PCS errors can be caused by anything MIC or lower, i.e.
>> transceiver, fiber, line equipment, etc.
>>
>>
>>
>>  PCS functionality:
>>  ===
>>  IEEE 802.3ae 10GbE interfaces use a 64B/66B encoder/decoder in the
>> PHY-PCS (Physical Coding Sub layer) to allow reasonable
>> clock recovery and facilitate alignment of the data stream at the
>> receiver.
>>  As the scheme name suggests, 64 bits of data on the MAC layer are
>> transmitted as a 66-bit code block on the PHY layer, which
>> realizes easier clock/timing synchronization. A 66-bit code block
>> contains a 2-bit Sync. Header + 8 octets data/control field.
>>   If the Sync. header is '01', the 8 octets are entirely data.
>>  If the Sync. header is '10', an 8-bit Type field follows, plus 56 bits
>> of data/control field.
>>   The 8 octets data/control field is scrambled by using a
>> self-synchronous scrambler to achieve complete DC-balance on the
>> serial line.
>>  PCS statistics displays PCS fault conditions by checking valid Sync.
>> headers received with every 66 bits interval, so that we
>> can monitor 10Gbps high speed transmission line quality.
>>   If the 64B/66B receiver does not detect the 2-bit Sync.
>>  Header with regular 66-bit interval and it estimates the high BER (Bit
>> Error Rate of >10^-4), PCS statistics will report a
>> problem.
>>   PCS statistics :
>>  
>>  - "Bit errors" indicates the number of PCS blocks with invalid Sync
>> headers.
>>  - "Errored blocks" indicates the number of PCS blocks with a valid Sync.
>> header but invalid block format.
>>
>>
>> On Fri, Oct 21, 2016 at 9:37 AM, Michael Carey  wrote:
>>   David,
>>
>>   When I've seen PCS statistical errors before, it pointed to either a
>>   failing optic that needed replaced in our MX or a drastic change in
>> optical
>>   light levels caused by an OSP fiber issue.  How do your "show
>> interface
>>   diagnostic optic" levels look?
>>
>>   On Wed, Oct 19, 2016 at 7:40 PM, David B Funk <
>> dbf...@engineering.uiowa.edu>
>>   wrote:
>>
>>   > I've got a couple of 10Gig-eth interfaces (xe- on MX480) of which
>> I'm
>>   > trying to interpret the "PCS statistics" values.
>>   >
>>   > One of them is pretty steady at:
>>   >
>>   >   PCS statistics  Seconds
>>   > Bit errors 4
>>   > Errored blocks 4
>>   >
>>   > The other one seems to vary with the values ranging from 10 to 70.
>>   > EG:
>>   >
>>   >   PCS statistics  Seconds
>>   > Bit errors61
>>   > Errored blocks69
>>   >
>>   > The second interface will will trigger a number of error
>> conditions at the
>>   > other end which terminates in a Cisco router with out showing any
>> error
>>   > conditions at my end (EG BPDU Error: None, MAC-REWRITE Error:
>> None,
>>   > CRC/Align errors 0, FIFO errors 0, etc..) During some of these
>> times I'll
>>   > see significant packet loss and others see minimal problems.
>>   >
>>   > According to Juniper docs the PCS statistics should mean:
>>   >
>>   >  PCS statistics
>>   >   (10-Gigabit Ethernet interfaces) Displays Physical Coding
>> Sublayer (PCS)
>>   > fault
>>

Re: [j-nsp] Load balancing errors on 15.1R4

2016-10-21 Thread Dragan Jovicic
That does look bad.
Please keep us updated on you case progress if possible.

Thank you.

Best,
Dragan

On Fri, Oct 21, 2016 at 7:32 PM, Luis Balbinot 
wrote:

> Case is open, but nothing yet. Could be related to PR1164101, but that's
> not Juniper's official position.
>
> Next week we are upgrading a lab MX480 to 16.2 and see if it persists.
>
> Luis
>
> On Oct 21, 2016 14:54, "Dragan Jovicic"  wrote:
>
>> Hi,
>>
>> Do you have any update on this? Have you opened a case for this maybe?
>>
>> Best
>> Dragan
>>
>> On Tue, Oct 18, 2016 at 4:14 PM, Luis Balbinot 
>> wrote:
>>
>>> Hey.
>>>
>>> Is anyone else having issues with load-balancing on 15.1R4? I'm
>>> getting these FPC errors in multiple boxes:
>>>
>>> fpc0 LUCHIP(3) RMC 2  Uninitialized EDMEM[0x3ce333] Read
>>> (0x6db6db6d6db6db6d)
>>> fpc0 LUCHIP(3) PPE_2 Errors sync xtxn error
>>> fpc0 LUCHIP(3) PPE_15 Errors sync xtxn error
>>> fpc0 PPE Sync XTXN Err Trap:  Count 23064982, PC 376, 0x0376:
>>> call_table_launch_nh
>>>
>>> They are all MX960s with MPC 16x10G cards. This impacts on traffic to
>>> random destinations (route is OK in FIB and RIB but traffic is
>>> blackholed). If I disable per-packet lb everything works perfectly.
>>>
>>> This issue appears when we have flaps that force route
>>> recalculation/installation.
>>>
>>> I have a similar box running 12.3R3.4 that is just fine.
>>>
>>> Luis
>>> ___
>>> juniper-nsp mailing list juniper-nsp@puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>>
>>
>>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] interpreting 10Gb interface "PCS statistics" values

2016-10-21 Thread David B Funk

Thanks guys but this isn't what I was asking.

The optical power is similar (within a few tenths of a dBm) at my end, down by 3 
dBm at the far end of the link that is having issues (-6.23 dBm as opposed 
to -3.73 dBm) but not enough to explain what I'm seeing.


The big question I have is: What does "30 Seconds" mean for an attribute that by 
description of the docs is supposed to be number of PCS blocks with invalid Sync 
headers?
Particularly when the guy on the Cisco at the other end says his error counters 
are going up like crazy (and packets are being dropped) while the stats my end 
stays constant at "30 Seconds".

What does that mean?

The particularly frustrating thing is that data streams are dropping packets (EG 
iperf3 showing retries and seriously degraded performance) but none of the 
interface stats are showing any values that indicate an issue other than that 
"30 Seconds".


Can anybody tell me what "30 Seconds" means (in the context of an error 
counter)?




On Fri, 21 Oct 2016, Christopher Costa wrote:


Here's my notes from a jtac review about these a couple years ago:



[pcs] encoding is continually transmitting to keep the line in sync. The PCS 
layer is directly below the MAC layer so for MX,
it’s on the MIC. PCS errors can be caused by anything MIC or lower, i.e. 
transceiver, fiber, line equipment, etc.



 PCS functionality:
 ===
 IEEE 802.3ae 10GbE interfaces use a 64B/66B encoder/decoder in the PHY-PCS 
(Physical Coding Sub layer) to allow reasonable
clock recovery and facilitate alignment of the data stream at the receiver. 
 As the scheme name suggests, 64 bits of data on the MAC layer are transmitted 
as a 66-bit code block on the PHY layer, which
realizes easier clock/timing synchronization. A 66-bit code block contains a 
2-bit Sync. Header + 8 octets data/control field.
  If the Sync. header is '01', the 8 octets are entirely data.
 If the Sync. header is '10', an 8-bit Type field follows, plus 56 bits of 
data/control field.
  The 8 octets data/control field is scrambled by using a self-synchronous 
scrambler to achieve complete DC-balance on the
serial line. 
 PCS statistics displays PCS fault conditions by checking valid Sync. headers 
received with every 66 bits interval, so that we
can monitor 10Gbps high speed transmission line quality. 
  If the 64B/66B receiver does not detect the 2-bit Sync. 
 Header with regular 66-bit interval and it estimates the high BER (Bit Error Rate 
of >10^-4), PCS statistics will report a
problem. 
  PCS statistics :
 
 - "Bit errors" indicates the number of PCS blocks with invalid Sync headers.
 - "Errored blocks" indicates the number of PCS blocks with a valid Sync. 
header but invalid block format. 


On Fri, Oct 21, 2016 at 9:37 AM, Michael Carey  wrote:
  David,

  When I've seen PCS statistical errors before, it pointed to either a
  failing optic that needed replaced in our MX or a drastic change in 
optical
  light levels caused by an OSP fiber issue.  How do your "show interface
  diagnostic optic" levels look?

  On Wed, Oct 19, 2016 at 7:40 PM, David B Funk 

  wrote:

  > I've got a couple of 10Gig-eth interfaces (xe- on MX480) of which I'm
  > trying to interpret the "PCS statistics" values.
  >
  > One of them is pretty steady at:
  >
  >   PCS statistics                      Seconds
  >     Bit errors                             4
  >     Errored blocks                         4
  >
  > The other one seems to vary with the values ranging from 10 to 70.
  > EG:
  >
  >   PCS statistics                      Seconds
  >     Bit errors                            61
  >     Errored blocks                        69
  >
  > The second interface will will trigger a number of error conditions at 
the
  > other end which terminates in a Cisco router with out showing any error
  > conditions at my end (EG BPDU Error: None, MAC-REWRITE Error: None,
  > CRC/Align errors 0, FIFO errors 0, etc..) During some of these times 
I'll
  > see significant packet loss and others see minimal problems.
  >
  > According to Juniper docs the PCS statistics should mean:
  >
  >  PCS statistics
  >   (10-Gigabit Ethernet interfaces) Displays Physical Coding Sublayer 
(PCS)
  > fault
  >   conditions from the WAN PHY or the LAN PHY device.
  >
  >     Bit errors—High bit error rate. Indicates the number of bit errors
  > when the
  >       PCS receiver is operating in normal mode.
  >     Errored blocks—Loss of block lock. The number of errored blocks when
  > PCS
  >       receiver is operating in normal mode.
  >
  > But I don't know how to interpret a value of "16 seconds" with that
  > definition.
  > Can anybody shed some light on what those numbers mean.
  >
  > Thanks.

Re: [j-nsp] Load balancing errors on 15.1R4

2016-10-21 Thread Luis Balbinot
Case is open, but nothing yet. Could be related to PR1164101, but that's
not Juniper's official position.

Next week we are upgrading a lab MX480 to 16.2 and see if it persists.

Luis

On Oct 21, 2016 14:54, "Dragan Jovicic"  wrote:

> Hi,
>
> Do you have any update on this? Have you opened a case for this maybe?
>
> Best
> Dragan
>
> On Tue, Oct 18, 2016 at 4:14 PM, Luis Balbinot 
> wrote:
>
>> Hey.
>>
>> Is anyone else having issues with load-balancing on 15.1R4? I'm
>> getting these FPC errors in multiple boxes:
>>
>> fpc0 LUCHIP(3) RMC 2  Uninitialized EDMEM[0x3ce333] Read
>> (0x6db6db6d6db6db6d)
>> fpc0 LUCHIP(3) PPE_2 Errors sync xtxn error
>> fpc0 LUCHIP(3) PPE_15 Errors sync xtxn error
>> fpc0 PPE Sync XTXN Err Trap:  Count 23064982, PC 376, 0x0376:
>> call_table_launch_nh
>>
>> They are all MX960s with MPC 16x10G cards. This impacts on traffic to
>> random destinations (route is OK in FIB and RIB but traffic is
>> blackholed). If I disable per-packet lb everything works perfectly.
>>
>> This issue appears when we have flaps that force route
>> recalculation/installation.
>>
>> I have a similar box running 12.3R3.4 that is just fine.
>>
>> Luis
>> ___
>> juniper-nsp mailing list juniper-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>
>
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] interpreting 10Gb interface "PCS statistics" values

2016-10-21 Thread Christopher Costa
Here's my notes from a jtac review about these a couple years ago:




[pcs] encoding is continually transmitting to keep the line in sync. The
PCS layer is directly below the MAC layer so for MX, it’s on the MIC. PCS
errors can be caused by anything MIC or lower, i.e. transceiver, fiber,
line equipment, etc.




PCS functionality:

===

IEEE 802.3ae 10GbE interfaces use a 64B/66B encoder/decoder in the PHY-PCS
(Physical Coding Sub layer) to allow reasonable clock recovery and
facilitate alignment of the data stream at the receiver.

As the scheme name suggests, 64 bits of data on the MAC layer are
transmitted as a 66-bit code block on the PHY layer, which realizes easier
clock/timing synchronization. A 66-bit code block contains a 2-bit Sync.
Header + 8 octets data/control field.

If the Sync. header is '01', the 8 octets are entirely data.

If the Sync. header is '10', an 8-bit Type field follows, plus 56 bits of
data/control field.

The 8 octets data/control field is scrambled by using a self-synchronous
scrambler to achieve complete DC-balance on the serial line.

PCS statistics displays PCS fault conditions by checking valid Sync.
headers received with every 66 bits interval, so that we can monitor 10Gbps
high speed transmission line quality.

If the 64B/66B receiver does not detect the 2-bit Sync.

Header with regular 66-bit interval and it estimates the high BER (Bit
Error Rate of >10^-4), PCS statistics will report a problem.

PCS statistics :



- "Bit errors" indicates the number of PCS blocks with invalid Sync headers.

- "Errored blocks" indicates the number of PCS blocks with a valid Sync.
header but invalid block format.


On Fri, Oct 21, 2016 at 9:37 AM, Michael Carey  wrote:

> David,
>
> When I've seen PCS statistical errors before, it pointed to either a
> failing optic that needed replaced in our MX or a drastic change in optical
> light levels caused by an OSP fiber issue.  How do your "show interface
> diagnostic optic" levels look?
>
> On Wed, Oct 19, 2016 at 7:40 PM, David B Funk <
> dbf...@engineering.uiowa.edu>
> wrote:
>
> > I've got a couple of 10Gig-eth interfaces (xe- on MX480) of which I'm
> > trying to interpret the "PCS statistics" values.
> >
> > One of them is pretty steady at:
> >
> >   PCS statistics  Seconds
> > Bit errors 4
> > Errored blocks 4
> >
> > The other one seems to vary with the values ranging from 10 to 70.
> > EG:
> >
> >   PCS statistics  Seconds
> > Bit errors61
> > Errored blocks69
> >
> > The second interface will will trigger a number of error conditions at
> the
> > other end which terminates in a Cisco router with out showing any error
> > conditions at my end (EG BPDU Error: None, MAC-REWRITE Error: None,
> > CRC/Align errors 0, FIFO errors 0, etc..) During some of these times I'll
> > see significant packet loss and others see minimal problems.
> >
> > According to Juniper docs the PCS statistics should mean:
> >
> >  PCS statistics
> >   (10-Gigabit Ethernet interfaces) Displays Physical Coding Sublayer
> (PCS)
> > fault
> >   conditions from the WAN PHY or the LAN PHY device.
> >
> > Bit errors—High bit error rate. Indicates the number of bit errors
> > when the
> >   PCS receiver is operating in normal mode.
> > Errored blocks—Loss of block lock. The number of errored blocks when
> > PCS
> >   receiver is operating in normal mode.
> >
> > But I don't know how to interpret a value of "16 seconds" with that
> > definition.
> > Can anybody shed some light on what those numbers mean.
> >
> > Thanks.
> >
> >
> > --
> > Dave Funk  University of Iowa
> > College of Engineering
> > 319/335-5751   FAX: 319/384-0549   1256 Seamans Center
> > Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
> > #include 
> > Better is not better, 'standard' is better. B{
> > ___
> > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>
>
>
>
> --
>
>
> [image: photo]
> *Michael Carey*
> Director of Operations, KINBER
> 717-963-7490
> <717-963-7490?utm_source=WiseStamp_medium=email_term=_
> content=_campaign=signature>
> | 814-777-5027
> <814-777-5027?utm_source=WiseStamp_medium=email_term=_
> content=_campaign=signature>
> | mca...@kinber.org | 5775 Allentown Blvd., Suite 101, Harrisburg, PA
> 17112
>  Network-Based-Education-and-Research-188743104566075/?utm_
> source=WiseStamp_medium=email_term=_content=&
> utm_campaign=signature>
>  utm_medium=email_term=_content=_campaign=signature>
>  

Re: [j-nsp] Load balancing errors on 15.1R4

2016-10-21 Thread Dragan Jovicic
Hi,

Do you have any update on this? Have you opened a case for this maybe?

Best
Dragan

On Tue, Oct 18, 2016 at 4:14 PM, Luis Balbinot 
wrote:

> Hey.
>
> Is anyone else having issues with load-balancing on 15.1R4? I'm
> getting these FPC errors in multiple boxes:
>
> fpc0 LUCHIP(3) RMC 2  Uninitialized EDMEM[0x3ce333] Read
> (0x6db6db6d6db6db6d)
> fpc0 LUCHIP(3) PPE_2 Errors sync xtxn error
> fpc0 LUCHIP(3) PPE_15 Errors sync xtxn error
> fpc0 PPE Sync XTXN Err Trap:  Count 23064982, PC 376, 0x0376:
> call_table_launch_nh
>
> They are all MX960s with MPC 16x10G cards. This impacts on traffic to
> random destinations (route is OK in FIB and RIB but traffic is
> blackholed). If I disable per-packet lb everything works perfectly.
>
> This issue appears when we have flaps that force route
> recalculation/installation.
>
> I have a similar box running 12.3R3.4 that is just fine.
>
> Luis
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] interpreting 10Gb interface "PCS statistics" values

2016-10-21 Thread Michael Carey
David,

When I've seen PCS statistical errors before, it pointed to either a
failing optic that needed replaced in our MX or a drastic change in optical
light levels caused by an OSP fiber issue.  How do your "show interface
diagnostic optic" levels look?

On Wed, Oct 19, 2016 at 7:40 PM, David B Funk 
wrote:

> I've got a couple of 10Gig-eth interfaces (xe- on MX480) of which I'm
> trying to interpret the "PCS statistics" values.
>
> One of them is pretty steady at:
>
>   PCS statistics  Seconds
> Bit errors 4
> Errored blocks 4
>
> The other one seems to vary with the values ranging from 10 to 70.
> EG:
>
>   PCS statistics  Seconds
> Bit errors61
> Errored blocks69
>
> The second interface will will trigger a number of error conditions at the
> other end which terminates in a Cisco router with out showing any error
> conditions at my end (EG BPDU Error: None, MAC-REWRITE Error: None,
> CRC/Align errors 0, FIFO errors 0, etc..) During some of these times I'll
> see significant packet loss and others see minimal problems.
>
> According to Juniper docs the PCS statistics should mean:
>
>  PCS statistics
>   (10-Gigabit Ethernet interfaces) Displays Physical Coding Sublayer (PCS)
> fault
>   conditions from the WAN PHY or the LAN PHY device.
>
> Bit errors—High bit error rate. Indicates the number of bit errors
> when the
>   PCS receiver is operating in normal mode.
> Errored blocks—Loss of block lock. The number of errored blocks when
> PCS
>   receiver is operating in normal mode.
>
> But I don't know how to interpret a value of "16 seconds" with that
> definition.
> Can anybody shed some light on what those numbers mean.
>
> Thanks.
>
>
> --
> Dave Funk  University of Iowa
> College of Engineering
> 319/335-5751   FAX: 319/384-0549   1256 Seamans Center
> Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
> #include 
> Better is not better, 'standard' is better. B{
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp




-- 


[image: photo]
*Michael Carey*
Director of Operations, KINBER
717-963-7490
<717-963-7490?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
| 814-777-5027
<814-777-5027?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
| mca...@kinber.org | 5775 Allentown Blvd., Suite 101, Harrisburg, PA 17112




___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp