Re: [j-nsp] MX80 Sampling - High CPU

2015-01-15 Thread Rob Foehl

On Thu, 15 Jan 2015, Mark Tees wrote:


For me on an MX80 running 11.4R13 with samlping that 10 minute equates to:

- around 3mins of rpd + sampling taking turns to smash the routing
engine CPU whilst seeming allowing other things to still be scheduled
in (phew).
- another 7mins of sampling chewing the CPU


I get similar behavior across several MX80s on various 11.4 builds with 
sampling enabled.  Pretty much anything that causes rpd to walk the entire 
RIB leads to this, including policy updates that produce no changes toward 
the PFEs.  I watched sampled take well over 20 minutes to settle down 
after one of those today, and the RE was basically useless for the 
duration...


-Rob
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2015-01-14 Thread Mark Tees
For me on an MX80 running 11.4R13 with samlping that 10 minute equates to:

- around 3mins of rpd + sampling taking turns to smash the routing
engine CPU whilst seeming allowing other things to still be scheduled
in (phew).
- another 7mins of sampling chewing the CPU

I have further tests coming up with an MX80 with full tables where I
will be testing for the 10 mins:

* Reachability to prefixes high in the routing table from neighbouring boxes.
* KRT queue through out the 10 mins
* PFE route specifics for my test prefixes
* route summaries
* While continuously trying to pass traffic through the box.

If there is anything else anyone recommends testing here in regards to
this please let me know.

Mark

On Thu, Jan 15, 2015 at 1:50 PM, Jordan Whited  wrote:
> -idle before, takes 10-15 minutes to settle after full table is ingested
> -1m
> -no 64-bit release for mx80
>
> On Tue, Jan 6, 2015 at 5:13 PM, Masood Ahmad Shah 
> wrote:
>
>> Jordan,
>>
>> How does CPU utilization looks during these 3 minutes (even a minute
>> before and after)?
>> How many routes (prefixes) you have in the RIB (not just active, the total
>> number of prefixes that are being scanned to find out the best routes
>> "adj-in-rib")?
>> With 14.1R3.5, did you use rpd-64bit or 32bit?
>>
>> Cheers,
>> Masood
>>
>>
>> On Sun, Jan 4, 2015 at 9:30 AM, Jordan Whited 
>> wrote:
>>
>>> I don't have any issues when sampling is disabled.
>>>
>>> No improvement from what I can tell between 12.3R8.7 and 14.1R3.5. Still
>>> seeing active-paths in the RIB advertised to other neighbors for upwards
>>> of
>>> 3 minutes before they are installed in the FIB.
>>>
>>> On Sat, Dec 13, 2014 at 3:34 AM, MSusiva  wrote:
>>>
>>> > I assume, the 3mins result is with sampling?
>>> > What is the result without sampling?
>>> > Did you test in 14.1 with sampling?
>>> >
>>> > Thank You
>>> > ___
>>> > juniper-nsp mailing list juniper-nsp@puck.nether.net
>>> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>>> >
>>> ___
>>> juniper-nsp mailing list juniper-nsp@puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>>
>>
>>
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



-- 
Regards,

Mark L. Tees
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2015-01-14 Thread Jordan Whited
-idle before, takes 10-15 minutes to settle after full table is ingested
-1m
-no 64-bit release for mx80

On Tue, Jan 6, 2015 at 5:13 PM, Masood Ahmad Shah 
wrote:

> Jordan,
>
> How does CPU utilization looks during these 3 minutes (even a minute
> before and after)?
> How many routes (prefixes) you have in the RIB (not just active, the total
> number of prefixes that are being scanned to find out the best routes
> "adj-in-rib")?
> With 14.1R3.5, did you use rpd-64bit or 32bit?
>
> Cheers,
> Masood
>
>
> On Sun, Jan 4, 2015 at 9:30 AM, Jordan Whited 
> wrote:
>
>> I don't have any issues when sampling is disabled.
>>
>> No improvement from what I can tell between 12.3R8.7 and 14.1R3.5. Still
>> seeing active-paths in the RIB advertised to other neighbors for upwards
>> of
>> 3 minutes before they are installed in the FIB.
>>
>> On Sat, Dec 13, 2014 at 3:34 AM, MSusiva  wrote:
>>
>> > I assume, the 3mins result is with sampling?
>> > What is the result without sampling?
>> > Did you test in 14.1 with sampling?
>> >
>> > Thank You
>> > ___
>> > juniper-nsp mailing list juniper-nsp@puck.nether.net
>> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>> >
>> ___
>> juniper-nsp mailing list juniper-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>
>
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2015-01-06 Thread Masood Ahmad Shah
Jordan,

How does CPU utilization looks during these 3 minutes (even a minute before
and after)?
How many routes (prefixes) you have in the RIB (not just active, the total
number of prefixes that are being scanned to find out the best routes
"adj-in-rib")?
With 14.1R3.5, did you use rpd-64bit or 32bit?

Cheers,
Masood

On Sun, Jan 4, 2015 at 9:30 AM, Jordan Whited  wrote:

> I don't have any issues when sampling is disabled.
>
> No improvement from what I can tell between 12.3R8.7 and 14.1R3.5. Still
> seeing active-paths in the RIB advertised to other neighbors for upwards of
> 3 minutes before they are installed in the FIB.
>
> On Sat, Dec 13, 2014 at 3:34 AM, MSusiva  wrote:
>
> > I assume, the 3mins result is with sampling?
> > What is the result without sampling?
> > Did you test in 14.1 with sampling?
> >
> > Thank You
> > ___
> > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
> >
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2015-01-06 Thread Jordan Whited
I don't have any issues when sampling is disabled.

No improvement from what I can tell between 12.3R8.7 and 14.1R3.5. Still
seeing active-paths in the RIB advertised to other neighbors for upwards of
3 minutes before they are installed in the FIB.

On Sat, Dec 13, 2014 at 3:34 AM, MSusiva  wrote:

> I assume, the 3mins result is with sampling?
> What is the result without sampling?
> Did you test in 14.1 with sampling?
>
> Thank You
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-12-13 Thread MSusiva
I assume, the 3mins result is with sampling?
What is the result without sampling?
Did you test in 14.1 with sampling?

Thank You
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-12-12 Thread Eduardo Schoedler
@mx5> start shell
% less /var/run/dmesg.boot
JUNOS 12.3R8.7 #0: 2014-09-19 15:52:00 UTC

buil...@tiabeth.juniper.net:/volume/build/junos/12.3/release/12.3R8.7/obj-powerpc/junos/bsd/kernels/JUNIPER-PPC/kernel
WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant
cpu0: Freescale e500v2 core revision 3.0
cpu0: HID0 80004000


[admin@mk] > /system resource print
cpu: e500v2
  cpu-count: 2
  cpu-frequency: 1066MHz
   cpu-load: 0%
  architecture-name: powerpc
 board-name: RB1100AHx2
   platform: MikroTik




2014-12-12 13:07 GMT-02:00 Scott Granados :
> Mikrotek, ouch, the only thing I found they were good for is target
> practice.:)
>
>
> On Dec 11, 2014, at 12:19 AM, Eduardo Schoedler  wrote:
>
> Em quarta-feira, 10 de dezembro de 2014, Jordan Whited
>  escreveu:
>>
>> I found the issue still present in 12.3R8.7 running on an MX80. In
>> 11.4R7.5
>> with sampling enabled it was taking upwards of 12 minutes for routes to
>> propagate to the FIB when taking in a full ipv4 with ~250k active-paths,
>> in
>> 12.3R8.7 I measured it closer to 3 minutes. Seems to be improved, but
>> still
>> unacceptable.
>
>
> What do you expect from a PowerPC processor that's used for mikrotik's
> routerboards?
>
> Thake a look in dmesg.
>
> --
> Eduardo Schoedler
>
>
>
> --
> Eduardo Schoedler
>
>



-- 
Eduardo Schoedler
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-12-12 Thread Scott Granados
Mikrotek, ouch, the only thing I found they were good for is target practice.:)


On Dec 11, 2014, at 12:19 AM, Eduardo Schoedler  wrote:

> Em quarta-feira, 10 de dezembro de 2014, Jordan Whited 
>  escreveu:
> I found the issue still present in 12.3R8.7 running on an MX80. In 11.4R7.5
> with sampling enabled it was taking upwards of 12 minutes for routes to
> propagate to the FIB when taking in a full ipv4 with ~250k active-paths, in
> 12.3R8.7 I measured it closer to 3 minutes. Seems to be improved, but still
> unacceptable.
> 
> What do you expect from a PowerPC processor that's used for mikrotik's 
> routerboards?
> 
> Thake a look in dmesg. 
> 
> --
> Eduardo Schoedler
> 
> 
> -- 
> Eduardo Schoedler
> 

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-12-11 Thread Morgan McLean
Taps and gigamon devices are king.

On Wednesday, December 10, 2014, Eduardo Schoedler 
wrote:

> Em quarta-feira, 10 de dezembro de 2014, Jordan Whited <
> jwhited0...@gmail.com > escreveu:
>
> > I found the issue still present in 12.3R8.7 running on an MX80. In
> 11.4R7.5
> > with sampling enabled it was taking upwards of 12 minutes for routes to
> > propagate to the FIB when taking in a full ipv4 with ~250k active-paths,
> in
> > 12.3R8.7 I measured it closer to 3 minutes. Seems to be improved, but
> still
> > unacceptable.
>
>
> What do you expect from a PowerPC processor that's used for mikrotik's
> routerboards?
>
> Thake a look in dmesg.
>
> --
> Eduardo Schoedler
>
> >
>
> --
> Eduardo Schoedler
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net 
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>


-- 
Thanks,
Morgan
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-12-10 Thread Eduardo Schoedler
Em quarta-feira, 10 de dezembro de 2014, Jordan Whited <
jwhited0...@gmail.com> escreveu:

> I found the issue still present in 12.3R8.7 running on an MX80. In 11.4R7.5
> with sampling enabled it was taking upwards of 12 minutes for routes to
> propagate to the FIB when taking in a full ipv4 with ~250k active-paths, in
> 12.3R8.7 I measured it closer to 3 minutes. Seems to be improved, but still
> unacceptable.


What do you expect from a PowerPC processor that's used for mikrotik's
routerboards?

Thake a look in dmesg.

--
Eduardo Schoedler

>

-- 
Eduardo Schoedler
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-12-10 Thread Jordan Whited
I found the issue still present in 12.3R8.7 running on an MX80. In 11.4R7.5
with sampling enabled it was taking upwards of 12 minutes for routes to
propagate to the FIB when taking in a full ipv4 with ~250k active-paths, in
12.3R8.7 I measured it closer to 3 minutes. Seems to be improved, but still
unacceptable.

On Tue, Dec 2, 2014 at 10:12 AM, Scott Granados 
wrote:

> I have 12.3R8.7 running on 2 MX-80s and 2 MX-480s with mixed results.  The
> good news is the routers will reconverge with sampling enabled now and the
> PFE programming won’t block hard.  The process is still slow however and
> while we did some testing it still seems that the processes hang during
> large updates although they do eventually un-wedge and complete.  The CPU
> spikes though seem pretty few and far between so that is an improvement.
> I’m hoping the rewrite of the sampled and PFE programming in the 13.3 code
> is improved.  With sampling enabled these boxes reconverge to slowly,
> especially for modern hardware.
>
>
> On Dec 1, 2014, at 6:09 PM, Jordan Whited  wrote:
>
> > Has anyone else made the jump to 12.3R8 yet?
> >
> > On Wed, Oct 1, 2014 at 8:35 AM, Justin M. Streiner <
> strei...@cluebyfour.org>
> > wrote:
> >
> >> On Wed, 1 Oct 2014, Sebastian Wiesinger wrote:
> >>
> >> * Graham Brown  [2014-09-23 22:33]:
> >>>
>  12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I
>  think
>  there are many people waiting for these two releases...
> 
> >>>
> >>> So, 12.3R8 is out. Any practical experiences if inline jflow /
> >>> sampling is faster now?
> >>>
> >>
> >> Not sure yet.  I need to load it on my lab routers, but I won't know how
> >> it behaves at full scale until I load it in production.
> >>
> >> jms
> >>
> >> ___
> >> juniper-nsp mailing list juniper-nsp@puck.nether.net
> >> https://puck.nether.net/mailman/listinfo/juniper-nsp
> >>
> > ___
> > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX80 Sampling - High CPU

2014-12-02 Thread Mark Tinka
On Tuesday, December 02, 2014 05:55:49 PM Eduardo Schoedler 
wrote:

> I have found a bug with validation (RPKI), crashed rpd
> and killed my box. Juniper told me to upgrade to
> 13.3R4.6.

FWIW, running 14.1R1.10 with IPFIX and RPKI on MX80 and 64-
bit MX480. No issues.

Mark.


signature.asc
Description: This is a digitally signed message part.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX80 Sampling - High CPU

2014-12-02 Thread Eduardo Schoedler
I have found a bug with validation (RPKI), crashed rpd and killed my box.
Juniper told me to upgrade to 13.3R4.6.

2014-12-02 13:12 GMT-02:00 Scott Granados :
> I have 12.3R8.7 running on 2 MX-80s and 2 MX-480s with mixed results.  The 
> good news is the routers will reconverge with sampling enabled now and the 
> PFE programming won’t block hard.  The process is still slow however and 
> while we did some testing it still seems that the processes hang during large 
> updates although they do eventually un-wedge and complete.  The CPU spikes 
> though seem pretty few and far between so that is an improvement.  I’m hoping 
> the rewrite of the sampled and PFE programming in the 13.3 code is improved.  
> With sampling enabled these boxes reconverge to slowly, especially for modern 
> hardware.
>
>
> On Dec 1, 2014, at 6:09 PM, Jordan Whited  wrote:
>
>> Has anyone else made the jump to 12.3R8 yet?
>>
>> On Wed, Oct 1, 2014 at 8:35 AM, Justin M. Streiner 
>> wrote:
>>
>>> On Wed, 1 Oct 2014, Sebastian Wiesinger wrote:
>>>
>>> * Graham Brown  [2014-09-23 22:33]:

> 12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I
> think
> there are many people waiting for these two releases...
>

 So, 12.3R8 is out. Any practical experiences if inline jflow /
 sampling is faster now?

>>>
>>> Not sure yet.  I need to load it on my lab routers, but I won't know how
>>> it behaves at full scale until I load it in production.
>>>
>>> jms
>>>
>>> ___
>>> juniper-nsp mailing list juniper-nsp@puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>>
>> ___
>> juniper-nsp mailing list juniper-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
>
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



-- 
Eduardo Schoedler

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX80 Sampling - High CPU

2014-12-02 Thread Scott Granados
I have 12.3R8.7 running on 2 MX-80s and 2 MX-480s with mixed results.  The good 
news is the routers will reconverge with sampling enabled now and the PFE 
programming won’t block hard.  The process is still slow however and while we 
did some testing it still seems that the processes hang during large updates 
although they do eventually un-wedge and complete.  The CPU spikes though seem 
pretty few and far between so that is an improvement.  I’m hoping the rewrite 
of the sampled and PFE programming in the 13.3 code is improved.  With sampling 
enabled these boxes reconverge to slowly, especially for modern hardware.  


On Dec 1, 2014, at 6:09 PM, Jordan Whited  wrote:

> Has anyone else made the jump to 12.3R8 yet?
> 
> On Wed, Oct 1, 2014 at 8:35 AM, Justin M. Streiner 
> wrote:
> 
>> On Wed, 1 Oct 2014, Sebastian Wiesinger wrote:
>> 
>> * Graham Brown  [2014-09-23 22:33]:
>>> 
 12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I
 think
 there are many people waiting for these two releases...
 
>>> 
>>> So, 12.3R8 is out. Any practical experiences if inline jflow /
>>> sampling is faster now?
>>> 
>> 
>> Not sure yet.  I need to load it on my lab routers, but I won't know how
>> it behaves at full scale until I load it in production.
>> 
>> jms
>> 
>> ___
>> juniper-nsp mailing list juniper-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>> 
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-12-01 Thread Justin M. Streiner

On Mon, 1 Dec 2014, Jordan Whited wrote:


Has anyone else made the jump to 12.3R8 yet?


I did on my two MX480s over the past few weeks.  IPFIX jflow sampling has 
been re-enabled.  So far, no unusually high CPU spikes.  BGP convergence 
times are acceptable (<~50 seconds to pull in a full IPv4 BGP feed).


This is on 12.3R8.7.

I also have it running on a pair of MX5s on my lab, but I don't have 
sampling enabled on those boxes at the moment.


jms


On Wed, Oct 1, 2014 at 8:35 AM, Justin M. Streiner 
wrote:


On Wed, 1 Oct 2014, Sebastian Wiesinger wrote:

 * Graham Brown  [2014-09-23 22:33]:



12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I
think
there are many people waiting for these two releases...



So, 12.3R8 is out. Any practical experiences if inline jflow /
sampling is faster now?



Not sure yet.  I need to load it on my lab routers, but I won't know how
it behaves at full scale until I load it in production.

jms

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-12-01 Thread Jordan Whited
Has anyone else made the jump to 12.3R8 yet?

On Wed, Oct 1, 2014 at 8:35 AM, Justin M. Streiner 
wrote:

> On Wed, 1 Oct 2014, Sebastian Wiesinger wrote:
>
>  * Graham Brown  [2014-09-23 22:33]:
>>
>>> 12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I
>>> think
>>> there are many people waiting for these two releases...
>>>
>>
>> So, 12.3R8 is out. Any practical experiences if inline jflow /
>> sampling is faster now?
>>
>
> Not sure yet.  I need to load it on my lab routers, but I won't know how
> it behaves at full scale until I load it in production.
>
> jms
>
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-10-01 Thread Justin M. Streiner

On Wed, 1 Oct 2014, Sebastian Wiesinger wrote:


* Graham Brown  [2014-09-23 22:33]:

12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I think
there are many people waiting for these two releases...


So, 12.3R8 is out. Any practical experiences if inline jflow /
sampling is faster now?


Not sure yet.  I need to load it on my lab routers, but I won't know how 
it behaves at full scale until I load it in production.


jms
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-10-01 Thread Sebastian Wiesinger
* Brendan Mannella  [2014-10-01 13:12]:
> We have a mx240 with inline flow enable, we were getting frequent cpu
> spikes, we installed 12.3R8 yesterday and the spikes are resolved.

Interesting, do you also monitor route propagation from RIB to FIB
(via 'show krt state' or something)? On MX80 we had up to 12min delay
for propagation to happen while sampled was working.

Regards

Sebastian

-- 
GPG Key: 0x93A0B9CE (F4F6 B1A3 866B 26E9 450A  9D82 58A2 D94A 93A0 B9CE)
'Are you Death?' ... IT'S THE SCYTHE, ISN'T IT? PEOPLE ALWAYS NOTICE THE SCYTHE.
-- Terry Pratchett, The Fifth Elephant
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-10-01 Thread Gavin Henry
On 1 October 2014 12:09, Brendan Mannella  wrote:
> We have a mx240 with inline flow enable, we were getting frequent cpu
> spikes, we installed 12.3R8 yesterday and the spikes are resolved.

Hi Brendan,

What is your monitoring frequency to pick up the spikes? We're running
the recommended junos on MX80's and only saw spikes/high usage for the
first 10mins or so then it settled down. We're using the standard
recommended config with a sample rate 1 but every 10 secs not 10 secs
or 1000 packets. Most of ours is RTP traffic so we stuck to 10 secs.

Thanks.

-- 
Kind Regards,
Gavin Henry.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-10-01 Thread Brendan Mannella
We have a mx240 with inline flow enable, we were getting frequent cpu
spikes, we installed 12.3R8 yesterday and the spikes are resolved.

On Wednesday, October 1, 2014, Sebastian Wiesinger <
juniper-...@ml.karotte.org> wrote:

> * Graham Brown > [2014-09-23
> 22:33]:
> > 12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I
> think
> > there are many people waiting for these two releases...
>
> So, 12.3R8 is out. Any practical experiences if inline jflow /
> sampling is faster now?
>
>
> Regards
>
> Sebastian
>
> --
> GPG Key: 0x93A0B9CE (F4F6 B1A3 866B 26E9 450A  9D82 58A2 D94A 93A0 B9CE)
> 'Are you Death?' ... IT'S THE SCYTHE, ISN'T IT? PEOPLE ALWAYS NOTICE THE
> SCYTHE.
> -- Terry Pratchett, The Fifth Elephant
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net 
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>


-- 
Brendan Mannella
bmanne...@teraswitch.com

TeraSwitch Inc.
Direct - 1.412.297.0225
Mobile - 1.412.592.7848
Fax - 412.202.7094
Cloud . Colocation . Connectivity
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-10-01 Thread Sebastian Wiesinger
* Graham Brown  [2014-09-23 22:33]:
> 12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I think
> there are many people waiting for these two releases...

So, 12.3R8 is out. Any practical experiences if inline jflow /
sampling is faster now?


Regards

Sebastian

-- 
GPG Key: 0x93A0B9CE (F4F6 B1A3 866B 26E9 450A  9D82 58A2 D94A 93A0 B9CE)
'Are you Death?' ... IT'S THE SCYTHE, ISN'T IT? PEOPLE ALWAYS NOTICE THE SCYTHE.
-- Terry Pratchett, The Fifth Elephant
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] MX80 Sampling - High CPU

2014-09-24 Thread Scott Granados
+1 here, definitely awaiting these releases.

On Sep 23, 2014, at 4:28 PM, Graham Brown  wrote:

> 12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I think
> there are many people waiting for these two releases...
> 
> Cheers,
> 
> Graham Brown
> Twitter - @mountainrescuer 
> LinkedIn 
> 
> On 24 September 2014 03:18, Justin M. Streiner 
> wrote:
> 
>> Sounds like you are running into bugs PR963060 or PR671136.
>> 
>> This is supposed to be fixed in 12.3R8 which is supposed to be released
>> very soon.
>> 
>> We ran into this behavior on a pair of MX480s and had to disable sampling
>> for the time being.
>> 
>> jms
>> 
>> 
>> On Tue, 23 Sep 2014, Ritz Rojas wrote:
>> 
>> We have a few MX80s (MX80-48T) that we're looking to deploy in certain
>>> applications where they'll be taking full Internet tables (v4 and v6).  We
>>> also have a need to gather flow data on our routers, and have noticed an
>>> interesting trend in the lab.
>>> 
>>> We are not using an MS-MIC currently.
>>> 
>>> This test box is running 12.3R7.7 at the moment, but we've seen this same
>>> thing in 11.4 too.
>>> 
>>> When set up with full Internet routes and sampling is enabled, each time a
>>> commit is made for any change at all, RPD and sampled take turns grinding
>>> the CPU up to 100%, for up to 5-10 minutes or more post-commit, and we see
>>> changes to BGP policy sometimes stall and take a decent amount of time (on
>>> the order of several minutes or more) to actually take effect.
>>> 
>>> First RPD will climb up to almost 100% CPU utilization, chew it for a few
>>> minutes, then it'll go down and sampled will climb up to almost 100% for
>>> it's couple minutes turn and chew a bit.  Then sampled goes back down and
>>> RPD takes back over to 100% for a few more minutes.  Eventually it all
>>> finally calms back down and normalizes back to expected levels.
>>> 
>>> Turn off sampling, and any CPU spikes post-commit are only on the order of
>>> seconds, not minutes, and any policy changes take effect pretty much
>>> immediately.
>>> 
>>> We've seen this regardless of how flow is configured; we've configured
>>> flow
>>> with a "simple" config, as well as inline jflow, pretty much with the same
>>> results.  We're curious if anyone's had any of these same problems with
>>> jflow killing the CPU on MX80s (yeah, I know these PPC boxes are pretty
>>> weak sisters), and if there's any fix beyond the usual "Doctor, it hurts
>>> when I do this, what should I do?".  "Don't do that!".
>>> 
>>> It's a nice feature, shame that using it seems to come with this heavy a
>>> price.
>>> 
>>> As an aside, we also see a bit of a slowdown in the RIB/FIB
>>> learning/purging on BGP session turnup/reset, which we're well aware is a
>>> known issue with sampling enabled, so I won't be shocked if this is just
>>> "how it is".  I'd love to be wrong.
>>> 
>>> Here's our sampling config, quick and dirty, regular and inline jflow, in
>>> case we're missing something.
>>> 
>>> "Normal" Sampling:
>>> 
>>> router> show configuration forwarding-options
>>> sampling {
>>>   input {
>>>   rate 8192;
>>>   run-length 0;
>>>   max-packets-per-second 2;
>>>   }
>>>   family inet {
>>>   output {
>>>   flow-server x.x.x.x {
>>>   port x;
>>>   version 5;
>>>   }
>>>   }
>>>   }
>>> }
>>> 
>>> router> show configuration interfaces xe-0/0/0
>>> unit xxx {
>>>   vlan-id xxx;
>>>   family inet {
>>>   sampling {
>>>   input;
>>>   output;
>>>   }
>>> }
>>> 
>>> 
>>> Inline Jflow Sampling:
>>> 
>>> router> show configuration forwarding-options
>>> sampling {
>>>   instance {
>>>   BLAH-INSTANCE {
>>>   input {
>>>   rate 5000;
>>>   }
>>>   family inet {
>>>   output {
>>>   flow-server x.x.x.x {
>>>   port ;
>>>   autonomous-system-type origin;
>>>   no-local-dump;
>>>   version-ipfix {
>>>   template {
>>>   BLAH-TEMPLATE;
>>>   }
>>>   }
>>>   }
>>>   inline-jflow {
>>>   source-address x.x.x.x;
>>>   }
>>>   }
>>>   }
>>>   }
>>>   }
>>> }
>>> 
>>> router> show configuration chassis
>>> tfeb {
>>>   slot 0 {
>>>   sampling-instance BLAH-INSTANCE;
>>>   }
>>> }
>>> 
>>> 
>>> router> show configuration services
>>> flow-monitoring {
>>>   version-ipfix {
>>>   template BLAH-TEMPLATE {
>>>   flow-active-timeout 10;
>>>   flow-inactive-timeout 10;
>>>   template-refresh-rate {
>>>   packets 1;
>>>   seconds 10;
>>>   }
>>>   option-refresh-rate {
>>>   packets 1;
>

Re: [j-nsp] MX80 Sampling - High CPU

2014-09-23 Thread Graham Brown
12.3R8 and 13.3R4 are due out anytime now with the fixes in place. I think
there are many people waiting for these two releases...

Cheers,

Graham Brown
Twitter - @mountainrescuer 
LinkedIn 

On 24 September 2014 03:18, Justin M. Streiner 
wrote:

> Sounds like you are running into bugs PR963060 or PR671136.
>
> This is supposed to be fixed in 12.3R8 which is supposed to be released
> very soon.
>
> We ran into this behavior on a pair of MX480s and had to disable sampling
> for the time being.
>
> jms
>
>
> On Tue, 23 Sep 2014, Ritz Rojas wrote:
>
>  We have a few MX80s (MX80-48T) that we're looking to deploy in certain
>> applications where they'll be taking full Internet tables (v4 and v6).  We
>> also have a need to gather flow data on our routers, and have noticed an
>> interesting trend in the lab.
>>
>> We are not using an MS-MIC currently.
>>
>> This test box is running 12.3R7.7 at the moment, but we've seen this same
>> thing in 11.4 too.
>>
>> When set up with full Internet routes and sampling is enabled, each time a
>> commit is made for any change at all, RPD and sampled take turns grinding
>> the CPU up to 100%, for up to 5-10 minutes or more post-commit, and we see
>> changes to BGP policy sometimes stall and take a decent amount of time (on
>> the order of several minutes or more) to actually take effect.
>>
>> First RPD will climb up to almost 100% CPU utilization, chew it for a few
>> minutes, then it'll go down and sampled will climb up to almost 100% for
>> it's couple minutes turn and chew a bit.  Then sampled goes back down and
>> RPD takes back over to 100% for a few more minutes.  Eventually it all
>> finally calms back down and normalizes back to expected levels.
>>
>> Turn off sampling, and any CPU spikes post-commit are only on the order of
>> seconds, not minutes, and any policy changes take effect pretty much
>> immediately.
>>
>> We've seen this regardless of how flow is configured; we've configured
>> flow
>> with a "simple" config, as well as inline jflow, pretty much with the same
>> results.  We're curious if anyone's had any of these same problems with
>> jflow killing the CPU on MX80s (yeah, I know these PPC boxes are pretty
>> weak sisters), and if there's any fix beyond the usual "Doctor, it hurts
>> when I do this, what should I do?".  "Don't do that!".
>>
>> It's a nice feature, shame that using it seems to come with this heavy a
>> price.
>>
>> As an aside, we also see a bit of a slowdown in the RIB/FIB
>> learning/purging on BGP session turnup/reset, which we're well aware is a
>> known issue with sampling enabled, so I won't be shocked if this is just
>> "how it is".  I'd love to be wrong.
>>
>> Here's our sampling config, quick and dirty, regular and inline jflow, in
>> case we're missing something.
>>
>> "Normal" Sampling:
>>
>> router> show configuration forwarding-options
>> sampling {
>>input {
>>rate 8192;
>>run-length 0;
>>max-packets-per-second 2;
>>}
>>family inet {
>>output {
>>flow-server x.x.x.x {
>>port x;
>>version 5;
>>}
>>}
>>}
>> }
>>
>> router> show configuration interfaces xe-0/0/0
>> unit xxx {
>>vlan-id xxx;
>>family inet {
>>sampling {
>>input;
>>output;
>>}
>> }
>>
>>
>> Inline Jflow Sampling:
>>
>> router> show configuration forwarding-options
>> sampling {
>>instance {
>>BLAH-INSTANCE {
>>input {
>>rate 5000;
>>}
>>family inet {
>>output {
>>flow-server x.x.x.x {
>>port ;
>>autonomous-system-type origin;
>>no-local-dump;
>>version-ipfix {
>>template {
>>BLAH-TEMPLATE;
>>}
>>}
>>}
>>inline-jflow {
>>source-address x.x.x.x;
>>}
>>}
>>}
>>}
>>}
>> }
>>
>> router> show configuration chassis
>> tfeb {
>>slot 0 {
>>sampling-instance BLAH-INSTANCE;
>>}
>> }
>>
>>
>> router> show configuration services
>> flow-monitoring {
>>version-ipfix {
>>template BLAH-TEMPLATE {
>>flow-active-timeout 10;
>>flow-inactive-timeout 10;
>>template-refresh-rate {
>>packets 1;
>>seconds 10;
>>}
>>option-refresh-rate {
>>packets 1;
>>seconds 10;
>>}
>>ipv4-template;
>>}
>>}
>> }
>>
>>
>> router> show configuration interfaces xe-0/0/0
>> unit xxx {
>>vlan-id xxx;
>>family inet {
>>sampling

Re: [j-nsp] MX80 Sampling - High CPU

2014-09-23 Thread Justin M. Streiner

Sounds like you are running into bugs PR963060 or PR671136.

This is supposed to be fixed in 12.3R8 which is supposed to be released 
very soon.


We ran into this behavior on a pair of MX480s and had to disable sampling 
for the time being.


jms

On Tue, 23 Sep 2014, Ritz Rojas wrote:


We have a few MX80s (MX80-48T) that we're looking to deploy in certain
applications where they'll be taking full Internet tables (v4 and v6).  We
also have a need to gather flow data on our routers, and have noticed an
interesting trend in the lab.

We are not using an MS-MIC currently.

This test box is running 12.3R7.7 at the moment, but we've seen this same
thing in 11.4 too.

When set up with full Internet routes and sampling is enabled, each time a
commit is made for any change at all, RPD and sampled take turns grinding
the CPU up to 100%, for up to 5-10 minutes or more post-commit, and we see
changes to BGP policy sometimes stall and take a decent amount of time (on
the order of several minutes or more) to actually take effect.

First RPD will climb up to almost 100% CPU utilization, chew it for a few
minutes, then it'll go down and sampled will climb up to almost 100% for
it's couple minutes turn and chew a bit.  Then sampled goes back down and
RPD takes back over to 100% for a few more minutes.  Eventually it all
finally calms back down and normalizes back to expected levels.

Turn off sampling, and any CPU spikes post-commit are only on the order of
seconds, not minutes, and any policy changes take effect pretty much
immediately.

We've seen this regardless of how flow is configured; we've configured flow
with a "simple" config, as well as inline jflow, pretty much with the same
results.  We're curious if anyone's had any of these same problems with
jflow killing the CPU on MX80s (yeah, I know these PPC boxes are pretty
weak sisters), and if there's any fix beyond the usual "Doctor, it hurts
when I do this, what should I do?".  "Don't do that!".

It's a nice feature, shame that using it seems to come with this heavy a
price.

As an aside, we also see a bit of a slowdown in the RIB/FIB
learning/purging on BGP session turnup/reset, which we're well aware is a
known issue with sampling enabled, so I won't be shocked if this is just
"how it is".  I'd love to be wrong.

Here's our sampling config, quick and dirty, regular and inline jflow, in
case we're missing something.

"Normal" Sampling:

router> show configuration forwarding-options
sampling {
   input {
   rate 8192;
   run-length 0;
   max-packets-per-second 2;
   }
   family inet {
   output {
   flow-server x.x.x.x {
   port x;
   version 5;
   }
   }
   }
}

router> show configuration interfaces xe-0/0/0
unit xxx {
   vlan-id xxx;
   family inet {
   sampling {
   input;
   output;
   }
}


Inline Jflow Sampling:

router> show configuration forwarding-options
sampling {
   instance {
   BLAH-INSTANCE {
   input {
   rate 5000;
   }
   family inet {
   output {
   flow-server x.x.x.x {
   port ;
   autonomous-system-type origin;
   no-local-dump;
   version-ipfix {
   template {
   BLAH-TEMPLATE;
   }
   }
   }
   inline-jflow {
   source-address x.x.x.x;
   }
   }
   }
   }
   }
}

router> show configuration chassis
tfeb {
   slot 0 {
   sampling-instance BLAH-INSTANCE;
   }
}


router> show configuration services
flow-monitoring {
   version-ipfix {
   template BLAH-TEMPLATE {
   flow-active-timeout 10;
   flow-inactive-timeout 10;
   template-refresh-rate {
   packets 1;
   seconds 10;
   }
   option-refresh-rate {
   packets 1;
   seconds 10;
   }
   ipv4-template;
   }
   }
}


router> show configuration interfaces xe-0/0/0
unit xxx {
   vlan-id xxx;
   family inet {
   sampling {
   input;
   output;
   }
}
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] MX80 Sampling - High CPU

2014-09-23 Thread Ritz Rojas
We have a few MX80s (MX80-48T) that we're looking to deploy in certain
applications where they'll be taking full Internet tables (v4 and v6).  We
also have a need to gather flow data on our routers, and have noticed an
interesting trend in the lab.

We are not using an MS-MIC currently.

This test box is running 12.3R7.7 at the moment, but we've seen this same
thing in 11.4 too.

When set up with full Internet routes and sampling is enabled, each time a
commit is made for any change at all, RPD and sampled take turns grinding
the CPU up to 100%, for up to 5-10 minutes or more post-commit, and we see
changes to BGP policy sometimes stall and take a decent amount of time (on
the order of several minutes or more) to actually take effect.

First RPD will climb up to almost 100% CPU utilization, chew it for a few
minutes, then it'll go down and sampled will climb up to almost 100% for
it's couple minutes turn and chew a bit.  Then sampled goes back down and
RPD takes back over to 100% for a few more minutes.  Eventually it all
finally calms back down and normalizes back to expected levels.

Turn off sampling, and any CPU spikes post-commit are only on the order of
seconds, not minutes, and any policy changes take effect pretty much
immediately.

We've seen this regardless of how flow is configured; we've configured flow
with a "simple" config, as well as inline jflow, pretty much with the same
results.  We're curious if anyone's had any of these same problems with
jflow killing the CPU on MX80s (yeah, I know these PPC boxes are pretty
weak sisters), and if there's any fix beyond the usual "Doctor, it hurts
when I do this, what should I do?".  "Don't do that!".

It's a nice feature, shame that using it seems to come with this heavy a
price.

As an aside, we also see a bit of a slowdown in the RIB/FIB
learning/purging on BGP session turnup/reset, which we're well aware is a
known issue with sampling enabled, so I won't be shocked if this is just
"how it is".  I'd love to be wrong.

Here's our sampling config, quick and dirty, regular and inline jflow, in
case we're missing something.

"Normal" Sampling:

router> show configuration forwarding-options
sampling {
input {
rate 8192;
run-length 0;
max-packets-per-second 2;
}
family inet {
output {
flow-server x.x.x.x {
port x;
version 5;
}
}
}
}

router> show configuration interfaces xe-0/0/0
unit xxx {
vlan-id xxx;
family inet {
sampling {
input;
output;
}
}


Inline Jflow Sampling:

router> show configuration forwarding-options
sampling {
instance {
BLAH-INSTANCE {
input {
rate 5000;
}
family inet {
output {
flow-server x.x.x.x {
port ;
autonomous-system-type origin;
no-local-dump;
version-ipfix {
template {
BLAH-TEMPLATE;
}
}
}
inline-jflow {
source-address x.x.x.x;
}
}
}
}
}
}

router> show configuration chassis
tfeb {
slot 0 {
sampling-instance BLAH-INSTANCE;
}
}


router> show configuration services
flow-monitoring {
version-ipfix {
template BLAH-TEMPLATE {
flow-active-timeout 10;
flow-inactive-timeout 10;
template-refresh-rate {
packets 1;
seconds 10;
}
option-refresh-rate {
packets 1;
seconds 10;
}
ipv4-template;
}
}
}


router> show configuration interfaces xe-0/0/0
unit xxx {
vlan-id xxx;
family inet {
sampling {
input;
output;
}
}
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp