subject:"PSINet\/Cogent Latency"

Re: PSINet/Cogent Latency

2002-07-24 Thread Joe Loiacono




Actually RRDTool interpolates any late replys to the nearest specified
collection timepoint (e.g., every 5th minute.) It doesn't really resample.

Joe



   

Matt   

ZimmermanTo: [EMAIL PROTECTED]   

mdz cc:   

@csh.rit.eduSubject: Re: PSINet/Cogent Latency

Sent by:   

owner-nanog

   

   

07/23/2002 

09:46 AM   

   

   






On Mon, Jul 22, 2002 at 10:50:03PM -0700, Doug Clements wrote:

 I think the problem with using rrdtool for billing purposes as described
 is that data can (and does) get lost. If your poller is a few cycles
late,
 the burstable bandwidth measured goes up when the poller catches up to
the
 interface counters. More bursting is bad for %ile (or good if you're
 selling it), and the customer won't like the fact that they're getting
 charged for artifically high measurements.

RRDtool takes into account the time at which the sample was collected, and
if it does not exactly match the expected sampling period, it is resampled
on the fly.  See:

http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/tutorial/rrdtutorial.html


under Data Resampling for more information.

RRDtool has some quirks when used for billing purposes, but it is not
guilty
of the error that you describe.

--
 - mdz

Re: PSINet/Cogent Latency

2002-07-24 Thread Matt Zimmerman



On Wed, Jul 24, 2002 at 10:55:43AM -0400, Joe Loiacono wrote:

 Actually RRDTool interpolates any late replys to the nearest specified
 collection timepoint (e.g., every 5th minute.) It doesn't really resample.

That particular document seems to refer to it as resampling, but yes,
interpolation would be more correct.

-- 
 - mdz

RE: PSINet/Cogent Latency

2002-07-23 Thread Phil Rosenthal



I have a small RRD project box that polls 200 interfaces and has it
takes 1 minute, 5 seconds to run with 60%  cpu usage (so obviously it
can be streamlined if I wanted to work on it). I guess the limit in this
implementation is 1000 interfaces per box in this setup -- but I see
most of the CPU usage is in the forking of snmpget over and over.  Im
sure I could write a small program in C that could do this at least 10X
more efficiently.  That's 10,000 interfaces with RRD on one intel -- if
you are determined to do it.

I think if you are billing 10k interfaces, you can afford a 2nd intel
box to check the 2nd 10,000, no?

My point is that if you have sufficient clue, time, and motivation --
Today's generic PCs are capable to do many large tasks... 

--Phil


-Original Message-
From: Richard A Steenbergen [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, July 23, 2002 2:10 AM
To: Phil Rosenthal
Cc: 'Doug Clements'; [EMAIL PROTECTED]
Subject: Re: PSINet/Cogent Latency


On Tue, Jul 23, 2002 at 01:56:45AM -0400, Phil Rosenthal wrote:
 
 I don't think RRD is that bad if you are gonna check only every 5 
 minutes...

RRD doesn't measure anything, it stores and graphs data. The perl
pollers everyone is using can barely keep up with 5 minute samples on a
couple dozen routers and a few hundred interfaces, requiring poller
farms to be distributed across a network, 'lest a box or part of the
network break and you lose data.

 Again, perhaps I'm just missing something, but so lets say you measure

 30 seconds late , and it thinks its on time -- So that one sample will

 be higher , then the next one will be on time, so 30 seconds early for

 that sample -- it will be lower.  On the whole -- it will be accurate 
 enough -- no?

enough is a relative term, but sure. :)

 I'm not saying a hardware solution can't be better -- but it is likely

 overkill compared to a few cheap intels running RRD -- assuming your 
 snmpd can deal with the load...

What hardware... storing a few byte counters is trivial, but polling
them through snmp is what is hard (never trust a protocol named simple
or trivial). Creating a buffer of samples which can be periodically
sampled should be easy and painless. I don't know if I call periodic ftp

painless but its certainly a start.

-- 
Richard A Steenbergen [EMAIL PROTECTED]
http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE
B6)

RE: PSINet/Cogent Latency

2002-07-23 Thread Phil Rosenthal



I see your point, but I still think RRD is good enough.

If cisco/foundry/juniper added this to their respective OS's -- I'd be a
happy camper... If they don't -- I won't lose sleep over it.

--Phil

-Original Message-
From: Doug Clements [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, July 23, 2002 2:12 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: PSINet/Cogent Latency


- Original Message -
From: Phil Rosenthal [EMAIL PROTECTED]
Subject: RE: PSINet/Cogent Latency


 I don't think RRD is that bad if you are gonna check only every 5 
 minutes...

 Again, perhaps I'm just missing something, but so lets say you measure

 30 seconds late , and it thinks its on time -- So that one sample will

 be higher , then the next one will be on time, so 30 seconds early for

 that sample -- it will be lower.  On the whole -- it will be accurate 
 enough -- no?

If you're polling every 5 minutes, with 2 retrys per poll, and you miss
2 retrys, then your next poll will be 5 minutes late. It's not
disastrous, but it's also not perfect. Again, peaks and vallys on your
graph cost more than smooth lines, even with the same total bandwidth.

Do you want to be the one to tell your customers your billing setup is
accurate enough, and especially that it's going to have a tendancy to
be accurate enough in your favor?

 Besides I think RRD has a bunch of things built in to deal with 
 precisely this problem.

Wouldn't that be just spiffy!

 I'm not saying a hardware solution can't be better -- but it is likely

 overkill compared to a few cheap intels running RRD -- assuming your 
 snmpd can deal with the load...

No extra hardware needed. I think the desired solution was integration
into the router. The data is already there, you just need software to
compile it and ship it out via a reliable reporting mechanism. For being
relatively simple, it's a nice idea that it could replace the almost
in an almost accurate billing process.

--Doug

Re: PSINet/Cogent Latency

2002-07-23 Thread Alexander Koch



On Tue, 23 July 2002 02:25:36 -0400, Phil Rosenthal wrote:
 I have a small RRD project box that polls 200 interfaces and has it
 takes 1 minute, 5 seconds to run with 60%  cpu usage (so obviously it
 can be streamlined if I wanted to work on it). I guess the limit in this
 implementation is 1000 interfaces per box in this setup -- but I see
 most of the CPU usage is in the forking of snmpget over and over.  Im
 sure I could write a small program in C that could do this at least 10X
 more efficiently.  That's 10,000 interfaces with RRD on one intel -- if
 you are determined to do it.
 
 I think if you are billing 10k interfaces, you can afford a 2nd intel
 box to check the 2nd 10,000, no?

Phil,

imagine some four routers dying or not answering queries,
you will see the poll script give you timeout after timeout
after timeout and with some 50 to 100 routers and the
respective interfaces you see mrtg choke badly, losing data.

You see, the poll script is doing one after the other,
mainly, so you wait too long and then the next run starts
and then something.

mrtg/rrd is not the tool of choice for accounting / billing
but nice enough for showing you 'backup' graphs for visitors
probably.

Alexander

Re: PSINet/Cogent Latency

2002-07-23 Thread Richard A Steenbergen



On Tue, Jul 23, 2002 at 02:25:36AM -0400, Phil Rosenthal wrote:
 I have a small RRD project box that polls 200 interfaces and has it
 takes 1 minute, 5 seconds to run with 60%  cpu usage (so obviously it
 can be streamlined if I wanted to work on it). I guess the limit in this
 implementation is 1000 interfaces per box in this setup -- but I see
 most of the CPU usage is in the forking of snmpget over and over.  Im
 sure I could write a small program in C that could do this at least 10X
 more efficiently.  That's 10,000 interfaces with RRD on one intel -- if
 you are determined to do it.

10x? Wanna try a higher order of magnitude?

While you're at it, eliminate the forking to the rrdtool bin when you're
adding data. A little thought and profiling goes a long way, this is
simple number crunching we're talking about, not supercomputer work. The
problem comes from the perl mentality (why is there no C lib for
efficiently adding to an rrd db? because they're expecting everyone to
call it from perl :P), it's good enough for my couple boxes and you can
throw more machines at it.

But again, I have no doubt that if you designed it properly you could 
throw lots of snmp queries and scale decently to a nice sized core 
network, I've seen it done. The problem is potential communication loss 
between the poller and the device, and the amount of work that the device 
(which usually isn't running gods gift to any code let alone snmp code) 
has to do for higher sampling rates with many interfaces.

-- 
Richard A Steenbergen [EMAIL PROTECTED]   http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)

Re: PSINet/Cogent Latency

2002-07-23 Thread Gary E. Miller



Yo Alexander!

On Tue, 23 Jul 2002, Alexander Koch wrote:

 imagine some four routers dying or not answering queries,
 you will see the poll script give you timeout after timeout
 after timeout and with some 50 to 100 routers and the
 respective interfaces you see mrtg choke badly, losing data.

Yep.  Anything gets behind and it all gets behind.

That is why we run multiple copies of MRTG.  That way polling for one set
of hosts does not have to wait for another set.  If one set is timing
out the other just keeps on as usual.

RGDS
GARY
---
Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701
[EMAIL PROTECTED]  Tel:+1(541)382-8588 Fax: +1(541)382-8676

Re: PSINet/Cogent Latency

2002-07-23 Thread Matt Zimmerman



On Tue, Jul 23, 2002 at 02:40:10AM -0400, Richard A Steenbergen wrote:

 While you're at it, eliminate the forking to the rrdtool bin when you're
 adding data. A little thought and profiling goes a long way, this is
 simple number crunching we're talking about, not supercomputer work. The
 problem comes from the perl mentality (why is there no C lib for
 efficiently adding to an rrd db? because they're expecting everyone to
 call it from perl :P), it's good enough for my couple boxes and you can
 throw more machines at it.

There is a C library, librrd.  That is how the other language APIs are
built.  As to efficiency, there is a lot of stringification, which is
inconvenient and unnatural in C, but this should not be the bottleneck in
the collection operation.

 But again, I have no doubt that if you designed it properly you could
 throw lots of snmp queries and scale decently to a nice sized core
 network, I've seen it done. The problem is potential communication loss
 between the poller and the device, and the amount of work that the device
 (which usually isn't running gods gift to any code let alone snmp code)
 has to do for higher sampling rates with many interfaces.

That said, bulk statistical exports from the device itself can easily be
more implemented efficiently than SNMP.  But unless the export process is
universally standardized, SNMP (for all its warts, and it has many) will
still have an edge in that it works nearly everywhere (for varying values of
works).

-- 
 - mdz

RE: PSINet/Cogent Latency

2002-07-23 Thread Alex Rubenstein




On Tue, 23 Jul 2002, Phil Rosenthal wrote:


 I have a small RRD project box that polls 200 interfaces and has it
 takes 1 minute, 5 seconds to run with 60%  cpu usage (so obviously it
 can be streamlined if I wanted to work on it). I guess the limit in this
 implementation is 1000 interfaces per box in this setup -- but I see
 most of the CPU usage is in the forking of snmpget over and over.  Im
 sure I could write a small program in C that could do this at least 10X
 more efficiently.  That's 10,000 interfaces with RRD on one intel -- if
 you are determined to do it.

Interesting. We have a dual p3-700, doing LOTS of other things, which does
1600 interfaces under MRTG using small amounts of CPU.

You are using 'Forks', if you're using MRTG, no?

This whole process takes less than 2 minutes.



 I think if you are billing 10k interfaces, you can afford a 2nd intel
 box to check the 2nd 10,000, no?

First and foremost, you said RRD, not billing.

Who uses RRD for billing purposes?



 My point is that if you have sufficient clue, time, and motivation --
 Today's generic PCs are capable to do many large tasks...

Quite. In regards to billing, we have some home grown software that (don't
laugh too hard) runs as an NT service; it collects 1,700 ports of
information every five minutes (Bytes[In|Out], BitsSec[In|Out],
AdminStatus, OperStatus, Time) in only 60 seconds; we've found the best
way to do this is to blast SNMP requests, and wait for replies which are
then event driven; wait 10 seconds, retry all the ones we get, then try
again. We've found that this works the best, having tried about 4
different ways of doing it over the last 5 years. It's all then nicely
stored in a SQL DB.




-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

Re: PSINet/Cogent Latency

2002-07-23 Thread Streiner, Justin



On Mon, 22 Jul 2002, Alex Rubenstein wrote:

 Yes, it's horrid. I've been peering with PSI for going on three years, and
 it's never been as bad as it is now.

I took advantage of their free peering offer back in the day, and ended
up peering with them for about 18 months (06/1999 - 01/2001).  It took
about 9 months for them to get the circuit installed.

For the first few months, everything was great, but then we started
getting massive spikes in latency (300-700ms) just getting across the pipe
between my router and PSI's router.  I liken it to owning an old Audi -
they were great when they ran, but spent more time in the shop than on the
road.

The process of opening tickets and getting clued people in their NOC to
talk to me was an adventure.  PSI, much like some other providers, went to
great pains to try keeping $CUSTOMER from having a direct path to
$CLUEDPEOPLE.

They could never adequately explain the latency, other than it would
mysteriously go away and re-appear, more or less independent of the amount
of traffic on the circuit.  Eventually an upper-level engineer told me
that the saturation was due to congestion on their end of the pipe, and
getting some fatter pipe in there would take 60 days.

Fine.

90 days later, the bigger pipe is installed on their end and the latency
goes away for a few weeks, then comes back.

Wash.  Rinse.  Repeat.

A few more months of that, and I cancelled the peering.

 oddly enough, we see 30+ msec across a DS3 to them, which isn't that
 loaded (35 to 40 mb/s).

 Then, behind whatever we peer with, we see over 400 msec, with 50% loss,
 during business hours.

Re: PSINet/Cogent Latency

2002-07-23 Thread Scott Granados



It has a lot of similarities to old Audi's.  Remember they used to work 
fine and then for no reason used to fall in to drive, rev high, and run 
over Grandma and  the kids!  Sounds a bit like their peering.:)

On Tue, 23 
Jul 2002, Streiner, Justin wrote:

 
 On Mon, 22 Jul 2002, Alex Rubenstein wrote:
 
  Yes, it's horrid. I've been peering with PSI for going on three years, and
  it's never been as bad as it is now.
 
 I took advantage of their free peering offer back in the day, and ended
 up peering with them for about 18 months (06/1999 - 01/2001).  It took
 about 9 months for them to get the circuit installed.
 
 For the first few months, everything was great, but then we started
 getting massive spikes in latency (300-700ms) just getting across the pipe
 between my router and PSI's router.  I liken it to owning an old Audi -
 they were great when they ran, but spent more time in the shop than on the
 road.
 
 The process of opening tickets and getting clued people in their NOC to
 talk to me was an adventure.  PSI, much like some other providers, went to
 great pains to try keeping $CUSTOMER from having a direct path to
 $CLUEDPEOPLE.
 
 They could never adequately explain the latency, other than it would
 mysteriously go away and re-appear, more or less independent of the amount
 of traffic on the circuit.  Eventually an upper-level engineer told me
 that the saturation was due to congestion on their end of the pipe, and
 getting some fatter pipe in there would take 60 days.
 
 Fine.
 
 90 days later, the bigger pipe is installed on their end and the latency
 goes away for a few weeks, then comes back.
 
 Wash.  Rinse.  Repeat.
 
 A few more months of that, and I cancelled the peering.
 
  oddly enough, we see 30+ msec across a DS3 to them, which isn't that
  loaded (35 to 40 mb/s).
 
  Then, behind whatever we peer with, we see over 400 msec, with 50% loss,
  during business hours.

Re: PSINet/Cogent Latency

2002-07-23 Thread Vadim Antonov





Some long long long time ago I wrote a small tool called snmpstatd.  Back 
then Sprint management was gracious to allow me to release it as a 
public-domain code.

It basically collects usage statistics (in 30-sec peaks and 5-min
averages), memory and CPU utilization from routers, by performing
_asynchronous_ SNMP polling.  I believe it can scale to about 5000-1
routers.  It also performs accurate time base interpolation for 30-sec
sampling (i.e. it always requests router's local time and uses it for
computing accurate 30-sec peak usage).

The data is stored in text files which are extremely easy to parse.

The configuration is text-based; it also includes compact status alarm 
output (i.e. which routers/links are down),  PostScript chart generator,
and troff/nroff based text report generator, with summary downtime and
usage figures + significant events.  The tool was used routinely to 
produce reporting on ICM-NET performance for NSF.

This thing may need some hacking to accomodate later-day IOS bogosities,
though.

If anyone wants it, I have it at www.kotovnik.com/~avg/snmpstatd.tar.gz

--vadim

On Mon, 22 Jul 2002, Gary E. Miller wrote:

 
 Yo Alexander!
 
 On Tue, 23 Jul 2002, Alexander Koch wrote:
 
  imagine some four routers dying or not answering queries,
  you will see the poll script give you timeout after timeout
  after timeout and with some 50 to 100 routers and the
  respective interfaces you see mrtg choke badly, losing data.
 
 Yep.  Anything gets behind and it all gets behind.
 
 That is why we run multiple copies of MRTG.  That way polling for one set
 of hosts does not have to wait for another set.  If one set is timing
 out the other just keeps on as usual.
 
 RGDS
 GARY
 ---
 Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701
   [EMAIL PROTECTED]  Tel:+1(541)382-8588 Fax: +1(541)382-8676

PSINet/Cogent Latency

2002-07-22 Thread Derek Samford



There was some mail being tossed around earlier about Cogent
having latency. I'm actually seeing this on PSINet (Now owned by
Cogent.) Is anyone else still seeing the latency they were experiencing
earlier?

Derek

Re: PSINet/Cogent Latency

2002-07-22 Thread Alex Rubenstein




Yes, it's horrid. I've been peering with PSI for going on three years, and
it's never been as bad as it is now.

oddly enough, we see 30+ msec across a DS3 to them, which isn't that
loaded (35 to 40 mb/s).

Then, behind whatever we peer with, we see over 400 msec, with 50% loss,
during business hours.



On Mon, 22 Jul 2002, Derek Samford wrote:


   There was some mail being tossed around earlier about Cogent
 having latency. I'm actually seeing this on PSINet (Now owned by
 Cogent.) Is anyone else still seeing the latency they were experiencing
 earlier?

 Derek


-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal



40mb/s isn't loaded for a DS3?

--Phil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
Alex Rubenstein
Sent: Monday, July 22, 2002 8:27 PM
To: Derek Samford
Cc: [EMAIL PROTECTED]
Subject: Re: PSINet/Cogent Latency




Yes, it's horrid. I've been peering with PSI for going on three years,
and it's never been as bad as it is now.

oddly enough, we see 30+ msec across a DS3 to them, which isn't that
loaded (35 to 40 mb/s).

Then, behind whatever we peer with, we see over 400 msec, with 50% loss,
during business hours.



On Mon, 22 Jul 2002, Derek Samford wrote:


   There was some mail being tossed around earlier about Cogent
having 
 latency. I'm actually seeing this on PSINet (Now owned by
 Cogent.) Is anyone else still seeing the latency they were 
 experiencing earlier?

 Derek


-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

Re: PSINet/Cogent Latency

2002-07-22 Thread G. Scott Granados

Nah, that's not loaded.  Its not loaded until you make it go in to alarm by
passing traffic:):).

- Original Message -
From: Phil Rosenthal [EMAIL PROTECTED]
To: 'Alex Rubenstein' [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, July 22, 2002 6:05 PM
Subject: RE: PSINet/Cogent Latency

 40mb/s isn't loaded for a DS3?

 --Phil

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
 Alex Rubenstein
 Sent: Monday, July 22, 2002 8:27 PM
 To: Derek Samford
 Cc: [EMAIL PROTECTED]
 Subject: Re: PSINet/Cogent Latency

 Yes, it's horrid. I've been peering with PSI for going on three years,
 and it's never been as bad as it is now.

 oddly enough, we see 30+ msec across a DS3 to them, which isn't that
 loaded (35 to 40 mb/s).

 Then, behind whatever we peer with, we see over 400 msec, with 50% loss,
 during business hours.

 On Mon, 22 Jul 2002, Derek Samford wrote:

  There was some mail being tossed around earlier about Cogent
 having
  latency. I'm actually seeing this on PSINet (Now owned by
  Cogent.) Is anyone else still seeing the latency they were
  experiencing earlier?

  Derek

 -- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
 --Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

Re: PSINet/Cogent Latency

2002-07-22 Thread Brian

bwahaha, 2 funnee.  I gotta think most people would be thinking of adding
another ds3 at that point.

Bri

- Original Message -
From: Phil Rosenthal [EMAIL PROTECTED]
To: 'Alex Rubenstein' [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, July 22, 2002 6:05 PM
Subject: RE: PSINet/Cogent Latency

 40mb/s isn't loaded for a DS3?

 --Phil

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
 Alex Rubenstein
 Sent: Monday, July 22, 2002 8:27 PM
 To: Derek Samford
 Cc: [EMAIL PROTECTED]
 Subject: Re: PSINet/Cogent Latency

 Yes, it's horrid. I've been peering with PSI for going on three years,
 and it's never been as bad as it is now.

 oddly enough, we see 30+ msec across a DS3 to them, which isn't that
 loaded (35 to 40 mb/s).

 Then, behind whatever we peer with, we see over 400 msec, with 50% loss,
 during business hours.

 On Mon, 22 Jul 2002, Derek Samford wrote:

  There was some mail being tossed around earlier about Cogent
 having
  latency. I'm actually seeing this on PSINet (Now owned by
  Cogent.) Is anyone else still seeing the latency they were
  experiencing earlier?

  Derek

 -- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
 --Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

Re: PSINet/Cogent Latency

2002-07-22 Thread Alex Rubenstein




You certainly would, except for the fact that the provider is in
bankruptcy and won't/can't answer the phone.

We wanted to do an oc3 or oc12 or gig-e, but that was replied to with,
wha?



On Mon, 22 Jul 2002, Brian wrote:

 bwahaha, 2 funnee.  I gotta think most people would be thinking of adding
 another ds3 at that point.

 Bri

 - Original Message -
 From: Phil Rosenthal [EMAIL PROTECTED]
 To: 'Alex Rubenstein' [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Sent: Monday, July 22, 2002 6:05 PM
 Subject: RE: PSINet/Cogent Latency


 
  40mb/s isn't loaded for a DS3?
 
  --Phil
 
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
  Alex Rubenstein
  Sent: Monday, July 22, 2002 8:27 PM
  To: Derek Samford
  Cc: [EMAIL PROTECTED]
  Subject: Re: PSINet/Cogent Latency
 
 
 
 
  Yes, it's horrid. I've been peering with PSI for going on three years,
  and it's never been as bad as it is now.
 
  oddly enough, we see 30+ msec across a DS3 to them, which isn't that
  loaded (35 to 40 mb/s).
 
  Then, behind whatever we peer with, we see over 400 msec, with 50% loss,
  during business hours.
 
 
 
  On Mon, 22 Jul 2002, Derek Samford wrote:
 
  
   There was some mail being tossed around earlier about Cogent
  having
   latency. I'm actually seeing this on PSINet (Now owned by
   Cogent.) Is anyone else still seeing the latency they were
   experiencing earlier?
  
   Derek
  
 
  -- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
  --Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --
 
 
 


-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal



I call any upstream link 'over capacity' if either:
1) There is less than 50mb/s unused
2) The circuit is more than 50% in use

I guess by my definition a DS3 is always 'over capacity'

--Phil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
Brian
Sent: Monday, July 22, 2002 9:36 PM
To: [EMAIL PROTECTED]; 'Alex Rubenstein'
Cc: [EMAIL PROTECTED]
Subject: Re: PSINet/Cogent Latency



bwahaha, 2 funnee.  I gotta think most people would be thinking of
adding another ds3 at that point.

Bri

- Original Message -
From: Phil Rosenthal [EMAIL PROTECTED]
To: 'Alex Rubenstein' [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, July 22, 2002 6:05 PM
Subject: RE: PSINet/Cogent Latency



 40mb/s isn't loaded for a DS3?

 --Phil

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf 
 Of Alex Rubenstein
 Sent: Monday, July 22, 2002 8:27 PM
 To: Derek Samford
 Cc: [EMAIL PROTECTED]
 Subject: Re: PSINet/Cogent Latency




 Yes, it's horrid. I've been peering with PSI for going on three years,

 and it's never been as bad as it is now.

 oddly enough, we see 30+ msec across a DS3 to them, which isn't that 
 loaded (35 to 40 mb/s).

 Then, behind whatever we peer with, we see over 400 msec, with 50% 
 loss, during business hours.



 On Mon, 22 Jul 2002, Derek Samford wrote:

 
  There was some mail being tossed around earlier about Cogent
 having
  latency. I'm actually seeing this on PSINet (Now owned by
  Cogent.) Is anyone else still seeing the latency they were 
  experiencing earlier?
 
  Derek
 

 -- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
 --Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

RE: PSINet/Cogent Latency

2002-07-22 Thread Alex Rubenstein





On Mon, 22 Jul 2002, Phil Rosenthal wrote:


 I call any upstream link 'over capacity' if either:
 1) There is less than 50mb/s unused

That must work well for T1's and DS3's.


 2) The circuit is more than 50% in use

I call it 'over capacity' too, but that doesn't mean all the ducks are in
a row to get both sides to realise an upgrade is needed, and even if they
do realise it, to actually get it done. I am sure 2238092 people on this
list can complain of the same problem.

So, what do you do? You monitor it's usage, making adjustments to make
sure it doesn't get clobbered. You can easily run DS-3s at 35 to 40
mbit/sec, with little to none increase in latency from the norm. Many
people do this as well, even up to OC12 or higher levels all the time.




 I guess by my definition a DS3 is always 'over capacity'

Which must work very well for those DS3's doing 10 to 20 mb/s. Do you
upgrade those to OC3 or beyond?


-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal



Actually, I wouldn't think about getting T1, DS3 or OC3 in the first
place ;)
Oc-12 is the minimum link I would even look at -- and my preference is
gig-e... Even if there is only 90 megs on the interface...

--Phil

-Original Message-
From: Alex Rubenstein [mailto:[EMAIL PROTECTED]] 
Sent: Monday, July 22, 2002 10:02 PM
To: Phil Rosenthal
Cc: [EMAIL PROTECTED]
Subject: RE: PSINet/Cogent Latency




On Mon, 22 Jul 2002, Phil Rosenthal wrote:


 I call any upstream link 'over capacity' if either:
 1) There is less than 50mb/s unused

That must work well for T1's and DS3's.


 2) The circuit is more than 50% in use

I call it 'over capacity' too, but that doesn't mean all the ducks are
in a row to get both sides to realise an upgrade is needed, and even if
they do realise it, to actually get it done. I am sure 2238092 people on
this list can complain of the same problem.

So, what do you do? You monitor it's usage, making adjustments to make
sure it doesn't get clobbered. You can easily run DS-3s at 35 to 40
mbit/sec, with little to none increase in latency from the norm. Many
people do this as well, even up to OC12 or higher levels all the time.




 I guess by my definition a DS3 is always 'over capacity'

Which must work very well for those DS3's doing 10 to 20 mb/s. Do you
upgrade those to OC3 or beyond?


-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

Re: PSINet/Cogent Latency

2002-07-22 Thread Richard A Steenbergen



On Mon, Jul 22, 2002 at 10:01:36PM -0400, Alex Rubenstein wrote:
 
 So, what do you do? You monitor it's usage, making adjustments to make
 sure it doesn't get clobbered. You can easily run DS-3s at 35 to 40
 mbit/sec, with little to none increase in latency from the norm. Many
 people do this as well, even up to OC12 or higher levels all the time.

Just remember that while a 5 minute average may not be at 100%, the 
microbursts are probably quite a bit over that.

For an ISP who actually cares about making money it's not *easy* to say
I'm terminating my peer to PSI because of their degraded performance and
unwillingness to upgrade, but a de-localpref'ing is probably a good idea.
:)

-- 
Richard A Steenbergen [EMAIL PROTECTED]   http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)

RE: PSINet/Cogent Latency

2002-07-22 Thread Brian Wallingford



Good for you, Phil.  Chime in again when you've got something useful to
offer.

In the meantime, you may want to review Economics 101 along with certain
queueing schemes, especially RED (no, I'm not endorsing the idea of 
oversubscribing to the extreme, but then again, neither was Alex).

Also, re-read the previous post.  There's a big difference between choice
and facility.

Did you grow up spending Summers in the Hamptons with no conception of the
value of a dollar, or are you simply trolling?

-brian


On Mon, 22 Jul 2002, Phil Rosenthal wrote:

:
:Actually, I wouldn't think about getting T1, DS3 or OC3 in the first
:place ;)
:Oc-12 is the minimum link I would even look at -- and my preference is
:gig-e... Even if there is only 90 megs on the interface...
:
:--Phil
:
:-Original Message-
:From: Alex Rubenstein [mailto:[EMAIL PROTECTED]] 
:Sent: Monday, July 22, 2002 10:02 PM
:To: Phil Rosenthal
:Cc: [EMAIL PROTECTED]
:Subject: RE: PSINet/Cogent Latency
:
:
:
:
:On Mon, 22 Jul 2002, Phil Rosenthal wrote:
:
:
: I call any upstream link 'over capacity' if either:
: 1) There is less than 50mb/s unused
:
:That must work well for T1's and DS3's.
:
:
: 2) The circuit is more than 50% in use
:
:I call it 'over capacity' too, but that doesn't mean all the ducks are
:in a row to get both sides to realise an upgrade is needed, and even if
:they do realise it, to actually get it done. I am sure 2238092 people on
:this list can complain of the same problem.
:
:So, what do you do? You monitor it's usage, making adjustments to make
:sure it doesn't get clobbered. You can easily run DS-3s at 35 to 40
:mbit/sec, with little to none increase in latency from the norm. Many
:people do this as well, even up to OC12 or higher levels all the time.
:
:
:
:
: I guess by my definition a DS3 is always 'over capacity'
:
:Which must work very well for those DS3's doing 10 to 20 mb/s. Do you
:upgrade those to OC3 or beyond?
:
:
:-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
:--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --
:
:
:
:

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal



With the price of transit where it is today:
#1 Transit is often cheaper than peering (if you factor in port costs on
public exchanges, or link costs for private exchanges)
#2 The difference in price is likely not large enough for me to risk:
saturation, latency, etc...

My customers pay me to provide them a premium service, and I see value
in providing that service.

Some people have no problem selling cogent -- what can I say... You get
what you pay for...

And no, I'm not trolling.  Is having a different opinion not allowed
now?

And 40mbit over a 45mbit circuit, if it is to an uplink/peer -- well, if
he has customers who are connected at 100mbit switched uncapped (likely)
-- then many customers (possibly even some DSL customers...) can flood
off his peer links with only a 5mbit stream.

--Phil

-Original Message-
From: Brian Wallingford [mailto:[EMAIL PROTECTED]] 
Sent: Monday, July 22, 2002 11:13 PM
To: Phil Rosenthal
Cc: 'Alex Rubenstein'; [EMAIL PROTECTED]
Subject: RE: PSINet/Cogent Latency


Good for you, Phil.  Chime in again when you've got something useful to
offer.

In the meantime, you may want to review Economics 101 along with certain
queueing schemes, especially RED (no, I'm not endorsing the idea of 
oversubscribing to the extreme, but then again, neither was Alex).

Also, re-read the previous post.  There's a big difference between
choice and facility.

Did you grow up spending Summers in the Hamptons with no conception of
the value of a dollar, or are you simply trolling?

-brian


On Mon, 22 Jul 2002, Phil Rosenthal wrote:

:
:Actually, I wouldn't think about getting T1, DS3 or OC3 in the first
:place ;) :Oc-12 is the minimum link I would even look at -- and my
preference is :gig-e... Even if there is only 90 megs on the
interface...
:
:--Phil
:
:-Original Message-
:From: Alex Rubenstein [mailto:[EMAIL PROTECTED]] 
:Sent: Monday, July 22, 2002 10:02 PM
:To: Phil Rosenthal
:Cc: [EMAIL PROTECTED]
:Subject: RE: PSINet/Cogent Latency
:
:
:
:
:On Mon, 22 Jul 2002, Phil Rosenthal wrote:
:
:
: I call any upstream link 'over capacity' if either:
: 1) There is less than 50mb/s unused
:
:That must work well for T1's and DS3's.
:
:
: 2) The circuit is more than 50% in use
:
:I call it 'over capacity' too, but that doesn't mean all the ducks are
:in a row to get both sides to realise an upgrade is needed, and even if
:they do realise it, to actually get it done. I am sure 2238092 people
on :this list can complain of the same problem.
:
:So, what do you do? You monitor it's usage, making adjustments to make
:sure it doesn't get clobbered. You can easily run DS-3s at 35 to 40
:mbit/sec, with little to none increase in latency from the norm. Many
:people do this as well, even up to OC12 or higher levels all the time.
:
:
:
:
: I guess by my definition a DS3 is always 'over capacity'
:
:Which must work very well for those DS3's doing 10 to 20 mb/s. Do you
:upgrade those to OC3 or beyond?
:
:
:-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
:--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --
:
:
:
:

RE: PSINet/Cogent Latency

2002-07-22 Thread Randy Bush



 40mb/s isn't loaded for a DS3?

if you are measuring 40mb at five min intervals, micro peaks are pegged out
causing serious packet loss.

randy

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal



My point exactly -- I guess some people disagree...
Probably with any sort of queuing there will only be minimal packet loss
at 40mbit, but at any point one more stream can push it up to 43mbit,
and then queuing might no longer be enough... (and even if it is, can we
say lag?)
--Phil

-Original Message-
From: Randy Bush [mailto:[EMAIL PROTECTED]] 
Sent: Monday, July 22, 2002 11:31 PM
To: Phil Rosenthal
Cc: [EMAIL PROTECTED]
Subject: RE: PSINet/Cogent Latency


 40mb/s isn't loaded for a DS3?

if you are measuring 40mb at five min intervals, micro peaks are pegged
out causing serious packet loss.

randy

Re: PSINet/Cogent Latency

2002-07-22 Thread Richard A Steenbergen



On Mon, Jul 22, 2002 at 11:34:44PM -0400, Phil Rosenthal wrote:
 
 My point exactly -- I guess some people disagree...
 Probably with any sort of queuing there will only be minimal packet loss
 at 40mbit, but at any point one more stream can push it up to 43mbit,
 and then queuing might no longer be enough... (and even if it is, can we
 say lag?)

Efficient packet loss is still packet loss. Just because you manage to 
make the link look good by slowing down TCP before your queueing latency 
starts going up doesn't make your network any less ghetto.

IMHO the biggest problem in peering is getting the other side to actively 
upgrade links to prevent congestion. If you're not in a position where you 
can dictate terms to your peer, move traffic off it and let economics take 
care of the rest. Leaving a congested peer up for your own benefit at the 
expense of your customers is one of the surest ways to lose customers to 
someone who doesn't.

I'd rather have a noncongested gige public peer than a ds3 private peer
any day.

-- 
Richard A Steenbergen [EMAIL PROTECTED]   http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)

RE: PSINet/Cogent Latency

2002-07-22 Thread william



Is there patch or special config example available that would allow me to 
use mrtg (or rather rrdtool) to measure more often and then graph it in a 
way that would show standard 5-min graph but also separate line showing 
those micro burst and actual peak usage?

On Mon, 22 Jul 2002, Randy Bush wrote:

 
  40mb/s isn't loaded for a DS3?
 
 if you are measuring 40mb at five min intervals, micro peaks are pegged out
 causing serious packet loss.
 
 randy

RE: PSINet/Cogent Latency

2002-07-22 Thread Brian Wallingford



On Mon, 22 Jul 2002, Phil Rosenthal wrote:

:
:With the price of transit where it is today:
:#1 Transit is often cheaper than peering (if you factor in port costs on
:public exchanges, or link costs for private exchanges)
:#2 The difference in price is likely not large enough for me to risk:
:saturation, latency, etc...
:
:My customers pay me to provide them a premium service, and I see value
:in providing that service.
:
:Some people have no problem selling cogent -- what can I say... You get
:what you pay for...
:
:And no, I'm not trolling.  Is having a different opinion not allowed
:now?
:And 40mbit over a 45mbit circuit, if it is to an uplink/peer -- well, if
:he has customers who are connected at 100mbit switched uncapped (likely)
:-- then many customers (possibly even some DSL customers...) can flood
:off his peer links with only a 5mbit stream.

Much better.  Your prior posts lacked context and continuity.

I've always advocated overprovisioning myself, vs. creative buffering,
queuing, and/or distracting the end user.  The statement I wouldn't
think of getting T1, DS3 or OC3 in the fist place, without context,
easily lends itself to misinterpretation.

cheers,
brian

:
:--Phil
:
:-Original Message-
:From: Brian Wallingford [mailto:[EMAIL PROTECTED]] 
:Sent: Monday, July 22, 2002 11:13 PM
:To: Phil Rosenthal
:Cc: 'Alex Rubenstein'; [EMAIL PROTECTED]
:Subject: RE: PSINet/Cogent Latency
:
:
:Good for you, Phil.  Chime in again when you've got something useful to
:offer.
:
:In the meantime, you may want to review Economics 101 along with certain
:queueing schemes, especially RED (no, I'm not endorsing the idea of 
:oversubscribing to the extreme, but then again, neither was Alex).
:
:Also, re-read the previous post.  There's a big difference between
:choice and facility.
:
:Did you grow up spending Summers in the Hamptons with no conception of
:the value of a dollar, or are you simply trolling?
:
:-brian
:
:
:On Mon, 22 Jul 2002, Phil Rosenthal wrote:
:
::
::Actually, I wouldn't think about getting T1, DS3 or OC3 in the first
::place ;) :Oc-12 is the minimum link I would even look at -- and my
:preference is :gig-e... Even if there is only 90 megs on the
:interface...
::
::--Phil
::
::-Original Message-
::From: Alex Rubenstein [mailto:[EMAIL PROTECTED]] 
::Sent: Monday, July 22, 2002 10:02 PM
::To: Phil Rosenthal
::Cc: [EMAIL PROTECTED]
::Subject: RE: PSINet/Cogent Latency
::
::
::
::
::On Mon, 22 Jul 2002, Phil Rosenthal wrote:
::
::
:: I call any upstream link 'over capacity' if either:
:: 1) There is less than 50mb/s unused
::
::That must work well for T1's and DS3's.
::
::
:: 2) The circuit is more than 50% in use
::
::I call it 'over capacity' too, but that doesn't mean all the ducks are
::in a row to get both sides to realise an upgrade is needed, and even if
::they do realise it, to actually get it done. I am sure 2238092 people
:on :this list can complain of the same problem.
::
::So, what do you do? You monitor it's usage, making adjustments to make
::sure it doesn't get clobbered. You can easily run DS-3s at 35 to 40
::mbit/sec, with little to none increase in latency from the norm. Many
::people do this as well, even up to OC12 or higher levels all the time.
::
::
::
::
:: I guess by my definition a DS3 is always 'over capacity'
::
::Which must work very well for those DS3's doing 10 to 20 mb/s. Do you
::upgrade those to OC3 or beyond?
::
::
::-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
::--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --
::
::
::
::
:
:
:

RE: PSINet/Cogent Latency

2002-07-22 Thread Alex Rubenstein




Packet loss is not guaranteed, especially considering the queuing
mechanism used is not disclosed.

IE, a simply hold queue north of 2048 will cause no loss, but the
occasional jitter/latency, most likely not even measureable by common
endpoints on the net.

I'm not endorsing, just correcting.



On Mon, 22 Jul 2002, Randy Bush wrote:


  40mb/s isn't loaded for a DS3?

 if you are measuring 40mb at five min intervals, micro peaks are pegged out
 causing serious packet loss.

 randy


-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

Re: PSINet/Cogent Latency

2002-07-22 Thread Matt Zimmerman



On Mon, Jul 22, 2002 at 08:38:58PM -0700, [EMAIL PROTECTED] wrote:  

 Is there patch or special config example available that would allow me to
 use mrtg (or rather rrdtool) to measure more often and then graph it in a
 way that would show standard 5-min graph but also separate line showing
 those micro burst and actual peak usage?

Cricket (cricket.sourceforge.net).

-- 
 - mdz

RE: PSINet/Cogent Latency

2002-07-22 Thread Alex Rubenstein




An effective way would to graph queue drops:

Serial4/1/1 is up, line protocol is up
  Description: to PSI via 3x-xxx-xxx-
  Internet address is 154.13.64.22/30
  Last clearing of show interface counters 5w4d
  Queueing strategy: fifo
  Output queue 0/40, 2275 drops; input queue 0/75, 0 drops

  30 second input rate 5000 bits/sec, 6 packets/sec
  30 second output rate 39911000 bits/sec, 4697 packets/sec
 144472370 packets input, 2769590243 bytes, 0 no buffer
 Received 0 broadcasts, 0 runts, 1 giants, 0 throttles
  0 parity
 5 input errors, 5 CRC, 0 frame, 0 overrun, 1 ignored, 0 abort
 1969955129 packets output, 430008350 bytes, 0 underruns


FYI, for those of you commenting on my full PSI pipe, with a very small
queue depth of only 40 packets, we've seen 0.00011548% percent drop -- 1
in every 865914 packets sent. Agreed, not 0%, but still, arguably that
would never, ever be noticed by anyone.

Once again, I don't condone; however, 1/1th of a percent of packet
loss is easily worth the decreased cost in traffic sent to this endpoint.

Anyone disagree?

(an important a seperate note is that CAR/CEF drops due to ICMP reaching
over 10 mb/s would trigger the same counter)





On Mon, 22 Jul 2002, [EMAIL PROTECTED] wrote:


 Is there patch or special config example available that would allow me to
 use mrtg (or rather rrdtool) to measure more often and then graph it in a
 way that would show standard 5-min graph but also separate line showing
 those micro burst and actual peak usage?

 On Mon, 22 Jul 2002, Randy Bush wrote:

 
   40mb/s isn't loaded for a DS3?
 
  if you are measuring 40mb at five min intervals, micro peaks are pegged out
  causing serious packet loss.
 
  randy
 


-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal




From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
Richard A Steenbergen
On Mon, Jul 22, 2002 at 11:34:44PM -0400, Phil Rosenthal wrote:

 I'd rather have a noncongested gige public peer than a ds3 private
peer any day.

Except apparently that's called trolling ;)


--Phil

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal



As you probably guessed, I do...

TCP is designed to not saturate links, so... If you take what should be
60 megs of traffic and put it limit it to 45, else queue for a while, or
drop if queue full... The sessions will slow-start back up to a slow
enough speed that wont drop.  No (or very little) packet loss, but lower
quality of service anyway.

--Phil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
Alex Rubenstein
Sent: Tuesday, July 23, 2002 12:05 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: RE: PSINet/Cogent Latency




An effective way would to graph queue drops:

Serial4/1/1 is up, line protocol is up
  Description: to PSI via 3x-xxx-xxx-
  Internet address is 154.13.64.22/30
  Last clearing of show interface counters 5w4d
  Queueing strategy: fifo
  Output queue 0/40, 2275 drops; input queue 0/75, 0 drops

  30 second input rate 5000 bits/sec, 6 packets/sec
  30 second output rate 39911000 bits/sec, 4697 packets/sec
 144472370 packets input, 2769590243 bytes, 0 no buffer
 Received 0 broadcasts, 0 runts, 1 giants, 0 throttles
  0 parity
 5 input errors, 5 CRC, 0 frame, 0 overrun, 1 ignored, 0 abort
 1969955129 packets output, 430008350 bytes, 0 underruns


FYI, for those of you commenting on my full PSI pipe, with a very small
queue depth of only 40 packets, we've seen 0.00011548% percent drop -- 1
in every 865914 packets sent. Agreed, not 0%, but still, arguably that
would never, ever be noticed by anyone.

Once again, I don't condone; however, 1/1th of a percent of packet
loss is easily worth the decreased cost in traffic sent to this
endpoint.

Anyone disagree?

(an important a seperate note is that CAR/CEF drops due to ICMP reaching
over 10 mb/s would trigger the same counter)





On Mon, 22 Jul 2002, [EMAIL PROTECTED] wrote:


 Is there patch or special config example available that would allow me

 to use mrtg (or rather rrdtool) to measure more often and then graph 
 it in a way that would show standard 5-min graph but also separate 
 line showing those micro burst and actual peak usage?

 On Mon, 22 Jul 2002, Randy Bush wrote:

 
   40mb/s isn't loaded for a DS3?
 
  if you are measuring 40mb at five min intervals, micro peaks are 
  pegged out causing serious packet loss.
 
  randy
 


-- Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben --
--Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --

Re: PSINet/Cogent Latency

2002-07-22 Thread Richard A Steenbergen



On Mon, Jul 22, 2002 at 08:38:58PM -0700, [EMAIL PROTECTED] wrote:
 
 Is there patch or special config example available that would allow me
 to use mrtg (or rather rrdtool) to measure more often and then graph it
 in a way that would show standard 5-min graph but also separate line
 showing those micro burst and actual peak usage?

It's usually not practical to sample data that often, at least over snmp. 
30 seconds is reasonable if your poller doesn't suck (aka not mrtg), but 
thats still a fair amount of averaging.

As an example, looking at an interface doing 135Mbps average on a pretty
steady curve through Juniper's monitor interface which gives 2 second 
samples, I see between 120Mbps and 150Mbps fluctuations almost constantly.

Personally I would like to see the data collection done on the router 
itself where it is simple to collect data very frequently, then pushed 
out. This is particularly important when you are doing things like billing 
95th percentile, where a loss of connectivity between the polling machine 
and the device is a loss of billing information.

Why Juniper won't spend 5 minutes to make a simple lib so a program could
sample interface counters, so someone could write this kind of system to
run on the RE, is beyond me. I blame generations of dumbed down network
engineers wielding perl as their only tool. :)

-- 
Richard A Steenbergen [EMAIL PROTECTED]   http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)

Re: PSINet/Cogent Latency

2002-07-22 Thread Richard A Steenbergen



On Tue, Jul 23, 2002 at 12:04:34AM -0400, Alex Rubenstein wrote:
 
 An effective way would to graph queue drops:
 
 Serial4/1/1 is up, line protocol is up

ifInDiscards  = 1.3.6.1.2.1.2.2.1.13
ifOutDiscards = 1.3.6.1.2.1.2.2.1.19

A far more interesting thing to graph than temperature IMHO. :)

-- 
Richard A Steenbergen [EMAIL PROTECTED]   http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)

Re: PSINet/Cogent Latency

2002-07-22 Thread Doug Clements

- Original Message -
From: Richard A Steenbergen [EMAIL PROTECTED]
Subject: Re: PSINet/Cogent Latency
 Personally I would like to see the data collection done on the router
 itself where it is simple to collect data very frequently, then pushed
 out. This is particularly important when you are doing things like billing
 95th percentile, where a loss of connectivity between the polling machine
 and the device is a loss of billing information.

Redbacks can actually do this with what they call Bulkstats. Collects data
on specified interfaces and ftp uploads the data file every so specified
often. Pretty slick.

Course, this isn't very helpful with Redback's extensive core router lineup,
but still.

--Doug

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal



Call me crazy -- but what's wrong with setting up RRDtool with a
heartbeat time of 30 seconds, and putting in cron:
* * * * * rrdscript.sh ; sleep 30s ; rrdscript.sh

Wouldn't work just as well?

I haven't tried it -- so perhaps this is too taxing (probably you would
only run this on a few interfaces anyway)...

The last time I tested such a thing was on an uplink doing ~200 mgs and
deviation was about +/- 5mbs per second

--Phil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
Doug Clements
Sent: Tuesday, July 23, 2002 12:59 AM
To: Richard A Steenbergen
Cc: [EMAIL PROTECTED]
Subject: Re: PSINet/Cogent Latency



- Original Message -
From: Richard A Steenbergen [EMAIL PROTECTED]
Subject: Re: PSINet/Cogent Latency
 Personally I would like to see the data collection done on the router 
 itself where it is simple to collect data very frequently, then pushed

 out. This is particularly important when you are doing things like 
 billing 95th percentile, where a loss of connectivity between the 
 polling machine and the device is a loss of billing information.

Redbacks can actually do this with what they call Bulkstats. Collects
data on specified interfaces and ftp uploads the data file every so
specified often. Pretty slick.

Course, this isn't very helpful with Redback's extensive core router
lineup, but still.

--Doug

Re: PSINet/Cogent Latency

2002-07-22 Thread Doug Clements

- Original Message -
From: Phil Rosenthal [EMAIL PROTECTED]
Subject: RE: PSINet/Cogent Latency

 Call me crazy -- but what's wrong with setting up RRDtool with a
 heartbeat time of 30 seconds, and putting in cron:
 * * * * * rrdscript.sh ; sleep 30s ; rrdscript.sh

 Wouldn't work just as well?

 I haven't tried it -- so perhaps this is too taxing (probably you would
 only run this on a few interfaces anyway)...

Redback's implementation overcame the limitation of monitoring say, 20,000
user circuits. You don't want to poll 20,000 interfaces for maybe 4 counters
each, every 5 minutes.

I think the problem with using rrdtool for billing purposes as described is
that data can (and does) get lost. If your poller is a few cycles late, the
burstable bandwidth measured goes up when the poller catches up to the
interface counters. More bursting is bad for %ile (or good if you're selling
it), and the customer won't like the fact that they're getting charged for
artifically high measurements.

Bulkstats lets the measurement happen independant of the reporting.

--Doug

RE: PSINet/Cogent Latency

2002-07-22 Thread Phil Rosenthal



I don't think RRD is that bad if you are gonna check only every 5
minutes...

Again, perhaps I'm just missing something, but so lets say you measure
30 seconds late , and it thinks its on time -- So that one sample will
be higher , then the next one will be on time, so 30 seconds early for
that sample -- it will be lower.  On the whole -- it will be accurate
enough -- no?

Besides I think RRD has a bunch of things built in to deal with
precisely this problem.

I'm not saying a hardware solution can't be better -- but it is likely
overkill compared to a few cheap intels running RRD -- assuming your
snmpd can deal with the load...

--Phil

-Original Message-
From: Doug Clements [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, July 23, 2002 1:50 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: PSINet/Cogent Latency


- Original Message -
From: Phil Rosenthal [EMAIL PROTECTED]
Subject: RE: PSINet/Cogent Latency

 Call me crazy -- but what's wrong with setting up RRDtool with a 
 heartbeat time of 30 seconds, and putting in cron:
 * * * * * rrdscript.sh ; sleep 30s ; rrdscript.sh

 Wouldn't work just as well?

 I haven't tried it -- so perhaps this is too taxing (probably you 
 would only run this on a few interfaces anyway)...

Redback's implementation overcame the limitation of monitoring say,
20,000 user circuits. You don't want to poll 20,000 interfaces for maybe
4 counters each, every 5 minutes.

I think the problem with using rrdtool for billing purposes as described
is that data can (and does) get lost. If your poller is a few cycles
late, the burstable bandwidth measured goes up when the poller catches
up to the interface counters. More bursting is bad for %ile (or good if
you're selling it), and the customer won't like the fact that they're
getting charged for artifically high measurements.

Bulkstats lets the measurement happen independant of the reporting.

--Doug

Re: PSINet/Cogent Latency

2002-07-22 Thread Richard A Steenbergen



On Tue, Jul 23, 2002 at 01:56:45AM -0400, Phil Rosenthal wrote:
 
 I don't think RRD is that bad if you are gonna check only every 5
 minutes...

RRD doesn't measure anything, it stores and graphs data. The perl pollers
everyone is using can barely keep up with 5 minute samples on a couple
dozen routers and a few hundred interfaces, requiring poller farms to
be distributed across a network, 'lest a box or part of the network break
and you lose data.

 Again, perhaps I'm just missing something, but so lets say you measure
 30 seconds late , and it thinks its on time -- So that one sample will
 be higher , then the next one will be on time, so 30 seconds early for
 that sample -- it will be lower.  On the whole -- it will be accurate
 enough -- no?

enough is a relative term, but sure. :)

 I'm not saying a hardware solution can't be better -- but it is likely
 overkill compared to a few cheap intels running RRD -- assuming your
 snmpd can deal with the load...

What hardware... storing a few byte counters is trivial, but polling them
through snmp is what is hard (never trust a protocol named simple or
trivial). Creating a buffer of samples which can be periodically
sampled should be easy and painless. I don't know if I call periodic ftp 
painless but its certainly a start.

-- 
Richard A Steenbergen [EMAIL PROTECTED]   http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)

Re: PSINet/Cogent Latency

2002-07-22 Thread Doug Clements

- Original Message -
From: Phil Rosenthal [EMAIL PROTECTED]
Subject: RE: PSINet/Cogent Latency

 I don't think RRD is that bad if you are gonna check only every 5
 minutes...

 Again, perhaps I'm just missing something, but so lets say you measure
 30 seconds late , and it thinks its on time -- So that one sample will
 be higher , then the next one will be on time, so 30 seconds early for
 that sample -- it will be lower.  On the whole -- it will be accurate
 enough -- no?

If you're polling every 5 minutes, with 2 retrys per poll, and you miss 2
retrys, then your next poll will be 5 minutes late. It's not disastrous, but
it's also not perfect. Again, peaks and vallys on your graph cost more than
smooth lines, even with the same total bandwidth.

Do you want to be the one to tell your customers your billing setup is
accurate enough, and especially that it's going to have a tendancy to be
accurate enough in your favor?

 Besides I think RRD has a bunch of things built in to deal with
 precisely this problem.

Wouldn't that be just spiffy!

 I'm not saying a hardware solution can't be better -- but it is likely
 overkill compared to a few cheap intels running RRD -- assuming your
 snmpd can deal with the load...

No extra hardware needed. I think the desired solution was integration into
the router. The data is already there, you just need software to compile it
and ship it out via a reliable reporting mechanism. For being relatively
simple, it's a nice idea that it could replace the almost in an almost
accurate billing process.

--Doug

42 matches

Mail list logo