Hi,

Does anyone have any recommendations on how to pinpoint and react to packet 
loss across the internet?  preferably in an automated fashion.  For detection 
I'm currently looking at trying smoketrace to run from inside my network, but 
I'd love to be able to run traceroutes from my edge routers triggered during 
periods of loss.  I have Juniper MX80s on one end- which I'm hopeful I'll be 
able to cobble together some combo of RPM and event scripting to kick off a 
traceroute.  We have Cisco4900Ms on the other end and maybe the same thing is 
possible but I'm not so sure.

I'd love to hear other suggestions and experience for detection and also for 
options on what I might be able to do when loss is detected on a path.

In my specific situation I control equipment on both ends of the path that I 
care about with details below.

we are a hosted service company and we currently have two data centers, DC A 
and DC B.  DC A uses juniper MX routers, advertises our own IP space and takes 
full BGP feeds from two providers, ISPs A1 and A2.  At DC B we have a smaller 
installation and instead take redundant drops (and IP space) from a single 
provider, ISP B1, who then peers upstream with two providers, B2 and B3

We have a fairly consistent bi-directional stream of traffic between DC A and 
DC B.  Both of ISP A1 and A2 have good peering with ISP B2 so under normal 
network conditions traffic flows across ISP B1 to B2 and then to either ISP A1 
or A2

oversimplified ascii pic showing only the normal best paths:

              -- ISP A1----------------------ISP B2--
DC A--|                                                                 |---  
ISP B1 ----- DC B
             -- ISP A2----------------------ISP B2--


with increasing frequency we've been experiencing packet loss along the path 
from DC A to DC B.  Usually the periods of loss are brief,  30 seconds to a 
minute, but they are total blackouts.

  I'd like to be able to collect enough relevant data to pinpoint the trouble 
spot as much as possible so I can take it to the ISPs and request a solution.  
The blackouts are so quick that it's impossible to log in and get a trace- 
hence the desire to automate it.

I can provide more details off list if helpful- I'm trying not to vilify 
anyone- especially without copious amounts of data points.

As a side question, what should my expectation be regarding packet loss when 
sending packets from point A to point B across multiple providers across the 
internet?  Is 30 seconds to a minute of blackout between two destinations every 
couple of weeks par for the course?  My directly connected ISPs offer me an 
SLA, but what should I reasonably expect from them when one of their upstream 
peers (or a peer of their peers) has issues?  If this turns out to be BGP 
reconvergence or similar do I have any options?

many thanks,
-andy

Reply via email to