Re: [j-nsp] QoS when there is no congestion

Ross Halliday Mon, 14 Nov 2016 15:21:03 -0800

> My opinion on QoS for networks with low bandwidth is to always implement
> it. It's really not that difficult and you never know when microbursts
> could be affecting things. Believe me, even if your upstream link is a
> 1Gb/s circuit, and your outbound traffic is less than 10Mb/s, you can
> still have drops due to microbursts.
> 
> Voice and video, depending on your use of them, are normally very
> important and very sensitive to drops/delays. It will never cost you
> anything (besides time) to learn to implement some simple QoS policies,
> however, if you have customers who complain about bad voice/video quality,
> it will cost you your reputation and possibly revenue when said customers
> cancel their contracts because of a service you cannot reliably provide.
> 
> -evt

As a convert from the "bandwidth is always the answer" camp, I'd like to echo
these sentiments. And apologize for the incoming wall of text.

Our network is not 'congested' - at least not in the sense that you'd pick up
with a Cacti bandwidth graph. A better way to think of things is that we have
many points of contention.

For us, it comes down to a matter of allocation of buffer space and I guess
what you could call the "funnelling" of packets. If you have a device that only
has two interfaces at the same speed, there's nothing to think about. The
challenge is when there are more interfaces: Consider a simple 3-interface
situation, where traffic from two (interfaces A and B) is destined out one (Z).
All are 1 Gbps. Say A and B hit the router at 100 Mbps. It's easy to think that
200 Mbps should fit into 1 Gbps, but this isn't completely accurate. The
packets are not nicely interleaved so that they "fit together" - in reality
some of them are occupying the same point in time as each other. The overall
bitrate measured over an entire second is comparatively low but they're
arriving at 1 Gbps speeds. If you want a car analogy, think of a multi-lane
freeway. Two cars travelling in different lanes going in the same direction,
going the same speed. They are the only two cars for a quarter mile. Just
because
the highway is rated for 80 cars in that space at that speed doesn't mean
there won't be a crash if one suddenly changes lanes.

Through buffering, the router is able to take this data and cram it into
interface Z, but it needs a large enough packet queue to deal with the packets
that arrive at the same time.

This is called a "microburst", and WILL cause packet delay and reordering if
the buffer isn't large enough. Anyone operating an IP SAN should be familiar
with this concept. This is a big issue issue with switches used for iSCSI, such
as the Cisco 3750s we started out with (despite common notions, QoS actually
has to be enabled as the 'default'/'disabled' buffers are insufficient to deal
with microbursts).

If you want a really blatant example of this in action, you need to look no
further than Cisco's 2960, which has default buffer allocations so small that
it experiences problems sending data out of a 100 Mbps port if it arrives on a
1 Gbps port. (CSCte64832
http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst2960/software/release/12-2_53_se/release/notes/OL22233.html)

Back in our early days of MPLS, our network was almost entirely Cisco 6500s. We
had a number of these boxes as P/PE that just used the SFP ports built in to
the supervisors for optics. Most of our small sites were fine, but as Internet
traffic levels grew, we noticed TV problems (~400 Mbps at the time). The issues
appeared similar to packet loss and were loosely coupled to peak Internet usage
- it was very easy to see this on the graphs in the evening, but issues were
observed during the day, as well. We did not understand why a gigabit link that
only had 600-800 Mbps running over it would be having these issues. We
eventually figured out the problem and moved everything onto interfaces with
real buffers and the problems disappeared. (we had not yet figured out DSCP to
MPLS EXP mapping.)

This highlighted for us that we cannot know or control what will happen with
data to or from our customers on the Internet side. Using the SUP720s' onboard
ports was a bad decision for a variety of reasons, but what would happen if a
real DoS hit our equipment? Since our network handles voice and video as well,
it is extremely important for us to protect our sensitive services. The only
way you can guarantee that is by allocating sufficient buffer space for the
things you deem important. I think a couple of our MXes are still running
default queues but all marking is enforced at ingress.

Of course since we implement QoS throughout the network, everything scales out
very well down to the access equipment. It is a lot of work but it would be
nearly impossible to reliably operate a converged network without the ability
to tell traffic apart and prioritize important stuff.

My two cents.

Cheers
Ross
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] QoS when there is no congestion

Reply via email to