Re: [c-nsp] WRR Confusion on 6748 blades
> Unfortunately, there are no 'absolute' per queue counters, only per queue > drop counters. So no easy way to determine if other queues are being > utilized unless you just 'know' (based on your classification policies and > known application mix) or those queues overflow & drop. > > >> > >> > >> <...snip...> > > > >> > This suggests to me that there is traffic in other queues contending for >> > the >> > available bandwidth, and that there's periodically instantaneous >> > congestion. >> > Alternatively you could try sizing this queue bigger and using the >> > original >> > bandwidth ratio. Or a combination of those two (tweaking both bandwidth >> > & >> > queue-limit). >> > >> > Is there some issue with changing the bandwidth ratio on this queue (ie, >> > are >> > you seeing collateral damage)? Else, seems like you've solved the >> > problem >> > already ;) >> >> Nope, we don't have a problem with it. That's what we've been doing. >> We haven't really been adjusting the queue limit ratios, though. In >> most cases, we were just changing the bandwidth ratio weights. I'm >> looking at an interface right now where the 30-second weighted traffic >> rate has never gone above around 150 Mbps but I'm still seeing OQDs in >> one of the queues only. How do you think we should be interpreting >> that? > > > > > In my opinion, it indicates that: > 1. there is traffic in the other queues contending for the link bandwidth > 2. there is instantaneous oversubscription that causes the problem queue to > fill as it's not being serviced frequently enough and/or is inadequately > sized > 3. the other queues are sized/weighted appropriately to handle the amount of > traffic that maps to them (ie, even under congestion scenarios, there is > adequate buffer to hold enough packets to avoid drops) > > If #1 was not true, then I don't see how changing the bandwidth ratio would > make any difference at all - if there is no traffic in the other queues, > then the single remaining active queue would get full unrestricted access to > the full bandwidth of the link and no queuing would be necessary in the > first place. > > Supposing there is no traffic in the other queues - in that case, you could > certainly still have oversubscription of the single queue and drops, but > changing the weight should have no effect on that scenario at all (while > changing the q-limit certainly could). > > > 2 cents, > Tim I just ran across an older thread where someone was having the same problem. In his case, he had a 1-gig source and a 1-gig receiver on the same switch with no output drops. He moved the receiver to another switch that was connected to the first switch via a 10-gig link. That resulted in output drops toward the receiver, apparently because of the difference in serialization delay on the second switch, i.e. it didn't take as long to bring in a packet on the 10-gig as it did to send it on the 1-gig, so the buffers were filling with bursty traffic at low apparent traffic rates. This is very interesting stuff. Just a little complicated. :) ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
At 10:53 AM 6/27/2012, Peter Rathlev pronounced: On Wed, 2012-06-27 at 10:46 -0700, Tim Stevenson wrote: > Unfortunately, there are no 'absolute' per queue counters, only per > queue drop counters. Any chance of that ever showing up on the Cat6500 platform? :-D Not on 67xx cards; not sure if 69xx cards have capable hardware, have been off the c6k platform for quite a while. Tim Or as my lolcat would say: "i can haz absolute counters plz kthxby" -- Peter Tim Stevenson, tstev...@cisco.com Routing & Switching CCIE #5561 Distinguished Technical Marketing Engineer, Cisco Nexus 7000 Cisco - http://www.cisco.com IP Phone: 408-526-6759 The contents of this message may be *Cisco Confidential* and are intended for the specified recipients only. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
On Wed, 2012-06-27 at 10:46 -0700, Tim Stevenson wrote: > Unfortunately, there are no 'absolute' per queue counters, only per > queue drop counters. Any chance of that ever showing up on the Cat6500 platform? :-D Or as my lolcat would say: "i can haz absolute counters plz kthxby" -- Peter ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
Hi John, please see inline below: At 10:05 AM 6/27/2012, John Neiberger pronounced: <...snip...> > What this should be doing is just causing us to service the queue more > frequently. That could certainly reduce/eliminate drops in the event of > congestion, but only if there is traffic in the other queues that is also > contending for the bandwidth. > > In other words, if there is only one active queue (ie only one queue has > traffic in it), then it can & should get full unrestricted access to the > entire link bandwidth. Can you confirm whether there's traffic in the other > queues? > I'm not certain whether or not we have traffic in the other queues. In nearly all cases, the output drops are all in one queue with zero in the other queues. That seems to indicate that either all of our traffic is one queue or there just isn't a lot of traffic in the other queues. Unfortunately, there are no 'absolute' per queue counters, only per queue drop counters. So no easy way to determine if other queues are being utilized unless you just 'know' (based on your classification policies and known application mix) or those queues overflow & drop. > > <...snip...> > This suggests to me that there is traffic in other queues contending for the > available bandwidth, and that there's periodically instantaneous congestion. > Alternatively you could try sizing this queue bigger and using the original > bandwidth ratio. Or a combination of those two (tweaking both bandwidth & > queue-limit). > > Is there some issue with changing the bandwidth ratio on this queue (ie, are > you seeing collateral damage)? Else, seems like you've solved the problem > already ;) Nope, we don't have a problem with it. That's what we've been doing. We haven't really been adjusting the queue limit ratios, though. In most cases, we were just changing the bandwidth ratio weights. I'm looking at an interface right now where the 30-second weighted traffic rate has never gone above around 150 Mbps but I'm still seeing OQDs in one of the queues only. How do you think we should be interpreting that? In my opinion, it indicates that: 1. there is traffic in the other queues contending for the link bandwidth 2. there is instantaneous oversubscription that causes the problem queue to fill as it's not being serviced frequently enough and/or is inadequately sized 3. the other queues are sized/weighted appropriately to handle the amount of traffic that maps to them (ie, even under congestion scenarios, there is adequate buffer to hold enough packets to avoid drops) If #1 was not true, then I don't see how changing the bandwidth ratio would make any difference at all - if there is no traffic in the other queues, then the single remaining active queue would get full unrestricted access to the full bandwidth of the link and no queuing would be necessary in the first place. Supposing there is no traffic in the other queues - in that case, you could certainly still have oversubscription of the single queue and drops, but changing the weight should have no effect on that scenario at all (while changing the q-limit certainly could). 2 cents, Tim > > Hope that helps, > Tim It helps a lot! thanks! John Tim Stevenson, tstev...@cisco.com Routing & Switching CCIE #5561 Distinguished Technical Marketing Engineer, Cisco Nexus 7000 Cisco - http://www.cisco.com IP Phone: 408-526-6759 The contents of this message may be *Cisco Confidential* and are intended for the specified recipients only. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
> queue-limit and bandwidth values (ratios/weights) are *different* things. > > The queue-limit physically sizes the queue. It says how much of the total > physical buffer on the port is set aside exclusively for each class (where > class is based on DSCP or COS). Traffic from other classes can NEVER get > access to the buffer set aside for another class, ie, there could be plenty > of available buffer in other queues even as you're dropping traffic in one > of the queues. > > The bandwidth ratios, on the other hand, determine how frequently each of > those queues is serviced, ie, how often the scheduler will dequeue/transmit > a frame from the queue. If there is nothing sitting in one queue, other > queues can get access to that bandwidth, ie, "bandwidth" is not a hard > limit, you can think of it as a minimum guarantee when there is > congestion/contention. > That part I think I understand. Mostly. :) When I say bandwidth in this context, i'm referring to the bandwidth ratio weight. >> are fairly hard limits. That is in line with what we >> were experiencing because we were seeing output queue drops when the >> interface was not fully utilized. Increasing the queue bandwidth got >> rid of the output queue drops. > > > > What this should be doing is just causing us to service the queue more > frequently. That could certainly reduce/eliminate drops in the event of > congestion, but only if there is traffic in the other queues that is also > contending for the bandwidth. > > In other words, if there is only one active queue (ie only one queue has > traffic in it), then it can & should get full unrestricted access to the > entire link bandwidth. Can you confirm whether there's traffic in the other > queues? > I'm not certain whether or not we have traffic in the other queues. In nearly all cases, the output drops are all in one queue with zero in the other queues. That seems to indicate that either all of our traffic is one queue or there just isn't a lot of traffic in the other queues. > > >> For one particular application >> traversing this link, that resulted in a file transfer rate increase >> from 2.5 MB/s to 25 MB/s. That's a really huge difference and all we >> did was increase the allocated queue bandwidth. At no point was that >> link overutilized. > > > > We frequently see 'microburst' situations where the avg rate measured over > 30sec etc is well under rate, but at some instantaneous moment there is a > burst that exceeds line rate and can cause drops if the queue is not deep > enough. Having a low bandwidth ratio, with traffic present in other queues, > is another form of the queue not being deep enough, ie, the queue may have a > lot of space but if packets are not dequeued frequently enough that queue > can still fill & drop. > > > >> In fact, during our testing of that particular >> application, the link output never went above 350 Mbps. We used very >> large files so that the transfer would take a while and we'd get a >> good feel for what was happening. Doing nothing but increasing the >> queue bandwidth fixed the problem there and has fixed the same sort of >> issue elsewhere. > > > This suggests to me that there is traffic in other queues contending for the > available bandwidth, and that there's periodically instantaneous congestion. > Alternatively you could try sizing this queue bigger and using the original > bandwidth ratio. Or a combination of those two (tweaking both bandwidth & > queue-limit). > > Is there some issue with changing the bandwidth ratio on this queue (ie, are > you seeing collateral damage)? Else, seems like you've solved the problem > already ;) Nope, we don't have a problem with it. That's what we've been doing. We haven't really been adjusting the queue limit ratios, though. In most cases, we were just changing the bandwidth ratio weights. I'm looking at an interface right now where the 30-second weighted traffic rate has never gone above around 150 Mbps but I'm still seeing OQDs in one of the queues only. How do you think we should be interpreting that? > > Hope that helps, > Tim It helps a lot! thanks! John ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
At 09:20 AM 6/27/2012, Phil Mayers pronounced: note that queues don't have bandwidth, they have size and weight. Yes, I've always disliked this term, "bandwidth" - I think "weight" would have been better, but that's water under the bridge. Tim Tim Stevenson, tstev...@cisco.com Routing & Switching CCIE #5561 Distinguished Technical Marketing Engineer, Cisco Nexus 7000 Cisco - http://www.cisco.com IP Phone: 408-526-6759 The contents of this message may be *Cisco Confidential* and are intended for the specified recipients only. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
Hi John, please see inline below: At 08:58 AM 6/27/2012, John Neiberger pronounced: On Wed, Jun 27, 2012 at 8:24 AM, Janez Novak wrote: > 6748 can't do shaping. Would love to have them do that. So you must be > experiencing drops somewhere else and not from WRR BW settings or WRED > settings. They both kick in when congestion is happening (queues are > filling up). For exaple linecard is oversubscribed etc > > Look at second bullet > (http://www.cisco.com/en/US/docs/routers/7600/ios/12.2SR/configuration/guide/qos.html#wp1728810). > > Kind regards, > Bostjan This is very confusing and I'm getting a lot of conflicting information. I've been told by three Cisco engineers that these queue bandwidths limits queue-limit and bandwidth values (ratios/weights) are *different* things. The queue-limit physically sizes the queue. It says how much of the total physical buffer on the port is set aside exclusively for each class (where class is based on DSCP or COS). Traffic from other classes can NEVER get access to the buffer set aside for another class, ie, there could be plenty of available buffer in other queues even as you're dropping traffic in one of the queues. The bandwidth ratios, on the other hand, determine how frequently each of those queues is serviced, ie, how often the scheduler will dequeue/transmit a frame from the queue. If there is nothing sitting in one queue, other queues can get access to that bandwidth, ie, "bandwidth" is not a hard limit, you can think of it as a minimum guarantee when there is congestion/contention. are fairly hard limits. That is in line with what we were experiencing because we were seeing output queue drops when the interface was not fully utilized. Increasing the queue bandwidth got rid of the output queue drops. What this should be doing is just causing us to service the queue more frequently. That could certainly reduce/eliminate drops in the event of congestion, but only if there is traffic in the other queues that is also contending for the bandwidth. In other words, if there is only one active queue (ie only one queue has traffic in it), then it can & should get full unrestricted access to the entire link bandwidth. Can you confirm whether there's traffic in the other queues? For one particular application traversing this link, that resulted in a file transfer rate increase from 2.5 MB/s to 25 MB/s. That's a really huge difference and all we did was increase the allocated queue bandwidth. At no point was that link overutilized. We frequently see 'microburst' situations where the avg rate measured over 30sec etc is well under rate, but at some instantaneous moment there is a burst that exceeds line rate and can cause drops if the queue is not deep enough. Having a low bandwidth ratio, with traffic present in other queues, is another form of the queue not being deep enough, ie, the queue may have a lot of space but if packets are not dequeued frequently enough that queue can still fill & drop. In fact, during our testing of that particular application, the link output never went above 350 Mbps. We used very large files so that the transfer would take a while and we'd get a good feel for what was happening. Doing nothing but increasing the queue bandwidth fixed the problem there and has fixed the same sort of issue elsewhere. This suggests to me that there is traffic in other queues contending for the available bandwidth, and that there's periodically instantaneous congestion. Alternatively you could try sizing this queue bigger and using the original bandwidth ratio. Or a combination of those two (tweaking both bandwidth & queue-limit). Is there some issue with changing the bandwidth ratio on this queue (ie, are you seeing collateral damage)? Else, seems like you've solved the problem already ;) Hope that helps, Tim I'm still researching this and trying to get to the bottom of it. I think we're missing something important that would make this all make more sense. I appreciate everyone's help! John ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ Tim Stevenson, tstev...@cisco.com Routing & Switching CCIE #5561 Distinguished Technical Marketing Engineer, Cisco Nexus 7000 Cisco - http://www.cisco.com IP Phone: 408-526-6759 The contents of this message may be *Cisco Confidential* and are intended for the specified recipients only. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
On Wed, Jun 27, 2012 at 10:20 AM, Saku Ytti wrote: > On (2012-06-27 12:11 -0400), Chris Evans wrote: > >> If you don't need QoS features, disable it and you will have the full >> interface buffer for any traffic. If you do need QoS perhaps remap your > > Agreed, no reason run what you don't need. I view CoPP as mandatory feature > to any node with IP address reachable from Internet and CoPP depends on > 'mls qos'. > If you do enable MLS QoS, you might want to map all traffic to fewer > classes, maybe just 2 or even 1, this way you can allocate more buffers, > instead of dividing it evenly to maximum amount of classes card supports. > > -- > ++ytti We definitely need CoPP, so I think on the devices that don't need it, we should definitely re-map the classes to one queue and then tune it accordingly. This is all fantastic information. I've never had to deal with queueing at this level before, so much of this is new to me. I appreciate everyone's help! John ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
On 27/06/12 16:58, John Neiberger wrote: I'm still researching this and trying to get to the bottom of it. I think we're missing something important that would make this all make more sense. I appreciate everyone's help! Queueing on this platform is complex. Google "qos srnd" and read the sections on 6500 carefully, if you haven't already. In particular, note that queues don't have bandwidth, they have size and weight. The actual rate at which packets leave a queue is a weighted function of arrival rate at ALL queues. A queue can absorb a burst in excess of the empty rate up to the queue size, with a drop threshold (if RED is enabled) controlled by queue size & CoS. If you are seeing a queue dropping packets, and the offered load into that queue is less than egress link speed, then some OTHER queue must have a weight AND OFFERED LOAD that is causing the dropped queue to be under-serviced. The 6748 does have DWRR, so you shouldn't be suffering from starvation. At this point, a "sh queueing int ..." on the egress port would help. Are you running a "default" QoS config? Are you trusting CoS/DSCP or not? ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
On (2012-06-27 12:11 -0400), Chris Evans wrote: > If you don't need QoS features, disable it and you will have the full > interface buffer for any traffic. If you do need QoS perhaps remap your Agreed, no reason run what you don't need. I view CoPP as mandatory feature to any node with IP address reachable from Internet and CoPP depends on 'mls qos'. If you do enable MLS QoS, you might want to map all traffic to fewer classes, maybe just 2 or even 1, this way you can allocate more buffers, instead of dividing it evenly to maximum amount of classes card supports. -- ++ytti ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
On 27/06/12 17:11, Chris Evans wrote: If you don't need QoS features, disable it and you will have the full interface buffer for any traffic. If you do need QoS perhaps remap your queues to reduce the amount of queues that will be in contention for bandwidth. In my experience QoS on the 6500 has always caused more issues than its solved due to its limited interface queuing capabilties. Note that CoPP on this platform requires QoS. I agree that remapping into fewer (one?) queues may be the solution here. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
This is where I ask the question whether you need QoS and its queues or not? At my old employer we never enabled QoS on our 6500s in the data centers because of this buffer carving issue. When you disable QoS on the 6500 platform it lets the dscp/802.1p bits pass, which we were fine with. We never wanted to do tagging for applications, it was always do it yourself or its not getting done. Anytime we enabled QoS we ran into issues such as you are having. If you don't need QoS features, disable it and you will have the full interface buffer for any traffic. If you do need QoS perhaps remap your queues to reduce the amount of queues that will be in contention for bandwidth. In my experience QoS on the 6500 has always caused more issues than its solved due to its limited interface queuing capabilties. On Wed, Jun 27, 2012 at 12:01 PM, John Neiberger wrote: > On Wed, Jun 27, 2012 at 9:58 AM, John Neiberger > wrote: > > On Wed, Jun 27, 2012 at 8:24 AM, Janez Novak > wrote: > >> 6748 can't do shaping. Would love to have them do that. So you must be > >> experiencing drops somewhere else and not from WRR BW settings or WRED > >> settings. They both kick in when congestion is happening (queues are > >> filling up). For exaple linecard is oversubscribed etc > >> > >> Look at second bullet > >> ( > http://www.cisco.com/en/US/docs/routers/7600/ios/12.2SR/configuration/guide/qos.html#wp1728810 > ). > >> > >> Kind regards, > >> Bostjan > > > > This is very confusing and I'm getting a lot of conflicting > > information. I've been told by three Cisco engineers that these queue > > bandwidths limits are fairly hard limits. That is in line with what we > > were experiencing because we were seeing output queue drops when the > > interface was not fully utilized. Increasing the queue bandwidth got > > rid of the output queue drops. For one particular application > > traversing this link, that resulted in a file transfer rate increase > > from 2.5 MB/s to 25 MB/s. That's a really huge difference and all we > > did was increase the allocated queue bandwidth. At no point was that > > link overutilized. In fact, during our testing of that particular > > application, the link output never went above 350 Mbps. We used very > > large files so that the transfer would take a while and we'd get a > > good feel for what was happening. Doing nothing but increasing the > > queue bandwidth fixed the problem there and has fixed the same sort of > > issue elsewhere. > > > > I'm still researching this and trying to get to the bottom of it. I > > think we're missing something important that would make this all make > > more sense. I appreciate everyone's help! > > > > John > > Also, these 6748 linecards are 1p3q8t. According to that doc these use > DWRR. Does the second bullet apply to DWRR, as well? I'm not quite > sure of the differences. > > Thanks again, > John > ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
On Wed, Jun 27, 2012 at 9:58 AM, John Neiberger wrote: > On Wed, Jun 27, 2012 at 8:24 AM, Janez Novak wrote: >> 6748 can't do shaping. Would love to have them do that. So you must be >> experiencing drops somewhere else and not from WRR BW settings or WRED >> settings. They both kick in when congestion is happening (queues are >> filling up). For exaple linecard is oversubscribed etc >> >> Look at second bullet >> (http://www.cisco.com/en/US/docs/routers/7600/ios/12.2SR/configuration/guide/qos.html#wp1728810). >> >> Kind regards, >> Bostjan > > This is very confusing and I'm getting a lot of conflicting > information. I've been told by three Cisco engineers that these queue > bandwidths limits are fairly hard limits. That is in line with what we > were experiencing because we were seeing output queue drops when the > interface was not fully utilized. Increasing the queue bandwidth got > rid of the output queue drops. For one particular application > traversing this link, that resulted in a file transfer rate increase > from 2.5 MB/s to 25 MB/s. That's a really huge difference and all we > did was increase the allocated queue bandwidth. At no point was that > link overutilized. In fact, during our testing of that particular > application, the link output never went above 350 Mbps. We used very > large files so that the transfer would take a while and we'd get a > good feel for what was happening. Doing nothing but increasing the > queue bandwidth fixed the problem there and has fixed the same sort of > issue elsewhere. > > I'm still researching this and trying to get to the bottom of it. I > think we're missing something important that would make this all make > more sense. I appreciate everyone's help! > > John Also, these 6748 linecards are 1p3q8t. According to that doc these use DWRR. Does the second bullet apply to DWRR, as well? I'm not quite sure of the differences. Thanks again, John ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
On Wed, Jun 27, 2012 at 8:24 AM, Janez Novak wrote: > 6748 can't do shaping. Would love to have them do that. So you must be > experiencing drops somewhere else and not from WRR BW settings or WRED > settings. They both kick in when congestion is happening (queues are > filling up). For exaple linecard is oversubscribed etc > > Look at second bullet > (http://www.cisco.com/en/US/docs/routers/7600/ios/12.2SR/configuration/guide/qos.html#wp1728810). > > Kind regards, > Bostjan This is very confusing and I'm getting a lot of conflicting information. I've been told by three Cisco engineers that these queue bandwidths limits are fairly hard limits. That is in line with what we were experiencing because we were seeing output queue drops when the interface was not fully utilized. Increasing the queue bandwidth got rid of the output queue drops. For one particular application traversing this link, that resulted in a file transfer rate increase from 2.5 MB/s to 25 MB/s. That's a really huge difference and all we did was increase the allocated queue bandwidth. At no point was that link overutilized. In fact, during our testing of that particular application, the link output never went above 350 Mbps. We used very large files so that the transfer would take a while and we'd get a good feel for what was happening. Doing nothing but increasing the queue bandwidth fixed the problem there and has fixed the same sort of issue elsewhere. I'm still researching this and trying to get to the bottom of it. I think we're missing something important that would make this all make more sense. I appreciate everyone's help! John ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
6748 can't do shaping. Would love to have them do that. So you must be experiencing drops somewhere else and not from WRR BW settings or WRED settings. They both kick in when congestion is happening (queues are filling up). For exaple linecard is oversubscribed etc Look at second bullet (http://www.cisco.com/en/US/docs/routers/7600/ios/12.2SR/configuration/guide/qos.html#wp1728810). Kind regards, Bostjan On Tue, Jun 26, 2012 at 10:28 PM, Chris Evans wrote: > Tac is right. This is a downfall of ethernet switching qos. The buffers are > carved up for the queues. My advice is to disable qos altogether or remap > all traffic and buffers back to one queue. > On Jun 26, 2012 4:22 PM, "John Neiberger" wrote: > >> I'm getting conflicting information about how WRR scheduling and >> queueing works on 6748 blades. These blades have three regular queues >> and one priority queue. We've been told by two Cisco TAC engineers >> that if one queue is full, packets will start being dropped even if >> you have plenty of link bandwidth available. Our experience over the >> past few days dealing with related issues seems to bear this out. If a >> queue doesn't have enough bandwidth allotted to it, bad things happen >> even when the link has plenty of room left over. >> >> However, someone else is telling me that traffic should be able to >> burst up to the link speed as long as the other queues are not full. >> Our experience seems to support what we were told by Cisco, but we may >> just be looking at this the wrong way. It's possible that the queue >> only seems to be policed, but maybe most of the drops are from RED. >> I'm just not sure now. >> >> Can anyone help clear this up? >> >> Thanks! >> John >> ___ >> cisco-nsp mailing list cisco-nsp@puck.nether.net >> https://puck.nether.net/mailman/listinfo/cisco-nsp >> archive at http://puck.nether.net/pipermail/cisco-nsp/ >> > ___ > cisco-nsp mailing list cisco-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp > archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
On Tue, 2012-06-26 at 14:16 -0600, John Neiberger wrote: > I'm getting conflicting information about how WRR scheduling and > queueing works on 6748 blades. These blades have three regular queues > and one priority queue. We've been told by two Cisco TAC engineers > that if one queue is full, packets will start being dropped even if > you have plenty of link bandwidth available. That is correct: If the queue is full, packets are dropped. The question is the: Why does the queue end up full if there's plenty of bandwidth available? > Our experience over the past few days dealing with related issues > seems to bear this out. If a queue doesn't have enough bandwidth > allotted to it, bad things happen even when the link has plenty of > room left over. Can you share the configuration from the interface in question together with the output from "show interface GiX/Y" and "show queueing interface GiX/Y"? And maybe "show flowcontrol interface GiX/Y" if you're using flowcontrol. > > However, someone else is telling me that traffic should be able to > burst up to the link speed as long as the other queues are not full. Correct. Keep in mind that queueing and bandwidth are two different things working together. Packets are put in queues and queues are served in a weighed round-robin fashion. If the amount of packets enqueued is larger than what can be transmitted for this queue it starts to drop. As long as there's available bandwidth all the WRR queues should be able to send what they have. > Our experience seems to support what we were told by Cisco, but we may > just be looking at this the wrong way. It's possible that the queue > only seems to be policed, but maybe most of the drops are from RED. > I'm just not sure now. RED (which is enabled by default) would introduce drops faster than without. This might not be the best idea for non-core interfaces. If your traffic is mostly BE (and thus hitting queue 1 threshold 1) you start RED-dropping at 40% and tail-dropping at 70% of the queue buffer space. And queue 1 has 50% of the interface buffers, which should be 583KB [0]. If my back-of-the-envolope calculation is right that's ~3.3ms queuing for BE traffic (q1t1). [0]: http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod_white_paper09186a0080131086.html -- Peter ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] WRR Confusion on 6748 blades
Tac is right. This is a downfall of ethernet switching qos. The buffers are carved up for the queues. My advice is to disable qos altogether or remap all traffic and buffers back to one queue. On Jun 26, 2012 4:22 PM, "John Neiberger" wrote: > I'm getting conflicting information about how WRR scheduling and > queueing works on 6748 blades. These blades have three regular queues > and one priority queue. We've been told by two Cisco TAC engineers > that if one queue is full, packets will start being dropped even if > you have plenty of link bandwidth available. Our experience over the > past few days dealing with related issues seems to bear this out. If a > queue doesn't have enough bandwidth allotted to it, bad things happen > even when the link has plenty of room left over. > > However, someone else is telling me that traffic should be able to > burst up to the link speed as long as the other queues are not full. > Our experience seems to support what we were told by Cisco, but we may > just be looking at this the wrong way. It's possible that the queue > only seems to be policed, but maybe most of the drops are from RED. > I'm just not sure now. > > Can anyone help clear this up? > > Thanks! > John > ___ > cisco-nsp mailing list cisco-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp > archive at http://puck.nether.net/pipermail/cisco-nsp/ > ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/