Re: Drops in qdisc on ifb interface
On Thu, 2015-05-28 at 12:33 -0400, jsulli...@opensourcedevel.com wrote: Our initial testing has been single flow but the ultimate purpose is processing real time video in a complex application which ingests associated meta data, post to consumer facing cloud, does reporting back - so lots of different traffics with very different demands - a perfect tc environment. Wait, do you really plan using TCP for real time video ? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On May 28, 2015 at 1:17 PM Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 12:33 -0400, jsulli...@opensourcedevel.com wrote: Our initial testing has been single flow but the ultimate purpose is processing real time video in a complex application which ingests associated meta data, post to consumer facing cloud, does reporting back - so lots of different traffics with very different demands - a perfect tc environment. Wait, do you really plan using TCP for real time video ? The overall product does but the video source feeds come over a different network via UDP. There are, however, RTMP quality control feeds coming across this connection. There may also occasionally be test UDP source feeds on this connection but those are not production. Thanks - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On Thu, 2015-05-28 at 13:31 -0400, jsulli...@opensourcedevel.com wrote: The overall product does but the video source feeds come over a different network via UDP. There are, however, RTMP quality control feeds coming across this connection. There may also occasionally be test UDP source feeds on this connection but those are not production. Thanks - John This is important to know, because UDP wont benefit from GRO. I was assuming your receiver had to handle ~88000 packets per second, so I was doubting it could saturate one core, but maybe your target is very different. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On May 28, 2015 at 1:49 PM Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 13:31 -0400, jsulli...@opensourcedevel.com wrote: The overall product does but the video source feeds come over a different network via UDP. There are, however, RTMP quality control feeds coming across this connection. There may also occasionally be test UDP source feeds on this connection but those are not production. Thanks - John This is important to know, because UDP wont benefit from GRO. I was assuming your receiver had to handle ~88000 packets per second, so I was doubting it could saturate one core, but maybe your target is very different. That PPS estimate seems accurate - the port speed and CIR on the shaped connection is 1 Gbps. I'm still mystified by why the GbE bottlenecks on IFB but the 10GbE does not. Thanks - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On May 25, 2015 at 6:31 PM Eric Dumazet eric.duma...@gmail.com wrote: On Mon, 2015-05-25 at 16:05 -0400, John A. Sullivan III wrote: Hello, all. One one of our connections we are doing intensive traffic shaping with tc. We are using ifb interfaces for shaping ingress traffic and we also use ifb interfaces for egress so that we can apply the same set of rules to multiple interfaces (e.g., tun and eth interfaces operating on the same physical interface). These are running on very powerful gateways; I have watched them handling 16 Gbps with CPU utilization at a handful of percent. Yet, I am seeing drops on the ifb interfaces when I do a tc -s qdisc show. Why would this be? I would expect if there was some kind of problem that it would manifest as drops on the physical interfaces and not the IFB interface. We have played with queue lengths in both directions. We are using HFSC with SFQ leaves so I would imagine this overrides the very short qlen on the IFB interfaces (32). These are drops and not overlimits. IFB is single threaded and a serious bottleneck. Don't use this on egress, this destroys multiqueue capaility. And SFQ is pretty limited (127 packets) You might try to change your NIC to have a single queue for RX, so that you have a single cpu feeding your IFB queue. (ethtool -L eth0 rx 1) This has been an interesting exercise - thank you for your help along the way, Eric. IFB did not seem to bottleneck in our initial testing but there was really only one flow of traffic during the test at around 1 Gbps. However, on a non-test system with many different flows, IFB does seem to be a serious bottleneck - I assume this is the consequence of being single-threaded. Single queue did not seem to help. Am I correct to assume that IFB would be as much as a bottleneck on the ingress side as it would be on the egress side? If so, is there any way to do high performance ingress traffic shaping on Linux - a multi-threaded version of IFB or a different approach? Thanks - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On Thu, 2015-05-28 at 10:38 -0400, jsulli...@opensourcedevel.com wrote: This has been an interesting exercise - thank you for your help along the way, Eric. IFB did not seem to bottleneck in our initial testing but there was really only one flow of traffic during the test at around 1 Gbps. However, on a non-test system with many different flows, IFB does seem to be a serious bottleneck - I assume this is the consequence of being single-threaded. Single queue did not seem to help. Am I correct to assume that IFB would be as much as a bottleneck on the ingress side as it would be on the egress side? If so, is there any way to do high performance ingress traffic shaping on Linux - a multi-threaded version of IFB or a different approach? Thanks - John IFB has still a long way before being efficient. In the mean time, you could play with following patch, and setup /sys/class/net/eth0/gro_timeout to 2 This way, the GRO aggregation will work even at 1Gbps, and your IFB will get big GRO packets instead of single MSS segments. Both IFB but also IP/TCP stack will have less work to do, and receiver will send fewer ACK packets as well. diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index f287186192bb655ba2dc1a205fb251351d593e98..c37f6657c047d3eb9bd72b647572edd53b1881ac 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -151,7 +151,7 @@ static void igb_setup_dca(struct igb_adapter *); #endif /* CONFIG_IGB_DCA */ static int igb_poll(struct napi_struct *, int); static bool igb_clean_tx_irq(struct igb_q_vector *); -static bool igb_clean_rx_irq(struct igb_q_vector *, int); +static unsigned int igb_clean_rx_irq(struct igb_q_vector *, int); static int igb_ioctl(struct net_device *, struct ifreq *, int cmd); static void igb_tx_timeout(struct net_device *); static void igb_reset_task(struct work_struct *); @@ -6342,6 +6342,7 @@ static int igb_poll(struct napi_struct *napi, int budget) struct igb_q_vector, napi); bool clean_complete = true; + unsigned int packets = 0; #ifdef CONFIG_IGB_DCA if (q_vector-adapter-flags IGB_FLAG_DCA_ENABLED) @@ -6350,15 +6351,17 @@ static int igb_poll(struct napi_struct *napi, int budget) if (q_vector-tx.ring) clean_complete = igb_clean_tx_irq(q_vector); - if (q_vector-rx.ring) - clean_complete = igb_clean_rx_irq(q_vector, budget); + if (q_vector-rx.ring) { + packets = igb_clean_rx_irq(q_vector, budget); + clean_complete = packets budget; + } /* If all work not completed, return budget and keep polling */ if (!clean_complete) return budget; /* If not enough Rx work done, exit the polling mode */ - napi_complete(napi); + napi_complete_done(napi, packets); igb_ring_irq_enable(q_vector); return 0; @@ -6926,7 +6929,7 @@ static void igb_process_skb_fields(struct igb_ring *rx_ring, skb-protocol = eth_type_trans(skb, rx_ring-netdev); } -static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget) +static unsigned int igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget) { struct igb_ring *rx_ring = q_vector-rx.ring; struct sk_buff *skb = rx_ring-skb; @@ -7000,7 +7003,7 @@ static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget) if (cleaned_count) igb_alloc_rx_buffers(rx_ring, cleaned_count); - return total_packets budget; + return total_packets; } static bool igb_alloc_mapped_page(struct igb_ring *rx_ring, -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On Thu, 2015-05-28 at 08:45 -0700, John Fastabend wrote: If your experimenting one thing you could do is create many ifb devices and load balance across them from tc. I'm not sure if this would be practical in your setup or not but might be worth trying. One thing I've been debating adding is the ability to match on current cpu_id in tc which would allow you to load balance by cpu. I could send you a patch if you wanted to test it. I would expect this to help somewhat with 'single queue' issue but sorry haven't had time yet to test it out myself. It seems John uses a single 1Gbps flow, so only one cpu would receive NIC interrupts. The only way he could get better results would be to schedule IFB work on another core. (Assuming one cpu is 100% busy servicing NIC + IFB, but I really doubt it...) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On May 28, 2015 at 11:45 AM John Fastabend john.fastab...@gmail.com wrote: On 05/28/2015 08:30 AM, jsulli...@opensourcedevel.com wrote: On May 28, 2015 at 11:14 AM Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 10:38 -0400, jsulli...@opensourcedevel.com wrote: snip IFB has still a long way before being efficient. In the mean time, you could play with following patch, and setup /sys/class/net/eth0/gro_timeout to 2 This way, the GRO aggregation will work even at 1Gbps, and your IFB will get big GRO packets instead of single MSS segments. Both IFB but also IP/TCP stack will have less work to do, and receiver will send fewer ACK packets as well. diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index f287186192bb655ba2dc1a205fb251351d593e98..c37f6657c047d3eb9bd72b647572edd53b1881ac 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -151,7 +151,7 @@ static void igb_setup_dca(struct igb_adapter *); #endif /* CONFIG_IGB_DCA */ snip Interesting but this is destined to become a critical production system for a high profile, internationally recognized product so I am hesitant to patch. I doubt I can convince my company to do it but is improving IFB the sort of development effort that could be sponsored and then executed in a moderately short period of time? Thanks - John -- If your experimenting one thing you could do is create many ifb devices and load balance across them from tc. I'm not sure if this would be practical in your setup or not but might be worth trying. One thing I've been debating adding is the ability to match on current cpu_id in tc which would allow you to load balance by cpu. I could send you a patch if you wanted to test it. I would expect this to help somewhat with 'single queue' issue but sorry haven't had time yet to test it out myself. .John -- John Fastabend Intel Corporation In the meantime, I've noticed something strange. When testing traffic between the two primary gateways and thus identical traffic flows, I have the bottleneck on the one which uses two bonded GbE igb interfaces but not on the one which uses two bonded 10 GbE ixgbe interfaces. The ethtool -k settings are identical, e.g., gso, gro, lro. The ring buffer is larger on the ixgbe cards but I would not think that would affect this. Identical kernels. The gateway hardware is identical and not working hard at all - no CPU or RAM pressure. Any idea why one bottlenecks and the other does not? Returning to your idea, John, how would I load balance? I assume I would need to attach several filters to the physical interfaces each redirecting traffic to different IFB devices. However, couldn't this work against the traffic shaping? Let's take an extreme example: all the time sensitive ingress packets find their way onto ifb0 and all the bulk ingress packets find their way onto ifb1. As these packets are merged back to the physical interface, wont' they simply be treated in pfifo_fast (or other physical interface qdisc) order? Thanks - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On May 28, 2015 at 12:26 PM Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 08:45 -0700, John Fastabend wrote: If your experimenting one thing you could do is create many ifb devices and load balance across them from tc. I'm not sure if this would be practical in your setup or not but might be worth trying. One thing I've been debating adding is the ability to match on current cpu_id in tc which would allow you to load balance by cpu. I could send you a patch if you wanted to test it. I would expect this to help somewhat with 'single queue' issue but sorry haven't had time yet to test it out myself. It seems John uses a single 1Gbps flow, so only one cpu would receive NIC interrupts. The only way he could get better results would be to schedule IFB work on another core. (Assuming one cpu is 100% busy servicing NIC + IFB, but I really doubt it...) Our initial testing has been single flow but the ultimate purpose is processing real time video in a complex application which ingests associated meta data, post to consumer facing cloud, does reporting back - so lots of different traffics with very different demands - a perfect tc environment. CPU utilization is remarkably light. Every once in a while, we see a single CPU about 50% utilized with si. Thanks, all - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On 05/28/2015 08:30 AM, jsulli...@opensourcedevel.com wrote: On May 28, 2015 at 11:14 AM Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 10:38 -0400, jsulli...@opensourcedevel.com wrote: snip IFB has still a long way before being efficient. In the mean time, you could play with following patch, and setup /sys/class/net/eth0/gro_timeout to 2 This way, the GRO aggregation will work even at 1Gbps, and your IFB will get big GRO packets instead of single MSS segments. Both IFB but also IP/TCP stack will have less work to do, and receiver will send fewer ACK packets as well. diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index f287186192bb655ba2dc1a205fb251351d593e98..c37f6657c047d3eb9bd72b647572edd53b1881ac 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -151,7 +151,7 @@ static void igb_setup_dca(struct igb_adapter *); #endif /* CONFIG_IGB_DCA */ snip Interesting but this is destined to become a critical production system for a high profile, internationally recognized product so I am hesitant to patch. I doubt I can convince my company to do it but is improving IFB the sort of development effort that could be sponsored and then executed in a moderately short period of time? Thanks - John -- If your experimenting one thing you could do is create many ifb devices and load balance across them from tc. I'm not sure if this would be practical in your setup or not but might be worth trying. One thing I've been debating adding is the ability to match on current cpu_id in tc which would allow you to load balance by cpu. I could send you a patch if you wanted to test it. I would expect this to help somewhat with 'single queue' issue but sorry haven't had time yet to test it out myself. .John -- John Fastabend Intel Corporation -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On Thu, 2015-05-28 at 11:30 -0400, jsulli...@opensourcedevel.com wrote: Interesting but this is destined to become a critical production system for a high profile, internationally recognized product so I am hesitant to patch. I doubt I can convince my company to do it but is improving IFB the sort of development effort that could be sponsored and then executed in a moderately short period of time? Thanks - John I intend to submit this patch very officially. Note that some Google servers use the same feature with good success on other NIC. This allowed us to remove interrupt coalescing, lowering RPC latencies, but keeping good throughput and cpu efficiency for bulk flows. http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=1a2881728211f0915c0fa1364770b9c73a67a073 While IFB might need quite a lot of efforts, I don't know. You certainly can ask to John Fastabend if he has plans about it in the short term. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On May 28, 2015 at 11:14 AM Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 10:38 -0400, jsulli...@opensourcedevel.com wrote: snip IFB has still a long way before being efficient. In the mean time, you could play with following patch, and setup /sys/class/net/eth0/gro_timeout to 2 This way, the GRO aggregation will work even at 1Gbps, and your IFB will get big GRO packets instead of single MSS segments. Both IFB but also IP/TCP stack will have less work to do, and receiver will send fewer ACK packets as well. diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index f287186192bb655ba2dc1a205fb251351d593e98..c37f6657c047d3eb9bd72b647572edd53b1881ac 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -151,7 +151,7 @@ static void igb_setup_dca(struct igb_adapter *); #endif /* CONFIG_IGB_DCA */ snip Interesting but this is destined to become a critical production system for a high profile, internationally recognized product so I am hesitant to patch. I doubt I can convince my company to do it but is improving IFB the sort of development effort that could be sponsored and then executed in a moderately short period of time? Thanks - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Drops in qdisc on ifb interface
Hello, all. One one of our connections we are doing intensive traffic shaping with tc. We are using ifb interfaces for shaping ingress traffic and we also use ifb interfaces for egress so that we can apply the same set of rules to multiple interfaces (e.g., tun and eth interfaces operating on the same physical interface). These are running on very powerful gateways; I have watched them handling 16 Gbps with CPU utilization at a handful of percent. Yet, I am seeing drops on the ifb interfaces when I do a tc -s qdisc show. Why would this be? I would expect if there was some kind of problem that it would manifest as drops on the physical interfaces and not the IFB interface. We have played with queue lengths in both directions. We are using HFSC with SFQ leaves so I would imagine this overrides the very short qlen on the IFB interfaces (32). These are drops and not overlimits. Ingress: root@gwhq-2:~# tc -s qdisc show dev ifb0 qdisc hfsc 11: root refcnt 2 default 50 Sent 198152831324 bytes 333838154 pkt (dropped 101509, overlimits 9850280 requeues 43871) backlog 0b 0p requeues 43871 qdisc sfq 1102: parent 11:10 limit 127p quantum 1514b divisor 4096 Sent 208463490 bytes 1367761 pkt (dropped 234, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 1202: parent 11:20 limit 127p quantum 1514b divisor 4096 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 1302: parent 11:30 limit 127p quantum 1514b divisor 4096 Sent 13498600307 bytes 203705301 pkt (dropped 23358, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 1402: parent 11:40 limit 127p quantum 1514b divisor 4096 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 1502: parent 11:50 limit 127p quantum 1514b divisor 4096 Sent 184445767527 bytes 128765092 pkt (dropped 77990, overlimits 0 requeues 0) backlog 0b 0p requeues 0 root@gwhq-2:~# tc -s class show dev ifb0 class hfsc 11: root Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 0 level 2 class hfsc 11:1 parent 11: ls m1 0bit d 0us m2 1000Mbit ul m1 0bit d 0us m2 1000Mbit Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 210766381 work 198152837828 bytes level 1 class hfsc 11:10 parent 11:1 leaf 1102: rt m1 0bit d 0us m2 1000Mbit Sent 208463490 bytes 1367761 pkt (dropped 234, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 0 work 208463490 bytes rtwork 208463490 bytes level 0 class hfsc 11:20 parent 11:1 leaf 1202: rt m1 186182Kbit d 2.2ms m2 10Kbit ls m1 0bit d 0us m2 10Kbit Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 0 level 0 class hfsc 11:30 parent 11:1 leaf 1302: rt m1 0bit d 0us m2 10Kbit ls m1 0bit d 0us m2 30Kbit Sent 13498600307 bytes 203705301 pkt (dropped 23358, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 200073586 work 13498600307 bytes rtwork 10035553945 bytes level 0 class hfsc 11:40 parent 11:1 leaf 1402: rt m1 0bit d 0us m2 20Kbit ls m1 0bit d 0us m2 50Kbit Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 0 level 0 class hfsc 11:50 parent 11:1 leaf 1502: rt m1 0bit d 0us m2 20Kbit ls m1 0bit d 0us m2 10Kbit Sent 184446394921 bytes 128765668 pkt (dropped 77917, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 11254219 work 184445774031 bytes rtwork 39040535823 bytes level 0 Egress: root@gwhq-2:~# tc -s qdisc show dev ifb1 qdisc hfsc 1: root refcnt 2 default 40 Sent 783335740812 bytes 551888729 pkt (dropped 9622, overlimits 8546933 requeues 7180) backlog 0b 0p requeues 7180 qdisc sfq 1101: parent 1:10 limit 127p quantum 1514b divisor 4096 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 1201: parent 1:20 limit 127p quantum 1514b divisor 4096 Sent 345678 bytes 2800 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 1301: parent 1:30 limit 127p quantum 1514b divisor 4096 Sent 573479513 bytes 8689797 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 1401: parent 1:40 limit 127p quantum 1514b divisor 4096 Sent 782761915621 bytes 543196132 pkt (dropped 9692, overlimits 0 requeues 0) backlog 0b 0p requeues 0 root@gwhq-2:~# tc -s class show dev ifb1 class hfsc 1: root Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 0 level 2 class hfsc 1:10 parent 1:1 leaf 1101: rt m1 186182Kbit d 2.2ms m2 10Kbit ls m1 0bit d 0us m2 10Kbit Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 0 level 0 class hfsc 1:1 parent 1: ls m1 0bit d 0us m2 1000Mbit ul m1 0bit d 0us m2 1000Mbit Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 period 27259167 work 783335741126 bytes level 1 class hfsc 1:20 parent 1:1 leaf 1201:
Re: Drops in qdisc on ifb interface
On Mon, 2015-05-25 at 16:05 -0400, John A. Sullivan III wrote: Hello, all. One one of our connections we are doing intensive traffic shaping with tc. We are using ifb interfaces for shaping ingress traffic and we also use ifb interfaces for egress so that we can apply the same set of rules to multiple interfaces (e.g., tun and eth interfaces operating on the same physical interface). These are running on very powerful gateways; I have watched them handling 16 Gbps with CPU utilization at a handful of percent. Yet, I am seeing drops on the ifb interfaces when I do a tc -s qdisc show. Why would this be? I would expect if there was some kind of problem that it would manifest as drops on the physical interfaces and not the IFB interface. We have played with queue lengths in both directions. We are using HFSC with SFQ leaves so I would imagine this overrides the very short qlen on the IFB interfaces (32). These are drops and not overlimits. IFB is single threaded and a serious bottleneck. Don't use this on egress, this destroys multiqueue capaility. And SFQ is pretty limited (127 packets) You might try to change your NIC to have a single queue for RX, so that you have a single cpu feeding your IFB queue. (ethtool -L eth0 rx 1) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On Mon, 2015-05-25 at 15:31 -0700, Eric Dumazet wrote: On Mon, 2015-05-25 at 16:05 -0400, John A. Sullivan III wrote: Hello, all. One one of our connections we are doing intensive traffic shaping with tc. We are using ifb interfaces for shaping ingress traffic and we also use ifb interfaces for egress so that we can apply the same set of rules to multiple interfaces (e.g., tun and eth interfaces operating on the same physical interface). These are running on very powerful gateways; I have watched them handling 16 Gbps with CPU utilization at a handful of percent. Yet, I am seeing drops on the ifb interfaces when I do a tc -s qdisc show. Why would this be? I would expect if there was some kind of problem that it would manifest as drops on the physical interfaces and not the IFB interface. We have played with queue lengths in both directions. We are using HFSC with SFQ leaves so I would imagine this overrides the very short qlen on the IFB interfaces (32). These are drops and not overlimits. IFB is single threaded and a serious bottleneck. Don't use this on egress, this destroys multiqueue capaility. And SFQ is pretty limited (127 packets) You might try to change your NIC to have a single queue for RX, so that you have a single cpu feeding your IFB queue. (ethtool -L eth0 rx 1) Hmm . . . I've been thinking about that SFQ leaf qdisc. I see that newer kernels allow a much higher limit than 127 but it still seems that the queue depth limit for any one flow is still 127. When we do something like GRE/IPSec, I think the decrypted GRE traffic will distribute across the queues but the IPSec traffic will collapse all the packets initially into one queue. At 80ms RTT a 1 Gbps wire speed, I would need a queue of around 7500. Thus, can one say that SFQ is almost useless for high BDP connections? Is there a similar round-robin type qdisc that does not have this limitation? If I recall correctly, if one does not attach a qdisc explicitly to a class, it defaults to pfifo_fast. Is that correct? Thanks - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On Mon, 2015-05-25 at 22:52 -0400, John A. Sullivan III wrote: Hmm . . . I've been thinking about that SFQ leaf qdisc. I see that newer kernels allow a much higher limit than 127 but it still seems that the queue depth limit for any one flow is still 127. When we do something like GRE/IPSec, I think the decrypted GRE traffic will distribute across the queues but the IPSec traffic will collapse all the packets initially into one queue. At 80ms RTT a 1 Gbps wire speed, I would need a queue of around 7500. Thus, can one say that SFQ is almost useless for high BDP connections? I am a bit surprised, as your 'nstat' output showed no packet retransmits. So no packets were lost in your sfq. Is there a similar round-robin type qdisc that does not have this limitation? fq_codel limit 1 If I recall correctly, if one does not attach a qdisc explicitly to a class, it defaults to pfifo_fast. Is that correct? Thanks - John That would be pfifo. pfifo_fast is the default root qdisc ( /proc/sys/net/core/default_qdisc ) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html