Balancing the interrupts didn't make the situation better: http://i.imgur.com/IH0uSwr.png :/
2013/3/22 Yongming Zhao <[email protected]> > yeah, you should balance all the eth2-TxRx-* :D > > > 在 2013-3-22,下午6:17,Philip <[email protected]> 写道: > > I have a hard time understanding the output of /proc/interrupts since > there seem to be multiple interrupts already: "eth2-TxRx-0", > "eth2-TxRx-1".. but it seems to be balanced pretty poorly. Should I change > smp_affinity for all these interrupts or only for the one that has the name > "eth2"? > > You can see the output of /proc/interrupts here -> > http://i.imgur.com/ZLulmkQ.png > > Best Regards > Philip > > > 2013/3/22 Yongming Zhao <[email protected]> > >> well, it is easy to identify the irq issue here: >> 1, in "top", press "1" to display all CPU details. and press "H" to >> display the Traffic Server threadings, by default the process is sorted >> with CPU usage desc. >> you may get one of the CPU with full load but not single TS process. >> >> 2, "cat /proc/interrupts", and grep out your 10GE nic, check the IRQs. >> you need the IRQs on different CPUs for better performance. >> you may get that all the IRQs for the NIC is on one CPU, that is the CPU >> with full load, typically this CPU0 >> >> just set the smp_affinity for each IRQ, here is a not prove to working >> one line script(replace the eth1 with your NIC name): >> >> j=0;for i in $(grep eth1 /proc/interrupts | awk -F: "{print \$1}");do >> test $j -gt $(grep processor /proc/cpuinfo | tail -n 1 | awk '{print $NF}') >> && let j=0;echo $(echo -n $(python -c 'a=1<<'$(echo $j%32 | bc)'; print >> "%X"%a'); echo -n $(k=$(echo $j/32 | bc);while [ $k -gt 0 ];do echo -n >> ",00000000";let k=k-1;done))> /proc/irq/$i/smp_affinity;let j=j+1;done >> >> >> FYI >> >> 在 2013-3-22,上午6:23,Igor Galić <[email protected]> 写道: >> >> This may be useful: >> >> http://kerneltrap.org/mailarchive/linux-netdev/2010/4/15/6274814/thread >> >> ------------------------------ >> >> Hi Yongming, >> >> I haven't changed the networking configuraton but I've also noticed that >> once the first core is at 100% utilization the server won't answer all ping >> requests anymore and has packet loss. This might be a sign that all network >> traffic is handled by the first core isn't it? >> >> You can find a screenshot of the threading output of top here: >> http://i.imgur.com/X3te2Ru.png >> >> Best Regards >> Philip >> >> 2013/3/21 Yongming Zhao <[email protected]> >> >>> well, due to the high network traffic, have you make the 10Ge NIC irq >>> balanced to multiple cpu? >>> >>> and can you show us the threading CPU usage in the top? >>> >>> thanks >>> >>> 在 2013-3-21,下午7:42,Philip <[email protected]> 写道: >>> >>> I've just upgraded to ATS 3.3.1-dev. The problem still is the same: >>> http://i.imgur.com/1pHWQy7.png >>> >>> The load goes on one core. (The server is only running ATS) >>> >>> 2013/3/21 Philip <[email protected]> >>> >>>> Hi Igor, >>>> >>>> I am using ATS 3.2.4, Debian 6 (Squeeze) and a 3.2.13 Kernel. >>>> >>>> I was using the "traffic_line -r" command to see the number of origin >>>> connections growing and htop/atop to see that only one core is 100% >>>> utilized. I've already tested the following changes to the configuration: >>>> >>>> proxy.config.accept_threads -> 0 >>>> >>>> proxy.config.exec_thread.autoconfig -> 0 >>>> proxy.config.exec_thread.limit -> 120 >>>> >>>> They had no effect there is still the one core that becomes 100% >>>> utilized and turns out to be a bottleneck. >>>> >>>> Best Regards >>>> Philip >>>> >>>> >>>> 2013/3/21 Igor Galić <[email protected]> >>>> >>>>> Hi Philip, >>>>> >>>>> Let's start with some simple data mining: >>>>> >>>>> which version of ATS are you running? >>>>> What OS/Distro/version are you running it on? >>>>> >>>>> Are you looking at stats_over_http's output to determine what's going >>>>> on in ATS? >>>>> >>>>> -- i >>>>> >>>>> ------------------------------ >>>>> >>>>> I have noticed the following strange behavior: Once the number of >>>>> origin connections start to increase and the proxying speed collapses the >>>>> first core is at 100% utilization while the others are not even close to >>>>> that. It seems like the origin requests are handled by the first core >>>>> only. >>>>> Is this expected behavior that can be changed by editing the configuration >>>>> or is this a bug? >>>>> >>>>> >>>>> >>>>> 2013/3/20 Philip <[email protected]> >>>>> >>>>>> Hi, >>>>>> >>>>>> I am running ATS on a pretty large server with two physical 6 core >>>>>> XEON CPUs and 22 raw device disks. I want to use that server as a >>>>>> frontend >>>>>> for several fileservers. It is currently configured to be infront of two >>>>>> file-servers. The load on the ATS server is pretty low. About 1-4% disk >>>>>> utilization and 500Mbps of outgoing traffic. >>>>>> >>>>>> Once I direct the traffic of the third file server towards ATS >>>>>> something strange happens: >>>>>> >>>>>> - The number of origin connection increases continually. >>>>>> - Requests that hit ATS and are not cached are served really slow to >>>>>> the client (about 35 kB/s) while requests that are served from the cache >>>>>> are blazingly fast. >>>>>> >>>>>> The ATS server has a dedicated 10Gbps port that is not maxed out, no >>>>>> CPU core is maxxed, there is no swapping, there are no error logs and >>>>>> also >>>>>> the origin servers are not heavy utilized. It feels like there are not >>>>>> enough workers to process the origin requests. >>>>>> >>>>>> Is there anything I can do to check if my theory is right and a way >>>>>> to increase the number of origin workers? >>>>>> >>>>>> Best Regards >>>>>> Philip >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Igor Galić >>>>> >>>>> Tel: +43 (0) 664 886 22 883 >>>>> Mail: [email protected] >>>>> URL: http://brainsware.org/ >>>>> GPG: 6880 4155 74BD FD7C B515 2EA5 4B1D 9E08 A097 C9AE >>>>> >>>>> >>>> >>> >>> >> >> >> >> -- >> Igor Galić >> >> Tel: +43 (0) 664 886 22 883 >> Mail: [email protected] >> URL: http://brainsware.org/ >> GPG: 6880 4155 74BD FD7C B515 2EA5 4B1D 9E08 A097 C9AE >> >> >> > >
