Hi Yongming,
Thanks for list, we are not using cache part, and that’s being disabled from our end, and our goal is to test the req/s , instead of the bandwidth it will use, most the thing we already tested out. I will check what the IRQ balance, NIC drivers stuff as the rest already been tested. Thanks, Di Li > On Dec 14, 2016, at 10:06 AM, Yongming Zhao <[email protected]> wrote: > > the cache performance may concern: > 1, hit or miss > 2, hit on disk or ram > 3, how many concurrent connections > 4, how many uniq objects > 5, how many request per connections(keep alive) > 6, is you binary a debug building? > 7, are you debug tag set? > 8, is disk io issue? > … > > but you get 100% cpu, that means you reach the ATS limit, you may: > 1, lower down the query/response size. > 2, more keep-alive > 3, try to tweak on the connections > 4, take a look upon the cache & hits > 5, try to make queries less collision in memory or dist(incase of the complex > hit/miss/update situation) > 6, try to evalue how many NET threads works better > 7, try to figure out which thread make trouble > 8, try to figure out how to lower down the SI cpu usage > ... > > take a look of the jtest & httpload in the tree, a good test tool always > better than anything else. > > 50000/qps definitely not a good mark :D > > - Yongming Zhao 赵永明 > >> 在 2016年12月14日,上午6:14,Di Li <[email protected] <mailto:[email protected]>> 写道: >> >> Hi Steve, >> >> Thanks for that info, I’m trying to use the small packets, each of request >> are just 200 byes, and responds in 900 bytes. >> >> I don’t worry too much about the bandwidth, the request per second is my >> primary testing object, I only use 1 server as client, 1 server as proxy, 1 >> server as destination, all dedicated boxes >> >> So far, I can’t pass 50000 req/s , I can see our proxy box has burned to >> 100% cpu usage with 24 cores. I’m still investigating if anything wrong on >> the config or the sysctl or IRQ balance >> >> >> Thanks, >> Di Li >> >> >> >> >> >>> On Dec 13, 2016, at 2:01 PM, Lerner, Steve <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Are you using multiple VMs to post via a single VM with ATS to a tuned >>> HTTPD? >>> >>> I got upwards of 50K connections and 9gbps. I posted giant binary files as >>> well… my concern isn’t connections per second its bandwidth throttling >>> which doesn’t seem to happen with these massive posts. >>> >>> In today’s age of VMs if we need more hits/sec we’d just spin up more VMs >>> with ATS. >>> >>> >>> >>> Steve Lerner | Director / Architect - Performance Engineering | m >>> 212.495.9212 | [email protected] <mailto:[email protected]> >>> <image001.png> >>> >>> From: <[email protected] <mailto:[email protected]>> on behalf of Di Li >>> <[email protected] <mailto:[email protected]>> >>> Reply-To: "[email protected] >>> <mailto:[email protected]>" <[email protected] >>> <mailto:[email protected]>> >>> Date: Tuesday, December 13, 2016 at 4:44 PM >>> To: "[email protected] >>> <mailto:[email protected]>" <[email protected] >>> <mailto:[email protected]>> >>> Subject: Re: benchmark ATS >>> >>> Hi Steve, >>> >>> We had those things before I made the post, and so far I stuck with the >>> same num. >>> >>> Btw, there is a good article talk about the tw_recycle and tw_reuse, you >>> may want to check it out, sometime tw_recycle is evil >>> >>> https://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html >>> <https://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html> >>> >>> >>> Thanks, >>> Di Li >>> >>> >>> >>> >>> On Dec 13, 2016, at 11:08 AM, Lerner, Steve <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> We use Ubuntu Server and in the end the only tuning was: >>> >>> · /etc/sysctl.conf >>> o net.ipv4.tcp_tw_recycle = 1 >>> o net.core.somaxconn = 65535 >>> o net.ipv4.tcp_fin_timeout = 15 >>> o net.ipv4.tcp_keepalive_time = 300 >>> o net.ipv4.tcp_keepalive_probes = 5 >>> o net.ipv4.tcp_keepalive_intvl = 15 >>> >>> But of that batch I only think that SOMAXCONN made the difference. Try with >>> just that tuning and then add the rest. >>> >>> The test was simply: >>> >>> ab -p post.txt -l -r -n 1000000 -c 20000 -k -H "Host: [apache httpd server >>> IP]" http://[apache <http://[apache> traffic server forward proxy >>> IP]:8080/index.html >>> >>> where post.txt is the file to post. >>> >>> You can study apache bench manpage to understand the fields used and vary >>> them to see the results. I’d use multiple client VMs running posts via >>> apache bench targeting the single proxy server and be able to easily hit >>> 9gbps and above. >>> >>> To see the performance, we used commands ss-s and tops. >>> Run these on all the machine involved to keep an eye on everything. >>> >>> This was all run manually and quickly. >>> >>> -Steve >>> >>> >>> >>> Steve Lerner | Director / Architect - Performance Engineering | m >>> 212.495.9212 | [email protected] <mailto:[email protected]> >>> <image001.png> >>> >>> From: <[email protected] <mailto:[email protected]>> on behalf of Di Li >>> <[email protected] <mailto:[email protected]>> >>> Reply-To: "[email protected] >>> <mailto:[email protected]>" <[email protected] >>> <mailto:[email protected]>> >>> Date: Tuesday, December 13, 2016 at 1:20 PM >>> To: "[email protected] >>> <mailto:[email protected]>" <[email protected] >>> <mailto:[email protected]>> >>> Subject: Re: benchmark ATS >>> >>> Hey Steve, >>> >>> Can you share some details on config or performance turning or results ? >>> >>> >>> Thanks, >>> Di Li >>> >>> >>> >>> >>> On Dec 13, 2016, at 9:46 AM, Lerner, Steve <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> I’ve benchmarked ATS forward proxy post with cache disabled to near 10gbps >>> on an Openstack VM with a 10Gbps NIC. >>> I used Apache Bench for this. >>> >>> Steve Lerner | Director / Architect - Performance Engineering | m >>> 212.495.9212 | [email protected] <mailto:[email protected]> >>> <image001.png> >>> >>> From: <[email protected] <mailto:[email protected]>> on behalf of Di Li >>> <[email protected] <mailto:[email protected]>> >>> Reply-To: "[email protected] >>> <mailto:[email protected]>" <[email protected] >>> <mailto:[email protected]>> >>> Date: Tuesday, December 13, 2016 at 12:32 PM >>> To: "[email protected] >>> <mailto:[email protected]>" <[email protected] >>> <mailto:[email protected]>> >>> Subject: Re: benchmark ATS >>> >>> using 6.2.0, repeatable >>> >>> >>> Thanks, >>> Di Li >>> >>> >>> >>> >>> On Dec 13, 2016, at 1:28 AM, Reindl Harald <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> Am 13.12.2016 um 09:45 schrieb Di Li: >>> >>> >>> >>> When I doing some benchmark for outbound proxy, and has http_cache >>> enabled, well, first of all, the performance are pretty low, I guess I >>> didn’t do it right with the cache enabled, 2nd when I use wrk to have >>> 512 connection with 40 thread to go through proxy with http, it cause a >>> core dump, here’s the trace >>> >>> And when I disable the http.cache, the performance has went up a lot, >>> and no more coredump at all. >>> >>> >>> FATAL: CacheRead.cc <http://cacheread.cc/> <http://cacheread.cc >>> <http://cacheread.cc/>>:249: failed assert >>> `w->alternate.valid()` >>> traffic_server: using root directory '/ngs/app/oproxy/trafficserver' >>> traffic_server: Aborted (Signal sent by tkill() 20136 1001) >>> traffic_server - STACK TRACE >>> >>> is this repeatable? >>> which version of ATS? >>> >>> at least mention the software version should be common-sense >>> >>> had one such crash after upgrade to 7.0.0 and was not able to reproduce it, >>> even not with a "ab -k -n 10000000 -c 500" benchmark >>> >>> >>> >> >
