Im interested to see what the specs of your machine are and the numbers you are 
getting and how you set up your tests…

Benchmarking is something few people really do- I spent a bunch of time and was 
pretty happy with the results but would love to see some peers work…




Steve Lerner | Director / Architect - Performance Engineering | m 212.495.9212 
| [email protected]<mailto:[email protected]>
[cid:[email protected]]

From: <[email protected]> on behalf of Di Li <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Wednesday, December 14, 2016 at 4:39 PM
To: "[email protected]" <[email protected]>
Subject: Re: benchmark ATS

Hi Yongming,


Thanks for list, we are not using cache part, and that’s being disabled from 
our end, and our goal is to test the req/s , instead of the bandwidth it will 
use, most the thing we already tested out.

I will check what the IRQ balance, NIC drivers stuff as the rest already been 
tested.



Thanks,
Di Li



On Dec 14, 2016, at 10:06 AM, Yongming Zhao 
<[email protected]<mailto:[email protected]>> wrote:

the cache performance may concern:
1, hit or miss
2, hit on disk or ram
3, how many concurrent connections
4, how many uniq objects
5, how many request per connections(keep alive)
6, is you binary a debug building?
7, are you debug tag set?
8, is disk io issue?
…

but you get 100% cpu, that means you reach the ATS limit, you may:
1, lower down the query/response size.
2, more keep-alive
3, try to tweak on the connections
4, take a look upon the cache & hits
5, try to make queries less collision in memory or dist(incase of the complex 
hit/miss/update situation)
6, try to evalue how many NET threads works better
7, try to figure out which thread make trouble
8, try to figure out how to lower down the SI cpu usage
...

take a look of the jtest & httpload in the tree, a good test tool always better 
than anything else.

50000/qps definitely not a good mark :D

- Yongming Zhao 赵永明

在 2016年12月14日,上午6:14,Di Li <[email protected]<mailto:[email protected]>> 写道:

Hi Steve,

Thanks for that info, I’m trying to use the small packets, each of request are 
just 200 byes, and responds in 900 bytes.

I don’t worry too much about the bandwidth, the request per second is my 
primary testing object, I only use 1 server as client, 1 server as proxy, 1 
server as destination, all dedicated boxes

So far, I can’t pass 50000 req/s , I can see our proxy box has burned to 100% 
cpu usage with 24 cores. I’m still investigating if anything wrong on the 
config or the sysctl or IRQ balance

Thanks,
Di Li




On Dec 13, 2016, at 2:01 PM, Lerner, Steve 
<[email protected]<mailto:[email protected]>> wrote:

Are you using multiple VMs to post via a single VM with ATS to a tuned HTTPD?

I got upwards of 50K connections and 9gbps. I posted giant binary files as 
well… my concern isn’t connections per second its bandwidth throttling which 
doesn’t seem to happen with these massive posts.

In today’s age of VMs if we need more hits/sec we’d just spin up more VMs with 
ATS.



Steve Lerner | Director / Architect - Performance Engineering | m 212.495.9212 
| [email protected]<mailto:[email protected]>
<image001.png>

From: <[email protected]<mailto:[email protected]>> on behalf of Di Li 
<[email protected]<mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, December 13, 2016 at 4:44 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: benchmark ATS

Hi Steve,

We had those things before I made the post, and so far I stuck with the same 
num.

Btw, there is a good article talk about the tw_recycle and tw_reuse, you may 
want to check it out, sometime tw_recycle is evil

https://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html

Thanks,
Di Li



On Dec 13, 2016, at 11:08 AM, Lerner, Steve 
<[email protected]<mailto:[email protected]>> wrote:

We use Ubuntu Server and in the end the only tuning was:

•         /etc/sysctl.conf
o    net.ipv4.tcp_tw_recycle = 1
o    net.core.somaxconn = 65535
o    net.ipv4.tcp_fin_timeout = 15
o    net.ipv4.tcp_keepalive_time = 300
o    net.ipv4.tcp_keepalive_probes = 5

o    net.ipv4.tcp_keepalive_intvl = 15
But of that batch I only think that SOMAXCONN made the difference. Try with 
just that tuning and then add the rest.

The test was simply:

ab -p post.txt -l -r -n 1000000 -c 20000 -k -H "Host: [apache httpd server IP]" 
http://[apache traffic server forward proxy IP]:8080/index.html

where post.txt is the file to post.

You can study apache bench manpage to understand the fields used and vary them 
to see the results. I’d use multiple client VMs running posts via apache bench 
targeting the single proxy server and be able to easily hit 9gbps and above.

To see the performance, we used commands ss-s and tops.
Run these on all the machine involved to keep an eye on everything.

This was all run manually and quickly.

-Steve



Steve Lerner | Director / Architect - Performance Engineering | m 212.495.9212 
| [email protected]<mailto:[email protected]>
<image001.png>

From: <[email protected]<mailto:[email protected]>> on behalf of Di Li 
<[email protected]<mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, December 13, 2016 at 1:20 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: benchmark ATS

Hey Steve,

Can you share some details on config or performance turning or results ?

Thanks,
Di Li



On Dec 13, 2016, at 9:46 AM, Lerner, Steve 
<[email protected]<mailto:[email protected]>> wrote:

I’ve benchmarked ATS forward proxy post with cache disabled to near 10gbps on 
an Openstack VM with a 10Gbps NIC.
I used Apache Bench for this.

Steve Lerner | Director / Architect - Performance Engineering | m 212.495.9212 
| [email protected]<mailto:[email protected]>
<image001.png>

From: <[email protected]<mailto:[email protected]>> on behalf of Di Li 
<[email protected]<mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, December 13, 2016 at 12:32 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: benchmark ATS

using 6.2.0, repeatable

Thanks,
Di Li



On Dec 13, 2016, at 1:28 AM, Reindl Harald 
<[email protected]<mailto:[email protected]>> wrote:


Am 13.12.2016 um 09:45 schrieb Di Li:




When I doing some benchmark for outbound proxy, and has http_cache
enabled, well, first of all, the performance are pretty low, I guess I
didn’t do it right with the cache enabled, 2nd when I use wrk to have
512 connection with 40 thread to go through proxy with http, it cause a
core dump, here’s the trace

And when I disable the http.cache, the performance has went up a lot,
and no more coredump at all.


FATAL: CacheRead.cc<http://cacheread.cc/> 
<http://cacheread.cc<http://cacheread.cc/>>:249: failed assert
`w->alternate.valid()`
traffic_server: using root directory '/ngs/app/oproxy/trafficserver'
traffic_server: Aborted (Signal sent by tkill() 20136 1001)
traffic_server - STACK TRACE

is this repeatable?
which version of ATS?

at least mention the software version should be common-sense

had one such crash after upgrade to 7.0.0 and was not able to reproduce it, 
even not with a "ab -k -n 10000000 -c 500" benchmark






Reply via email to