Hi Chris, On Tue, Dec 03, 2013 at 02:49:57PM -0500, Chris Burroughs wrote: > On 11/26/2013 07:25 AM, Chris Burroughs wrote: > > > >As far as I can tell from AMD docs and Vincent's handy /sys trick, each > >of the 6 cores has a fully independent L2 cache, and the chip has a > >single shared L3 cache. > > > >I'm not sure I'm following the part about the "same part of the L3 > >cache". Are you saying that some cores are "closer" to each other on > >the L3 cache, like NUMA? > > > >>These CPUs seem to be designed for VM hosting, or running highly > >>threaded Java apps which don't need much FPU. I'm not certain they > >>were optimized for network processing unfortunately, which is sad > >>considering that their older brothers were extremely fast at that. > >> > > > >"Highly threaded Java apps" happens to be what most of our servers are > >used for and what we benchmarked for purchasing decisions. > > We happen to have another CPU we purchased to be good with highly > threaded Java apps: Intel Xeon CPU E5-2670 0 @ 2.60GHz > > It also has a L2 cache per core. This CPU has performed significantly > better in both "many" and "a few" threaded workloads. Somewhat > surprisingly with a single haproxy I'm only able to get to around 23 k > req/s (vs 20k with the much older Opteron).
Could you please run "top" during the test, and press "1" to have the per-cpu measures ? Simply copy-paste it and send it here so that we can check what to improve. Willy