Hi Brendon, On Wed, Sep 26, 2018 at 02:45:29PM -0500, Brendon Colby wrote: > Tc mean: 0.59 ms > Tc std dev: 17.49 ms > Tc max: 1033.00 ms > Tc median: 0.00 ms
I don't know if all your servers are local, but if that's the case, the Tc should always be very small and an stddev of 17ms and a max of 1s are huge. This could indicate random pauses in the hypervisor which could confirm the possibilities of traffic bursts I was talking about. > So it looks like Tc isn't the issue here. Everything else looks good > to my eyes. I still think something else changed, because on Jessie > this never happened like I said. As I said it's very possible that with a change of limit you were slightly above the minimum required settings and now you're slightly below after the kernel change. > > But I also confess I've not run a VM test myself for a > > while because each time I feel like I'm going to throw up in the middle > > of the test :-/ > > I know that's always been your position on VMs (ha) but one day I > decided to try it for myself and haven't had a single issue until now. > Our old hardware sat nearly 100% idle most of the time, so it was hard > to justify the expense. Oh don't get me wrong, I know how this happens and am not complaining about it. I'm just saying that using VMs is mostly a cost saving solution and that if you cut costs you often have to expect a sacrifice on something else. For those who can stand a slight degradation of performance or latency, or spend some time chasing issues which do not exist from time to time, that's fine. Others can't afford this at all and will prefer bare metal. It's just a matter of trade-off. At least if you found a way to tune your system to work around this issue, you should simply document it somewhere for you or your coworkers and consider yourself lucky to have saved $93/mo without degrading the performance :-) Willy