Try watching:
$ iostat -x 1
You want to pay attention to the r_await/w_await which are the average
read and write times respectively, in milliseconds.
If any of those numbers goes above 15 you will get the stuttering you
describe.
Basically it indicates that the underlying disk(s) can not serve your
read/write requests within 15 milliseconds. As mentioned previously
1000ms/66 ticket = time budget of 15ms per tick.
If that is the case talk to the host about possibly migrating to another
physical host, or change providers. There's nothing you can do inside
the VM.
With VPS hosting there are a lot of anecdotes about various providers.
At the end of the day your *consistent* performance level depends on
what your provider does in terms of locking down resource usage between
VMs, how many VMs they try to cram onto each box, and what your
neighbours are doing.
In an ideal world providers would provide specifications down to things
like disk latency, IOPS, CPU scheduling etc but in reality most
customers won't understand it anyway, or prefer to pay a low price for
"burst" ability even if that means massive oversubscription
Unfortunately that means VPS is basically a crapshoot, especially within
the budget market. You try VPSes until you land with a provider & host
that works for your budget. On the bright side at least with the server
registration now you can migrate between VPS providers without losing
all of your players :]
On 10/04/2014 3:48 AM, pilger wrote:
Hey guys,
Switched to Ubuntu 13 x86 and things got a little better but the hiccups
are still there.
I've noticed the "wa" value of top varies a bit during the hiccups. Here's
what I have observed:
[image: Inline images 1]
Could that be related to the problem? I've googled it briefly and found it
has something to do with disk i/o. Is that correct!? The problem is caused
by slow disk!?
Thanks in advance!
_pilger
On 8 April 2014 11:14, pilger <[email protected]> wrote:
The VPS is running OpenVZ virtualization, so I guess it should be fine.
I also have a friend on the same hosting company with the same plan and
Debian 7 x64 (which was the SO at the time I first posted about the
problem) that works way better than my VPS. Not sure why.
_pilger
On 8 April 2014 11:06, pilger <[email protected]> wrote:
Weasels, whew. Long text. But it was a good share of experience there. My
VPS host offers Ubuntu 13 as the latest SO there, so I switched to it. On
preliminary tests it did well, let's see how it goes when the server gets
full and we reach peak time of day were the lousy neighbours are all
screaming around.
I've had previous experiences with Linux but for other purposes and,
since I do some programming around, the learning curve isn't too steep for
me. Networking is still a mystery to me, though. So I went here for your
help.
Many thanks for all the info!
Thanks for all the input, Yun. Seems interesting but a rather long read,
so I'll take my time to digest the whole information. I'll also have to
produce another server to host the monitor, which I don't have available
right now, so I guess collectd will be postponed for a while, despite of it
being a good method to see what the hell is going on.
_pilger
On 7 April 2014 21:55, Yun Huang Yong <[email protected]> wrote:
collectd is fairly light which is why it is popular as a collection
agent.
I'm going to assume you are comfortable installing packages and
configuring software. I don't have the time to write copy pasta
instructions. I *strongly* recommend you read all of this & the links
before you begin, and make sure you understand what is required of you.
The key components are:
- https://collectd.org/
- http://graphite.wikidot.com/
You need to get collectd running on each of your TF2 hosts. Basically
apt-get but see note below regarding collectd versions.
You'll then want to setup Graphite on another machine. You *could* run
it on your TF2 host but Carbon can get I/O hungry (it is tunable) and that
will create more problems for you so I strongly recommend running Graphite
on another machine.
Also, having Graphite on another machine (with the collectd collector,
below) makes it easy for you to have multiple TF2 hosts, or migrate TF2
hosts.
In my setup I have my Graphite host running on an Ubuntu VM at home with
6 external servers reporting to it.
Here's a picture of the overall setup:
https://dl.dropboxusercontent.com/u/8110989/2014/collectd-graphite.png
Back to setup... with collectd running on your TF2 host, and Graphite on
another host, how do you connect them?
https://collectd.org/wiki/index.php/Networking_introduction
Your Graphite host *also* needs to run collectd in order to act as a
collection server for your TF2 host's collectd to send it data. Edit config
on both TF2 host & Graphite host -- both sides need to run the collectd
network plugin, Graphite host as server, TF2 host as client.
Your Graphite host's collectd also needs to run the write_graphite
plugin to write the network collected data to Graphite.
https://collectd.org/wiki/index.php/Plugin:Write_Graphite
<Plugin write_graphite>
<Carbon>
Host "localhost"
Port "2003"
EscapeCharacter "_"
</Carbon>
</Plugin>
Note: if you Google collectd + graphite you may be confused by many blog
posts refer to custom written plugins which were necessary before collectd
had its write_graphite plugin.
Note 2: since you're on Debian note that the write_graphite plugin was
added with collectd 5.1. You may need to get it from backports or something.
For Graphite...
This is a reasonable overview but may be out of date:
http://graphite.wikidot.com/installation
Read ^ as an overview but maybe follow the current instructions here:
https://graphite.readthedocs.org/en/latest/install.html
You need to pay attention to the storage-schemas.conf but you can more
or less ignore other instructions about feeding data into Graphite. With
the collectd write_graphite plugin your data will automagically be fed from
collectd -> localhost:2003 which is Carbon (Graphite's collector).
Good luck :]
PS: I am happy to answer specific questions about the collectd/graphite
setup but if you ask general sysadmin stuff I probably won't respond.
On 8/04/2014 12:50 AM, pilger wrote:
I've noticed the yellow bars mainly on the Mem field. Don't know if that
might be related. Could it?
About collectd, it seems very nice and a lot easier to visualize but you
talked greek to me up there. Would you point me to some tutorial or show
me some ropes on how to get it running so I can find the bottlenecks?
Does it use a lot of resource!?
_pilger
On 7 April 2014 11:35, Yun Huang Yong <[email protected]
<mailto:[email protected]>> wrote:
Your concern about noisy VPS neighbours will show up as CPU steal -
htop shows this as yellow bars by default.
Disk latency could also be an issue.
66 tick means each tick has a time budget of around 15ms (1000/66).
If disk latency exceeds 15ms you will get stuttering - I had this
happen on servers in the past.
e.g.
https://dl.dropboxusercontent.__com/u/8110989/2013/np1-disk-
__latency.png
<https://dl.dropboxusercontent.com/u/8110989/2013/np1-disk-latency.
png>
Stuttery server leading up to 08/03 (US style month/day, August last
year). Host migrated my server to another less loaded machine, great
for a few weeks then as that machine also became more heavily
utilised (by other customers) it started to stutter again.
FWIW I use collectd to gather these metrics on each host, feeding
into a single collectd collector which then uses collectd's
write_graphite plugin to write all the data into graphite for
storage & graphing. collectd's default 10s polling is great for
picking up transient issues, and graphite_web makes the
visualisation easy.
On 7/04/2014 10:26 PM, pilger wrote:
Hey guys, thanks for the replies.
* The RAM seems all right when I look at it with htop;
* We tried CentOS but the network was behaving poorly with it
so we
switched to Debian x64 and it became a lot better;
* net_splitpacket_maxrate was set to 50000 while the rates
were from
30000 to 60000. I've now set the splitpacket to 100000 and
the rates
to 50000 to 100000 as you guys suggested. Gotta wait a bit
for the
server to get full so I can check if it worked;
Wouldn't the htop or any other monitoring tool show something
wrong even
it being a VPS!?
But, anyway, as I mentioned before, the problem occurs with the
server
practically empty. So I don't think it is related to CPU being
overloaded... could I be wrong on this? Could my VPS neighbours
be
leeching on my CPU even it being supposedly reserved to my
service?
Thanks!
_pilger
On 7 April 2014 02:10, John <lists.valve@nuclearfallout.__net
<mailto:[email protected]>
<mailto:lists.valve@__nuclearfallout.net
<mailto:[email protected]>>> wrote:
Its not the RAM. Its packet loss from server side - you
won't
see it on net graph as its only client side.
Packet loss should show in net_graph output either way.
But, to be
safe, certainly run MTR tests.
I've had this happen to me lots of times. Been running
servers
since the 1.5 days. Ditch your host and also ditch
Debian BS.
Recent versions of Debian work well for game servers, so
ditching it
would not be necessary.
You should confer with your host on the status of your
hardware and
whether a performance limitation is involved, such as I/O
delays.
You should also double-check server-side rates, including
by making
sure that net_splitpacket_maxrate is set sufficiently high
(such as
100000). These symptoms seem along the lines of what I
would expect
from net_splitpacket_maxrate being low.
Ask ant corporation or enterprise, all use CentOS.
CentOS is marketed to enterprise and works well for such
applications because of its older, stable, well-tested
software
packages and extended RHEL support for those older
packages. For
game servers, it is not ideal, since those older packages
often lack
useful features and performance tweaks. Debian is usually a
better
choice for game servers.
If you're interested in hosting DDoS protected servers,
email me
- I can help you.
Be very careful with hosts that claim to offer DDoS
protection.
There is an extremely limited number who do it right, and a
very
large number who do not.
-John
___________________________________________________
To unsubscribe, edit your list preferences, or view the
list
archives, please visit:
https://list.valvesoftware.____com/cgi-bin/mailman/listinfo/
____hlds
<https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/__hlds
<https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>>
_________________________________________________
To unsubscribe, edit your list preferences, or view the list
archives, please visit:
https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/_
_hlds
<https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>
_________________________________________________
To unsubscribe, edit your list preferences, or view the list
archives, please visit:
https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/__hlds
<https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>
_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives,
please visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds
_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives,
please visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds
_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please
visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds_linux
_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please
visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds_linux