Re: [hlds_linux] [hlds] Help with "Rubberband Effect"

Yun Huang Yong Wed, 09 Apr 2014 19:24:07 -0700

Try watching:

$ iostat -x 1

You want to pay attention to the r_await/w_await which are the averageread and write times respectively, in milliseconds.

If any of those numbers goes above 15 you will get the stuttering youdescribe.

Basically it indicates that the underlying disk(s) can not serve yourread/write requests within 15 milliseconds. As mentioned previously1000ms/66 ticket = time budget of 15ms per tick.

If that is the case talk to the host about possibly migrating to anotherphysical host, or change providers. There's nothing you can do insidethe VM.

With VPS hosting there are a lot of anecdotes about various providers.At the end of the day your *consistent* performance level depends onwhat your provider does in terms of locking down resource usage betweenVMs, how many VMs they try to cram onto each box, and what yourneighbours are doing.

In an ideal world providers would provide specifications down to thingslike disk latency, IOPS, CPU scheduling etc but in reality mostcustomers won't understand it anyway, or prefer to pay a low price for"burst" ability even if that means massive oversubscription

Unfortunately that means VPS is basically a crapshoot, especially withinthe budget market. You try VPSes until you land with a provider & hostthat works for your budget. On the bright side at least with the serverregistration now you can migrate between VPS providers without losingall of your players :]


On 10/04/2014 3:48 AM, pilger wrote:

Hey guys,

Switched to Ubuntu 13 x86 and things got a little better but the hiccups
are still there.

I've noticed the "wa" value of top varies a bit during the hiccups. Here's
what I have observed:

[image: Inline images 1]

Could that be related to the problem? I've googled it briefly and found it
has something to do with disk i/o. Is that correct!? The problem is caused
by slow disk!?

Thanks in advance!


_pilger


On 8 April 2014 11:14, pilger <[email protected]> wrote:

The VPS is running OpenVZ virtualization, so I guess it should be fine.

I also have a friend on the same hosting company with the same plan and
Debian 7 x64 (which was the SO at the time I first posted about the
problem) that works way better than my VPS. Not sure why.


_pilger


On 8 April 2014 11:06, pilger <[email protected]> wrote:

Weasels, whew. Long text. But it was a good share of experience there. My
VPS host offers Ubuntu 13 as the latest SO there, so I switched to it. On
preliminary tests it did well, let's see how it goes when the server gets
full and we reach peak time of day were the lousy neighbours are all
screaming around.

I've had previous experiences with Linux but for other purposes and,
since I do some programming around, the learning curve isn't too steep for
me. Networking is still a mystery to me, though. So I went here for your
help.

Many thanks for all the info!

Thanks for all the input, Yun. Seems interesting but a rather long read,
so I'll take my time to digest the whole information. I'll also have to
produce another server to host the monitor, which I don't have available
right now, so I guess collectd will be postponed for a while, despite of it
being a good method to see what the hell is going on.




_pilger


On 7 April 2014 21:55, Yun Huang Yong <[email protected]> wrote:

collectd is fairly light which is why it is popular as a collection
agent.

I'm going to assume you are comfortable installing packages and
configuring software. I don't have the time to write copy pasta
instructions. I *strongly* recommend you read all of this & the links
before you begin, and make sure you understand what is required of you.

The key components are:
   - https://collectd.org/
   - http://graphite.wikidot.com/

You need to get collectd running on each of your TF2 hosts. Basically
apt-get but see note below regarding collectd versions.

You'll then want to setup Graphite on another machine. You *could* run
it on your TF2 host but Carbon can get I/O hungry (it is tunable) and that
will create more problems for you so I strongly recommend running Graphite
on another machine.

Also, having Graphite on another machine (with the collectd collector,
below) makes it easy for you to have multiple TF2 hosts, or migrate TF2
hosts.

In my setup I have my Graphite host running on an Ubuntu VM at home with
6 external servers reporting to it.

Here's a picture of the overall setup:
https://dl.dropboxusercontent.com/u/8110989/2014/collectd-graphite.png

Back to setup... with collectd running on your TF2 host, and Graphite on
another host, how do you connect them?

https://collectd.org/wiki/index.php/Networking_introduction

Your Graphite host *also* needs to run collectd in order to act as a
collection server for your TF2 host's collectd to send it data. Edit config
on both TF2 host & Graphite host -- both sides need to run the collectd
network plugin, Graphite host as server, TF2 host as client.

Your Graphite host's collectd also needs to run the write_graphite
plugin to write the network collected data to Graphite.

https://collectd.org/wiki/index.php/Plugin:Write_Graphite

<Plugin write_graphite>
   <Carbon>
     Host "localhost"
     Port "2003"
     EscapeCharacter "_"
   </Carbon>
</Plugin>

Note: if you Google collectd + graphite you may be confused by many blog
posts refer to custom written plugins which were necessary before collectd
had its write_graphite plugin.

Note 2: since you're on Debian note that the write_graphite plugin was
added with collectd 5.1. You may need to get it from backports or something.

For Graphite...

This is a reasonable overview but may be out of date:
http://graphite.wikidot.com/installation

Read ^ as an overview but maybe follow the current instructions here:
https://graphite.readthedocs.org/en/latest/install.html

You need to pay attention to the storage-schemas.conf but you can more
or less ignore other instructions about feeding data into Graphite. With
the collectd write_graphite plugin your data will automagically be fed from
collectd -> localhost:2003 which is Carbon (Graphite's collector).

Good luck :]

PS: I am happy to answer specific questions about the collectd/graphite
setup but if you ask general sysadmin stuff I probably won't respond.


On 8/04/2014 12:50 AM, pilger wrote:

I've noticed the yellow bars mainly on the Mem field. Don't know if that
might be related. Could it?

About collectd, it seems very nice and a lot easier to visualize but you
talked greek to me up there. Would you point me to some tutorial or show
me some ropes on how to get it running so I can find the bottlenecks?
Does it use a lot of resource!?


_pilger


On 7 April 2014 11:35, Yun Huang Yong <[email protected]
<mailto:[email protected]>> wrote:

     Your concern about noisy VPS neighbours will show up as CPU steal -
     htop shows this as yellow bars by default.

     Disk latency could also be an issue.

     66 tick means each tick has a time budget of around 15ms (1000/66).
     If disk latency exceeds 15ms you will get stuttering - I had this
     happen on servers in the past.

     e.g.
     https://dl.dropboxusercontent.__com/u/8110989/2013/np1-disk-
__latency.png

     <https://dl.dropboxusercontent.com/u/8110989/2013/np1-disk-latency.
png>

     Stuttery server leading up to 08/03 (US style month/day, August last
     year). Host migrated my server to another less loaded machine, great
     for a few weeks then as that machine also became more heavily
     utilised (by other customers) it started to stutter again.

     FWIW I use collectd to gather these metrics on each host, feeding
     into a single collectd collector which then uses collectd's
     write_graphite plugin to write all the data into graphite for
     storage & graphing. collectd's default 10s polling is great for
     picking up transient issues, and graphite_web makes the
     visualisation easy.


     On 7/04/2014 10:26 PM, pilger wrote:

         Hey guys, thanks for the replies.

            * The RAM seems all right when I look at it with htop;
            * We tried CentOS but the network was behaving poorly with it
         so we

              switched to Debian x64 and it became a lot better;
            * net_splitpacket_maxrate was set to 50000 while the rates
         were from

              30000 to 60000. I've now set the splitpacket to 100000 and
         the rates
              to 50000 to 100000 as you guys suggested. Gotta wait a bit
         for the
              server to get full so I can check if it worked;

         Wouldn't the htop or any other monitoring tool show something
         wrong even
         it being a VPS!?

         But, anyway, as I mentioned before, the problem occurs with the
         server
         practically empty. So I don't think it is related to CPU being
         overloaded... could I be wrong on this? Could my VPS neighbours
be
         leeching on my CPU even it being supposedly reserved to my
service?

         Thanks!



         _pilger


         On 7 April 2014 02:10, John <lists.valve@nuclearfallout.__net
         <mailto:[email protected]>
         <mailto:lists.valve@__nuclearfallout.net

         <mailto:[email protected]>>> wrote:

                  Its not the RAM. Its packet loss from server side - you
         won't
                  see it on net graph as its only client side.


              Packet loss should show in net_graph output either way.
         But, to be
              safe, certainly run MTR tests.


                  I've had this happen to me lots of times. Been running
         servers
                  since the 1.5 days. Ditch your host and also ditch
         Debian BS.


              Recent versions of Debian work well for game servers, so
         ditching it
              would not be necessary.

              You should confer with your host on the status of your
         hardware and
              whether a performance limitation is involved, such as I/O
         delays.
              You should also double-check server-side rates, including
         by making
              sure that net_splitpacket_maxrate is set sufficiently high
         (such as
              100000). These symptoms seem along the lines of what I
         would expect
              from net_splitpacket_maxrate being low.


                  Ask ant corporation or enterprise, all use CentOS.


              CentOS is marketed to enterprise and works well for such
              applications because of its older, stable, well-tested
software
              packages and extended RHEL support for those older
         packages. For
              game servers, it is not ideal, since those older packages
         often lack
              useful features and performance tweaks. Debian is usually a
         better
              choice for game servers.


                  If you're interested in hosting DDoS protected servers,
         email me
                  - I can help you.


              Be very careful with hosts that claim to offer DDoS
protection.
              There is an extremely limited number who do it right, and a
         very
              large number who do not.

              -John


              ___________________________________________________


              To unsubscribe, edit your list preferences, or view the
list
              archives, please visit:
         https://list.valvesoftware.____com/cgi-bin/mailman/listinfo/
____hlds

         <https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/__hlds
<https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>>






         _________________________________________________
         To unsubscribe, edit your list preferences, or view the list
         archives, please visit:
         https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/_
_hlds
         <https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>



     _________________________________________________
     To unsubscribe, edit your list preferences, or view the list
     archives, please visit:
     https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/__hlds
     <https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>




_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives,
please visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds


_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives,
please visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds

_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please 
visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds_linux



_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please 
visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds_linux

Re: [hlds_linux] [hlds] Help with "Rubberband Effect"

Reply via email to