Re: [lxc-users] Experience with large number of LXC/LXD containers

David Favor Mon, 27 Mar 2017 10:04:48 -0700

Serge E. Hallyn wrote:

On Tue, Mar 14, 2017 at 02:29:01AM +0100, Benoit GEORGELIN - Association 
Web4all wrote:

----- Mail original -----

De: "Simos Xenitellis" <simos.li...@googlemail.com>
À: "lxc-users" <lxc-users@lists.linuxcontainers.org>
Envoyé: Lundi 13 Mars 2017 20:22:03
Objet: Re: [lxc-users] Experience with large number of LXC/LXD containers
On Sun, Mar 12, 2017 at 11:28 PM, Benoit GEORGELIN - Association
Web4all <benoit.george...@web4all.fr> wrote:

Hi lxc-users ,
I would like to know if you have any experience with a large number of
LXC/LXD containers ?
In term of performance, stability and limitation .
I'm wondering for exemple, if having 100 containers behave the same of
having 1.000 or 10.000 with the same configuration to avoid to talk about
container usage.
I have been looking around for a couple of days to found any user/admin
feedback experience but i'm not able to find large deployments
Is there any ressources limits or any maximum number that can be deployed on
the same node ?
Beside physical performance of the node, is there any specific behavior that
a large number of LXC/LXD containers can experience ? I'm not aware of any
test or limits that can occurs beside number of process. But I'm sure from
LXC/LXD side it might have some technical contraints ?
Maybe on namespace availability , or any other technical layer used by
LXC/LXD
I will be interested to here from your experience or if you have any
links/books/story about this large deployments

This would be interesting to hear if someone can talk publicly about
their large deployment.
In any case, it should be possible to create, for example, 1000 web servers
and then try to access each one and check any issues regarding the
response time.
Another test would be to install 1000 Wordpress installations and
check again for the response time
and resource usage.
Such scripts to create this massive number of containers would also be
helpful to replicate
any issues in order to solve them.
Simos


Been reading this + here's a bit of info.

I've been running LXC since early deployment + now LXD.

There are a few big performance killers related to WordPress. If you keep
these issues in mind, you'll be good.

1) I run 100s of sites across many containers on many machines.

   My business is private, high speed hosting, so I eat from my efforts.
   No theory here.

   I target WordPress site speed at 3000+ reqs/second, measured locally
   using ab (ApacheBench). This is a crude tool + sufficient, as I issue
   1,000,000 simultaneous 5 thread connections against a server for 30 seconds.

      ab -k -t 30 -n 10000000 -c 5 $URL

   This will crash most machines, unless they're tuned well.

2) Memory + CPU. The big killer of performance anywhere is swap thrash. If top
   shows swapping for more than a few seconds, likely your system is heading
   toward a crash.

   Fix: I tend to deploy OVH machines with 128G of memory, as this is enough
   memory to handle huge spikes of memory usage across many sites, during
   traffic spikes... then recover...

   For example, running 100s of sites across many LXD containers, I've had
   machines sustain 250,000+ reqs/hour every day for months.

   At these traffic levels, <1 core used sustained + 50%ish memory use.

   Sites still show 3000+ reqs/sec using ab test above.

3) Database: I run MariaDB rather than MySQL as it's smokin' fast.

   I also relocate /tmp to tmpfs, so temp file i/o runs at memory speed,
   rather than disk speed.

   This ensures all MariaDB temp select set files (for complex selects)
   generate + access at memory speed.

   Also PHP session /tmp files run at memory speed.

   This is important to me, as many of my clients run large membership
   sites. Many are >40K members. This sites performance would circle
   the drain if /tmp was on disk.

4) Disk Thrash: Becomes the killer as traffic increases.

5) Apache Logging: For several clients I'm currently retuning my Apache logging
   to skip logging of successful serves of - images, css, js, fonts. I'll still
   long non-200s, as these need to be debugged.

   This can make a huge difference if memory pressure/use forces disk writes to
   actually go to disk, rather than kernel filesystem i/o buffers.

   Once memory pressure forces physical disk writes, disk i/o starves Apache 
from
   quickly serving uncached content. Very ugly.

   Right now I'm doing extensive filesystem testing, to reduce disk thrash 
during
   traffic spikes + related memory pressure.

6) Net Connection: If you're running 1000s of containers, best also check 
adapter
   saturation. I use 10Gig adapters + even at extreme traffic levels, they 
barely
   reach 10% saturation.

   This means 10Gig adapters are a must for me, as 10% is 1Gig, so using 1Gig
   adapters, site speed would begin to throttle, based on adapter saturation,
   which would be a bear to debug.

7) Apache: I've taken setting up Apache to kill off processes, after anywhere
   from 10K to 100K requests served. This ensures the kernel can garbage collect
   (resource reclamation) which also helps escape swapping.

   If you have 100,000s+ Apache processes running, with no kill off, then 
eventually
   they can potentially eat up a massive amount of memory, which takes a long 
time
   to reclaim, depending on other MPM config settings.

So... General rule of thumb. Tune your entire WAMPL stack to run out of memory:

   WAMPL - WordPress running on Apache + PHP + MariaDB + Linux

If your sites run at memory speed, makes no real difference how many containers
you run. Possibly context switching might come into play if many of the sites
running were high traffic sites.

If problems occur, just look at your Apache logs across all containers. Move
the site with highest traffic to another physical machine.

Or, if top shows swapping, add more memory.

_______________________________________________
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] Experience with large number of LXC/LXD containers

Reply via email to