Re: [CentOS] clustering and load balancing Apache, using nginx

2009-02-12 Thread Les Mikesell
Sergej Kandyla wrote:

>>>
>>> In the preforking mode apache create a child on each incoming request, 
>>> so it's too much expensive for resource usage.
>>> 
>> Have you actually measured this?  Preforking apache doesn't fork per 
>> request, it forks enough instances to accept the concurrent connection 
>> count plus a few spares.  Each child would typically handle thousands of 
>> requests before exiting and requiring a new fork - the number is 
>> configurable.
>>
>>   
> Sorry for bad explanation.
> I meant that apache create a child (above MinSpareServers) for serving 
> each new unique client.

That's actually for each concurrent connection, not each unique client. 
  Browsers may fire off many simultaneous connections but http 
connections typically have a very short life, so unless users are 
downloading big files, streaming data, or have low-bandwidth connections 
(or your back end service is slow), you shouldn't have that much 
concurrency.

> I measured nginx in real life :)
> On some server (~15k uniq hosts per day, ~ 100k pageviews, and with 1-3k 
> concurrent tcp "established" connections ) with frontend(nginx) - 
> backend (apache + phpfastcgi) architecture I turned off nginx proxing 
> and server go away for a minute... apache forked to MaxClients (500) and 
> took all memory.

There are many factors that can affect it, but that seems like too many 
concurrent connections for that amount of traffic.  The obvious thing to 
check is whether you have keepalives on and if so, what timeout you use. 
  On a busy internet site you want it off or very short.  Also, I'm not 
sure the fastcgi interface gives the same buffer/decoupling effect that 
you get with a proxy.  With a proxy, the heavyweight backend is finished 
and can accept the next request as soon as it has sent its output to the 
proxy which may take much longer to deliver to slow clients. The fastcgi 
interface might keep the backend tied up until the output is delivered. 
   If that is the case, you would get much of the same effect with 
apache as a front end proxy.  Running apache as a proxy might work with 
less memory in threaded mode too.

> Also nginx helped me protect from low-medium DDoS. When apache forked to 
> maxclients, nginx could server many thousand concurrent connections.  So
> I've wrote shell scripts to parse nginx logs and put IPs of bots to 
> firewall table.

Basically if your backend can't deliver the data at the rate the 
requests come in you are fried anyway.

> Therefore I find nginx (lighttpd also a good choose) enough efficient 
> (at least for me). Off course you should understand what you expecting 
> from nginx, what it can do and what can't.
> 
> If you want real world measurements or examples of using nginx on heavy 
> loaded sites please to google. Also you could ask in the nginx at 
> sysoev.ru mail list (EN).

Thanks, I hadn't found much about it in english.

>>> Also apache spend about 
>>> 15-30Kb mem for serving each tcp connection at this time nginx only 
>>> 1-1.5Kb. If you have, for example, abount 100 concurrent connections 
>>> from different IPs there is nearly 100 apache forks... it's too expensive.
>>> 
>> A freshly forked child should have nearly 100% memory shared with its 
>> parent and other child instances. 
> Please tell me how much resources you should have for revers proxing 
> with apache for example nearly 1k-2k unique clients ?
> What cpu load and memory usage will you have?

I'm not sure there are good ways to measure the shared copy-on-write RAM 
of forked processes.  But 15k/connection doesn't sound unreasonable, 
keeping in mind that you have to buffer all unacknowledged data somewhere.

> I think that apache is great software. It's very flexible and features 
> rich, but it especially good as backend for dynamical applications 
> (mod_php, mod_perl, etc.)
> If you need to serve many thousand concurrent connections you should 
> look at nginx, lighttpd, squid, etc..
> IMHO.

I've been using F5 load balancers for the hard part of this for a while 
  but I'd still wonder why you have that much concurrency instead of 
delivering the page and dropping the connection.

-- 
   Les Mikesell
lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] clustering and load balancing Apache, using nginx

2009-02-12 Thread Sergej Kandyla
Les Mikesell пишет:
> Sergej Kandyla wrote:
>   
>> nginx http_proxy module is universal complex solution. Also apache 
>> working in prefork mode (in general cases), I don't know does 
>> mod_jk\mod_proxy_ajp works in the worker-MPM mode...
>>
>> In the preforking mode apache create a child on each incoming request, 
>> so it's too much expensive for resource usage.
>> 
>
> Have you actually measured this?  Preforking apache doesn't fork per 
> request, it forks enough instances to accept the concurrent connection 
> count plus a few spares.  Each child would typically handle thousands of 
> requests before exiting and requiring a new fork - the number is 
> configurable.
>
>   
Sorry for bad explanation.
I meant that apache create a child (above MinSpareServers) for serving 
each new unique client.

I measured nginx in real life :)
On some server (~15k uniq hosts per day, ~ 100k pageviews, and with 1-3k 
concurrent tcp "established" connections ) with frontend(nginx) - 
backend (apache + phpfastcgi) architecture I turned off nginx proxing 
and server go away for a minute... apache forked to MaxClients (500) and 
took all memory.

Also nginx helped me protect from low-medium DDoS. When apache forked to 
maxclients, nginx could server many thousand concurrent connections. So 
I've wrote shell scripts to parse nginx logs and put IPs of bots to 
firewall table.

Therefore I find nginx (lighttpd also a good choose) enough efficient 
(at least for me). Off course you should understand what you expecting 
from nginx, what it can do and what can't.

If you want real world measurements or examples of using nginx on heavy 
loaded sites please to google. Also you could ask in the nginx at 
sysoev.ru mail list (EN).


>> Also apache spend about 
>> 15-30Kb mem for serving each tcp connection at this time nginx only 
>> 1-1.5Kb. If you have, for example, abount 100 concurrent connections 
>> from different IPs there is nearly 100 apache forks... it's too expensive.
>> 
>
> A freshly forked child should have nearly 100% memory shared with its 
> parent and other child instances. 
Please tell me how much resources you should have for revers proxing 
with apache for example nearly 1k-2k unique clients ?
What cpu load and memory usage will you have?

I think that apache is great software. It's very flexible and features 
rich, but it especially good as backend for dynamical applications 
(mod_php, mod_perl, etc.)
If you need to serve many thousand concurrent connections you should 
look at nginx, lighttpd, squid, etc..
IMHO.

http://www.kegel.com/c10k.html

>  As things change, this will decrease, 
> but you are going to have to store the unique socket/buffer info 
> somewhere whether it is a copy-on-write fork or allocated in an 
> event-loop program.  If you run something like mod_perl, the shared 
> memory effect degrades pretty quickly because of the way perl stores 
> reference counts along with its variables, but I'd expect the base 
> apache and most module code to be pretty good about retaining their 
> inherited shared memory.
>
>   

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] clustering and load balancing Apache, using nginx

2009-02-11 Thread nate
Les Mikesell wrote:

> It may be, but I'd like to see some real-world  measurements.  Most of
> the discussions about more efficient approaches seem to use straw-man
> arguments that aren't realistic about the way apache works or timings of
> a few static pages under ideal conditions that don't match an internet
> web server.

In my experience apache has not been any kind of noticeable bottleneck.
At my last company we deployed a pair of apache reverse proxy nodes
that did:

- reverse proxy(188 rewrite rules)
- HTTP compression (compression level set to 9)
- mod_expires for some static content that we hosted on the front end
  proxy nodes
- SSL termination for the portion of the sites that needed SSL
- Header manipulation (had to remove some headers to work around
  IE browser issues with SSL)
- Serve up a "maintenance" page when we took the site down for
  software updates(this was on another dedicated apache instance)

traffic flow was:

internet->BigIP->proxy->BigIP->front end web servers->BigIP->back end apps
(utilizing BigIP's ability to transparently/effortlessly NAT
traffic internal to the network, and using HTTP headers to
communicate the originating IP addresses from the outside
world).

Each proxy node had 8 copies of apache going, 4 for HTTP and 4
for HTTPS, at the moment they seem to average about 125 workers
per proxy node, and an average of 80 idle workers per node.
CPU averages 3%, memory averages about 650MB(boxes have 3GB).
When I first started at the company they were trying to do this
via a low end F5 BigIP load balancer but it was not able to
provide the same level of service at low latency(and that was
when we had a dozen proxy rules). I love BigIPs but for proxies
I prefer apache. It wasn't until recently that F5 made their
code sudo multithreaded, until then even if you had a 4 CPU
load balancer, the proxy stuff could only use one of those
CPUs. Because of this limitation one large local customer F5
told me that they had to implement 5 layers of load balancers
due to their app design depended on the full proxy support in
the BigIPs to route traffic.

Systems were dual proc single core hyperthreaded. They proxied
requests for four dual proc quad core systems which seem to
average around 25-35% CPU usage and about 5GB of memory usage(8GB
total) a piece.

At the company before that we had our stuff split out per
customer, and had 3 proxy nodes in front and about 100 web servers
and application servers behind them for the biggest customers,
having 3 was just for N+1 redundancy, 1 was able to handle the
job. And those proxies were single processor.

At my current job 99% of the load is served directly by tomcat,
the application on the front end at least is simple by comparison
so there's no need for rewrite-type rules. Load balancing is
handled by F5 BigIPs, as is SSL termination. We don't do any
HTTP compression as far as I know.

I personally would not want to load balance using apache, I load
balance with BigIPs, and I do layer 7 proxying(URL inspection)
with apache. If I need to do deeper layer 7 inspection then I
may resort to F5 iRules, but the number of times I've had to
do that over the past several years I think is maybe two.
And even today with the latest version of code, our dual
processor BigIPs cannot run in multithreaded mode, it's not
supported on the platform, only on the latest & greatest(ours
is one generation back from the latest).

I use apache because I've been using it for so long and know it
so well, it's rock solid stable at least for me, and the fewer
different platforms I can use reduces complexity and improves
manageability for me.

If I was in a situation where apache couldn't scale to meet the
needs and something else was there that could handle say 5x the
load, then I might take a look. So far haven't come across that
yet.

nate

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] clustering and load balancing Apache, using nginx

2009-02-11 Thread Les Mikesell
Sergej Kandyla wrote:
> 
> nginx http_proxy module is universal complex solution. Also apache 
> working in prefork mode (in general cases), I don't know does 
> mod_jk\mod_proxy_ajp works in the worker-MPM mode...
> 
> In the preforking mode apache create a child on each incoming request, 
> so it's too much expensive for resource usage.

Have you actually measured this?  Preforking apache doesn't fork per 
request, it forks enough instances to accept the concurrent connection 
count plus a few spares.  Each child would typically handle thousands of 
requests before exiting and requiring a new fork - the number is 
configurable.

> Also apache spend about 
> 15-30Kb mem for serving each tcp connection at this time nginx only 
> 1-1.5Kb. If you have, for example, abount 100 concurrent connections 
> from different IPs there is nearly 100 apache forks... it's too expensive.

A freshly forked child should have nearly 100% memory shared with its 
parent and other child instances.  As things change, this will decrease, 
but you are going to have to store the unique socket/buffer info 
somewhere whether it is a copy-on-write fork or allocated in an 
event-loop program.  If you run something like mod_perl, the shared 
memory effect degrades pretty quickly because of the way perl stores 
reference counts along with its variables, but I'd expect the base 
apache and most module code to be pretty good about retaining their 
inherited shared memory.

> If you don't need full power of apache flexibility as server for dynamic 
> applications, why use it for simple job such as proxing ?
> So, I think nginx is great as light frontend server.

It may be, but I'd like to see some real-world  measurements.  Most of 
the discussions about more efficient approaches seem to use straw-man 
arguments that aren't realistic about the way apache works or timings of 
a few static pages under ideal conditions that don't match an internet 
web server.

-- 
   Les Mikesell
lesmikes...@gmail.com


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] clustering and load balancing Apache, using nginx

2009-02-11 Thread Sergej Kandyla
Les Mikesell пишет:
> Sergej Kandyla wrote:
>   
>>
>> No, nginx could serve any kind of content via ngx_http_proxy_module 
>> module http://wiki.codemongers.com/NginxHttpProxyModule
>> For example I'm using nginx as reverse proxy for tomcat 
>> servers\applications.
>> 
>
> Is there some advantage to this over apache with mod_jk?
>
>   
afaik mod_jk is only available for RHEL4\Centos4 i.e apache 2.0 (of 
course you could compile it manually for apache 2.2 coming with centos5)
So, recommended way for centos5 (apache 2.2) is using mod_proxy 
(mod_proxy_ajp)

nginx http_proxy module is universal complex solution. Also apache 
working in prefork mode (in general cases), I don't know does 
mod_jk\mod_proxy_ajp works in the worker-MPM mode...

In the preforking mode apache create a child on each incoming request, 
so it's too much expensive for resource usage. Also apache spend about 
15-30Kb mem for serving each tcp connection at this time nginx only 
1-1.5Kb. If you have, for example, abount 100 concurrent connections 
from different IPs there is nearly 100 apache forks... it's too expensive.

If you don't need full power of apache flexibility as server for dynamic 
applications, why use it for simple job such as proxing ?
So, I think nginx is great as light frontend server.

example config for proxing to tomcat backend:

location / {
rewrite ^/$ /tomcatapp/ redirect;
}

location /tomcatapp {
proxy_pass http://localhost:8080/tomcatapp;

proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

proxy_connect_timeout 120;
proxy_send_timeout 120;
proxy_read_timeout 180;

}

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] clustering and load balancing Apache, using nginx

2009-02-11 Thread Les Mikesell
Sergej Kandyla wrote:
>
> No, nginx could serve any kind of content via ngx_http_proxy_module 
> module http://wiki.codemongers.com/NginxHttpProxyModule
> For example I'm using nginx as reverse proxy for tomcat 
> servers\applications.

Is there some advantage to this over apache with mod_jk?

-- 
   Les Mikesell
lesmikes...@gmail.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] clustering and load balancing Apache, using nginx

2009-02-11 Thread Sergej Kandyla
Rainer Duffner пишет:
> Sergej Kandyla schrieb:
>   
>> Hi,
>> apache is good as backend server for dynamic applications.
>> You could use something like nginx, haproxy as frontend for balancing 
>> multiple backend servers.
>> I'm using nginx. This light web server could serve many thousand 
>> concurrent connections! It works great!
>>
>> look at
>> http://wiki.codemongers.com/NginxLoadBalanceExample
>> http://blog.kovyrin.net/2006/08/25/haproxy-load-balancer/lang/en/
>> http://blog.kovyrin.net/2006/05/18/nginx-as-reverse-proxy/lang/en/
>> and http://highscalability.com/
>>
>>   
>> 
>
> Yup. NGINX is probably the fastest way to serve content nowadays.
> But content has to be static and be available as a file (AFAIK) directly
> to NGINX.
>   
No, nginx could serve any kind of content via ngx_http_proxy_module 
module http://wiki.codemongers.com/NginxHttpProxyModule
For example I'm using nginx as reverse proxy for tomcat 
servers\applications.
Also I've wrote some article about using nginx in shared hosting sphere. 
Look at http://directadmin.com/forum/showthread.php?t=27344

When content located on the some server (or via NAS\SAN) nginx could 
serve this content directly using some efficient mechanisms like sendfile
http://wiki.codemongers.com/NginxHttpCoreModule#sendfile

For serving static content nginx even more times efficient than ftp!!
On some servers with low-power hardware like celeron\sempron processors 
and 512M ram I have upload rate nearly 100mbit, It's not limit for 
nginx, its a limit of sata disks and chanel to that servers :)

As for load-balancing:
http://wiki.codemongers.com/NginxHttpUpstreamModule
http://barry.wordpress.com/2008/04/28/load-balancer-update/

> There's also "varnish", if you can't meet the above provision easily.
>
>
>   

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos