mod_perl in larger scale environments

2010-04-14 Thread Brad Van Sickle

Hello

I have a lot of experience in large scale web applications using Java 
and Websphere, but I now find myself needing to scale a web application 
built on mod_perl, and I have some questions about best practices for 
doing that since I don't have any sort of deployment manager or an 
intelligent HTTP plugin.


I currently have an application set up in the standard 3 tiered model:
Apache web layer - Apache/mod_perl app layer - MySQL DB layer.

Right now I have one app layer node, but traffic is dictating that I 
need to expand capacity there soon and I plan on adding more hosts to 
that layer.


My first question relates to quality of service and load balancing:
I'm currently using mod_proxy on the web layer, and I know I can set 
that up to load balance requests to multiple app layer nodes, but to the 
best of my knowledge mod_proxy is not able to provide any quality of 
service.  So if a node in the app layer had a problem (or was shut down 
for maintenance) mod_proxy would be unaware of that and would still send 
requests to that node.   How are situations like this normally handled?  
Is there something I can use other than mod_proxy that is intelligent 
enough to mark a host as down?  I'd rather not use a hardware load 
balancer here if I can avoid it.


My second question deals with management of multiple mod_perl nodes:
At some point, if you have enough app layer nodes, managing the code 
deployments, apache configs and server restarts becomes very cumbersome 
if you're doing it all manually.  Are there any tools that can make 
these tasks easier or give me one management view?


Again, I'm used to working in web applications with a full blown app 
server. I love working with mod_perl, but I do find myself missing the 
advantages an app server gives me sometimes.  Hopefully someone can 
offer me some suggestions here.


Thanks




Re: mod_perl in larger scale environments

2010-04-14 Thread Fred Moyer
On Wed, Apr 14, 2010 at 1:57 PM, Brad Van Sickle bvs7...@gmail.com wrote:
 My first question relates to quality of service and load balancing:
 I'm currently using mod_proxy on the web layer, and I know I can set that up
 to load balance requests to multiple app layer nodes, but to the best of my
 knowledge mod_proxy is not able to provide any quality of service.  So if a
 node in the app layer had a problem (or was shut down for maintenance)
 mod_proxy would be unaware of that and would still send requests to that
 node.   How are situations like this normally handled?  Is there something I
 can use other than mod_proxy that is intelligent enough to mark a host as
 down?  I'd rather not use a hardware load balancer here if I can avoid it.

Check out perlbal - http://search.cpan.org/dist/Perlbal/

The load balancing is really nice, and you can handle 10k's of clients
with junkyard hardware.

 My second question deals with management of multiple mod_perl nodes:
 At some point, if you have enough app layer nodes, managing the code
 deployments, apache configs and server restarts becomes very cumbersome if
 you're doing it all manually.  Are there any tools that can make these tasks
 easier or give me one management view?

Puppet and cfmanager are general purpose systems administration tools
that serve well here.  In addition, you can manage the rest of the
system with these tools.


Re: accessing environment variables set by other modules

2010-04-14 Thread Chris Datfung
On Tue, Apr 13, 2010 at 9:57 PM, Chris Datfung chris.datf...@gmail.comwrote:

 On Tue, Apr 13, 2010 at 6:34 PM, Fred Moyer f...@redhotpenguin.comwrote:

 Correct me if I'm wrong, but don't you need to do this:

 PerlPassEnv TE


Hi Fred,

After a bit more research, It seems that PerlPassEnv is just for passing
shell environment variables, Apache maintains a separate environment, as
documented here:

http://httpd.apache.org/docs/2.1/env.html

What I'm looking for is a way to access the environment variables stored in
the internal Apache structure. Any leads of how to access these environment
variables is much appreciated.

Thanks,
Chris


Re: accessing environment variables set by other modules

2010-04-14 Thread Perrin Harkins
On Wed, Apr 14, 2010 at 5:17 PM, Chris Datfung chris.datf...@gmail.com wrote:
 What I'm looking for is a way to access the environment variables stored in
 the internal Apache structure. Any leads of how to access these environment
 variables is much appreciated.

The subprocess_env info that Adam sent you should have worked.  Can
you show us what you tried?  Can you try it in a response handler to
make sure it's not an odd bug with filters?

- Perrin


Re: mod_perl in larger scale environments

2010-04-14 Thread Perrin Harkins
On Wed, Apr 14, 2010 at 5:33 PM, Brad Van Sickle bvs7...@gmail.com wrote:
 I didn't find much info on perlbal after a quick glance, I'll certainly give
 it a closer look, but my inital reaction is that I'm leary of replacing
 Apache on my web layer. I'm doing a few things with a few other modules (
 mod_rewrite for example) in addition to mod_proxy, and from what I was able
 to find in my initial look, I didn't see any support for some of those types
 of things.

There are many full-featured proxy servers these days.  There's even
mod_proxy_balancer for apache, but that doesn't do high-availability,
which you're looking for.

Check out some of these for reverse-proxying if you don't like perlbal:
- nginx
- lighttpd
- varnish
- pound

All of those can serve as mod_perl frontends.

- Perrin


Re: mod_perl in larger scale environments

2010-04-14 Thread Brad Van Sickle
So it sounds like Apache is simply not going to meet my needs. In the 
event that I do need to replace Apache, hopefully you can save me some 
research time and recommend me one of the listed options that fulfills 
my needs (or confirm that perlbal does)


I need the following features:
1) provides support for named virtual hosts
2) supports SSL to the client
3) supports URL rewriting (similar to mod_rewrite)
4) knows the availability of pool members and provides high availability.
5) the ability to serve static content itself

I guess 5 isn't strictly neccessary, but it would be nice to serve 
static content (css/js/images/etc...) from the same piece of technology 
without proxying those requests to another Apache instance running on 
the same host (or something)


Thanks for all the help!

On 4/14/2010 5:48 PM, Perrin Harkins wrote:

On Wed, Apr 14, 2010 at 5:33 PM, Brad Van Sicklebvs7...@gmail.com  wrote:
   

I didn't find much info on perlbal after a quick glance, I'll certainly give
it a closer look, but my inital reaction is that I'm leary of replacing
Apache on my web layer. I'm doing a few things with a few other modules (
mod_rewrite for example) in addition to mod_proxy, and from what I was able
to find in my initial look, I didn't see any support for some of those types
of things.
 

There are many full-featured proxy servers these days.  There's even
mod_proxy_balancer for apache, but that doesn't do high-availability,
which you're looking for.

Check out some of these for reverse-proxying if you don't like perlbal:
- nginx
- lighttpd
- varnish
- pound

All of those can serve as mod_perl frontends.

- Perrin
   


Re: mod_perl in larger scale environments

2010-04-14 Thread Fred Moyer
On Wed, Apr 14, 2010 at 3:15 PM, Brad Van Sickle bvs7...@gmail.com wrote:
 So it sounds like Apache is simply not going to meet my needs. In the event
 that I do need to replace Apache, hopefully you can save me some research
 time and recommend me one of the listed options that fulfills my needs (or
 confirm that perlbal does)

You may want to try Apache with the event mpm using mod_proxy.  I
haven't used that, I ended up using Perlbal mostly so I could
customize the code without having to deal with intermittent segfaults
like I did when I patched apache and got something wrong.

 I need the following features:
 1) provides support for named virtual hosts
 2) supports SSL to the client
 3) supports URL rewriting (similar to mod_rewrite)
 4) knows the availability of pool members and provides high availability.
 5) the ability to serve static content itself

I think perlbal does all of this.  Getting it configured to do all of
this may not be straightforward, but the perlbal list is very helpful
for that.

That being said, I've heard really great things about Varnish, so I'd
try that if I didn't have perlbal.


 I guess 5 isn't strictly neccessary, but it would be nice to serve static
 content (css/js/images/etc...) from the same piece of technology without
 proxying those requests to another Apache instance running on the same host
 (or something)

 Thanks for all the help!

 On 4/14/2010 5:48 PM, Perrin Harkins wrote:

 On Wed, Apr 14, 2010 at 5:33 PM, Brad Van Sicklebvs7...@gmail.com
  wrote:


 I didn't find much info on perlbal after a quick glance, I'll certainly
 give
 it a closer look, but my inital reaction is that I'm leary of replacing
 Apache on my web layer. I'm doing a few things with a few other modules (
 mod_rewrite for example) in addition to mod_proxy, and from what I was
 able
 to find in my initial look, I didn't see any support for some of those
 types
 of things.


 There are many full-featured proxy servers these days.  There's even
 mod_proxy_balancer for apache, but that doesn't do high-availability,
 which you're looking for.

 Check out some of these for reverse-proxying if you don't like perlbal:
 - nginx
 - lighttpd
 - varnish
 - pound

 All of those can serve as mod_perl frontends.

 - Perrin




Re: mod_perl in larger scale environments

2010-04-14 Thread Cosimo Streppone
In data 14 aprile 2010 alle ore 22:57:06, Brad Van Sickle  
bvs7...@gmail.com ha scritto:



My first question relates to quality of service and load balancing:


Hi Brad,

we're using LVS (http://en.wikipedia.org/wiki/Linux_Virtual_Server),
and I find it very useful and reliable.

Our infrastructure for the modperl application, simplifying,
consists of:

a) 1 main front lvs load balancer
b) 2 web frontends
c) 1 back lvs load balancer
d) 12 apache/modperl backends
e) 5 db servers
f) ...other stuff... :)

a) load balances incoming traffic between the 2 web frontends.
b) rewrites backend requests to a single address, wlb (web load balancer)
   which is handled by c)
c) takes incoming requests for several different virtual hostnames, as  
in:

   wlb.domain.com = {weighted round robin to} = (back1,back2,...,back12)
   mlb.domain.com = {wrr} = (db1, db2, ..., db5) (mysql load balancer)
   s-mlb.domain.com = {wrr} = (search-db1, search-db2, ...) (search  
mysql lb)

d) app servers are stateless, so we don't need sticky sessions

This architecture can be simplified, and we're trying to do it.
So, I'm not saying this is the best practice or not even sane. :)

LVS performs health checking via HTTP requests,
with or without md5 checksum of the responses,
or direct TCP connections to the port you specify (f.ex. for db servers).

I'm currently using mod_proxy on the web layer, and I know I can set  
that up to load balance requests to multiple app layer nodes, but to the  
best of my knowledge mod_proxy is not able to provide any quality of  
service.  So if a node in the app layer had a problem (or was shut down  
for maintenance) mod_proxy would be unaware of that and would still send  
requests to that node.


That's where LVS is useful.

LVS can do direct routing or tcp handoff IIRC, and we're using it.
The client and servers talk directly to each other,
without taking up too much resources on the LVS machine itself.


I'd rather not use a hardware load balancer here if I can avoid it.


LVS usually runs on our older less powerful machines.


My second question deals with management of multiple mod_perl nodes:
At some point, if you have enough app layer nodes, managing the code  
deployments, apache configs and server restarts becomes very cumbersome  
if you're doing it all manually.


We're using a simple but limited in-house tool that basically uses
rsync, ssh, and keeps list of hosts w/ roles.

Currently for a pilot project I used puppet for config management
and fabric as last mile deployment tool. So far I'm happy
with the result.

Are there any tools that can make these tasks easier or give me one  
management view?


I don't know. Everything we've done is command line based,
so it's not very friendly. Actually I'm currently looking into
higher level tools to integrate what we've done.

I also looked at ControlTier, but it feels too heavy for me.
Needs its own (powerful) machine. I'd be glad to hear experiences on it.

Again, I'm used to working in web applications with a full blown app  
server. I love working with mod_perl, but I do find myself missing the  
advantages an app server gives me sometimes.


I'm ignorant there. What advantages exactly?

--
Cosimo


Re: mod_perl in larger scale environments

2010-04-14 Thread Dzuy Nguyen
I concur with LVS.  I have LVS running on a $10 piece of hardware (300 
MHz CPU,
128MB memory) that acts as a load balancer for 15+ web servers.  I use 
keepalive

to monitor the systems.

Dzuy

Cosimo Streppone wrote:


In data 14 aprile 2010 alle ore 22:57:06, Brad Van Sickle 
bvs7...@gmail.com ha scritto:



My first question relates to quality of service and load balancing:


Hi Brad,

we're using LVS (http://en.wikipedia.org/wiki/Linux_Virtual_Server),
and I find it very useful and reliable.

Our infrastructure for the modperl application, simplifying,
consists of:

a) 1 main front lvs load balancer
b) 2 web frontends
c) 1 back lvs load balancer
d) 12 apache/modperl backends
e) 5 db servers
f) ...other stuff... :)

a) load balances incoming traffic between the 2 web frontends.
b) rewrites backend requests to a single address, wlb (web load 
balancer)

   which is handled by c)
c) takes incoming requests for several different virtual hostnames, 
as in:
   wlb.domain.com = {weighted round robin to} = 
(back1,back2,...,back12)

   mlb.domain.com = {wrr} = (db1, db2, ..., db5) (mysql load balancer)
   s-mlb.domain.com = {wrr} = (search-db1, search-db2, ...) (search 
mysql lb)

d) app servers are stateless, so we don't need sticky sessions

This architecture can be simplified, and we're trying to do it.
So, I'm not saying this is the best practice or not even sane. :)

LVS performs health checking via HTTP requests,
with or without md5 checksum of the responses,
or direct TCP connections to the port you specify (f.ex. for db servers).

I'm currently using mod_proxy on the web layer, and I know I can set 
that up to load balance requests to multiple app layer nodes, but to 
the best of my knowledge mod_proxy is not able to provide any quality 
of service.  So if a node in the app layer had a problem (or was shut 
down for maintenance) mod_proxy would be unaware of that and would 
still send requests to that node.


That's where LVS is useful.

LVS can do direct routing or tcp handoff IIRC, and we're using it.
The client and servers talk directly to each other,
without taking up too much resources on the LVS machine itself.


I'd rather not use a hardware load balancer here if I can avoid it.


LVS usually runs on our older less powerful machines.


My second question deals with management of multiple mod_perl nodes:
At some point, if you have enough app layer nodes, managing the code 
deployments, apache configs and server restarts becomes very 
cumbersome if you're doing it all manually.


We're using a simple but limited in-house tool that basically uses
rsync, ssh, and keeps list of hosts w/ roles.

Currently for a pilot project I used puppet for config management
and fabric as last mile deployment tool. So far I'm happy
with the result.

Are there any tools that can make these tasks easier or give me one 
management view?


I don't know. Everything we've done is command line based,
so it's not very friendly. Actually I'm currently looking into
higher level tools to integrate what we've done.

I also looked at ControlTier, but it feels too heavy for me.
Needs its own (powerful) machine. I'd be glad to hear experiences on it.

Again, I'm used to working in web applications with a full blown app 
server. I love working with mod_perl, but I do find myself missing 
the advantages an app server gives me sometimes.


I'm ignorant there. What advantages exactly?



Re: [RELEASE CANDIDATE] Apache-Test-1.32 RC1

2010-04-14 Thread Randy Kobes
On 2010-04-13, at 3:41 PM, Fred Moyer wrote:

 Please take a couple minutes to test this release candidate [1] for
 Apache::Test 1.32 and report back success or failure.  Thanks!


+1

Tests pass on OS X 10.6.3, Apache/2.2.11 (prefork MPM), and v5.10.0 for 
darwin-thread-multi-2level.

-- 
best regards,
Randy




Re: mod_perl in larger scale environments

2010-04-14 Thread Brad Van Sickle



LVS does sound interesting but in your infrastructure layout aren't your 
single LVS load balancers single points of failure? Especially if they 
are running on older hardware? Maybe that isn't important in your 
environment?  However, it seems like that negates a lot of the high 
availability goal of load balancing.


 It still may be a possibility for me, possibly running on the same 
host as my existing web layer apache instance and using a localhost 
connection... I will definitely look into it.


Again, I'm used to working in web applications with a full blown app 
server. I love working with mod_perl, but I do find myself missing the 
advantages an app server gives me sometimes.


I'm ignorant there. What advantages exactly?

I'l be brief here because this is a mod_perl list :)

The specific product I'm used to working with is IBM Websphere, which 
allows you to cluster your individual app servers and then manage them 
all from one administration tool.  So settting or config changes, code 
deployments, etc... are snynced across all nodes.  It makes managing app 
clusters extremely easy.It also provides a plugin to IBM's http 
server that handles proxying back to the application servers and 
provides load balancing/high availability,


Those are the two advantages that address my original questions 
directly.  App servers provide a lot of other benefits such as allowing 
you to leverage things like shared memory and shared DB and messaging 
connections/buses... many of these can be simulated in mod_perl.  
(Apache::DBI, etc...)





On 4/14/2010 6:27 PM, Cosimo Streppone wrote:
In data 14 aprile 2010 alle ore 22:57:06, Brad Van Sickle 
bvs7...@gmail.com ha scritto:



My first question relates to quality of service and load balancing:


Hi Brad,

we're using LVS (http://en.wikipedia.org/wiki/Linux_Virtual_Server),
and I find it very useful and reliable.

Our infrastructure for the modperl application, simplifying,
consists of:

a) 1 main front lvs load balancer
b) 2 web frontends
c) 1 back lvs load balancer
d) 12 apache/modperl backends
e) 5 db servers
f) ...other stuff... :)

a) load balances incoming traffic between the 2 web frontends.
b) rewrites backend requests to a single address, wlb (web load 
balancer)

   which is handled by c)
c) takes incoming requests for several different virtual hostnames, 
as in:
   wlb.domain.com = {weighted round robin to} = 
(back1,back2,...,back12)

   mlb.domain.com = {wrr} = (db1, db2, ..., db5) (mysql load balancer)
   s-mlb.domain.com = {wrr} = (search-db1, search-db2, ...) (search 
mysql lb)

d) app servers are stateless, so we don't need sticky sessions

This architecture can be simplified, and we're trying to do it.
So, I'm not saying this is the best practice or not even sane. :)

LVS performs health checking via HTTP requests,
with or without md5 checksum of the responses,
or direct TCP connections to the port you specify (f.ex. for db servers).

I'm currently using mod_proxy on the web layer, and I know I can set 
that up to load balance requests to multiple app layer nodes, but to 
the best of my knowledge mod_proxy is not able to provide any quality 
of service.  So if a node in the app layer had a problem (or was shut 
down for maintenance) mod_proxy would be unaware of that and would 
still send requests to that node.


That's where LVS is useful.

LVS can do direct routing or tcp handoff IIRC, and we're using it.
The client and servers talk directly to each other,
without taking up too much resources on the LVS machine itself.


I'd rather not use a hardware load balancer here if I can avoid it.


LVS usually runs on our older less powerful machines.


My second question deals with management of multiple mod_perl nodes:
At some point, if you have enough app layer nodes, managing the code 
deployments, apache configs and server restarts becomes very 
cumbersome if you're doing it all manually.


We're using a simple but limited in-house tool that basically uses
rsync, ssh, and keeps list of hosts w/ roles.

Currently for a pilot project I used puppet for config management
and fabric as last mile deployment tool. So far I'm happy
with the result.

Are there any tools that can make these tasks easier or give me one 
management view?


I don't know. Everything we've done is command line based,
so it's not very friendly. Actually I'm currently looking into
higher level tools to integrate what we've done.

I also looked at ControlTier, but it feels too heavy for me.
Needs its own (powerful) machine. I'd be glad to hear experiences on it.

Again, I'm used to working in web applications with a full blown app 
server. I love working with mod_perl, but I do find myself missing 
the advantages an app server gives me sometimes.


I'm ignorant there. What advantages exactly?