Re: Advanced Mass Hosting Module

2003-03-17 Thread Thomas Eibner

On Sat, Mar 15, 2003 at 07:45:00PM -0800, Ian Holsman wrote:
 I was looking at the code, and I think we could acheive alot if we replaced 
 the calls
 to the vhost-parsing code on
 http://lxr.webperf.org/source.cgi/server/protocol.c#L980 
 http://lxr.webperf.org/source.cgi/server/connection.c#L203
 
 with 2 hooks. this would give the added benefit of moving the name-based 
 virtual hosting
 into modules/http where it belongs.

Sounds like a very good idea Ian! Does anyone else have comments about
moving virtualhosts out from the core and into a module?
 
 we would still have a problem creating the server configs on the fly, but 
 that is a smaller problem
 which the module writer could attempt.
 


Re: Advanced Mass Hosting Module

2003-03-16 Thread Graham Leggett
Nathan Ollerenshaw wrote:

What I have in mind is a module that fits in with our current LDAP
based infrastructure. Currently, LDAP services our mail users, and I
would like to see the Apache mass hosting configuration held in LDAP as
well. In this way, we can just scale by adding more apache servers,
mounting the shared docroot and pointing them to the LDAP server.
I had this on the cards quite a while ago, but have not got around to 
actually finishing it off.

The idea was a separate tool which would generate flat apache config 
files based on LDAP queries. The reason for the flat files was so the 
server could still restart and work even if the LDAP server was down. 
Kicking a server could be as simple as accessing a special URL, which 
recreates the flat config files and gracefully restarts the server.

Regards,
Graham
--
-
[EMAIL PROTECTED]   There's a moon
over Bourbon Street
tonight...


Re: Re: Advanced Mass Hosting Module

2003-03-15 Thread Nomentsoa-Tahiry Ramanampanoharana
Instead of writing a new module from the scratch. I don't know if it's a good idea to 
look at mod_rewrite RewriteMap internal function first. From mod_rewrite documentation 
(http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewritemap), it written:

Here the source is an internal Apache function. Currently you cannot create your own, 
but the following functions already exists:

Actually, this is not true anymore in Apache 2 since you can write your own internal 
functions now.

So what I suggest is that your module should register a function as a mod_rewrite 
RewriteMap internal function. In this way, you would have your advanced mass hosting 
needs and still have the power of mod_rewrite.

Tahiry

-Original Message-
From: Nathan Ollerenshaw [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Date: Sat, 15 Mar 2003 02:58:41 +0900
Subject: Re: Advanced Mass Hosting Module

On Saturday, March 15, 2003, at 01:13 AM, Thomas Eibner wrote:
 On Sat, Mar 15, 2003 at 01:00:18AM +0900, Nathan Ollerenshaw wrote:
 I wasn't thinking of anything radical. Just have a hook to set the
 handler for a particular document (if it matches .php or .php4) to the
 PHP module if it's allowed to, and serve it as a normal document if
 not. Etc.

 I've not had a great delve in the hooks but nothing has suggested in
 what I've looked at that it's not possible.

 I'm not sure if it's as simple as you describe. What is to stop a user
 from placing a .htaccess file in a directory giving himself ability to
 give the right content type to execute a php script for instance?
 If you want suexec to work too, there might be further complications.
 (Just thinking out loud here) :)

You bring up a valid point, but I was thinking more of sbox. Thats what 
use use currently (because suexec didn't fit our model) and it works 
great. Though, there seems to be a bug where it's poisoning the 
environment ...

At any rate, if I'm interfering around the URI-to-filename translation 
phase first, I should be able to minimise any problems with .htaccess 
files. But, I don't know, I don't fully understand all the phases that 
I can interfere with just yet :)

There are other phases I've not really looked at as well which I could 
hook into to do extra sanity checks, I guess. But, I think, get the 
thing basically working, then narrow down all the annoying security 
holes it will make, eh?

 I really need to get a proof-of-concept working; maybe this weekend if
 my other half gives me a 'allowed to use computer' note for the 
 teacher.

 What would you consider a proof-of-concept? I have my code lurking on 
 some
 machine in cvs if you want to take a look at it.

If my feeble coding skills are up to it :) I've requested a new sf.net 
project, so in a couple of days I should be able to put up my hacky 
bits of code.

Really, I only started programming C with a vengeance about a week ago. 
I'm an old perl hacker, and never felt a need to use C. So fear my
code. Expect apache to segfault. ;)

Nathan.

--
Nathan Ollerenshaw - Systems Engineer - Shared Hosting
ValueCommerce Japan - http://www.valuecommerce.ne.jp

I'm your blubber boy you should rub me
The sun beat me down too viciously
I fell into the ground to what I used to be
I've melted away I'm nothing again






Re: Re: Advanced Mass Hosting Module

2003-03-15 Thread Tony Finch
On Sat, Mar 15, 2003 at 11:14:44AM -0500, Nomentsoa-Tahiry Ramanampanoharana wrote:

 Instead of writing a new module from the scratch. I don't
 know if it's a good idea to look at mod_rewrite RewriteMap
 internal function first.

I recommend against using mod_rewrite wherever possible. Yes it is
essential in some situations, but frequently you will be better off with
mod_alias and therefore much less hair. Everything that mod_vhost_alias
does can be done with mod_rewrite, but enabling mod_rewrite and therefore
letting customers use it is a scary prospect. With great power comes
great responsibility, and users are far too frequently irresponsible.

Tony.
-- 
f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
LOUGH FOYLE TO CARLINGFORD LOUGH: SOUTHEAST 5 OR 6, DECREASING 4 OR 5. FAIR.
GOOD. MODERATE.


Re: Advanced Mass Hosting Module

2003-03-15 Thread Ian Holsman
Tony Finch wrote:
On Sat, Mar 15, 2003 at 11:14:44AM -0500, Nomentsoa-Tahiry Ramanampanoharana wrote:

Instead of writing a new module from the scratch. I don't
know if it's a good idea to look at mod_rewrite RewriteMap
internal function first.


I recommend against using mod_rewrite wherever possible. Yes it is
essential in some situations, but frequently you will be better off with
mod_alias and therefore much less hair. Everything that mod_vhost_alias
does can be done with mod_rewrite, but enabling mod_rewrite and therefore
letting customers use it is a scary prospect. With great power comes
great responsibility, and users are far too frequently irresponsible.
Tony.
I was looking at the code, and I think we could acheive alot if we replaced the 
calls
to the vhost-parsing code on
http://lxr.webperf.org/source.cgi/server/protocol.c#L980 
http://lxr.webperf.org/source.cgi/server/connection.c#L203
with 2 hooks. this would give the added benefit of moving the name-based virtual 
hosting
into modules/http where it belongs.
we would still have a problem creating the server configs on the fly, but that is a 
smaller problem
which the module writer could attempt.


Re: Advanced Mass Hosting Module

2003-03-15 Thread Nathan Ollerenshaw
On Sunday, March 16, 2003, at 12:45 PM, Ian Holsman wrote:

I was looking at the code, and I think we could acheive alot if we 
replaced
the calls
to the vhost-parsing code on
http://lxr.webperf.org/source.cgi/server/protocol.c#L980 
http://lxr.webperf.org/source.cgi/server/connection.c#L203

with 2 hooks. this would give the added benefit of moving the 
name-based
virtual hosting
into modules/http where it belongs.

we would still have a problem creating the server configs on the fly, 
but
that is a smaller problem
which the module writer could attempt.
Ok. This would probably be a much better solution than what I was 
originally proposing. (I'm not sure I made a very clear proposal, btw).

If we can futz around with the vhosts that apache thinks it has in 
memory, *and* somehow age them and remove them if they don't get 
accessed in a specified amount of time, we would run an awful amount of 
*individually configured* vhosts off a single apache instance and never 
have to graceful it.

Which is my ultimate aim, at any rate.

Nathan.

--
Nathan Ollerenshaw - Systems Engineer - Shared Hosting
ValueCommerce Japan - http://www.valuecommerce.ne.jp
In the days, When we were swinging form the trees
I was a monkey, Stealing honey from a swarm of bees
I could taste, I could taste you even then
And I would chase you down the wind


Re: Advanced Mass Hosting Module

2003-03-14 Thread Mads Toftum
On Thu, Mar 13, 2003 at 04:55:19PM -0800, David Burry wrote:
 These are neat ideas.  At a few companies I've worked for we already do
 similar things but we have scripts that generate the httpd.conf files
 and distribute them out to the web servers and gracefully restart.
 Adding a new web server machine to the mix is as simple as adding the
 host name to the distribution script.
 
This only works when you have a limited number of vhosts - if you were
to run thousands of vhosts on each machine, then mod_vhost_alias
(or mod_rewrite) is currently the only way to go. A module like this
could provide a nice compromise between the flexibility of using 
httpd.conf to specify each vhost and the speed of vhost_alias.

vh

Mads Toftum
-- 
`Darn it, who spiked my coffee with water?!' - lwall



Re: Advanced Mass Hosting Module

2003-03-14 Thread David Burry
You and someone else said the same thing.  I currently have a setup where we
run several hundred vhosts (all individually specified) without issue, I'll
have to remember this if it ever grows to thousands.  Thanks.  With the lack
of a more powerful vhost-alias type thing, I'll probably have to vhost-alias
all the standard bare bones configs, and list out the anomalies
separately

Dave

- Original Message -
From: Mads Toftum [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, March 14, 2003 12:55 AM
Subject: Re: Advanced Mass Hosting Module


 On Thu, Mar 13, 2003 at 04:55:19PM -0800, David Burry wrote:
  These are neat ideas.  At a few companies I've worked for we already do
  similar things but we have scripts that generate the httpd.conf files
  and distribute them out to the web servers and gracefully restart.
  Adding a new web server machine to the mix is as simple as adding the
  host name to the distribution script.
 
 This only works when you have a limited number of vhosts - if you were
 to run thousands of vhosts on each machine, then mod_vhost_alias
 (or mod_rewrite) is currently the only way to go. A module like this
 could provide a nice compromise between the flexibility of using
 httpd.conf to specify each vhost and the speed of vhost_alias.

 vh

 Mads Toftum
 --
 `Darn it, who spiked my coffee with water?!' - lwall




Re: Advanced Mass Hosting Module

2003-03-14 Thread Thomas Eibner

On Thu, Mar 13, 2003 at 08:27:30PM +0900, Nathan Ollerenshaw wrote:
 Resending this to this list as I got no response on users list.
 
 Currently, we are using flat config files generated by our website
 provisioning software to support our mass hosted customers. The reason
 for doing it this way, and not using the mod_vhost_alias module is
 because we need to be able to turn on/off CGI, PHP, Java, shtml etc on
 a per vhost basis. We need the power that having a distinct
 VirtualHost directive for each site gives you.
 
 Is there a better way?

I once started a project to do this from a database, I eventually stopped
as I couldn't figure out a nice way to enable/disable php,cgi,whatever on
demand. Serving virtualhosts from documentroots you pull out of a database
is no big deal.



Re: Advanced Mass Hosting Module

2003-03-14 Thread Nathan Ollerenshaw
On Saturday, March 15, 2003, at 12:02 AM, Thomas Eibner wrote:

On Thu, Mar 13, 2003 at 08:27:30PM +0900, Nathan Ollerenshaw wrote:
Resending this to this list as I got no response on users list.

Currently, we are using flat config files generated by our website
provisioning software to support our mass hosted customers. The reason
for doing it this way, and not using the mod_vhost_alias module is
because we need to be able to turn on/off CGI, PHP, Java, shtml etc on
a per vhost basis. We need the power that having a distinct
VirtualHost directive for each site gives you.
Is there a better way?
I once started a project to do this from a database, I eventually 
stopped
as I couldn't figure out a nice way to enable/disable php,cgi,whatever 
on
demand. Serving virtualhosts from documentroots you pull out of a 
database
is no big deal.
I wasn't thinking of anything radical. Just have a hook to set the 
handler for a particular document (if it matches .php or .php4) to the 
PHP module if it's allowed to, and serve it as a normal document if 
not. Etc.

I've not had a great delve in the hooks but nothing has suggested in 
what I've looked at that it's not possible.

I really need to get a proof-of-concept working; maybe this weekend if 
my other half gives me a 'allowed to use computer' note for the teacher.

Nathan.

--
Nathan Ollerenshaw - Systems Engineer - Shared Hosting
ValueCommerce Japan - http://www.valuecommerce.ne.jp
You can't be a Real Country unless you have a BEER and an airline -
it helps if you have some kind of a football team or some nuclear
weapons, but at the very least you need a BEER. - Frank Zappa



Re: Advanced Mass Hosting Module

2003-03-14 Thread Ian Holsman
Thomas Eibner wrote:
On Sat, Mar 15, 2003 at 01:00:18AM +0900, Nathan Ollerenshaw wrote:

On Saturday, March 15, 2003, at 12:02 AM, Thomas Eibner wrote:


On Thu, Mar 13, 2003 at 08:27:30PM +0900, Nathan Ollerenshaw wrote:

Resending this to this list as I got no response on users list.

Currently, we are using flat config files generated by our website
provisioning software to support our mass hosted customers. The reason
for doing it this way, and not using the mod_vhost_alias module is
because we need to be able to turn on/off CGI, PHP, Java, shtml etc on
a per vhost basis. We need the power that having a distinct
VirtualHost directive for each site gives you.
Is there a better way?
I don't know of a specific virtual host hook, but if there isn't there might be a need for it.

I guess you need to have someplace which calls your module's hook *before* the server definition 
gets set, and allows you to run a pre-config/post-config  followup merge for all the modules 
currently loaded on the first time the server-name is loaded into memory, and then pass the 
resulting server-config down to the rest of the hooks.

this should make it possible to allow you to do anything in your module that the plaintext v-host 
one could do.

--Ian




Re: Advanced Mass Hosting Module

2003-03-14 Thread Nathan Ollerenshaw
On Saturday, March 15, 2003, at 01:13 AM, Thomas Eibner wrote:
On Sat, Mar 15, 2003 at 01:00:18AM +0900, Nathan Ollerenshaw wrote:
I wasn't thinking of anything radical. Just have a hook to set the
handler for a particular document (if it matches .php or .php4) to the
PHP module if it's allowed to, and serve it as a normal document if
not. Etc.
I've not had a great delve in the hooks but nothing has suggested in
what I've looked at that it's not possible.
I'm not sure if it's as simple as you describe. What is to stop a user
from placing a .htaccess file in a directory giving himself ability to
give the right content type to execute a php script for instance?
If you want suexec to work too, there might be further complications.
(Just thinking out loud here) :)
You bring up a valid point, but I was thinking more of sbox. Thats what 
use use currently (because suexec didn't fit our model) and it works 
great. Though, there seems to be a bug where it's poisoning the 
environment ...

At any rate, if I'm interfering around the URI-to-filename translation 
phase first, I should be able to minimise any problems with .htaccess 
files. But, I don't know, I don't fully understand all the phases that 
I can interfere with just yet :)

There are other phases I've not really looked at as well which I could 
hook into to do extra sanity checks, I guess. But, I think, get the 
thing basically working, then narrow down all the annoying security 
holes it will make, eh?

I really need to get a proof-of-concept working; maybe this weekend if
my other half gives me a 'allowed to use computer' note for the 
teacher.
What would you consider a proof-of-concept? I have my code lurking on 
some
machine in cvs if you want to take a look at it.
If my feeble coding skills are up to it :) I've requested a new sf.net 
project, so in a couple of days I should be able to put up my hacky 
bits of code.

Really, I only started programming C with a vengeance about a week ago. 
I'm an old perl hacker, and never felt a need to use C. So fear my 
code. Expect apache to segfault. ;)

Nathan.

--
Nathan Ollerenshaw - Systems Engineer - Shared Hosting
ValueCommerce Japan - http://www.valuecommerce.ne.jp
I'm your blubber boy you should rub me
The sun beat me down too viciously
I fell into the ground to what I used to be
I've melted away I'm nothing again


Re: Advanced Mass Hosting Module

2003-03-13 Thread Tim Nagel
I would also love to see such a module available, and im very willing to
contribute in any way i can, however, im skillless in the C arena :(

Good luck.

Tim
- Original Message -
From: Nathan Ollerenshaw [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, March 13, 2003 10:27 PM
Subject: Advanced Mass Hosting Module


 Resending this to this list as I got no response on users list.

 Currently, we are using flat config files generated by our website
 provisioning software to support our mass hosted customers. The reason
 for doing it this way, and not using the mod_vhost_alias module is
 because we need to be able to turn on/off CGI, PHP, Java, shtml etc on
 a per vhost basis. We need the power that having a distinct
 VirtualHost directive for each site gives you.

 Is there a better way?

 What I have in mind is a module that fits in with our current LDAP
 based infrastructure. Currently, LDAP services our mail users, and I
 would like to see the Apache mass hosting configuration held in LDAP as
 well. In this way, we can just scale by adding more apache servers,
 mounting the shared docroot and pointing them to the LDAP server.

 The LDAP entry would look something like this:

 # www.example.com, base
 dn: uid=www.example.com, o=base
 siteGidNumber: 10045
 siteUidNumber: 10045
 objectClass: top
 objectClass: apacheVhost
 serverName: www.example.com
 serverAlias: example.com
 serverAlias: another.example.com
 docRoot: /data/web/04/09/example.com/www
 vhostStatus: enabled
 phpStatus: enabled
 shtmlStatus: enabled
 cgiStatus: enabled
 dataOutSoftLimit: 100 (in bytes per month)
 dataOutHardLimit: 1000
 dataInSoftLimit: 100
 dataInHardLimit: 1000
 dataThrottleRate: 100 (in bits/sec)

 Then, as a request came in, the imaginary mod_advanced_masshosting
 module would first check to see if it had the information about the
 domain already cached in memory (to avoid hitting LDAP for every HTTP
 request, which would be a Bad Idea) and then if not, it would grab the
 entry from LDAP, cache it, and service the incoming requests.

 The cache itself would need to be shared among the actual child apache
 processes somehow.

 In addition to these features, the module would keep track of the
 amount of data transferred in  out for each vhost and apply a
 soft/hard limit when the limits defined in the LDAP entry were reached.
 The amount of actual data transferred would periodically be written to
 either a GDBM file or even to an LDAP entry (not sure what is best -
 probably LDAP for consistency) and the data would also need to be
 shared among any servers in a cluster somehow.

 This would enable ISPs to bill on a per vhost basis fairly accurately,
 and limit abusive sites.

 Now, I've looked around for something like this, and as far as I can
 see, there isn't anything that does vhosting quite like this, except
 for the commercial systems out there such as Zeus.

 Do people think this is a good approach?

 Will another method give me what I want? (LDAP is not a dependency,
 just a nice-to-have)

 Finally, I am thinking about starting an Open Source project to write
 this module. My C is pretty primitive right now, though I have got
 simple LDAP lookup code working already (just not in Apache, yet).

 Would anyone else see this as a worthwhile project for Apache?

 It certainly would solve our problems, but it sometimes feels like I'm
 trying to fix a simple problem with something very heavy - though
 implemented correctly, I don't think performance will be a problem.

 Comments gratefully received :)

 Regards,

 Nathan.

 --
 Nathan Ollerenshaw - Systems Engineer - Shared Hosting
 ValueCommerce Japan - http://www.valuecommerce.ne.jp

 If you think nobody cares if you're alive, try missing a
 couple of car payments.





RE: Advanced Mass Hosting Module

2003-03-13 Thread David Burry
These are neat ideas.  At a few companies I've worked for we already do
similar things but we have scripts that generate the httpd.conf files
and distribute them out to the web servers and gracefully restart.
Adding a new web server machine to the mix is as simple as adding the
host name to the distribution script.

What you're talking about doing sounds like a lot more complexity to
achieve a similar thing, and more complexity means there's a lot more
that can go wrong.  For instance, what are you going to do if the LDAP
server is down, are many not-yet-cached virtual hosts just going to
fail?  In our scenario it's solved simply and easily by the generation
script simply failing and nothing being copied (but at least the web
servers keep working fine with the last config revision, so not many/any
end user web surfers will notice the outage).

Dave

-Original Message-
From: Nathan Ollerenshaw [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 13, 2003 3:28 AM
To: [EMAIL PROTECTED]
Subject: Advanced Mass Hosting Module


Resending this to this list as I got no response on users list.

Currently, we are using flat config files generated by our website
provisioning software to support our mass hosted customers. The reason
for doing it this way, and not using the mod_vhost_alias module is
because we need to be able to turn on/off CGI, PHP, Java, shtml etc on a
per vhost basis. We need the power that having a distinct VirtualHost
directive for each site gives you.

Is there a better way?

What I have in mind is a module that fits in with our current LDAP based
infrastructure. Currently, LDAP services our mail users, and I would
like to see the Apache mass hosting configuration held in LDAP as well.
In this way, we can just scale by adding more apache servers, mounting
the shared docroot and pointing them to the LDAP server.

The LDAP entry would look something like this:

# www.example.com, base
dn: uid=www.example.com, o=base
siteGidNumber: 10045
siteUidNumber: 10045
objectClass: top
objectClass: apacheVhost
serverName: www.example.com
serverAlias: example.com
serverAlias: another.example.com
docRoot: /data/web/04/09/example.com/www
vhostStatus: enabled
phpStatus: enabled
shtmlStatus: enabled
cgiStatus: enabled
dataOutSoftLimit: 100 (in bytes per month)
dataOutHardLimit: 1000
dataInSoftLimit: 100
dataInHardLimit: 1000
dataThrottleRate: 100 (in bits/sec)

Then, as a request came in, the imaginary mod_advanced_masshosting
module would first check to see if it had the information about the
domain already cached in memory (to avoid hitting LDAP for every HTTP
request, which would be a Bad Idea) and then if not, it would grab the
entry from LDAP, cache it, and service the incoming requests.

The cache itself would need to be shared among the actual child apache
processes somehow.

In addition to these features, the module would keep track of the amount
of data transferred in  out for each vhost and apply a soft/hard limit
when the limits defined in the LDAP entry were reached. The amount of
actual data transferred would periodically be written to either a GDBM
file or even to an LDAP entry (not sure what is best - probably LDAP for
consistency) and the data would also need to be shared among any servers
in a cluster somehow.

This would enable ISPs to bill on a per vhost basis fairly accurately,
and limit abusive sites.

Now, I've looked around for something like this, and as far as I can
see, there isn't anything that does vhosting quite like this, except for
the commercial systems out there such as Zeus.

Do people think this is a good approach?

Will another method give me what I want? (LDAP is not a dependency, just
a nice-to-have)

Finally, I am thinking about starting an Open Source project to write
this module. My C is pretty primitive right now, though I have got
simple LDAP lookup code working already (just not in Apache, yet).

Would anyone else see this as a worthwhile project for Apache?

It certainly would solve our problems, but it sometimes feels like I'm
trying to fix a simple problem with something very heavy - though
implemented correctly, I don't think performance will be a problem.

Comments gratefully received :)

Regards,

Nathan.

-- 
Nathan Ollerenshaw - Systems Engineer - Shared Hosting ValueCommerce
Japan - http://www.valuecommerce.ne.jp

If you think nobody cares if you're alive, try missing a
couple of car payments.



Re: Advanced Mass Hosting Module

2003-03-13 Thread Zac Stevens
On Thu, Mar 13, 2003 at 04:55:19PM -0800, David Burry wrote:
 These are neat ideas.  At a few companies I've worked for we already do
 similar things but we have scripts that generate the httpd.conf files
 and distribute them out to the web servers and gracefully restart.
 Adding a new web server machine to the mix is as simple as adding the
 host name to the distribution script.

I've done the same in the past.  It works fine, but becomes unweildy when
you're talking about thousands of sites per server.  Graceful restarts also
take a nontrivial amount of time in this environment.

 What you're talking about doing sounds like a lot more complexity to
 achieve a similar thing, and more complexity means there's a lot more
 that can go wrong.  For instance, what are you going to do if the LDAP
 server is down, are many not-yet-cached virtual hosts just going to
 fail?  

Redundant LDAP servers?  Or even pluggable backends - keep a DBM-format
copy on the local filesystem as a backup.  I imagine many people would be
happy with a default vhost specified in the config, which could display an
Ooops! Something's broken! page.

In my experience, the 80:20 rule definitely applies here - and I would be
inclined to suggest the ratio is even more severe.  That is, more than 80% 
of the vhosts contribute less than 20% of the load.  While the dynamic 
reconfiguration afforded by this proposal is a big win, I'm more impressed 
with the opportunity to minimise the amount of wasted resources in large
environments.

I'm interested to hear whether this is feasible for development against
2.0, as I don't believe the current architecture allows for plugging in
this sort of functionality as a 3rd-party module.



Zac


Re: Advanced Mass Hosting Module

2003-03-13 Thread Tony Finch
  Resending this to this list as I got no response on users list.

Sorry, I missed the original version of this post.

  Currently, we are using flat config files generated by our website
  provisioning software to support our mass hosted customers. The reason
  for doing it this way, and not using the mod_vhost_alias module is
  because we need to be able to turn on/off CGI, PHP, Java, shtml etc on
  a per vhost basis. We need the power that having a distinct
  VirtualHost directive for each site gives you.
 
  Is there a better way?

The mod_vhost_alias way came from a heritage of very basic web site
provisioning, with little change in architecture since 1996. The
model was abusing the filesystem as a database -- we were using
permissions on users' home directories to record if they had been
barred or had exceeded their quota. We also abused the DNS as a
database, which is where UseCanonicalName DNS came from.

From a more recent perspective this is foolish (or at least naive).

  In addition to these features, the module would keep track of the
  amount of data transferred in  out for each vhost and apply a
  soft/hard limit when the limits defined in the LDAP entry were reached.
  The amount of actual data transferred would periodically be written to
  either a GDBM file or even to an LDAP entry (not sure what is best -
  probably LDAP for consistency) and the data would also need to be
  shared among any servers in a cluster somehow.
  This would enable ISPs to bill on a per vhost basis fairly accurately,
  and limit abusive sites.

This part of it should be separate from the vhosting side of things.
How you provision a web site is independent of how you accumulate stats
on it. It's a logging module, which is naturally separate from a
URI-filename mapping module -- though a proper vhosting module needs
to hook into the DirectoryWalk side of things to do permissions.

  Will another method give me what I want? (LDAP is not a dependency,
  just a nice-to-have)

Clever application of .htaccess files, directory sections containing
AllowOverride directives, etc. *may* be good enough, but it's a very
blunt tool.

Sounds like you're aiming for something good. Lots of people have asked
me for database-driven mod_vhost_alias (which misses the point, but)
so there is a clear need. Don't worry too much about the project
management side of things -- just write the code and the docs and publish
it, then keep polishing and answering emails.

Tony.
-- 
f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
BERWICK ON TWEED TO WHITBY: SOUTHEAST 2 OR 3, INCREASING 4 PERHAPS 5. FAIR.
MODERATE OR GOOD. SLIGHT, INCREASING MODERATE LATER.


Re: Advanced Mass Hosting Module

2003-03-13 Thread Nathan Ollerenshaw
On Friday, March 14, 2003, at 10:15 AM, Zac Stevens wrote:

On Thu, Mar 13, 2003 at 04:55:19PM -0800, David Burry wrote:
These are neat ideas.  At a few companies I've worked for we already 
do
similar things but we have scripts that generate the httpd.conf files
and distribute them out to the web servers and gracefully restart.
Adding a new web server machine to the mix is as simple as adding the
host name to the distribution script.
I've done the same in the past.  It works fine, but becomes unweildy 
when
you're talking about thousands of sites per server.  Graceful restarts 
also
take a nontrivial amount of time in this environment.
Even a few hundred sites are now taking an inordinate time to do a 
graceful - our config is on NFS, with a separate file for each site - a 
design decision that I am beginning to regret... I did some testing, 
but I didn't account for the fact that I'd be loading the configs over 
NFS. Not great.

What you're talking about doing sounds like a lot more complexity to
achieve a similar thing, and more complexity means there's a lot more
that can go wrong.  For instance, what are you going to do if the LDAP
server is down, are many not-yet-cached virtual hosts just going to
fail?
Redundant LDAP servers?  Or even pluggable backends - keep a DBM-format
copy on the local filesystem as a backup.  I imagine many people would 
be
happy with a default vhost specified in the config, which could 
display an
Ooops! Something's broken! page.
We use redundancy everywhere, the backend LDAP is no exception tho this 
rule.

The main reason for LDAP is because we have a front-end provisioning 
system that creates accounts for FTP and Email in LDAP, it would be 
nice to keep the website configurations in there too, without the 
provisioning system having to write apache config files.

You're right, of course. Some form of graceful failure would be needed, 
but it would probably be a 'Temporarily Unavailable' error with a 
custom error page in Japanese and English (most of our customers are 
Japanese).

In my experience, the 80:20 rule definitely applies here - and I would 
be
inclined to suggest the ratio is even more severe.  That is, more than 
80%
of the vhosts contribute less than 20% of the load.  While the dynamic
reconfiguration afforded by this proposal is a big win, I'm more 
impressed
with the opportunity to minimise the amount of wasted resources in 
large
environments.
This equates with my experience too. It irks me that apache spends a 
large amount of time and memory holding the configuration for a bunch 
of sites that only get hit maybe once a day (when the owner loads the 
page to see if the hit counter has increased - HAH!)

I'm interested to hear whether this is feasible for development against
2.0, as I don't believe the current architecture allows for plugging in
this sort of functionality as a 3rd-party module.
I was looking at implementing it in the URI-to-filename translation 
phase. Any memory malloc'd for a in-memory cache would only be 
accessable by that particular child, but that would not be so bad for a 
v1.0 implementation of the module.

In the future, we might look at shmem or something like that. Even a DB 
file held on a ramdisk might be acceptable (if a little perverse).

Nathan.

--
Nathan Ollerenshaw - Systems Engineer - Shared Hosting
ValueCommerce Japan - http://www.valuecommerce.ne.jp
In the days, When we were swinging form the trees
I was a monkey, Stealing honey from a swarm of bees
I could taste, I could taste you even then
And I would chase you down the wind


Re: Advanced Mass Hosting Module

2003-03-13 Thread Nathan Ollerenshaw
On Friday, March 14, 2003, at 09:00 AM, Tim Nagel wrote:

I would also love to see such a module available, and im very willing 
to
contribute in any way i can, however, im skillless in the C arena :(
Learn C, and you're on the team!

Good luck.

Tim
Nathan.

--
Nathan Ollerenshaw - Systems Engineer - Shared Hosting
ValueCommerce Japan - http://www.valuecommerce.ne.jp
I'm your blubber boy you should rub me
The sun beat me down too viciously
I fell into the ground to what I used to be
I've melted away I'm nothing again


Re: Advanced Mass Hosting Module

2003-03-13 Thread Nathan Ollerenshaw
On Friday, March 14, 2003, at 09:55 AM, David Burry wrote:

These are neat ideas.  At a few companies I've worked for we already do
similar things but we have scripts that generate the httpd.conf files
and distribute them out to the web servers and gracefully restart.
Adding a new web server machine to the mix is as simple as adding the
host name to the distribution script.
Yup. Not too dissimilar to what we use right now. We have a shared NFS 
filesystem mounted on all the apache servers with a single level tree 
of config files, one per domain. Apache just includes the base 
directory.

This sucks, performance wise. Convenience wise, it's great.

The NFS server is a High Availability setup, so thats cool. And even if 
I was worried about the NFS going away and the server not being able to 
read it's configs, the point is mute - the NFS server also holds the 
docs.

What you're talking about doing sounds like a lot more complexity to
achieve a similar thing, and more complexity means there's a lot more
that can go wrong.  For instance, what are you going to do if the LDAP
Normally, I'd agree. But like what was mentioned before, you have to 
load thousands, or if you're really lucky, tens of thousands of virtual 
hosts into your apache daemon. Eventually what happens is the apache 
daemon starts using an inordinate amount of ram just to load all those 
configurations into memory, and reloading takes an age.

At least with 1.3, I saw a massive memory usage when loading 5,000 
virtualhosts in a test. I am not sure about 2.0.

Besides. I don't want to have to keep restarting my apache daemon 
*every time* someone wants to enable/disable php on their site. It 
ruins the uptime! ;)

server is down, are many not-yet-cached virtual hosts just going to
fail?  In our scenario it's solved simply and easily by the generation
script simply failing and nothing being copied (but at least the web
servers keep working fine with the last config revision, so not 
many/any
end user web surfers will notice the outage).
Have more than one LDAP server :) This is easy to do, LDAP allows for 
it, and as long as the client software is smart (stops trying to use a 
borked LDAP server) you won't even notice the failure of a back-end 
LDAP slave.

Besides, LDAP is much-maligned. I've been running LDAP in production 
systems for a long time now, and I've never had one just up and die on 
me.

The ability to store all your configuration data in one place overrides 
the inconvenience of having to manage another set of servers.

Nathan.

--
Nathan Ollerenshaw - Systems Engineer - Shared Hosting
ValueCommerce Japan - http://www.valuecommerce.ne.jp
In the days, When we were swinging form the trees
I was a monkey, Stealing honey from a swarm of bees
I could taste, I could taste you even then
And I would chase you down the wind