On Thu, Jul 12, 2001 at 10:00:57AM -0500, Haim Dimermanas wrote:
> > any script i need to write can just open the virtual-hosts.conf file
> > and parse it (it's a single line, colon-delimited format) to find
> > out everything it needs to know about every virtual host.
>  I used to do it that way and then I discovered something called a
>  database.

i've considered using postgres for this but am resisting it until the
advantages greatly outweigh the disadvantages.

why complicate a simple job with a database? plain text configuration is
perfect for a task of this size.

it takes a lot longer to edit a database entry than it does to edit a
text file with vi.

i'd lose the ability to check-in all changes to RCS if i used a database
instead of a text file.

to get these features, i'd have to write a wrapper script to dump the
config database to a text file, run vi, and then import the database
from the edited file. that still wouldn't get around the fact that you
can put comments in text files - you can't in databases.

in short: databases are appropriate for some tasks, but not all.

> It makes it a lot easier to delete an entry and prevent duplicates.

huh? it takes no time at all to run "vi virtual-hosts.conf" and comment
our or delete a line.

> > i need to split up the log files so that each virtual domain can
> > download their raw access logs at any time. having separate error
> > log files is necessary for debugging scripts too (and preserving
> > privacy - don't want user A having access to user B's error logs).
>  I strongly suggest you invest some time looking into a
> way to put the access log into a database. Something like
> http://freshmeat.net/projects/apachedb/.

i wrote my own code a year ago to store logs in postgres (mysql is a
toy). it had it's uses but i decided it was a waste of disk space and
it made archiving old logs a pain. it greatly complicated the task of
allowing users to download their log files.

i went back to log files.

i'm a strong believer in the KISS principle, and see no need to add
unneccesary complication, especially for such little benefit.

> My research showed that web hosting customers don't look at their
> stats every day. Even if they did, your stats are generated
> daily. Having the logs in a database allows you to generate the stats
> on the fly. Now with a simple caching system that keeps the stats
> until midnight, you can save yourself a lot of machine power.

not relevant.

1. my customers want raw log files. the fact that i run webalizer
for them is a nice bonus, but what they insist on having is the raw
logs downloadably by ftp whenever they want (within a time limit -
we don't keep old logs forever). that's fine by me - stats are their

2. cpu usage is basically irrelevant on a machine which is I/O bound.

3. caching the stats pages defeats the purpose of generating them on the

4. generating stats on the fly is more expensive CPU and I/O wise than
running webalizer once/night and generating static html stats pages.

5. adding more boxes to the web farm is pretty easy with a properly
designed load-balancer system.

> > the only trouble is that means at least 2 log files open per vhost
> > per apache process...on one of my machines, that means 344 log files
> > open per process, * 50 processes (average) = 17,200 log files open.
>  Read http://httpd.apache.org/docs/vhosts/fd-limits.html

i read it years ago. i'm fully aware of the issues regarding
file-descriptor limits.

> > that obviously is not very scalable.
>  That's a nice way to put it. Another way to put it would be "it's not
> gonna work".

no. it does work. it's working right now, with that many log files open.

it's not scalable. looking at current growth patterns, i reckon i've got
a few months to come up with a long-term solution before it becomes a
serious problem.


craig sanders <[EMAIL PROTECTED]>

Fabricati Diem, PVNC.
 -- motto of the Ankh-Morpork City Watch

with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to