To add a comment to this (quite old) message :-

On Wed, 11 May 2005, Andrew M. Bishop wrote:

> "Paul A. Rombouts" <[EMAIL PROTECTED]> writes:
> 
> > In the past few weeks I have been experimenting with a design where the 
> > URLs of
> > the cached webpages are stored in a single compact database file called
> > "urlhashtable". This file is mmapped to an area of address space that is 
> > shared
> > between all WWWOFFLE processes.
> 
> I try and keep WWWOFFLE simple, which is a good and often recommended
> way of writing software.  It is a method that tends to produce robust
> software.
> 
> I like the ability to be able to keep all of the files relating to one
> host together in one directory.  I can delete any or all of the files
> (it is best if I delete the matching U* file for each D* I delete) for
> a host and the program keeps on working.  I can copy the host
> directory (or files from a host directory) between machines without
> needing to worry about WWWOFFLE failing to work.  I don't even need to
> tell WWWOFFLE that I have done this, it will work it out for itself.

I make heavy use of this to scoop pages for schools which are never
online. They pass their wget requests to another wwoffle installation
which is online, which creates a dynamic wwwoffle instance, pulls the
request through it, and then packs up the site directories.

These packed up directories are passed by USB stick to the offline
school. I very much appreciate the ability to just dump fresh site
directories, or updated old directories, and have it 'just work'.

The urlhashtable would break that ability.

My 2c ..

Cheers,    Andy!

Reply via email to