Miernik <[EMAIL PROTECTED]> writes:
> Andrew M. Bishop <[EMAIL PROTECTED]> wrote:
> > I know that quite a few people want WWWOFFLE to be an archiving
> > program, but I am against that. An archiving program does not need
> > the proxying functions but probably works best if it works like
> > www.archive.org.
>
> I disagree here, what I need is a proxy cache with archiving functions.
> Thats how I use WWWOFFLE, and I can't imagine how a archiving program
> which is not a web proxy that I always have in my webbrowser proxy
> settings, would be useful. What is useful is an archiving program that
> automatically archives all pages a user looks at in his normal browsing.
> How do you expect something that "works like www.archive.org" to do
> that?
You can continue to use WWWOFFLE as a proxy to keep a copy of
everything. What you do is you feed the WWWOFFLE cache to this new
archiving program every time that you go offline so that it can create
an archive of changed pages on that day. This means that it is
possible for there to be two different programs, one for proxying and
one for archiving. Obviously the drawback of this is that you don't
get a copy of pages that have changed while you have been online, you
only get the last one.
> Currently one of the greatest options of WWWOFFLE is
> keep-cache-if-not-found = yes
> I wonder why its no by default.
Sometimes the new page is more useful than the old one. It might tell
you where the pages have all gone, or why they have been moved etc.
The default is not to be an archiving program, but to keep a copy of
what you have visited. In this case you have visited a 404 error page.
> However it has one big deficiency: if the page we had in the cache is
> replaced by some other page with status 200 (for example with some text
> removed, or some newer news, or an explanation page that the author had
> to remove the content because of some legal problems, etc), than we lose
> the previous version forever. It would be useful if there was an option
> to instruct WWWOFFLE to always keep a backup copy (for some URLs),
> regardless of whether the new version is status 200 or not.
>
> Ideally it would keep all old versions (or a configurable number for
> each URL-SPEC) if they differ from each other, and allow diffs between
> them.
This is exactly the function that I don't want to add in to WWWOFFLE.
You need an archiving program for that, it doesn't have to be part of
WWWOFFLE.
--
Andrew.
----------------------------------------------------------------------
Andrew M. Bishop [EMAIL PROTECTED]
http://www.gedanken.demon.co.uk/
WWWOFFLE users page:
http://www.gedanken.demon.co.uk/wwwoffle/version-2.8/user.html