Re: [WWWOFFLE-Users] WWWOFFLE Version 2.9-beta released

Micha Tue, 21 Feb 2006 22:41:09 -0800


No why ? IMHO we were talking about extend wwwoffle 
further in the direction of a true webarchive. That isn't desired. 
However, this does not mean that there aren't any archiving 
functions...in fact, they are, by the very basic design. 
What i understood was that Andrew just doesn't like to 
emphasis archiving features in the first place.



Miernik <[EMAIL PROTECTED]>:

| Andrew M. Bishop <[EMAIL PROTECTED]> wrote:
| > I know that quite a few people want WWWOFFLE to be an archiving
| > program, but I am against that.  An archiving program does not need
| > the proxying functions but probably works best if it works like
| > www.archive.org.
| 
| I disagree here, what I need is a proxy cache with archiving functions.
| Thats how I use WWWOFFLE, and I can't imagine how a archiving program
| which is not a web proxy that I always have in my webbrowser proxy
| settings, would be useful. What is useful is an archiving program that
| automatically archives all pages a user looks at in his normal browsing.
| How do you expect something that "works like www.archive.org" to do
| that?
| 
| Currently one of the greatest options of WWWOFFLE is
| keep-cache-if-not-found = yes
| I wonder why its no by default.
| 
| However it has one big deficiency: if the page we had in the cache is
| replaced by some other page with status 200 (for example with some text
| removed, or some newer news, or an explanation page that the author had
| to remove the content because of some legal problems, etc), than we lose
| the previous version forever. It would be useful if there was an option
| to instruct WWWOFFLE to always keep a backup copy (for some URLs),
| regardless of whether the new version is status 200 or not.

AFAIR we had very similar proposals some years ago. But I think this is exactly 
what Andrew refuses to do :)

Another issue which makes web archives less attractive is that they have no 
legal validity.
If someone could provide legal validity about a content, maybe even only by 
technical means
(checksum deposition or sth) then that would be much more interesting. 

Personally, i seem to have similar preferences to you, and solved the problem 
by using two different browsers. If i discover, with firefox, that a page 
content 
which i like to preserve is gone, then i can lookup in the wwwoffle cache if 
it's there.
That also means i have to decide if a site is interesting enough to switch the 
brwoser:
If i discover some pages are mportant i have to open them in the other browser. 
Since it occurs rarely this is no big trouble. If it's about a whole site, or a 
online manual, 
i use to issue a recursive fetch anyway.
A little trade-off is that i don't accumulate waste in the cache anymore.

(It would basically work  with one browser alone too,  if it had a quick switch 
proxy' 
button. Galeon refused to imply that, years ago, now let's see what firefox is 
doing...)

This makes two more things necessary:
First, bookmark files in sync. This was possible with galeon and firefox, now i 
shall
loook for something else. Maybe start two different instances of firefox 
(vmware? xnest?).

Second, a simple and fast way to say 'don't purge these pages / this site' 
because
obviously archived versions doesn't expire. 
Marking a site 'archived' or release it for purging again should be as easy as 
to set a bookmark.
I solved that with some script (wwwofflebook, available in contrib) but wished 
it would 
be integrated into the browser. For example, if i could include a specific 
comment tag 
like <wwwoffle_archived_page> into some bookmark xml field which says this page 
- or the pages domain domain - shouldn't be purged; and if wwwoffle would read 
that information
(optionally) and include it into the purge action. It would be possible for 
most browsers if 
one could customize the filepath and a correct working function.
However i found it easiest to say any bookamred page shouldn't expire since i 
can setup overriding 
exceptions in the wwwoffle config anyway, or just do a reload when online. 

Well this is all about extensions, or 'plugins', perhaps it would be woth a 
discussion to
have a general 'API' for script plugins somehow. This is how firefox got mighty.
And they just include the best contributions into the main code.
 



   °
 /\/

Re: [WWWOFFLE-Users] WWWOFFLE Version 2.9-beta released

Reply via email to