No why ? IMHO we were talking about extend wwwoffle further in the direction of a true webarchive. That isn't desired. However, this does not mean that there aren't any archiving functions...in fact, they are, by the very basic design. What i understood was that Andrew just doesn't like to emphasis archiving features in the first place.
Miernik <[EMAIL PROTECTED]>: | Andrew M. Bishop <[EMAIL PROTECTED]> wrote: | > I know that quite a few people want WWWOFFLE to be an archiving | > program, but I am against that. An archiving program does not need | > the proxying functions but probably works best if it works like | > www.archive.org. | | I disagree here, what I need is a proxy cache with archiving functions. | Thats how I use WWWOFFLE, and I can't imagine how a archiving program | which is not a web proxy that I always have in my webbrowser proxy | settings, would be useful. What is useful is an archiving program that | automatically archives all pages a user looks at in his normal browsing. | How do you expect something that "works like www.archive.org" to do | that? | | Currently one of the greatest options of WWWOFFLE is | keep-cache-if-not-found = yes | I wonder why its no by default. | | However it has one big deficiency: if the page we had in the cache is | replaced by some other page with status 200 (for example with some text | removed, or some newer news, or an explanation page that the author had | to remove the content because of some legal problems, etc), than we lose | the previous version forever. It would be useful if there was an option | to instruct WWWOFFLE to always keep a backup copy (for some URLs), | regardless of whether the new version is status 200 or not. AFAIR we had very similar proposals some years ago. But I think this is exactly what Andrew refuses to do :) Another issue which makes web archives less attractive is that they have no legal validity. If someone could provide legal validity about a content, maybe even only by technical means (checksum deposition or sth) then that would be much more interesting. Personally, i seem to have similar preferences to you, and solved the problem by using two different browsers. If i discover, with firefox, that a page content which i like to preserve is gone, then i can lookup in the wwwoffle cache if it's there. That also means i have to decide if a site is interesting enough to switch the brwoser: If i discover some pages are mportant i have to open them in the other browser. Since it occurs rarely this is no big trouble. If it's about a whole site, or a online manual, i use to issue a recursive fetch anyway. A little trade-off is that i don't accumulate waste in the cache anymore. (It would basically work with one browser alone too, if it had a quick switch proxy' button. Galeon refused to imply that, years ago, now let's see what firefox is doing...) This makes two more things necessary: First, bookmark files in sync. This was possible with galeon and firefox, now i shall loook for something else. Maybe start two different instances of firefox (vmware? xnest?). Second, a simple and fast way to say 'don't purge these pages / this site' because obviously archived versions doesn't expire. Marking a site 'archived' or release it for purging again should be as easy as to set a bookmark. I solved that with some script (wwwofflebook, available in contrib) but wished it would be integrated into the browser. For example, if i could include a specific comment tag like <wwwoffle_archived_page> into some bookmark xml field which says this page - or the pages domain domain - shouldn't be purged; and if wwwoffle would read that information (optionally) and include it into the purge action. It would be possible for most browsers if one could customize the filepath and a correct working function. However i found it easiest to say any bookamred page shouldn't expire since i can setup overriding exceptions in the wwwoffle config anyway, or just do a reload when online. Well this is all about extensions, or 'plugins', perhaps it would be woth a discussion to have a general 'API' for script plugins somehow. This is how firefox got mighty. And they just include the best contributions into the main code. ° /\/
