As part of my housekeeping routine, I dutifully run "wwwoffle -purge" 
once every week with the untouched default purge configuration in my 
"wwwoffle.conf" file (i.e.):

,----[ purge.txt ]-
|  use-mtime     = no
| 
|  max-size      = -1
|  min-free      = -1
| 
|  use-url       = no
| 
|  del-dontget   = yes
|  del-dontcache = yes
| 
|  age           = 4w
| 
|  compress-age  = -1
`----

I thought that it worked.  I get an impressive list of URLs flashing 
past with information on how many Kb have been deleted and how much 
room remains on the drive.

However, I have just looked at the content of /var/spool/wwwoffle/http/ 
and find that it has grown (over the past year) to 264 Mb.  

It seems that the default setting of "use-mtime" may be to blame.
However, if "access time" is the controlling factor in selecting
candidates for purging, I fail to understand how aged files, like those
listed below and of whose existence I have long ago ceased to be aware,
are accessed routinely without any directive from me:


,----[ old_cache.txt ]-
| drwxr-xr-x    2 daemon   daemon       4096 Jul 21  2003 www.upi.com
| drwxr-xr-x    2 daemon   daemon       4096 Jul 21  2003 sunsite.dk
| drwxr-xr-x    2 daemon   daemon       4096 Jul 21  2003 z.about.com
| drwxr-xr-x    2 daemon   daemon       4096 Jul 21  2003 gardening.about.com
| drwxr-xr-x    2 daemon   daemon       4096 Jul 20  2003 canberraauctions.com.au
| drwxr-xr-x    2 daemon   daemon       4096 Jul 20  2003 207.36.6.4
| drwxr-xr-x    2 daemon   daemon       4096 Jul 19  2003 www.mikerubel.org
| drwxr-xr-x    2 daemon   daemon       4096 Jul 18  2003 learn.to
| drwxr-xr-x    2 daemon   daemon       4096 Jul 18  2003 home.worldonline.dk
`----

So, finally, my questions:

- What was the rationale for basing the default purge configuration on 
  access time rather than modification time? and 
  
- What am I likely to destroy by the setting "use-mtime=yes"?


And suggestions to help with custom tweaks to the purge defaults:

- Redirect the output of the purge command to a file; this can then be 
  inspected at leisure to decide which URLs should be purged and which 
  should be kept; and

- Have a separate "Purge.txt" file - like the current "DontGet.txt" file  
  in which the updated tweaks are entered.


Felix Karpfen 

-- 
Felix Karpfen
Public Key 72FDF9DF (DH/DSA)



Reply via email to