Felix Karpfen <[EMAIL PROTECTED]> writes:
> Dan Jacobson wrote:
> > To see what files have been fetched repeatedly recently, try
> > cd /var/cache/wwwoffle && find *time* -name U\*|xargs sed :|sort|uniq -c|sort -nr
>
> OK - did that; a relevant extract is attached.
>
> Some of the attached entries are monitored URLs, where the content of
> the page is known to change each day. Others (mainly .gif entries) are
> unsolicited additions to entries that had been flagged for monitoring
> and that <may|may not> be changed; the latter are only a small fraction
> of the URLs that WWWOFFLE lists daily during the download with the
> comment "Unchanged; not fetched" (or words to that effect).
>
> So is there a remedy or is it just an opportunity for teeth-nashing?
Dan's script provides an opportunity for people to fine-tune their
WWWOFFLE configuration to reduce the bandwidth.
I tried the script and there were no GIFs that were fetched daily, the
only URLs that appeared 8 times were the ones that were monitored.
The reason for this is that I have got some options in the
configuration file to stop certain URLs from being re-checked often.
As an example part of my wwwoffle.conf is below.
OnlineOptions
{
<*://*zdnet.*/*/graphics/*> request-changed = 4w
<*://*zdnet.*/anchordesk/images/*> request-changed = 4w
<*://*zdnet.*/clear/*> request-changed = 4w
<*://*zdnet.*/graphics/*> request-changed = 4w
<*://images*.slashdot.org/*> request-changed = 4w
}
This stops the specified URLs from being re-fetched within 4 weeks.
If I monitor any of the pages from these sites it is only the
monitored page that is updated.
--
Andrew.
----------------------------------------------------------------------
Andrew M. Bishop [EMAIL PROTECTED]
http://www.gedanken.demon.co.uk/
WWWOFFLE users page:
http://www.gedanken.demon.co.uk/wwwoffle/version-2.7/user.html