Hello,

since my update to version 2.7h I have noticed, that some web pages are
sometimes not updated, when online surfing with wwwoffle, i.e. wwwoffle
presents me old versions of the page.

I have problems to track down what actually goes wrong, though I have some
ideas. I've mailed Andrew about this, and sent him a lot of logs, but
unfortunately this was not enough to put light on it. So I want to ask
people whether they have seen something like this, and if so, under which
circumstances.

So this is what I have:

I use wwwoffle on a gateway, not on the machine where the browser runs. I
have DSL connection, which I often switch on and off manually. Auto-dial-in
is *not* configuered. The gateway is configuered without forwarding; the
browser must use the proxy and cannot connect directly.

I use Opera 6, and can also see this with Netscape 4.7, both configured with
RAM cache, but no disk cache.

I get old pages with sites that I use frequently online and offline, with
URL's without a file name, e.g. http://www.heise.de/ct/ or
http://www.freeciv.org/ 

The old pages are (of course) presented when I'm online, I don't use the
fetch mechanism. The restart of the browser or the connection does not help.

My online options look like this:

OnlineOptions
{
 request-changed       = 60m
 request-changed-once  = no
 request-expired       = no
 request-no-cache      = no
 try-without-password  = yes
 intr-download-keep    = no
 intr-download-size    = 1
 intr-download-percent = 80
 timeout-download-keep = no
 request-compressed-data = no
}

My guesses so far are, that wwwoffle sends a wrong If-modified-since header
to the server, with a date newer than the web page, and the reason might be
that it mixes the 2 URLs e.g. http://www.freeciv.org/ and
http://www.freeciv.org/index.phtml which are different but represent the
same file.

Andrew explained me on this:

------------------------------------------------------------------
The date that WWWOFFLE will use is the last date that it checked if
the file needed updating.  If the server used a "Last-Modified" header
then WWWOFFLE would use that all times.

For example:

WWWOFFLE got a URL at 10:11:12 on 1/2/2003

On 2/2/2003 at 12:00:00 you ask for the same URL again then WWWOFFLE
will use the header "If-Modified-Since 10:11:12 1/2/2003".  This is
the time that WWWOFFLE got the file, so it makes sense to ask if it
has changed since then.  If the file has not changed then WWWOFFLE
updates the date of it in the cache.

On 10/2/2003 at 19:00:00 you ask for the same URL again WWWOFFLE will   
use the header "If-Modified-Since 12:00:00 2/2/2003".  The same reason
applies as before.  If the file is not modified then WWWOFFLE will
update the cached file time anyway.

This way WWWOFFLE keeps increasing the date since it doesn't know what
the original date of the file was because there was no "Last-Modified"
header.
---------------------------------------------------------------------

So far the theory. I know for sure that the pages are old because I can log
on the server, start lynx, and circumvent wwwoffle this way.

But this is "one shot" and I cannot make it reproducible for others :-(

If somebody knows more than this I'll be happy; otherwise I have to put
wwwoffle into the bit-bucket, because I don't want to live in the past :-(

Please don't forget to start wwwoffled with full debug log: wwwoffled -d 6

Regards,

Christian

-- 
Christian Knoke     * * *      http://www.enter.de/~c.knoke/
* * * * * * * * *  Ceterum censeo Microsoft esse dividendum.

Reply via email to