Paul Slootman <[EMAIL PROTECTED]> writes:

> please take a look at the following Debian bug report.
> I've written a few comments at the end.
> (Please preserve [EMAIL PROTECTED] in the CC: list when
> responding, so that your responses can be tracked by the Debian BTS.)
> 
> On Sat 25 Nov 2006, Tim Connors wrote:
> > Subject: Bug#400329: wwwoffle: lock-files in concurrent downloading broken 
> > either way
> > From: Tim Connors <[EMAIL PROTECTED]>
> > To: Debian Bug Tracking System <[EMAIL PROTECTED]>
> 
> > Package: wwwoffle
> > Version: 2.9-2
> > Severity: grave
> > Justification: causes non-serious data loss
> > 
> > wwwoffle has the setting:
> > # lock-files = yes | no
> > #         Enable the use of lock files to stop more than one WWWOFFLE 
> > process
> > #         from downloading the same URL at the same time (default=no).
> > 
> > Either way this is set, it is broken.  
> > 
> > If set to yes, it seems that a lockfile only manages to tell the
> > second process to give up loading the page at all, giving back a HTTP
> > 500 WWWOFFLE Server Error:
> > 
> > for i in `seq 1 10 ` ;do lynx -dump http://www.google.com.au & done

> >      _________________________________________________________________
> > 
> >                            WWWOFFLE Server Error
> > 
> >                The WWWOFFLE server encountered a fatal error:
> > 
> >                  Cannot open the spooled web page to read.
> >             The program cannot continue to service this request.

> >      _________________________________________________________________

You should not be getting this error message.  You should get this
error instead:

      _________________________________________________________________

                            WWWOFFLE File Locked

                            Your request for URL

                                  <<<URL>>>
           is already being modified by another WWWOFFLE server.

Help

   The page that you have requested is being modified by another server
   and you cannot currently access it. Reloading this page will wait for
   the other server to finish making modifications.

   To ensure that only one WWWOFFLE server modifies each cached file at a
   time a lock file is used. While one server is modifying the cached
   file the lock file exists so that other servers know about this. Until
   the first server has finished the cached file is not valid and cannot
   be accessed by another server.

   If you see this error message all of the time even when offline then
   it is possible that the lock file exists but there is no server
   modifying the page. This can only happen when the WWWOFFLE server that
   was modifying the page does not exit properly. It is important that
   you make sure that you allow the WWWOFFLE servers to finish and that
   you do not kill them or shut down the machine while they are still
   modifying a page. To remove the lock file it is necessary to purge the
   WWWOFFLE cache since this will clean up any bad lock files.

      _________________________________________________________________

The purpose of this page is to explain exactly what has happened.  The
system is not broken, but busy.  If it happens all the time then it
might be broken and there is even an explanation of how to solve it.

This error page should not appear until a timeout of 1/6th of the
socket timeout option in the configuration file (so 10 seconds for a
60 second socket timeout).

The purpose of the lockfile is to stop the same file being downloaded
many times.  One of the key features of WWWOFFLE is the ability to
reduce the number of bytes downloaded.  To do this you need a method
to ensure that mutliple downloads of the same file do not occur.


> > If set to no, then the first process to hit the cache gets their
> > download, but the second process only retrieves as much as had reached
> > the first process at that time.  So the second download ends up
> > incomplete.  No error is returned, so it may not even be apparent
> > until a later date -- hence dataloss.

> I've responded that it's not a grave loss of data, as that's what the
> option is for; say "yes" if you want to prevent that.

I agree, the lockfiles gave some people problems so I let them have no
lock files at the risk of broken files.  You have a choice.


> That said, the error you get when a page is indeed locked, is a bit
> unexpected: "This is an internal WWWOFFLE server error that should not
> occur."  It's after all a documented option...
> IMHO the second process should wait for the completion of the first
> download process before proceeding; giving an "500 WWWOFFLE Server
> Error" is not the right thing to do here...

There is the risk of a real bug in WWWOFFLE that causes the lockfile
to not be deleted when it should be.  In this case if the second
process waits for the lockfile then it will wait forever.  Eventually
all servers are waiting for the same lockfile and nobody is ever going
to delete it.  This is why there is a timeout before showing the
special lockfile error message.

-- 
Andrew.
----------------------------------------------------------------------
Andrew M. Bishop                             [EMAIL PROTECTED]
                                      http://www.gedanken.demon.co.uk/

WWWOFFLE users page:
        http://www.gedanken.demon.co.uk/wwwoffle/version-2.9/user.html


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to