Re: [WWWOFFLE-Users] Recursive fetch with links to identical items

Andrew M. Bishop Mon, 25 Jun 2007 21:06:49 -0700

Albert Reiner <[EMAIL PROTECTED]> writes:

> I just wanted to make sure my understanding is correct: 
> 
> Suppose I am doing a recursive fetch with a high depth, and many of
> those pages are interlinked or contain, e.g., references to the same
> graphics files (logos etc.).  Will wwwoffle be so smart to skip the
> pages it already downloaded in this batch, or will it simply continue
> following links until the depth is exceeded?
> 
> My assumption is that wwwoffle is smart enough not to fetch the same
> file more than once during a single run.


WWWOFFLE will treat each URL with the normal rules for deciding if it
is to be downloaded again in the same session.  These are the rules in
the OnlineOptions section of the configuration file.  The default
settings are that files will only be fetched once in a session.

While performing a recursive fetch WWWOFFLE will place the new files
in the list in the outgoing directory.  The decision if they will be
fetched or not will be made at the time that the file is read in again
from the outgoing directory.


> On a related note[*]: Is there any defined relation between the order
> of the output of `wwwoffle-ls outgoing' and the order in which pages
> are fetched?

Yes, the order that wwwoffle-ls uses is to list the files in the
on-disk directory order.  In the case of the outgoing directory this
is also the order that will be used to decide which URL to fetch next.
So the order of wwwoffle-ls is the order in which they will be
fetched.

-- 
Andrew.
----------------------------------------------------------------------
Andrew M. Bishop                             [EMAIL PROTECTED]
                                      http://www.gedanken.demon.co.uk/

WWWOFFLE users page:
        http://www.gedanken.demon.co.uk/wwwoffle/version-2.9/user.html

Re: [WWWOFFLE-Users] Recursive fetch with links to identical items

Reply via email to