On Fri, Jun 21, 2013 at 7:07 PM, Joe Zhang <smartag...@gmail.com> wrote:

> Sorry, Nutch is certainly aware of page modification, and it does capture
> lastModified.

Nutch does captures the "last modified" field but I am not sure if its
value is used ahead. I remember that it was not being used for any logic in
older versions but need to confirm if the code is modified to take that
into account.

The real question is, can nutch get lastModified of a page
> before fetching, and use it to make fetching decisions (e.g,, whether or
> not to override the default interval)?
>

No. Nutch won't lookup for the lastModified of a page before fetching its
content.

>
>
> On Fri, Jun 21, 2013 at 6:27 PM, Joe Zhang <smartag...@gmail.com> wrote:
>
> > If I don't change the default value of db.fetch.interval.default, which
> is
> > 30 days, does it mean that the URL in the db won't be refetched before
> the
> > due time even if it has been modified? In other words, is Nutch aware of
> > page modification?
> >
>

Reply via email to