Flag for generate to fetch only new pages to complement the -refetchonly flag
-----------------------------------------------------------------------------

         Key: NUTCH-49
         URL: http://issues.apache.org/jira/browse/NUTCH-49
     Project: Nutch
        Type: New Feature
  Components: fetcher  
    Reporter: Luke Baker
    Priority: Minor
 Attachments: fetchnewonly.patch

It would be useful, especially for research/testing purposes, to have a flag 
for the FetchListTool that make sure to only include URLs in the fetchlist that 
have not already been fetched (according to the information from the webdb that 
you're generating the fetchlist from).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.Net email is sponsored by: New Crystal Reports XI.
Version 11 adds new functionality designed to reduce time involved in
creating, integrating, and deploying reporting solutions. Free runtime info,
new features, or free trial, at: http://www.businessobjects.com/devxi/728
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to