Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by MatthiasGuenter: http://wiki.apache.org/nutch/FAQ ------------------------------------------------------------------------------ ==== What happens if I inject urls several times? ==== Urls, which are already in the database, won't be injected. + + ==== Java.io.IOException: No input directories specified in: NutchConf: nutch-default.xml , mapred-default.xml ==== + + This really is a crawl tool issue, but is covered here as weel: The crawl tool expects as its first parameter the folder name where the seeding urls file is located so for example if your urls.txt is located in /nutch/seeds the crawl command would look like: crawl seeds -dir /user/nutchuser... === Fetching === @@ -361, +365 @@ ==== Java.io.IOException: No input directories specified in: NutchConf: nutch-default.xml , mapred-default.xml ==== - The crawl tool expects as its first parameter the folder name where the seeding urls file is located so for example if your urls.txt is located in /nutch/seeds the crawl command would look like: crawl seed -dir /user/nutchuser... + The crawl tool expects as its first parameter the folder name where the seeding urls file is located so for example if your urls.txt is located in /nutch/seeds the crawl command would look like: crawl seeds -dir /user/nutchuser... === Discussion === ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-cvs mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-cvs
