I hope someone can help me with this problem.
This works fine:
#bin/nutch crawl urls.txt
and it creates a directory named something like crawl-20060418105008,
with a working index.
However if I try to add any parameters beyond the root_url_file
parameter I get the output below. I'm really stumped. The following
does not create a directory named FOO, but it does create a directory
named something like crawl-20060418105500, so apparently it ignores the
-dir FOO parameter.
Help, anyone? This happens under Solaris. This works fine on my PC
using cygwin but I want to run this on Solaris. TIA!
## bin/nutch crawl urls.txt -dir FOO
060418 105308 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/nutch-default.xml
060418 105308 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/crawl-tool.xml
060418 105308 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/nutch-site.xml
060418 105308 No FS indicated, using default:local
060418 105308 crawl started in: crawl-20060418105308
060418 105308 rootUrlFile = urls.txt -dir FOO
060418 105308 threads = 10
060418 105308 depth = 5
060418 105310 Created webdb at
LocalFS,/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/crawl-20060418105308/db
Exception in thread "main" java.io.FileNotFoundException: urls.txt -dir
FOO (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at java.io.FileReader.<init>(FileReader.java:55)
at
org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:372)
at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general