Logging is also different in 0.8. by default it logs to file $NUTCH_HOME/logs/hadoop.log (so you don't need to capture stdout, stderr to log file anymore) -- Sami Siren
BDalton wrote: >Thank you, that seemed to fix the problem. Unfortunately, another problem >followed. > >With command: bin/nutch crawl urls1 -dir newcrawled -depth 2 >& crawl.log > >I now get a directory called “newcrawled”, however, the crawl.log is created >empty without any information. Also the index created contains no data. No >error messages. I’m using nightly July 18, and have no problems with 0.7.2. > > >Sami Siren-2 wrote: > > >>in 0.8 you submit a _directory_ containing urls.txt not the file itself. >> >>so remove /urls.txt part from your commandline and it should go fine. >> >>-- >> Sami Siren >> >>BDalton wrote: >> >> >> >>>I get this error, >>> >>>bin/nutch crawl url.txt -dir newcrawled -depth 2 >& crawl.log >>> >>>Exception in thread "main" java.io.IOException: Input directory >>>d:/nutch3/urls/urls.txt in local is invalid. >>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274) >>> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327) >>> at org.apache.nutch.crawl.Injector.inject(Injector.java:138) >>> at org.apache.nutch.crawl.Crawl.main(Crawl.java:105) >>> >>> >>> >>> >>> >> >> >> > > > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
