Logging is also different in 0.8. by default it logs to file 
$NUTCH_HOME/logs/hadoop.log (so you don't need to capture stdout, stderr 
to log file anymore)
--
 Sami Siren

BDalton wrote:

>Thank you, that seemed to fix the problem. Unfortunately, another problem
>followed.
>
>With command: bin/nutch crawl urls1 -dir newcrawled -depth 2 >& crawl.log
>
>I now get a directory called “newcrawled”, however, the crawl.log is created
>empty without any information. Also the index created contains no data. No
>error messages. I’m using nightly July 18, and have no problems with 0.7.2.
>
>
>Sami Siren-2 wrote:
>  
>
>>in 0.8 you submit a _directory_ containing urls.txt not the file itself.
>>
>>so remove /urls.txt part from your commandline and it should go fine.
>>
>>--
>> Sami Siren
>>
>>BDalton wrote:
>>
>>    
>>
>>>I get this error,
>>>
>>>bin/nutch crawl url.txt -dir newcrawled -depth 2 >& crawl.log
>>>
>>>Exception in thread "main" java.io.IOException: Input directory
>>>d:/nutch3/urls/urls.txt in local is invalid.
>>>     at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274)
>>>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327)
>>>     at org.apache.nutch.crawl.Injector.inject(Injector.java:138)
>>>     at org.apache.nutch.crawl.Crawl.main(Crawl.java:105)
>>>
>>> 
>>>
>>>      
>>>
>>
>>    
>>
>
>  
>


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to