Hadoop doesn't find the input file

2014-01-04 Thread Manikandan Saravanan
Hi, I’m trying to run Nutch 2.2.1 on a Haddop 2-node cluster. My hadoop cluster is running fine and I’ve successfully added the input and output directory on to HDFS. But when I run $HADOOP_HOME/bin/hadoop jar /nutch/apache-nutch-2.2.1.job org.apache.nutch.crawl.Crawler urls -dir crawl -depth

Re: Hadoop doesn't find the input file

2014-01-04 Thread Ted Yu
Can you pastebin the stack trace involving the NPE ? Thanks On Jan 4, 2014, at 9:25 AM, Manikandan Saravanan manikan...@thesocialpeople.net wrote: Hi, I’m trying to run Nutch 2.2.1 on a Haddop 2-node cluster. My hadoop cluster is running fine and I’ve successfully added the input and

Re: Hadoop doesn't find the input file

2014-01-04 Thread Manikandan Saravanan
Hmm.. I just removed the “crawl” directory (output directory) from the command and it works! I’m storing the output in a Cassandra cluster using Gora anyway. So I don’t think I want to store that on HDFS :) --  Manikandan Saravanan Architect - Technology TheSocialPeople On 4 January 2014 at