subject:"\"Hadoop doesn't find the input file\""

Re: Hadoop doesn't find the input file

2014-01-04 Thread Manikandan Saravanan

Hmm.. I just removed the “crawl” directory (output directory) from the command and it works! I’m storing the output in a Cassandra cluster using Gora anyway. So I don’t think I want to store that on HDFS :) -- Manikandan Saravanan Architect - Technology TheSocialPeople On 4 January 2014 at 11:0

Re: Hadoop doesn't find the input file

2014-01-04 Thread Ted Yu

Can you pastebin the stack trace involving the NPE ? Thanks On Jan 4, 2014, at 9:25 AM, Manikandan Saravanan wrote: > Hi, > > I’m trying to run Nutch 2.2.1 on a Haddop 2-node cluster. My hadoop cluster > is running fine and I’ve successfully added the input and output directory on > to HDFS

Hadoop doesn't find the input file

2014-01-04 Thread Manikandan Saravanan

Hi, I’m trying to run Nutch 2.2.1 on a Haddop 2-node cluster. My hadoop cluster is running fine and I’ve successfully added the input and output directory on to HDFS. But when I run $HADOOP_HOME/bin/hadoop jar /nutch/apache-nutch-2.2.1.job org.apache.nutch.crawl.Crawler urls -dir crawl -depth