nodes namely nutch1,2,3. The first one’s in the masters file
and all are listed in the slaves file. The /etc/hosts file lists all machines
along with their IP addresses. Can someone help me?
--
Manikandan Saravanan
Architect - Technology
TheSocialPeople
Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com
On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan
manikan...@thesocialpeople.net wrote:
I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase runs
fine. But in the next job, this error comes up
3 -topN 5
I’m getting something like:
INFO input.FileInputFormat: Total input paths to process : 0
Which, I understand, is meaning that Hadoop cannot locate the input files. The
job ends for obvious reasons citing the null pointer exception. Can someone
help me out?
--
Manikandan Saravanan
Hmm.. I just removed the “crawl” directory (output directory) from the command
and it works! I’m storing the output in a Cassandra cluster using Gora anyway.
So I don’t think I want to store that on HDFS :)
--
Manikandan Saravanan
Architect - Technology
TheSocialPeople
On 4 January 2014 at 11