Hi - Check the logs first.
-Original message-
From: Bin Wang
Sent: Saturday 4th January 2014 21:47
To: dev@nutch.apache.org
Subject: Re: Independent Map Reduce to parse Nutch content (Cont.)
Hi Tejas,
I started an AWS instance and run hadoop in single node mode.
When I do..
hadoop
*>> It will finish all the mappers without problem but still.. errored out
after all the mappers*
*>> Exception in thread "main" java.io.IOException: Job failed!*
As I mentioned in the earlier mail, did you see the logs to find out the
root cause of the exception ?
*>> I can see Nutch constan
Hi Tejas,
I started an AWS instance and run hadoop in single node mode.
When I do..
hadoop -jar example.jar hdfsinput/ hdfsoutput/
Everything works perfect as I expected: a bunch of staff got printed to the
screen and both mappers and reducers got finished without question. In the
end, the expec
Hi Bin Wang,
I would suggest you to NOT use eclipse and run your code over command line.
Use logger statements and see the logs for full stack traces of the
failure. In my personal experience, logs are the best way to debug hadoop
code compared to Eclipse debugger.
Thanks,
Tejas
On Fri, Jan 3, 2
Hi,
I tried to modify the code here to parse the nutch content data...
http://svn.apache.org/viewvc/nutch/trunk/src/java/org/apache/nutch/parse/ParseSegment.java?view=markup
And in the end of this email is a prototype that I have written to run map
reduce to calculate the HTML content length of ea
5 matches
Mail list logo