I am using a Python script as a mapper for a Hadoop Streaming (hadoop
0.20.0) job, with reducer NONE. My jobs keep getting killed with "task
failed to respond after 600 seconds." I tried sending a heartbeat
every minute to stderr using sys.stderr.write in my mapper, but
nothing is being output to s
This doesn't solve your stderr/stdout problem, but you can always set the
timeout to be a bigger value if necessary.
-Dmapred.task.timeout=__ (in milliseconds)
Koji
On 10/25/09 12:00 PM, "Ryan Rosario" wrote:
> I am using a Python script as a mapper for a Hadoop Streaming (hadoop
> 0.20.0
Thanks. I think that I may have tripped on some sort of bug.
Unfortunately, I do not know how to reproduce it and am a bit scared
to try to reproduce it.
I got this to work. I changed the following things, and now my job
completes successfully with stderr written to the logs as output
occurs. What
Most likely one gets buffered when the file descriptor is a pipe and the
other is at most line buffered as it is when the code is run by the
streaming mapper tsak.
On Mon, Oct 26, 2009 at 11:06 AM, Ryan Rosario wrote:
> Thanks. I think that I may have tripped on some sort of bug.
> Unfortunately,