Two things generally cause this.
1) The Java task does not have sufficient java heap. It pauses often and
because the JVM is paused it does not report status
2) a process like streaming or potentially a UDTF is creating many tuples
or taking a long time to produce a single tuple/row
Hive has some settings that control this. You can simply ignore the
timeouts and let the process continue indefinitely, But there is no direct
way for a user to signal progress like in map reduce. (If streaming you
can send progress)
On Fri, Nov 29, 2013 at 7:11 AM, Krishna Rao krishnanj...@gmail.com wrote:
Hi all,
We've been running into this problem a lot recently on a particular reduce
task. I'm aware that I can work around it by uping the
mapred.task.timeout.
However, I would like to know what the underlying problem is. How can I
find this out?
Alternatively, can I force a generated hive task to report a status, maybe
just increment a counter?
Cheers,
Krishna