Mefa, On 10-Jan-2012, at 6:38 PM, Mefa Grut wrote:
> Two cleanup related questions: > Can I execute context.write from the reduce/map cleanup phase? If by cleanup, you mean the mapper/reducer cleanup methods, then the answer is Yes, and this has been asked previously: http://search-hadoop.com/m/jzO0k18XoNW1 if you want to know some random info. on top. (You probably do not even seek the cleanup method, see my last para.) > Should I expect cleanup to be killed when a task fail or killed(speculative > execution)? I don't understand this question. If your task fails, then it fails right there. Your cleanup() method won't even be called, since your task would exit with whatever error it ran into. And kills (user-killed or speculative-killed) are pure kills, so your task may die out immediately when such a signal is issued. > The idea is to update HBase counters from within mapreduce job (kind of > alternative to the builtin mapreduce counters that can scale to millions of > counters). > Since tak can fail and run again or be duplicated and killed events can be > incremented too many times. How Hadoop workaround this problem with the > generic counters? In Hadoop, the counters are added only from successful tasks (i.e. tasks that have been 'committed' by the framework, via the OutputCommitter). I think, for your case, it'd be better if you did the final committing with a custom impl. of OutputCommitter. But unfortunately the output stream is not available inside the FOC, so you'd have to probably hack around a bit to get your outputs to HBase in the end. But there may surely be other, possibly better solutions :) A good idea would be to also ask this specific issue on the HBase's user lists, so you reach the right audience.