Hello, Sorry for the puzzling subject. I have a single long running /statement/ in my reduce method, so the the framework might assume my reduce is not responding and kill it. I solved the problem in the map method by subclassing MapRunner, and running a thread which calls reporter.progress() every minute or so. However the same thread does not run during the reduce (i checked this by setting a status string in the thread which did not appear (on the Jobtracker website) during the reduce stage but did appearing during the map stage).
Hadoop v.0.20 appears to solve this by having separate run methods for both Map and Reduce, however I'm using v 0.19. I scanned the Streaming source and it only subclasses MapRunner, so I assume it to has the same limitation (probably wrong, if so can someone point me to the location?) Is there a way around this, /without/ starting a thread in the reduce function? Hadoop v 0.19 Many thanks Saptarshi -- Saptarshi Guha - saptarshi.g...@gmail.com