Re: Which data sets were processed by each tasktracker?

2013-05-03 Thread Harsh J
You probably need to be using a release that has https://issues.apache.org/jira/browse/MAPREDUCE-3678 in it. It will print the input split onto the task logs, letting you know therefore what it processed at all times (so long as the input split type, such as file splits, have intelligible outputs f

Which data sets were processed by each tasktracker?

2013-05-03 Thread Agarwal, Nikhil
Hi, I have a 3-node cluster, with JobTracker running on one machine and TaskTrackers on other two. Instead of using HDFS, I have written my own FileSystem implementation. I am able to run a MapReduce job on this cluster but I am not able to make out from logs or TaskTracker UI, which data sets