How can I tell how the map and reduce tasks were spread accross the cluster? I looked at the jobtracker web page but can't find that info.
Also, can I specify how many map or reduce tasks I want to be launched? >From what I understand is that it's based on the number of input files passed to hadoop. So if I have 4 files there will be 4 Map taks that will be launced and reducer is dependent on the hashpartitioner.