Pedro, > 1 - Hadoop uses several ports to run. It exists ports for HDFS, for the > MapReduce JvmTasks, etc. I don't know how I can identify all the ports that > a MapReduce and HDFS uses. I'm running the wordcount example, and I would > like to see what ports are open and what are their purpose. Where can I get > this information? > > > 2 - Running the example of the WordCount in MapReduce, I noticed that it's > created several new process. I know that it's created new process for a > TaskTracker, for a JobTracker and for a TaskInProgress. Is this all the > processes that are created on MapReduce during the execution of the > wordcount example? How many processes are created in total?
There are 2 daemons that are launched when the MapReduce framework starts - JobTracker and TaskTracker. Every map or reduce task runs in a separate process called Child (you could grep for Child to identify these). Possibly, this is what you meant by 'TaskInProgress'. TaskInProgress is an internal class of the framework and as such does not relate to any process launched. The number of child processes launched will be equal to the number of Map and reduce tasks launched on a slave node. Thanks Hemanth
