Dear All, I am trying to setup Hadoop for multiple users in a class, on our cluster. For some reason I don't seem to get it right. If only one user is running it works great. I would want to have all of the users submit a Hadoop job to the existing DataNode and on the cluster, not sure if this is right. Do I need to start a DataNode for every user, if so I was not able to do because I ran into issues of port already being used. Please advise. Below are few of the config files.
Also I have tired searching for other documents, that tell us to create a user "Hadoop" and a group "Hadoop" and then start the daemons as Hadoop user. This didn't work for me as well. I am sure I am doing something wrong. Could anyone please thrown in some more ideas. =>List of env changed in Hadoop-env.sh: export HADOOP_LOG_DIR=/scratch/$USER/hadoop-logs export HADOOP_PID_DIR=/scratch/$USER/.var/hadoop/pids #cat core-site.xml <configuration> <property> <name>fs.default.name</name> <value>hdfs://frontend:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/scratch/${user.name}/hadoop-FS</value> <description>A base for other temporary directories.</description> </property> </configuration> # cat hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.name.dir</name> <value>/scratch/${user.name}/.hadoop/.transaction/.edits</value> </property> </configuration> # cat mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>frontend:9001</value> </property> <property> <name>mapreduce.tasktracker.map.tasks.maximum</name> <value>2</value> </property> <property> <name>mapreduce.tasktracker.reduce.tasks.maximum</name> <value>2</value> </property> </configuration> Thank you, Amit