Hello, I've got a question about Hadoop On Demand and HDFS/MapReduce.
I've understand that MapReduce splits the input data-set into independent chunks and that HDFS store the data over the cluster. In the cluster that we have, we want split the data over the cluster, and then we have a batch system with Torque and Maui, and we want that when a user submit a job, and the job work in a specific data-set, the job is processed by a node or more nodes that contain the specific dataset. Have I only need of HDFS and MapReduce, or have I need of Hadoop On Demand too? What is exactly Hadoop on Demand?What means "managing independent Hadoop MapReduce and HDFS instances on a shared cluster of nodes"? Well, I've try to configure HOD, but when i try to do: ./hod allocate -d /opt/exp_soft/hdfs/clusterdir/ -n 3 --resource_manager.batch-home=/var/spool/pbs I obtain the following error: CRITICAL - qsub error: exit code: 175 | signal: False | core False CRITICAL - qsub error: exit code: 175 | signal: False | core False CRITICAL - Job submission failed with exit code 175 CRITICAL - Cannot allocate cluster /opt/exp_soft/hdfs/clusterdir Thanks a lot. Andrea Valentini