Hello,

I've got a question about Hadoop On Demand and HDFS/MapReduce.

I've understand that MapReduce splits the input data-set into
independent chunks and that HDFS store the data over the cluster.

In the cluster that we have, we want split the data over the cluster,
and then we have a batch system with Torque and Maui, and we want that
when a user submit a job, and the job  work in a specific data-set, the
job is processed by a node or more nodes that contain the specific
dataset.

Have I only need of HDFS and MapReduce, or have I need of Hadoop On
Demand too?

What is exactly Hadoop on Demand?What means "managing independent Hadoop
MapReduce and HDFS instances on a shared cluster of nodes"? 

Well, I've try to configure HOD, but when i try to do:
./hod allocate -d /opt/exp_soft/hdfs/clusterdir/ -n 3
--resource_manager.batch-home=/var/spool/pbs

I obtain the following error:
CRITICAL - qsub error: exit code: 175 | signal: False | core False
CRITICAL - qsub error: exit code: 175 | signal: False | core False
CRITICAL - Job submission failed with exit code 175
CRITICAL - Cannot allocate cluster /opt/exp_soft/hdfs/clusterdir

Thanks a lot.

Andrea Valentini

Reply via email to