Joel Welling wrote:
Hi folks;
  I'm new to Hadoop, and I'm trying to set it up on a cluster for which
almost all the disk is mounted via the Lustre filesystem.  That
filesystem is visible to all the nodes, so I don't actually need HDFS to
implement a shared filesystem.  (I know the philosophical reasons why
people say local disks are better for Hadoop, but that's not the
situation I've got).  My system is failing, and I think it's because the
different nodes are tripping over each other when they try to run HDFS
out of the same directory tree.
  Is there a way to turn off HDFS and just let Lustre do the distributed
filesystem?  I've seen discussion threads about Hadoop with NFS which
said something like 'just specify a local filesystem and everything will
be fine', but I don't know how to do that.  I'm using Hadoop 0.17.2.


I dont know enough about Lustre to be very useful

* You shouldnt have nodes trying to use the same directories. At the very least, point each datanode at a different bit of the filesystem.

* If there is a specific API call to find out which rack has the data, that could be used to place work near the data. Someone (==you) would have to write a new filesystem back-end for Hadoop for this.

-steve

Reply via email to