Hi all,


I’d like to create a new URI for a distributed POSIX-compliant filesystem
shared between all nodes. A number of such filesystems currently exist
(think HDFS w/o the POSIX incompliance). We can, of course, run HDFS on top
of such a file system, but it adds an extra unnecessary and inefficient
layer. Why have a master retrieve a set of data from a FS cluster, only to
distribute it back out to the same cluster but on a different distributed FS
(HDFS)?



In the new URI I seek to create, each MapReduce slave would look for input
data from a seemingly local file:///, and write output to it as well. Assume
that the distributed FS handles concurrent reads, writes. Assuming
POSIX-compliance, the LocalFileSystem seems to be the best foundation.



Please let me know of any warnings or errors you see in this. Any advice is
strongly appreciated as well, as the source tree of Hadoop is new to me and
intimidating.



Best,

--Chris

Reply via email to