Hi , I am Roshan and I have setup hadoop in a fully distributed mode(hadoop cluster). The default file sytem is hdfs.I have two nodes cluster. The map reduce job works fine when I give inputfile from hdfs location and output is also generated in hdfs when running WordCount example. My hadoop-site.xml looks like this:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://server1.example.com</value> </property> <property> <name>mapred.job.tracker</name> <value>server1.example.com:9001</value> </property> <property> <name>dfs.data.dir</name> <value>/home/admin/hadoop/dfs/data</value> </property> <property> <name>dfs.name.dir</name> <value>/home/admin/hadoop/dfs/name</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/tmp/hadoop</value> </property> <property> <name>mapred.system.dir</name> <value>/hadoop/mapred/system</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> </configuration> The problem is that I don't want to give hdfs location for fileinput and want to generated output in localfilesystem also giving the inputfile from local but I want to do this in hadoop fully distributed mode configuration(keeping the hadoop cluster alive). Giving default filesystem other than hdfs doesn't run cluster.so Is there any way to achieve this? I also have nfs server running. All the nodes in hadoop cluster are also the client of nfs,so they can share the exported /home/admin directory. Does nfs help in this regard? Do I have to change my hadoop-site.xml.If so what would it looks like? Any help will be heartly appreciated. I am looking forward for the kind response.