Re: Fair Scheduler of Hadoop

2013-01-26 Thread Lin Ma
Thanks Joep, smart answer! All of my confusions are gone. Have a good weekend. regards, Lin On Tue, Jan 22, 2013 at 2:00 AM, Joep Rottinghuis jrottingh...@gmail.comwrote: You could configure it like that if you wanted. Keep in mind that would waste some resources. Imagine a 10 minute task

Difference between HDFS and local filesystem

2013-01-26 Thread Sundeep Kambhampati
Hi Users, I am kind of new to MapReduce programming I am trying to understand the integration between MapReduce and HDFS. I could understand MapReduce can use HDFS for data access. But is possible not to use HDFS at all and run MapReduce programs? HDFS does file replication and partitioning.

Re: Difference between HDFS and local filesystem

2013-01-26 Thread Harsh J
The local filesystem has no sense of being 'distributed'. If you run a distributed mode of Hadoop over file:// (Local FS), then unless the file:// points being used itself is distributed (such as an NFS), then your jobs will fail their tasks on all the nodes the referenced files cannot be found

Re: Executing a Python program inside Map Function

2013-01-26 Thread Harsh J
Java provides the Process class to help you launch and read/write from/to processes: http://docs.oracle.com/javase/6/docs/api/java/lang/Process.html. You can use this to spawn your program from your code, to write input into the process's stdin, and to read its output via its stdout/etc.. The