Re: Data Locality Importance

2014-03-22 Thread Vinod Kumar Vavilapalli
Like you said, it depends both on the kind of network you have and the type of your workload. Given your point about S3, I'd guess your input files/blocks are not large enough that moving code to data trumps moving data itself to the code. When that balance tilts a lot, especially when moving

Re: Data Locality Importance

2014-03-22 Thread Chen He
Hi Mike Data locality has an assumption. It assumes storage access (disk, ssd, etc) is faster than network data transferring. Vinod has already explained the benefits. But locality in map stage may not always bring good things. If a fat node saves a large file, it is possible that current MR

Re: Benchmark Failure

2014-03-22 Thread Lixiang Ao
Checked the logs, and turned out to be configuration problem. Just set dfs.namenode.fs-limits.min-block-size to 1 and it's fixed Thanks. On Wed, Mar 19, 2014 at 2:51 PM, Brahma Reddy Battula brahmareddy.batt...@huawei.com wrote: Seems to be this is issue, which is logged..Please check

Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

2014-03-22 Thread Tony Mullins
Hi, I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can successfully run distributedshell-2.2.0.jar example. But when I try to run any mapreduce job I get error. I have setup MapRed.xml and other configs for running MapReduce job according to (

Re: The reduce copier failed

2014-03-22 Thread Mahmood Naderan
Really stuck at this step. I have test with smaller data set and it works. Now I am using wikipedia articles (46GB) with 600 chunks (each 64MB) I have set number of mappers and reducers to 1 to ensure consistency and I am running on a local node. Why reducer doesn't report anything within 600

Re: Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

2014-03-22 Thread Vinod Kumar Vavilapalli
What is 614 here? The other relevant thing to check is the MapReduce specific config mapreduce.application.classpath. +Vinod On Mar 22, 2014, at 9:03 AM, Tony Mullins tonymullins...@gmail.com wrote: Hi, I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can

Re: Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

2014-03-22 Thread Tony Mullins
That I also dont know what 614... Its the exact and single line in stderr of Jobs logs. And regarding MapRed classpath , defaults are good as there are only two vars $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*, $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*. Is there any other place to look

Re: Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

2014-03-22 Thread Vinod Kumar Vavilapalli
Given your earlier mail about the paths in /opt, shouldn't mapreduce classpath also point to /opt/yarn/hadoop-2.3.0 etc? +Vinod On Mar 22, 2014, at 11:33 AM, Tony Mullins tonymullins...@gmail.com wrote: That I also dont know what 614... Its the exact and single line in stderr of Jobs logs.

Re: Benchmark Failure

2014-03-22 Thread Harsh J
Do not leave that configuration in after your tests are done. It would be very harmful to allow such tiny block sizes from clients, enabling them to flood your NameNode's metadata with a lot of blocks for a small file. If its instead possible, tune NNBench's block size to be larger perhaps. On