Like you said, it depends both on the kind of network you have and the type of
your workload.
Given your point about S3, I'd guess your input files/blocks are not large
enough that moving code to data trumps moving data itself to the code. When
that balance tilts a lot, especially when moving
Hi Mike
Data locality has an assumption. It assumes storage access (disk, ssd, etc)
is faster than network data transferring. Vinod has already explained the
benefits. But locality in map stage may not always bring good things. If a
fat node saves a large file, it is possible that current MR
Checked the logs, and turned out to be configuration problem. Just
set dfs.namenode.fs-limits.min-block-size to 1 and it's fixed
Thanks.
On Wed, Mar 19, 2014 at 2:51 PM, Brahma Reddy Battula
brahmareddy.batt...@huawei.com wrote:
Seems to be this is issue, which is logged..Please check
Hi,
I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can
successfully run distributedshell-2.2.0.jar example. But when I try to run
any mapreduce job I get error. I have setup MapRed.xml and other configs
for running MapReduce job according to (
Really stuck at this step. I have test with smaller data set and it works. Now
I am using wikipedia articles (46GB) with 600 chunks (each 64MB)
I have set number of mappers and reducers to 1 to ensure consistency and I am
running on a local node. Why reducer doesn't report anything within 600
What is 614 here?
The other relevant thing to check is the MapReduce specific config
mapreduce.application.classpath.
+Vinod
On Mar 22, 2014, at 9:03 AM, Tony Mullins tonymullins...@gmail.com wrote:
Hi,
I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can
That I also dont know what 614... Its the exact and single line in stderr
of Jobs logs.
And regarding MapRed classpath , defaults are good as there are only two
vars $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*.
Is there any other place to look
Given your earlier mail about the paths in /opt, shouldn't mapreduce classpath
also point to /opt/yarn/hadoop-2.3.0 etc?
+Vinod
On Mar 22, 2014, at 11:33 AM, Tony Mullins tonymullins...@gmail.com wrote:
That I also dont know what 614... Its the exact and single line in stderr of
Jobs logs.
Do not leave that configuration in after your tests are done. It would be very
harmful to allow such tiny block sizes from clients, enabling them to
flood your NameNode's metadata with a lot of blocks for a small file.
If its instead possible, tune NNBench's block size to be larger perhaps.
On