Re: Which Subphases Do Times on JobHistory Web UI Cover

2013-09-24 Thread Sandy Ryza
Average map time includes everything the map task is doing, i.e. all the things you mentioned. Reduce time does not cover shuffle time. Reduce time is the time spent calling the reducer function and writing its output to HDFS. Merge time is related to reduce, not map. -Sandy On Tue, Sep 24, 2

Re: Which Subphases Do Times on JobHistory Web UI Cover

2013-09-24 Thread Efe Gencer
*By the way this question is about Apache Hadoop Release 2.1.0-beta. Thanks, 2013/9/24 Efe Gencer > Hi All, > > In JobHistory Web UI under Job > "Map Tasks" I see something as follows: > ... > Started: > Finished: > Elapsed: 12 mins, 5sec > Diagnostics: > *Average Map Time*: 1 mins, 40 sec >

Re: Lz4Codec

2013-09-24 Thread Harsh J
The LZ4 codec was introduced in 2.x releases and is not present in 1.x. On Tue, Sep 24, 2013 at 8:20 PM, Tomás Fernández Pena wrote: > Hi > I'm doing a comparative between different compression codecs but I can > not find the LZ4 one. I'm using Hadoop 1.2.1. > Where is org.apache.hadoop.io.compre

Re: Yarn and Hdfs federation

2013-09-24 Thread Harsh J
Hi Manickam, Can you explain what you're really trying to achieve/construct here? YARN has little to do with Federation. One typically runs federation to divide the namespace across multiple NameNodes (lowering load compared to a single NN). YARN and MR though would run across the whole cluster i

Which Subphases Do Times on JobHistory Web UI Cover

2013-09-24 Thread Efe Gencer
Hi All, In JobHistory Web UI under Job > "Map Tasks" I see something as follows: ... Started: Finished: Elapsed: 12 mins, 5sec Diagnostics: *Average Map Time*: 1 mins, 40 sec Average Reduce Time: 12 sec Average Shuffle Time: 10 mins, 8 sec Average Merge Time: 1 sec ... 1) I wonder which sub-map

Yarn and Hdfs federation

2013-09-24 Thread Manickam P
Guys, I have installed federated cluster with two name nodes and 3 data nodes. Now i want to have separate RM for yarn. Can i install like that? Then will my name node work like NM? How it will work? Can i install hive in any one of the name node in federated cluster? Pls help me to understan

Re: HDFs file-create performance

2013-09-24 Thread M. C. Srivas
Small file creation is a well-documented major problem (and bottleneck) in HDFS. You can either roll your own protocol, or use MapR which is about 100x faster and 1000x scalable than HDFS for this particular problem.

Lz4Codec

2013-09-24 Thread Tomás Fernández Pena
Hi I'm doing a comparative between different compression codecs but I can not find the LZ4 one. I'm using Hadoop 1.2.1. Where is org.apache.hadoop.io.compress.Lz4Codec? How can I use it? Regards Tomas

Memory Implications on NameNode of creating SymLinks using FIleContext

2013-09-24 Thread Geovanie Marquez
Hi, Symbolic Links are supported in Hadoop 2.0 using FileContext's objects createSymLink() method. I am looking at using symlinks heavily in a program that places all files for the previous month in Hadoop Archives

RE: Distributed cache in command line

2013-09-24 Thread Chandra Mohan, Ananda Vel Murugan
Hi, Thanks for the response. I can create symlinks for the files. But I don't know how to add jar to distributed cache. I found one way is by using libjars argument while running hadoop job. Is it possible to add a jar file directly to distributed cache? Is there any specific folder in HDFS whi