from:"Arun C Murthy"

Re: mapred queue -list

2013-06-14 Thread Arun C Murthy

Capacity is 'guaranteed' capacity, while max-capacity can we configured to be 
greater than capacity.

Arun

On Jun 13, 2013, at 5:28 AM, Pedro Sá da Costa wrote:

> 
> When I launch the command "mapred queue -list" I have this output:
> 
> 
> Scheduling Info : Capacity: 100.0, MaximumCapacity: 1.0, CurrentCapacity: 0.0 
> 
> What is the difference between Capacity and  MaximumCapacity fields?
> 
> 
> 
> -- 
> Best regards,

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Save configuration data in job configuration file.

2013-01-19 Thread Arun C Murthy

jobConf.set(String, String)?

On Jan 19, 2013, at 7:31 AM, Pedro Sá da Costa wrote:

> Hi
> 
> I want to save some configuration data in the configuration files that
> belongs to the job. How can I do it?
> 
> -- 
> Best regards,

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Java Heap memory error : Limit to 2 Gb of ShuffleRamManager ?

2012-12-06 Thread Arun C Murthy

Oliver,

 Sorry, missed this.

 The historical reason, if I remember right, is that we used to have a single 
byte buffer and hence the limit.

 We should definitely remove it now since we don't use a single buffer. Mind 
opening a jira? 

 http://wiki.apache.org/hadoop/HowToContribute

thanks!
Arun

On Dec 6, 2012, at 8:01 AM, Olivier Varene - echo wrote:

> anyone ?
> 
> Début du message réexpédié :
> 
>> De : Olivier Varene - echo 
>> Objet : ReduceTask > ShuffleRamManager : Java Heap memory error
>> Date : 4 décembre 2012 09:34:06 HNEC
>> À : mapreduce-user@hadoop.apache.org
>> Répondre à : mapreduce-user@hadoop.apache.org
>> 
>> 
>> Hi to all,
>> first many thanks for the quality of the work you are doing : thanks a lot
>> 
>> I am facing a bug with the memory management at shuffle time, I regularly get
>> 
>> Map output copy failure : java.lang.OutOfMemoryError: Java heap space
>>  at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1612)
>> 
>> 
>> reading the code in org.apache.hadoop.mapred.ReduceTask.java file
>> 
>> the "ShuffleRamManager" is limiting the maximum of RAM allocation to 
>> Integer.MAX_VALUE * maxInMemCopyUse ?
>> 
>> maxSize = (int)(conf.getInt("mapred.job.reduce.total.mem.bytes",
>>(int)Math.min(Runtime.getRuntime().maxMemory(), 
>> Integer.MAX_VALUE))
>>  * maxInMemCopyUse);
>> 
>> Why is is so ?
>> And why is it concatened to an Integer as its raw type is long ?
>> 
>> Does it mean that you can not have a Reduce Task taking advantage of more 
>> than 2Gb of memory ?
>> 
>> To explain a little bit my use case, 
>> I am processing some 2700 maps (each working on 128 MB block of data), and 
>> when the reduce phase starts, it sometimes stumbles with java heap memory 
>> issues.
>> 
>> configuration is : java 1.6.0-27
>> hadoop 0.20.2
>> -Xmx1400m
>> io.sort.mb 400
>> io.sort.factor 25
>> io.sort.spill.percent 0.80
>> mapred.job.shuffle.input.buffer.percent 0.70
>> ShuffleRamManager: MemoryLimit=913466944, MaxSingleShuffleLimit=228366736
>> 
>> I will decrease 
>> mapred.job.shuffle.input.buffer.percent to limit the errors, but I am not 
>> fully confident for the scalability of the process.
>> 
>> Any help would be welcomed
>> 
>> once again, many thanks
>> Olivier
>> 
>> 
>> P.S: sorry if I misunderstood the code, any explanation would be really 
>> welcomed
>> 
>> -- 
>>  
>>  
>>  
>> 
>> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: [ANNOUNCE] - New user@ mailing list for hadoop users in-lieu of (common,hdfs,mapreduce)-user@

2012-08-07 Thread Arun C Murthy

Apologies (again) for the cross-post, I've filed 
https://issues.apache.org/jira/browse/INFRA-5123 to close down (common, hdfs, 
mapreduce)-user@ since user@ is functional now.

thanks,
Arun

On Aug 4, 2012, at 9:59 PM, Arun C Murthy wrote:

> All,
> 
>  Given our recent discussion (http://s.apache.org/hv), the new 
> u...@hadoop.apache.org mailing list has been created and all existing users 
> in (common,hdfs,mapreduce)-user@ have been migrated over.
> 
>  I'm in the process of changing the website to reflect this (HADOOP-8652). 
> 
>  Henceforth, please use the new mailing list for all user-related discussions.
> 
> thanks,
> Arun
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

[ANNOUNCE] - New user@ mailing list for hadoop users in-lieu of (common,hdfs,mapreduce)-user@

2012-08-04 Thread Arun C Murthy

All,

 Given our recent discussion (http://s.apache.org/hv), the new 
u...@hadoop.apache.org mailing list has been created and all existing users in 
(common,hdfs,mapreduce)-user@ have been migrated over.

 I'm in the process of changing the website to reflect this (HADOOP-8652). 

 Henceforth, please use the new mailing list for all user-related discussions.

thanks,
Arun

Re: task jvm bootstrapping via distributed cache

2012-08-03 Thread Arun C Murthy

Just do -javaagent:./profiler.jar?

On Aug 3, 2012, at 9:32 AM, Stan Rosenberg wrote:

> Arun,
> 
> I don't believe the symlink is of help.  The symlink is created in the
> task's current working directory (cwd), but I don't know what cwd is
> when I launch with 'hadoop jar ...'.
> 
> Thanks,
> 
> stan
> 
> On Fri, Aug 3, 2012 at 2:39 AM, Arun C Murthy  wrote:
>> Stan,
>> 
>> You can ask TT to create a symlink to your jar shipped via DistCache:
>> 
>> http://hadoop.apache.org/common/docs/r1.0.3/mapred_tutorial.html#DistributedCache
>> 
>> That should give you what you want.
>> 
>> hth,
>> Arun
>> 
>> On Jul 30, 2012, at 3:23 PM, Stan Rosenberg wrote:
>> 
>> Hi,
>> 
>> I am seeking a way to leverage hadoop's distributed cache in order to
>> ship jars that are required to bootstrap a task's jvm, i.e., before a
>> map/reduce task is launched.
>> As a concrete example, let's say that I need to launch with
>> '-javaagent:/path/profiler.jar'.  In theory, the task tracker is
>> responsible for downloading cached files onto its local filesystem.
>> However, the absolute path to a given cached file is not known a
>> priori; however, we need the path in order to configure '-javaagent'.
>> 
>> Is this currently possible with the distributed cache? If not, is the
>> use case appealing enough to open a jira ticket?
>> 
>> Thanks,
>> 
>> stan
>> 
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>> 
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: task jvm bootstrapping via distributed cache

2012-08-02 Thread Arun C Murthy

Stan,

 You can ask TT to create a symlink to your jar shipped via DistCache:
 
http://hadoop.apache.org/common/docs/r1.0.3/mapred_tutorial.html#DistributedCache

 That should give you what you want.

hth,
Arun

On Jul 30, 2012, at 3:23 PM, Stan Rosenberg wrote:

> Hi,
> 
> I am seeking a way to leverage hadoop's distributed cache in order to
> ship jars that are required to bootstrap a task's jvm, i.e., before a
> map/reduce task is launched.
> As a concrete example, let's say that I need to launch with
> '-javaagent:/path/profiler.jar'.  In theory, the task tracker is
> responsible for downloading cached files onto its local filesystem.
> However, the absolute path to a given cached file is not known a
> priori; however, we need the path in order to configure '-javaagent'.
> 
> Is this currently possible with the distributed cache? If not, is the
> use case appealing enough to open a jira ticket?
> 
> Thanks,
> 
> stan

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: What are the basic Hadoop Java Classes?

2012-07-30 Thread Arun C Murthy

http://hadoop.apache.org/common/docs/r1.0.3/mapred_tutorial.html

Arun

On Jul 30, 2012, at 12:25 PM,   
wrote:

> Could someone give me a list of the basic Java classes that are needed to run 
> a program in Hadoop?  By basic classes I mean classes like Mapper and Reducer 
> that are essential in running programs in Hadoop. 
>  
> Thanks,
>  
> Andrew Botelho
> EMC Corporation
> 55 Constitution Blvd., Franklin, MA
> andrew.bote...@emc.com
> Mobile: 508-813-2026
>  

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Reducer - getMapOutput

2012-07-30 Thread Arun C Murthy

Hey Robert,

 Can you pls share what you are trying to do?

Arun

On Jul 30, 2012, at 9:10 AM, Grandl Robert wrote:

> Hi,
> 
> I am trying to modify the code for data transfer of intermediate output. 
> 
> In this respect, on the reduce side in getMapOuput I want to have the 
> connection with the TaskTracker(setupSecureConnection), but then on doGet the 
> TaskTracker to be able to delay the response back to the client. I tried with 
> Thread.sleep(time), before setting up the response.setHeader(FROM_MAP_TASK, 
> mapId) and the subsequent headers/content. But it seems the reducer will 
> continue to read the headers immediately and get NULL.
> 
> Do you know how can I do the reducer to wait for TaskTracker to update this 
> information before reading it. I tried with a while loop and reading the 
> connection.getHeaderField(...) till not null but seems it always remains in 
> the loop :(.
> 
> Probably is more a HttpServlet related problem but I am not very familiar 
> with that. 
> 
> Do you have any idea how can I do it ?
> 
> Thanks,
> Robert

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Hadoop compile - delete conf files

2012-07-17 Thread Arun C Murthy

This seems like a bug, can you pls open a jira? Thanks!

On Jul 17, 2012, at 6:43 PM, Grandl Robert wrote:

> Mayank,
> 
> Thanks a lot for your answer. Do you know how I can avoid overwritten of 
> these four files though?
> 
> Otherwise I just have to backup and copy again the old content. 
> 
> Thanks,
> Robert
> 
> From: Mayank Bansal 
> To: mapreduce-user@hadoop.apache.org; Grandl Robert  
> Sent: Tuesday, July 17, 2012 7:38 PM
> Subject: Re: Hadoop compile - delete conf files
> 
> this is how it suppose to be. It copies the files from the src and create the 
> package.
> 
> Thanks,
> Mayank
> 
> On Tue, Jul 17, 2012 at 1:16 PM, Grandl Robert  wrote:
> Hi,
> 
> I am trying to compile hadoop from command line doing something like: 
> 
> ant compile jar run
> 
> However, it always delete the conf files content (hadoop-env.sh, 
> core-site.xml, mapred-site.xml, hdfs-site.xml)
> So I have to recover from backup these files all the time.
> 
> Does anybody face similar issues ? 
> 
> Thanks,
> Robert
> 
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: MRv2 jobs fail when run with more than one slave

2012-07-17 Thread Arun C Murthy

eems to be related to the job 
> failing, despite the fact that the job makes progress on the slave running 
> the AM. The Node Manager logs on both AM and non-AM slaves appear fairly 
> similar, and I don't see any errors in the non-AM logs.
> 
> Another strange data point: These failures occur running the slaves on ARM 
> systems. Running the slaves on x86 with the same configuration works. I'm 
> using the same tarball on both, which means that the native-hadoop library 
> isn't loaded on ARM. The master/client is the same x86 system in both 
> scenarios. All nodes are running Ubuntu 12.04.
> 
> Thanks for any guidance,
> Trevor
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: MRv2 jobs fail when run with more than one slave

2012-07-17 Thread Arun C Murthy

Trevor,

 It's hard for folks here to help you with CDH patchsets (it's their call on 
what they include), can you pls try with vanilla Apache hadoop-2.0.0-alpha and 
I'll try helping out? 

thanks,
Arun

On Jul 17, 2012, at 2:24 PM, Trevor wrote:

> Hi all,
> 
> I recently upgraded from CDH4b2 (0.23.1) to CDH4 (2.0.0). Now for some 
> strange reason, my MRv2 jobs (TeraGen, specifically) fail if I run with more 
> than one slave. For every slave except the one running the Application 
> Master, I get the following failed tasks and warnings repeatedly:
> 
> 12/07/13 14:21:55 INFO mapreduce.Job: Running job: job_1342207265272_0001
> 12/07/13 14:22:17 INFO mapreduce.Job: Job job_1342207265272_0001 running in 
> uber mode : false
> 12/07/13 14:22:17 INFO mapreduce.Job:  map 0% reduce 0%
> 12/07/13 14:22:46 INFO mapreduce.Job:  map 1% reduce 0%
> 12/07/13 14:22:52 INFO mapreduce.Job:  map 2% reduce 0%
> 12/07/13 14:22:55 INFO mapreduce.Job:  map 3% reduce 0%
> 12/07/13 14:22:58 INFO mapreduce.Job:  map 4% reduce 0%
> 12/07/13 14:23:04 INFO mapreduce.Job:  map 5% reduce 0%
> 12/07/13 14:23:07 INFO mapreduce.Job:  map 6% reduce 0%
> 12/07/13 14:23:07 INFO mapreduce.Job: Task Id : 
> attempt_1342207265272_0001_m_04_0, Status : FAILED
> 12/07/13 14:23:08 WARN mapreduce.Job: Error reading task output Server 
> returned HTTP response code: 400 for URL: http://
> perfgb0n0:8080/tasklog?plaintext=true&attemptid=attempt_1342207265272_0001_m_04_0&filter=stdout
> 12/07/13 14:23:08 WARN mapreduce.Job: Error reading task output Server 
> returned HTTP response code: 400 for URL: http://
> perfgb0n0:8080/tasklog?plaintext=true&attemptid=attempt_1342207265272_0001_m_04_0&filter=stderr
> 12/07/13 14:23:08 INFO mapreduce.Job: Task Id : 
> attempt_1342207265272_0001_m_03_0, Status : FAILED
> 12/07/13 14:23:08 WARN mapreduce.Job: Error reading task output Server 
> returned HTTP response code: 400 for URL: http://
> perfgb0n0:8080/tasklog?plaintext=true&attemptid=attempt_1342207265272_0001_m_03_0&filter=stdout
> ...
> 12/07/13 14:25:12 INFO mapreduce.Job:  map 25% reduce 0%
> 12/07/13 14:25:12 INFO mapreduce.Job: Job job_1342207265272_0001 failed with 
> state FAILED due to:
> ...
> Failed map tasks=19
> Launched map tasks=31
> 
> The HTTP 400 error appears to be generated by the ShuffleHandler, which is 
> configured to run on port 8080 of the slaves, and doesn't understand that 
> URL. What I've been able to piece together so far is that /tasklog is handled 
> by the TaskLogServlet, which is part of the TaskTracker. However, isn't this 
> an MRv1 class that shouldn't even be running in my configuration? Also, the 
> TaskTracker appears to run on port 50060, so I don't know where port 8080 is 
> coming from.
> 
> Though it could be a red herring, this warning seems to be related to the job 
> failing, despite the fact that the job makes progress on the slave running 
> the AM. The Node Manager logs on both AM and non-AM slaves appear fairly 
> similar, and I don't see any errors in the non-AM logs.
> 
> Another strange data point: These failures occur running the slaves on ARM 
> systems. Running the slaves on x86 with the same configuration works. I'm 
> using the same tarball on both, which means that the native-hadoop library 
> isn't loaded on ARM. The master/client is the same x86 system in both 
> scenarios. All nodes are running Ubuntu 12.04.
> 
> Thanks for any guidance,
> Trevor
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: What exactly does the "CPU time spent (ms)" measure?

2012-07-17 Thread Arun C Murthy

Zhu,

 Do you mean the 'slot millis' counter?

Arun

On Jul 17, 2012, at 6:13 AM, GUOJUN Zhu wrote:

> 
> Sometimes there are big discrepency between this time and the real running 
> time at the task level.  It can be significantly less than the real running 
> time; for some case, I observe that it is longer than the real running time.  
> What exactly does this counter measure? Thanks. 
> 
> Zhu, Guojun
> Modeling Sr Graduate
> 571-3824370
> guojun_...@freddiemac.com
> Financial Engineering
> Freddie Mac

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Mapper basic question

2012-07-11 Thread Arun C Murthy

Take a look at CombineFileInputFormat - this will create 'meta splits' which 
include multiple small spilts, thus reducing #maps which are run.

Arun

On Jul 11, 2012, at 5:29 AM, Manoj Babu wrote:

> Hi,
> 
> The no of mappers is depends on the no of blocks. Is it possible to limit the 
> no of mappers size without increasing the HDFS block size?
> 
> Thanks in advance.
> 
> Cheers!
> Manoj.
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Basic question on how reducer works

2012-07-09 Thread Arun C Murthy


On Jul 9, 2012, at 12:55 PM, Grandl Robert wrote:

> Thanks a lot guys for answers. 
> 
> Still I am not able to find exactly the code for the following things:
> 
> 1. reducer to read from a Map output only its partition. I looked into 
> ReduceTask#getMapOutput which do the actual read in 
> ReduceTask#shuffleInMemory, but I don't see where it specify which partition 
> to read(reduceID).
> 

Look at TaskTracker.MapOutputServlet.

> 2. still don't understand very well in which part of the code(MapTask.java) 
> the intermediate data is written do which partition. So MapOutputBuffer is 
> the one who actually writes the data to buffer and spill after buffer is 
> full. Could you please elaborate a bit on how the data is written to which 
> partition ?
> 

Essentially you can think of the partition-id as the 'primary key' and the 
actual 'key' in the map-output of  as the 'secondary key'.

hth,
Arun

> Thanks,
> Robert
> 
> From: Arun C Murthy 
> To: mapreduce-user@hadoop.apache.org 
> Sent: Monday, July 9, 2012 9:24 AM
> Subject: Re: Basic question on how reducer works
> 
> Robert,
> 
> On Jul 7, 2012, at 6:37 PM, Grandl Robert wrote:
> 
>> Hi,
>> 
>> I have some questions related to basic functionality in Hadoop. 
>> 
>> 1. When a Mapper process the intermediate output data, how it knows how many 
>> partitions to do(how many reducers will be) and how much data to go in each  
>> partition for each reducer ?
>> 
>> 2. A JobTracker when assigns a task to a reducer, it will also specify the 
>> locations of intermediate output data where it should retrieve it right ? 
>> But how a reducer will know from each remote location with intermediate 
>> output what portion it has to retrieve only ?
> 
> To add to Harsh's comment. Essentially the TT *knows* where the output of a 
> given map-id/reduce-id pair is present via an output-file/index-file 
> combination.
> 
> Arun
> 
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Basic question on how reducer works

2012-07-09 Thread Arun C Murthy

Robert,

On Jul 7, 2012, at 6:37 PM, Grandl Robert wrote:

> Hi,
> 
> I have some questions related to basic functionality in Hadoop. 
> 
> 1. When a Mapper process the intermediate output data, how it knows how many 
> partitions to do(how many reducers will be) and how much data to go in each  
> partition for each reducer ?
> 
> 2. A JobTracker when assigns a task to a reducer, it will also specify the 
> locations of intermediate output data where it should retrieve it right ? But 
> how a reducer will know from each remote location with intermediate output 
> what portion it has to retrieve only ?

To add to Harsh's comment. Essentially the TT *knows* where the output of a 
given map-id/reduce-id pair is present via an output-file/index-file 
combination.

Arun

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: MR job runs on CDH3 but not on CDH4

2012-07-04 Thread Arun C Murthy

It's hard for folks here to help you on CDH - please ask their own user lists.

Arun

On Jul 4, 2012, at 8:49 AM, Alan Miller wrote:

> Hi,
>  
> I’m trying to move from CDH3U3 to CDH4.
> My existing MR program works fine on CDH3U3  but I cant get it to run on CDH4.
>  
> Basically my Driver class
> 1.   queries a PG DB and writes some HashMaps to files in the Distributed 
> Cache,
> 2.   then writes some Avro files (avro 1.7.0) to HDFS,
> 3.   and then triggers a MRv1 job to process the Avro files.
>  
> The DC & Avro files get written so HDFS is working, but my job is not getting 
> started.
> I get an error:
> Exception in thread "main" java.lang.NoSuchMethodError: 
> org.apache.hadoop.ipc.RPC.getProxy()   
> …
> at 
> org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:478)
>  
> Seems the job doesn’t even get accepted .
> At  MyDriver:397 (see below) I call job.submit, but that’s it.
>  
> …..
> 12/07/04 08:16:54 INFO MyDriver: Processing data [my-data]
> 12/07/04 08:16:54 INFO MyDriver: Write Avro: [Tue Jul 03 00:00:00 PDT 2012 > 
> etime <= Tue Jul 03 23:59:59 PDT 2012]
> 12/07/04 08:19:47 INFO MyDriver: Initialized file 
> /data/in/my-data_2012-07-03.avro
> 12/07/04 08:19:49 INFO MyDriver: Read 158285 lines, Wrote 158131 records to 1 
> file(s)
> 12/07/04 08:19:49 INFO MyDriver: Wed Jul 04 08:19:49 PDT 2012 Finished avro 
> data /data/in/my-data_2012-07-03.avro
> 12/07/04 08:19:49 INFO MyDriver: Added 
> /data/cache/fd/83206b5c-8a1c-46f3-bfb2-d8c3e949a530#q_map to distributed 
> cache.
> 12/07/04 08:19:49 INFO MyDriver: Added 
> /data/cache/fd/b2ebfeb9-bdb0-489e-8186-8e18f4416224#u_map to distributed 
> cache.
> 12/07/04 08:19:49 INFO MyDriver: Added 
> /data/cache/fd/437cfd91-aa07-4c3a-b4c9-cd4ae076f7ad#r_map to distributed 
> cache.
> 12/07/04 08:19:49 INFO MyDriver: Added 
> /data/cache/fd/9554fe48-2171-423c-ba54-6249ffc882d4#m_map to distributed 
> cache.
> 12/07/04 08:19:49 INFO MyDriver: Added /data/in/y-data_2012-07-03.avro to 
> input files list.
> Exception in thread "main" java.lang.NoSuchMethodError: 
> org.apache.hadoop.ipc.RPC.getProxy(Ljava/lang/Class;JLjava/net/InetSocketAddress;Lorg/apache/hadoop/security/UserGroupInformation;Lorg/apache/hadoop/conf/Configuration;Ljavax/net/SocketFactory;)Lorg/apache/hadoop/ipc/VersionedProtocol;
> at 
> org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:478)
> at org.apache.hadoop.mapred.JobClient.init(JobClient.java:472)
> at 
> org.apache.hadoop.mapred.JobClient.(JobClient.java:455)
> at org.apache.hadoop.mapreduce.Job$1.run(Job.java:478)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
> at org.apache.hadoop.mapreduce.Job.connect(Job.java:476)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:464)
> at com.mycompany.MyDriver.run(MyDriver.java:397)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at com.mycompany.MyDriver.runHadoopJob(MyDriver.java:308)
> at com.mycompany.MyDriver.main(MyDriver.java:1532)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>  
>  
> Alan

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Out of memory (heap space) errors on job tracker

2012-06-10 Thread Arun C Murthy

Harsh - I'd be inclined to think it's worse than just setting 
mapreduce.jobtracker.completeuserjobs.maximum - the only case this would solve 
is if a single user submitted 25 *large* jobs (in terms of tasks) over a single 
24-hr window.

David - I'm guessing you aren't using the CapacityScheduler - that would help 
you with more controls, limits on jobs etc.

More details here: 
http://hadoop.apache.org/common/docs/r1.0.3/capacity_scheduler.html

In particular, look at the example config there and let us know if you need 
help understanding any of it.

Arun

On Jun 9, 2012, at 10:40 PM, Harsh J wrote:

> Hey David,
> 
> Primarily you'd need to lower down
> "mapred.jobtracker.completeuserjobs.maximum" in your mapred-site.xml
> to a value of < 25. I recommend using 5, if you don't need much
> retention of job info per user. This will help keep the JT's live
> memory usage in check and stop your crashes instead of you having to
> raise your heap all the time. There's no "leak", but this config's
> default of 100 causes much issues to JT that runs a lot of jobs per
> day (from several users).
> 
> Try it out and let us know!
> 
> On Sat, Jun 9, 2012 at 12:37 AM, David Rosenstrauch  wrote:
>> We're running 0.20.2 (Cloudera cdh3u4).
>> 
>> What configs are you referring to?
>> 
>> Thanks,
>> 
>> DR
>> 
>> 
>> On 06/08/2012 02:59 PM, Arun C Murthy wrote:
>>> 
>>> This shouldn't be happening at all...
>>> 
>>> What version of hadoop are you running? Potentially you need configs to
>>> protect the JT that you are missing, those should ensure your hadoop-1.x JT
>>> is very reliable.
>>> 
>>> Arun
>>> 
>>> On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote:
>>> 
>>>> Our job tracker has been seizing up with Out of Memory (heap space)
>>>> errors for the past 2 nights.  After the first night's crash, I doubled the
>>>> heap space (from the default of 1GB) to 2GB before restarting the job.
>>>>  After last night's crash I doubled it again to 4GB.
>>>> 
>>>> This all seems a bit puzzling to me.  I wouldn't have thought that the
>>>> job tracker should require so much memory.  (The NameNode, yes, but not the
>>>> job tracker.)
>>>> 
>>>> Just wondering if this behavior sounds reasonable, or if perhaps there
>>>> might be a bigger problem at play here.  Anyone have any thoughts on the
>>>> matter?
>>>> 
>>>> Thanks,
>>>> 
>>>> DR
>>> 
>>> 
>>> --
>>> Arun C. Murthy
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>> 
>>> 
>>> 
>> 
>> 
> 
> 
> 
> -- 
> Harsh J

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Out of memory (heap space) errors on job tracker

2012-06-08 Thread Arun C Murthy

This shouldn't be happening at all...

What version of hadoop are you running? Potentially you need configs to protect 
the JT that you are missing, those should ensure your hadoop-1.x JT is very 
reliable.

Arun

On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote:

> Our job tracker has been seizing up with Out of Memory (heap space) errors 
> for the past 2 nights.  After the first night's crash, I doubled the heap 
> space (from the default of 1GB) to 2GB before restarting the job.  After last 
> night's crash I doubled it again to 4GB.
> 
> This all seems a bit puzzling to me.  I wouldn't have thought that the job 
> tracker should require so much memory.  (The NameNode, yes, but not the job 
> tracker.)
> 
> Just wondering if this behavior sounds reasonable, or if perhaps there might 
> be a bigger problem at play here.  Anyone have any thoughts on the matter?
> 
> Thanks,
> 
> DR

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Idle nodes with terasort and MRv2/YARN (0.23.1)

2012-06-05 Thread Arun C Murthy

Trevor,

On May 30, 2012, at 1:38 PM, Trevor Robinson wrote:

> Jeff:
> 
> Thanks for the corroboration and advice. I can't retreat to 0.20, and
> must forge ahead with 2.0, so I'll share any progress.
> 
> Arun:
> 
> I haven't set the minimum container size. Do you know the default? Is
> there a way to easily find defaults (more complete/reliable than
> docs)?

The default is way too low (128M) which is non-optimal.

I've opened https://issues.apache.org/jira/browse/MAPREDUCE-4316 to fix it.

> 
> Thanks, I'll give that a try. How does minimum container size relate
> to settings mapreduce.map.memory.mb? Would it essentially raise my
> 768M map memory allocation to 1G? Does CapacityScheduler need any
> additional configuration to function optimally? BTW, what is the
> default scheduler?

The min. container size should be >= mapreduce.map.memory.mb.

If you have a heap size of 768M for your maps, it should be perfectly ok to 
have min-container size as 1024M.

Hmm... I'd also look into bumping mapreduce.reduce.memory.mb to be 
2*min-container-size i.e. use 2 containers.

The default scheduler is FifoScheduler which isn't as mature as CS (which is 
what we use for tuning etc., see 
http://hortonworks.com/blog/delivering-on-hadoop-next-benchmarking-performance/ 
for more details).

> 
> I should have MAPREDUCE-3641. I'm using 0.23.1 with CDH4b2 patches
> (and a few Java 7/Ubuntu 12.04 build patches). How does 2.0.0-alpha
> compare to 0.23.1?
> 

I'm not familiar with patchsets on CDH, but hadoop-2.0.0-alpha is a significant 
improvement on hadoop-0.23.1...

> If there's anything I can do to assist with the issue of spreading out
> map tasks, please let me know. Is there a JIRA issue for it (or if
> not, should there be)?
> 

For smaller clusters https://issues.apache.org/jira/browse/MAPREDUCE-3210 could 
help.

> Incidentally, my current benchmarking work on x86 is only a training
> ground and baseline before moving onto ARM-based systems, which have
> 4GB RAM and generally fewer, smaller (2.5" form factor) disks per
> node. It sounds like the smaller RAM will force better distribution,
> but the disk capacity/utilization situation will be more severe.
> 

Right, smaller RAM should force better distribution.

Love to get more f/b from your ARM work, thanks!

Arun

> Thanks,
> Trevor
> 
> On Tue, May 29, 2012 at 6:21 PM, Arun C Murthy  wrote:
>> What is the minimum container size? i.e.
>> yarn.scheduler.minimum-allocation-mb.
>> 
>> I'd bump it up to at least 1G and use the CapacityScheduler for performance
>> tests:
>> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
>> 
>> In case of teragen, the job has no locality at all (since it's just
>> generating data from 'random' input-splits) and hence you are getting them
>> stuck on fewer nodes since you have so many containers on each node.
>> 
>> The reduces should be better spread if you are using CapacityScheduler and
>> have https://issues.apache.org/jira/browse/MAPREDUCE-3641 in your build i.e.
>> hadoop-0.23.1 or hadoop-2.0.0-alpha (I'd use the latter).
>> 
>> Also, FYI, currently the CS makes the tradeoff that node-locality is almost
>> same as rack-locality and hence you might see maps not spread out for
>> terasort. I'll fix that one soon.
>> 
>> hth,
>> Arun
>> 
>> On May 29, 2012, at 2:33 PM, Trevor Robinson wrote:
>> 
>> Hello,
>> 
>> I'm trying to tune terasort on a small cluster (4 identical slave
>> nodes w/ 4 disks and 16GB RAM each), but I'm having problems with very
>> uneven load.
>> 
>> For teragen, I specify 24 mappers, but for some reason, only 2 nodes
>> out of 4 run them all, even though the web UI (for both YARN and HDFS)
>> shows all 4 nodes available. Similarly, I specify 16 reducers for
>> terasort, but the reducers seem to run on 3 nodes out of 4. Do I have
>> something configured wrong, or does the scheduler not attempt to
>> spread out the load? In addition to performing sub-optimally, this
>> also causes me to run out of disk space for large jobs, since the data
>> is not being spread out evenly.
>> 
>> Currently, I'm using these settings (not shown as XML for brevity):
>> 
>> yarn-site.xml:
>> yarn.nodemanager.resource.memory-mb=13824
>> 
>> mapred-site.xml:
>> mapreduce.map.memory.mb=768
>> mapreduce.map.java.opts=-Xmx512M
>> mapreduce.reduce.memory.mb=2304
>> mapreduce.reduce.java.opts=-Xmx2048M
>> mapreduce.task.io.sort.mb=512
>> 
>> In case it's significant, I've scripted the cluster setup and terasort
>> jobs, so everything runs back-to-back instantly, except that I poll to
>> ensure that HDFS is up and has active data nodes before running
>> teragen. I've also tried adding delays, but they didn't seem to have
>> any effect, so I don't *think* it's a start-up race issue.
>> 
>> Thanks for any advice,
>> Trevor
>> 
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>> 
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Idle nodes with terasort and MRv2/YARN (0.23.1)

2012-05-29 Thread Arun C Murthy

What is the minimum container size? i.e. yarn.scheduler.minimum-allocation-mb.

I'd bump it up to at least 1G and use the CapacityScheduler for performance 
tests:
http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html

In case of teragen, the job has no locality at all (since it's just generating 
data from 'random' input-splits) and hence you are getting them stuck on fewer 
nodes since you have so many containers on each node.

The reduces should be better spread if you are using CapacityScheduler and have 
https://issues.apache.org/jira/browse/MAPREDUCE-3641 in your build i.e. 
hadoop-0.23.1 or hadoop-2.0.0-alpha (I'd use the latter).

Also, FYI, currently the CS makes the tradeoff that node-locality is almost 
same as rack-locality and hence you might see maps not spread out for terasort. 
I'll fix that one soon.

hth,
Arun

On May 29, 2012, at 2:33 PM, Trevor Robinson wrote:

> Hello,
> 
> I'm trying to tune terasort on a small cluster (4 identical slave
> nodes w/ 4 disks and 16GB RAM each), but I'm having problems with very
> uneven load.
> 
> For teragen, I specify 24 mappers, but for some reason, only 2 nodes
> out of 4 run them all, even though the web UI (for both YARN and HDFS)
> shows all 4 nodes available. Similarly, I specify 16 reducers for
> terasort, but the reducers seem to run on 3 nodes out of 4. Do I have
> something configured wrong, or does the scheduler not attempt to
> spread out the load? In addition to performing sub-optimally, this
> also causes me to run out of disk space for large jobs, since the data
> is not being spread out evenly.
> 
> Currently, I'm using these settings (not shown as XML for brevity):
> 
> yarn-site.xml:
> yarn.nodemanager.resource.memory-mb=13824
> 
> mapred-site.xml:
> mapreduce.map.memory.mb=768
> mapreduce.map.java.opts=-Xmx512M
> mapreduce.reduce.memory.mb=2304
> mapreduce.reduce.java.opts=-Xmx2048M
> mapreduce.task.io.sort.mb=512
> 
> In case it's significant, I've scripted the cluster setup and terasort
> jobs, so everything runs back-to-back instantly, except that I poll to
> ensure that HDFS is up and has active data nodes before running
> teragen. I've also tried adding delays, but they didn't seem to have
> any effect, so I don't *think* it's a start-up race issue.
> 
> Thanks for any advice,
> Trevor

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Good learning resources for 0.23?

2012-05-23 Thread Arun C Murthy

Keith,

 Happy to help.

 When you mean .23 API, do you mean how to write your own applications on top 
of YARN?

 If so, you can start with hadoop-2 release docs:
 
http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html

 There is also an example application (DistributedShell) you can look at for a 
simpler usage of YARN apis:
http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/

 As you are probably aware, MapReduce applications itself don't need to change 
when you move to using hadoop-2.

thanks,
Arun

On May 23, 2012, at 3:18 PM, Keith Wiley wrote:

> I have already preordered the third edition of Tom's book (obviously, I don't 
> have it yet since it won't be published until the end of the month), but 
> aside from that, I'm looking for good resources for learning how to program 
> to the .23 API.  I have found several websites and articles that discuss the 
> philosophical differences between .20 and .23 but I'm looking for 
> teaching/learning resources for getting into the guts and actually 
> programming the thing.  I'm pretty competent at .20 so I'm not looking for 
> starter-level hadoop stuff.  Rather, I'm looking for transitional resources 
> to learn the specifics of the new design.
> 
> Any ideas?  How are people vetted on the older versions of hadoop learning 
> the way of the YARN?
> 
> Thanks.
> 
> 
> Keith Wiley kwi...@keithwiley.com keithwiley.com
> music.keithwiley.com
> 
> "And what if we picked the wrong religion?  Every week, we're just making God
> madder and madder!"
>       --  Homer Simpson
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: API to get info for deprecated key

2012-05-22 Thread Arun C Murthy

Good point. Please file a jira to add the new key to the deprecation warning. 
Thanks.

On May 22, 2012, at 12:52 AM, Subroto wrote:

> Hi,
> 
> Though this question may relate to Hadoop-Common project but, I faced the 
> concern while working with MR.
> The current version of Hadoop deprecates many keys but, takes care of adding 
> the new keys to the configuration accordingly.
> For the end user only logs are there which indicates the key is deprecated 
> and the message also suggests the new key to be used.
> 
> I was thinking; there should be an API/Utility which could probably provide 
> the new key information when called with old key.
> Using this API the user can judge in runtime to use the old key or new key.
> Currently the Configuration provides an API only to check whether a key is 
> deprecated or not but, doesn't provides a way to get the corresponding new 
> key.
> 
> Cheers,
> Subroto Sanyal

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Hadoop YARN/MapReduce Meetup during Hadoop Summit 2012

2012-05-14 Thread Arun C Murthy

Folks,

 I thought I'd drop a note and let folks know that I've scheduled a Hadoop 
YARN/MapReduce meetup during Hadoop Summit, June 2012.

 The agenda is: 
 # YARN - State of the art
 # YARN futures
  - Premption
  - Resource Isolation
  - Multi-resource scheduling
 # Implementing new YARN frameworks 
 # MapReduce futures
  - What next for Hadoop MR framework?

If you are interested, please sign up at: 
http://www.meetup.com/Hadoop-Contributors/events/64747342/

I look forward to a fun (technical) conversation and to put faces to names!

thanks,
Arun

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Terasort

2012-05-10 Thread Arun C Murthy

Changing subject...

On May 10, 2012, at 3:40 PM, Jeffrey Buell wrote:

> I have the right #slots to fill up memory across the cluster, and all those 
> slots are filled with tasks. The problem I ran into was that the maps grabbed 
> all the slots initially and the reduces had a hard time getting started.  As 
> maps finished, more maps were started and only rarely was a reduce started.  
> I assume this behavior occurred because I had ~4000 map tasks in the queue, 
> but only ~100 reduce tasks.  If the scheduler lumps maps and reduces 
> together, then whenever a slot opens up it will almost surely be taken by a 
> map task.  To get good performance I need all reduce tasks started early on, 
> and have only map tasks compete for open slots.  Other apps may need 
> different priorities between maps and reduces.  In any case, I don’t 
> understand how treating maps and reduces the same is workable.
>  

Are you playing with YARN or MR1?

IAC, you are getting hit by 'slowstart' for reduces where-in reduces aren't 
scheduled till sufficient % of maps are completed.

Set mapred.reduce.slowstart.completed.maps to 0. (That should work for either 
MR1 or MR2).

Arun

> Jeff
>  
> From: Arun C Murthy [mailto:a...@hortonworks.com] 
> Sent: Thursday, May 10, 2012 1:27 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: Re: max 1 mapper per node
>  
> For terasort you want to fill up your entire cluster with maps/reduces as 
> fast as you can to get the best performance.
>  
> Just play with #slots.
>  
> Arun
>  
> On May 9, 2012, at 12:36 PM, Jeffrey Buell wrote:
> 
> 
> Not to speak for Radim, but what I’m trying to achieve is performance at 
> least as good as 0.20 for all cases.  That is, no regressions.  For something 
> as simple as terasort, I don’t think that is possible without being able to 
> specify the max number of mappers/reducers per node.  As it is, I see 
> slowdowns as much as 2X.  Hopefully I’m wrong and somebody will straighten me 
> out.  But if I’m not, adding such a feature won’t lead to bad behavior of any 
> kind since the default could be set to unlimited and thus have no effect 
> whatsoever.
>  
> I should emphasize that I support the goal of greater automation since Hadoop 
> has way too many parameters and is so hard to tune.  Just not at the expense 
> of performance regressions. 
>  
> Jeff
>  
>  
> We've been against these 'features' since it leads to very bad behaviour 
> across the cluster with multiple apps/users etc.
>  
> What is your use-case i.e. what are you trying to achieve with this?
>  
> thanks,
> Arun
>  
> On May 3, 2012, at 5:59 AM, Radim Kolar wrote:
> 
> 
> 
> if plugin system for AM is overkill, something simpler can be made like:
> 
> maximum number of mappers per node
> maximum number of reducers per node
> 
> maximum percentage of non data local tasks
> maximum percentage of rack local tasks
> 
> and set this in job properties.
>  
>  
>  
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
> 
>  

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: max 1 mapper per node

2012-05-10 Thread Arun C Murthy

For terasort you want to fill up your entire cluster with maps/reduces as fast 
as you can to get the best performance.

Just play with #slots.

Arun

On May 9, 2012, at 12:36 PM, Jeffrey Buell wrote:

> Not to speak for Radim, but what I’m trying to achieve is performance at 
> least as good as 0.20 for all cases.  That is, no regressions.  For something 
> as simple as terasort, I don’t think that is possible without being able to 
> specify the max number of mappers/reducers per node.  As it is, I see 
> slowdowns as much as 2X.  Hopefully I’m wrong and somebody will straighten me 
> out.  But if I’m not, adding such a feature won’t lead to bad behavior of any 
> kind since the default could be set to unlimited and thus have no effect 
> whatsoever.
>  
> I should emphasize that I support the goal of greater automation since Hadoop 
> has way too many parameters and is so hard to tune.  Just not at the expense 
> of performance regressions. 
>  
> Jeff
>  
>  
> We've been against these 'features' since it leads to very bad behaviour 
> across the cluster with multiple apps/users etc.
>  
> What is your use-case i.e. what are you trying to achieve with this?
>  
> thanks,
> Arun
>  
> On May 3, 2012, at 5:59 AM, Radim Kolar wrote:
> 
> 
> if plugin system for AM is overkill, something simpler can be made like:
> 
> maximum number of mappers per node
> maximum number of reducers per node
> 
> maximum percentage of non data local tasks
> maximum percentage of rack local tasks
> 
> and set this in job properties.
>  
>  

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: max 1 mapper per node

2012-05-09 Thread Arun C Murthy

We've been against these 'features' since it leads to very bad behaviour across 
the cluster with multiple apps/users etc.

What is your use-case i.e. what are you trying to achieve with this?

thanks,
Arun

On May 3, 2012, at 5:59 AM, Radim Kolar wrote:

> if plugin system for AM is overkill, something simpler can be made like:
> 
> maximum number of mappers per node
> maximum number of reducers per node
> 
> maximum percentage of non data local tasks
> maximum percentage of rack local tasks
> 
> and set this in job properties.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Cleanup after a Job

2012-04-29 Thread Arun C Murthy

Use OutputCommitter.(abortJob, commitJob):
http://hadoop.apache.org/common/docs/r1.0.2/api/org/apache/hadoop/mapred/OutputCommitter.html

Arun

On Apr 26, 2012, at 4:44 PM, kasi subrahmanyam wrote:

> Hi 
> 
> I have few jobs added to a Job controller .
> I need a afterJob() to be executed after the completion of s Job.
> For example
> 
> Here i am actually overriding the Job of JobControl.
> I have Job2 depending on the output of Job1.This input for Job2is obtained 
> after doing some File System operations on the output of the Job1.This 
> operation should happen in a afterJob( ) method while is available for each 
> Job.How do i make sure that afterJob () method is called for each Job added 
> to the controller before running the jobs that are depending on it.
> 
> 
> Thanks

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: JVM reuse option in MRV2

2012-04-11 Thread Arun C Murthy

Not yet, coming soon via https://issues.apache.org/jira/browse/MAPREDUCE-3902.

Arun

On Apr 9, 2012, at 5:03 PM, ramgopal wrote:

> Hi,
>Is there a way to specify JVM reuse  for yarn applications  as in MRV1?
>  
>  
> Regards,
> Ramgopal
>  
>  
> ***
> This e-mail and attachments contain confidential information from HUAWEI, 
> which is intended only for the person or entity whose address is listed 
> above. Any use of the information contained herein in any way (including, but 
> not limited to, total or partial disclosure, r tion) by persons other than 
> the intended recipient's) is prohibited. If you receive this e-mail in error, 
> please notify the sender by phone or email immediately and delete it!
>  
>  

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: CompressionCodec in MapReduce

2012-04-11 Thread Arun C Murthy

You can write your own InputFormat (IF) which extends FileInputFormat.

In your IF you get the InputSplit which has the filename during the call to 
getRecordReader. That is the hook you are looking for.

More details here:
http://hadoop.apache.org/common/docs/r1.0.2/mapred_tutorial.html#Job+Input

hth,
Arun

On Apr 11, 2012, at 2:53 PM, Grzegorz Gunia wrote:

> I think we misunderstood here.
> 
> I'll base my question upon an example:
> Lets say I want each of the files stored on my hdfs to be encrypted prior to 
> being physically stored on the cluster.
> For that I'll write a custom CompressionCodec, that performs the encryption, 
> and use it during any edits/creations of files in the HDFS.
> Then to make it more secure I'll make it so it uses different keys for 
> different files, and supply the keys to the codec during its instantiation.
> 
> Now I'd like to do a MapReduce job on those files. That would require 
> instantiating the codec, and supplying it with the filename, to determine the 
> key used. Is it possible to do so with the current implementation of Hadoop?
> 
> --
> Greg
> 
> W dniu 2012-04-11 10:44, Zizon Qiu pisze:
>> 
>> If your are:
>> 1. using TextInputFormat.
>> 2.all input files are ends with certain suffix like ".gz"
>> 3.the custom CompressionCodec already register  in configuration and 
>> getDefaultExtension return the same suffix like as describe in 2.
>> 
>> the nothing else you need to do.
>> hadoop will deal with it automatically.
>> 
>> that means the input key&value in map method are already decompress.
>> 
>> But,if the origin files dose not end with certain suffix,you need to write 
>> your own inputformat or subclass TextInputFormat , override the 
>> createRecordReader method which return your own RecordReader.
>> the InputSplit pass to the InputFormat is actually FileInputSplit,which you 
>> can retrieve the input file path.
>> 
>> you may also take a look at the isSplitable method declared in 
>> InputFormat,if your files are not splitable.
>> 
>> for more detail,refer to the TextInputFormat class implementation.
>> 
>> On Wed, Apr 11, 2012 at 4:16 PM, Grzegorz Gunia  
>> wrote:
>> Thanks for you reply! That clears some thing up
>> There is but one problem... My CompressionCodec has to be instantiated on a 
>> per-file basis, meaning it needs to know the name of the file it is to 
>> compress/decompress. I'm guessing that would not be possible with the 
>> current implementation?
>> 
>> Or if so, how would I proceed with injecting it with the file name?
>> --
>> Greg
>> 
>> W dniu 2012-04-11 10:12, Zizon Qiu pisze:
>>> append your custom codec full class name in "io.compression.codecs" either 
>>> in mapred-site.xml or in the configuration object pass to Job constructor.
>>> 
>>> the map reduce framework will try to guess the compress algorithm using the 
>>> input files suffix.
>>> 
>>> if any CompressionCodec.getDefaultExtension() register in the configuration 
>>> match the suffix,hadoop will try to instantiate the codec and decompress 
>>> for you ,if succeed,automatically.
>>> 
>>> the default value for "io.compression.codecs" is 
>>> "org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec"
>>> 
>>> On Wed, Apr 11, 2012 at 3:55 PM, Grzegorz Gunia 
>>>  wrote:
>>> Hello,
>>> I am trying to apply a custom CompressionCodec to work with MapReduce jobs, 
>>> but I haven't found a way to inject it during the reading of input data, or 
>>> during the write of the job results.
>>> Am I missing something, or is there no support for compressed files in the 
>>> filesystem?
>>> 
>>> I am well aware of how to set it up to work during the intermitent phases 
>>> of the MapReduce operation, but I just can't find a way to apply it BEFORE 
>>> the job takes place...
>>> Is there any other way except simply uncompressing the files I need prior 
>>> to scheduling a job?
>>> 
>>> Huge thanks for any help you can give me!
>>> --
>>> Greg
>>> 
>> 
>> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Question on adding new MR application into hadoop.examples.jar

2012-03-19 Thread Arun C Murthy


On Mar 19, 2012, at 12:05 PM, Qu Chen wrote:

> I want to add new MR applications into hadoop-0.20.2-examples.jar. How to do 
> that? 
> 

As a normal enhancement to Hadoop - more details here: 
http://wiki.apache.org/hadoop/HowToContribute

thanks!
Arun

> I have set up Hadoop 0.20.2 development in eclipse. 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: MR output to a file instead of directory?

2012-03-04 Thread Arun C Murthy

I'm not sure about the usecase, but if you really care you can use an existing 
directory (e.g. /) by writing a bit of code to bypass the check for output-dir 
existence...

By default FIleOutputFormat assumes the output-dir shouldn't exist and will 
error out during init if it does. You could customize it to not bother to check.

Arun

On Mar 2, 2012, at 4:38 PM, Jianhui Zhang wrote:

> Hi all,
> 
> The FileOutputFormat/FileOutputCommitter always treats an output path
> as a directory and write files under it, even if there is only one
> Reducer. Is there any way to configure an OutputFormat to write all
> data into a file?
> 
> Thanks,
> James

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: reducers output

2012-02-04 Thread Arun C Murthy

Alieh - you might want to look at MR concepts (job, input, output etc.).

The short answer is *hadoop* doesn't decide anything, it's upto the MR job to 
decide input/output etc.

You might want to read through this:
http://hadoop.apache.org/common/docs/stable/mapred_tutorial.html

Arun

On Feb 3, 2012, at 11:46 PM, Alieh Saeedi wrote:

> Hi
> 1- How does Hadoop decide where to save file blocks (I mean all files include 
> files written by reducers)? Could you please give me a reference link?
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Best practices for hadoop shuffling/tunning ?

2012-01-31 Thread Arun C Murthy

Moving to mapreduce-user@, bcc common-user@. Please use project specific lists.

Your io.sort.mb is too high. You only have 1G of heap for the map. Reduce 
parallel copies is too high too.

On Jan 30, 2012, at 4:50 AM, praveenesh kumar wrote:

> Hey guys,
> 
> Just wanted to ask, are there any sort of best practices to be followed for
> hadoop shuffling improvements ?
> 
> I am running Hadoop 0.20.205 on 8 nodes cluster.Each node is 24 cores/CPUs
> with 48 GB RAM.
> 
> I have set the following parameters :
> 
> fs.inmemory.size.mb=2000
> io.sort.mb=2000
> io.sort.factor=200
> io.file.buffer.size=262544
> 
> mapred.map.tasks=200
> mapred.reduce.tasks=40
> mapred.reduce.parallel.copies=80
> mapred.map.child.java.opts = 1024 Mb
> mapred.map.reduce.java.opts=1024 Mb
> 
> mapred.job.tracker.handler.count=60
> tasktracker.http.threads=50
> mapred.job.reuse.jvm.num.tasks = -1
> mapred.compress.map.output = true
> mapred.reduce.slowstart.completed.maps = 0.5
> 
> mapred.tasktracker.map.tasks.maximum=24
> mapred.tasktracker.reduce.tasks.maximum=12
> 
> 
> Can anyone please validate the above tuning parameters, and suggest any
> further improvements ?
> My mappers are running fine. Shuffling and reducing part is comparatively
> slower, than expected for normal jobs. Wanted to know what I am doing
> wrong/missing.
> 
> Thanks,
> Praveenesh

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: hadoop-1.0.0 and errors with log.index

2012-01-31 Thread Arun C Murthy

Anything in TaskTracker logs ?

On Jan 31, 2012, at 10:18 AM, Markus Jelsma wrote:

> In our case, which seems to be the same problem, the web UI does not show 
> anything useful except the first line of the stack trace:
> 
> 2012-01-03 21:16:27,256 WARN org.apache.hadoop.mapred.TaskLog: Failed to
> retrieve stdout log for task: attempt_201201031651_0008_m_000233_0
> 
> Only the task tracker log shows a full stack trace. This happened on 1.0.0 
> and 
> 0.20.205.0 but not 0.20.203.0.
> 
> 2012-01-03 21:16:27,256 WARN org.apache.hadoop.mapred.TaskLog: Failed to
> retrieve stdout log for task: attempt_201201031651_0008_m_000233_0
> java.io.FileNotFoundException:
> /opt/hadoop/hadoop-0.20.205.0/libexec/../logs/userlogs/job_201201031651_0008/attempt_201201031651_0008_m_000233_0/log.index
> (No such file or directory)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.(SecureIOUtils.java:102)
> at
> org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:187)
> at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLogServlet.java:81)
> at
> org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
> at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> 
> 
>> Actually, all that is telling you is that the task failed and the
>> job-client couldn't display the logs.
>> 
>> Can you check the JT web-ui and see why the task failed ?
>> 
>> If you don't see anything there, you can try see the TaskTracker logs on
>> the node on which the task ran.
>> 
>> Arun
>> 
>> On Jan 31, 2012, at 3:21 AM, Marcin Cylke wrote:
>>> Hi
>>> 
>>> I've upgraded my hadoop cluster to version 1.0.0. The upgrade process
>>> went relatively smoothly but it rendered the cluster inoperable due to
>>> errors in jobtrackers operation:
>>> 
>>> # in job output
>>> Error reading task
>>> outputhttp://hadoop4:50060/tasklog?plaintext=true&attemptid=attempt_20120
>>> 1311241_0003_m_04_2&filter=stdout
>>> 
>>> # in each of the jobtrackers' logs
>>> WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve stderr log for
>>> task: attempt_201201311241_0003_r_00_1
>>> java.io.FileNotFoundException:
>>> /usr/lib/hadoop-1.0.0/libexec/../logs/userlogs/job_201201311241_0003/atte
>>> mpt_201201311241_0003_r_00_1/log.index (No such file or directory)
>>> 
>>>at java.io.FileInputStream.open(Native Method)
>>> 
>>> These errors seem related to this two problems:
>>> 
>>> http://grokbase.com/t/hadoop.apache.org/mapreduce-user/2012/01/error-read
>>> ing-task-output-and-log-filenotfoundexceptions/03mjwctewcnxlgp2jkcrhvsgep
>>> 4e
>>> 
>>> https://issues.apache.org/jira/browse/MAPREDUCE-2846
>>> 
>>> But I've looked into the source code and the fix from MAPREDUCE-2846 is
>>> there. Perhaps there is some other reason?
>>> 
>>> Regards
>>> Marcin
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: hadoop-1.0.0 and errors with log.index

2012-01-31 Thread Arun C Murthy

Actually, all that is telling you is that the task failed and the job-client 
couldn't display the logs.

Can you check the JT web-ui and see why the task failed ?

If you don't see anything there, you can try see the TaskTracker logs on the 
node on which the task ran.

Arun

On Jan 31, 2012, at 3:21 AM, Marcin Cylke wrote:

> Hi
> 
> I've upgraded my hadoop cluster to version 1.0.0. The upgrade process
> went relatively smoothly but it rendered the cluster inoperable due to
> errors in jobtrackers operation:
> 
> # in job output
> Error reading task
> outputhttp://hadoop4:50060/tasklog?plaintext=true&attemptid=attempt_201201311241_0003_m_04_2&filter=stdout
> 
> # in each of the jobtrackers' logs
> WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve stderr log for
> task: attempt_201201311241_0003_r_00_1
> java.io.FileNotFoundException:
> /usr/lib/hadoop-1.0.0/libexec/../logs/userlogs/job_201201311241_0003/attempt_201201311241_0003_r_00_1/log.index
>  
> (No such file or directory)
> at java.io.FileInputStream.open(Native Method)
> 
> 
> These errors seem related to this two problems:
> 
> http://grokbase.com/t/hadoop.apache.org/mapreduce-user/2012/01/error-reading-task-output-and-log-filenotfoundexceptions/03mjwctewcnxlgp2jkcrhvsgep4e
> 
> https://issues.apache.org/jira/browse/MAPREDUCE-2846
> 
> But I've looked into the source code and the fix from MAPREDUCE-2846 is
> there. Perhaps there is some other reason?
> 
> Regards
> Marcin

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: reducers outputs

2012-01-29 Thread Arun C Murthy

The NameNode doesn't care how you wrote the file i.e. either via 'bin/hadoop 
dfs -put <>' or via a MR job.

Arun

On Jan 29, 2012, at 10:09 PM, aliyeh saeedi wrote:

> I studied it, but I could not get the point. I mean if I save reducer's 
> output with my own selected names, does NameNode behave with them like other 
> files?
> regards.
> 
> From: Ashwanth Kumar 
> To: mapreduce-user@hadoop.apache.org; aliyeh saeedi  
> Sent: Monday, 30 January 2012, 9:25
> Subject: Re: Fw: reducers outputs
> 
> You should have a look at this -  
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/FileOutputFormat.html
>  
> 
>  - Ashwanth Kumar
> 
> On Mon, Jan 30, 2012 at 11:17 AM, aliyeh saeedi  wrote:
> 
> 
> 
> I want to save them with my own names, How NameNode will keep their names?
> 
> From: Joey Echeverria 
> To: mapreduce-user@hadoop.apache.org; aliyeh saeedi  
> Sent: Sunday, 29 January 2012, 17:10
> Subject: Re: reducers outputs
> 
> Reduce output is normally stored in HDFS, just like your other files.
> Are you seeing different behavior?
> 
> -Joey
> 
> On Sun, Jan 29, 2012 at 1:05 AM, aliyeh saeedi  wrote:
> > Hi
> > I want to save reducers outputs like other files in Hadoop. Does NameNode
> > keep any information about them? How can I do this?
> > Or can I add a new component to Hadoop like NameNode and make JobTracker to
> > consult with it too (I mean I want to make JobTracker to consult with
> > NameNode AND myNewComponent both)?
> 
> 
> 
> -- 
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
> 
> 
> 
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Reduce node in 0.23

2012-01-19 Thread Arun C Murthy

Currently it's a random node.

You might be interested in https://issues.apache.org/jira/browse/MAPREDUCE-199.

Arun

On Jan 19, 2012, at 10:10 AM, Ann Pal wrote:

> Hi,
> How is the reduce node choosen in 0.23? What parameters determine choosing 
> the reduce node. Does it depend on map node placement?
> 
> Thanks!

Re: Container size

2012-01-17 Thread Arun C Murthy

Sorry, wasn't clear - default is 1024 for both schedulers. Use those configs to 
tune them.

Arun

On Jan 17, 2012, at 11:56 PM, Arun C Murthy wrote:

> The default is 1000MB.
> 
> Try bumping down yarn.scheduler.fifo.minimum-allocation-mb to 100 (default is 
> 1024) for FifoScheduler or  yarn.scheduler.capacity.minimum-allocation-mb for 
> CapacityScheduler.
> 
> Arun
> 
> On Jan 17, 2012, at 11:37 PM, raghavendhra rahul wrote:
> 
>> What could be the minimum size.Because my cluster size is 2.5GB
>> When i set 100 mb for each container only 2 instances are launched in each 
>> node.
>> When i set 1000 mb for each container even then 2 instances are launched in 
>> each node.
>> What could be the problem.
>> 
>> Sorry for the cross posting...
>> 
>> On Wed, Jan 18, 2012 at 1:01 PM, Arun C Murthy  wrote:
>> Removing common-user@, please do not cross-post.
>> 
>> On Jan 17, 2012, at 11:24 PM, raghavendhra rahul wrote:
>> 
>> > Hi,
>> >
>> > What is the minimum size of the container in hadoop yarn.
>> >capability.setmemory(xx);
>> 
>> The AM gets this information from RM via the return value for 
>> AMRMProtocol.registerApplicationMaster.
>> 
>> Arun
>> 
>> 
>

Re: Container size

2012-01-17 Thread Arun C Murthy

The default is 1000MB.

Try bumping down yarn.scheduler.fifo.minimum-allocation-mb to 100 (default is 
1024) for FifoScheduler or  yarn.scheduler.capacity.minimum-allocation-mb for 
CapacityScheduler.

Arun

On Jan 17, 2012, at 11:37 PM, raghavendhra rahul wrote:

> What could be the minimum size.Because my cluster size is 2.5GB
> When i set 100 mb for each container only 2 instances are launched in each 
> node.
> When i set 1000 mb for each container even then 2 instances are launched in 
> each node.
> What could be the problem.
> 
> Sorry for the cross posting...
> 
> On Wed, Jan 18, 2012 at 1:01 PM, Arun C Murthy  wrote:
> Removing common-user@, please do not cross-post.
> 
> On Jan 17, 2012, at 11:24 PM, raghavendhra rahul wrote:
> 
> > Hi,
> >
> > What is the minimum size of the container in hadoop yarn.
> >capability.setmemory(xx);
> 
> The AM gets this information from RM via the return value for 
> AMRMProtocol.registerApplicationMaster.
> 
> Arun
> 
>

Re: Container size

2012-01-17 Thread Arun C Murthy

Removing common-user@, please do not cross-post.

On Jan 17, 2012, at 11:24 PM, raghavendhra rahul wrote:

> Hi,
> 
> What is the minimum size of the container in hadoop yarn.
>capability.setmemory(xx);

The AM gets this information from RM via the return value for 
AMRMProtocol.registerApplicationMaster.

Arun

Re: Yarn Container Memory

2012-01-12 Thread Arun C Murthy

What scheduler are you using?

On Jan 11, 2012, at 11:48 PM, raghavendhra rahul wrote:

> min container size 100mb
> AM size is 1000mb
> 
> On Thu, Jan 12, 2012 at 1:06 PM, Arun C Murthy  wrote:
> What is your min container size?
> 
> How much did you allocate to AM itself?
> 
> On Jan 11, 2012, at 9:51 PM, raghavendhra rahul wrote:
> 
>> Any suggestions..
>> 
>> On Wed, Jan 11, 2012 at 2:09 PM, raghavendhra rahul 
>>  wrote:
>> Hi,
>>  I formed a hadoop cluster with 3 nodes of 3500mb alloted for 
>> containers in each node.
>> 
>> In the appmaster i set resourcecapability as 1000 and total containers as 20
>> Now in each node only 2 container is running.
>> Second time i reduced the resourcecapability as 100 and total containers as 
>> 20
>> Even now only 2 containers are running in each node.
>> 
>> Any reason why the remaining containers are not alloted even though 
>> resourcememory is available...
>> 
>> Thanks..
>> 
> 
>

Re: Yarn Container Memory

2012-01-11 Thread Arun C Murthy

What is your min container size?

How much did you allocate to AM itself?

On Jan 11, 2012, at 9:51 PM, raghavendhra rahul wrote:

> Any suggestions..
> 
> On Wed, Jan 11, 2012 at 2:09 PM, raghavendhra rahul 
>  wrote:
> Hi,
>  I formed a hadoop cluster with 3 nodes of 3500mb alloted for 
> containers in each node.
> 
> In the appmaster i set resourcecapability as 1000 and total containers as 20
> Now in each node only 2 container is running.
> Second time i reduced the resourcecapability as 100 and total containers as 20
> Even now only 2 containers are running in each node.
> 
> Any reason why the remaining containers are not alloted even though 
> resourcememory is available...
> 
> Thanks..
>

Re: Application start stop

2012-01-11 Thread Arun C Murthy

You'll have to implement it yourself for your AM.

The necessary apis are present in the protocol to do so.

On Jan 11, 2012, at 3:33 AM, raghavendhra rahul wrote:

> Hi
>  Is there a specific way to stop the application master other than timeout 
> option in the client.
> Is there a command like
> ./hadoop -job kill jobid for application masters

Re: Queries on next gen MR architecture

2012-01-07 Thread Arun C Murthy


On Jan 7, 2012, at 6:47 PM, Praveen Sripati wrote:

> Thanks for the response.
> 
> I was just thinking why some of the design decisions were made with MRv2.
> 
> > No, the OR condition is implied by the hierarchy of requests (node, rack, 
> > *).
> 
> If InputSplit1 is on Node11 and Node12 and InputSplit2 on Node21 and Node22. 
> Then the AM can ask for 1 containers on each of the nodes and * as 2 for map 
> tasks. Then the RM can return  2 nodes on Node11 and make * as 0. The data 
> locality is lost for InputSplit2 or else the AM has to make another call to 
> RM releasing one of the container and asking for another container.

Remember, you also have racks information to guide the RM...

> A bit more complex request specifying the dependencies might be more 
> effective.

At a very high cost - it's very expensive for the RM to track splits for each 
task across nodes & racks. To the extent possible, our goal has been to push 
work to the AM and keep the RM (and NM) really simple to scale & perform well.

> 
> > NM doesn't make any 'out' calls to anyone by RM, else it would be severe 
> > scalability bottleneck.
> 
> There is already a one-way communication between the AM and NM for launching 
> the containers. The response can from the NM can hold the list of completed 
> containers from the previous call.
> 

Again, we want too keep the framework (RM/NM) really simple. So, the task can 
communicate it's status to the AM itself. 

> > All interactions (RPCs) are authenticated. Also, there is a container token 
> > provided by the RM (during allocation) which is verified by the NM during 
> > container launch.
> 
> So, a shared key has to be deployed manually on all the nodes for the NM?

No, it's automatically shared on startup between the daemons.

Arun

Re: Queries on next gen MR architecture

2012-01-07 Thread Arun C Murthy

On Jan 5, 2012, at 8:29 AM, Praveen Sripati wrote:

> Hi,
> 
> I had been going through the MRv2 documentation and have the following queries
> 
> 1) Let's say that an InputSplit is on Node1 and Node2.
> 
> Can the ApplicationMaster ask the ResourceManager for a container either on 
> Node1 or Node2 with an OR condition?
> 

No, the OR condition is implied by the hierarchy of requests (node, rack, *).

In this case, assuming topology is node1/rack1 node2/rack1, requests would be:
node1 -> 1
node2 -> 1
rack1 -> 1
* -> 1

OTOH, if the topology is node1/rack1, node2/rack2, requests would be:
node1 -> 1
node2 -> 1
rack1 -> 1
rack2 -> 1
* -> 1

In both cases, * would limit the #allocated-containers to '1'.

In the first case rack1 itself (independent of *) would limit 
#allocated-containers to 1.

More details here: 
http://developer.yahoo.com/blogs/hadoop/posts/2011/03/mapreduce-nextgen-scheduler/.

I'll work on incorporating this into our docs on hadoop.apache.org.

> 2) > The Scheduler receives periodic information about the resource usages on 
> allocated resources from the NodeManagers. The Scheduler also makes available 
> status of completed Containers to the appropriate ApplicationMaster.
> 
> What's the use of NM sending the resource usages to the scheduler?
> 
> Why can't the NM directly talk to the AM about the completed containers? Does 
> any information pass from NM to AM?
> 

The NM sends resource usages to the scheduler to allow it to track resource 
utilization on each node and, in future, make smarter decisions about 
allocating extra containers on under-utilized nodes etc.

NM doesn't make any 'out' calls to anyone by RM, else it would be severe 
scalability bottleneck.

> 3) >The Map-Reduce ApplicationMaster has the following components:
> > TaskUmbilical – The component responsible for receiving heartbeats and 
> > status updates form the map and reduce tasks.
> 
> Does the communication happen directly between the container and the AM? If 
> yes, the task completion status could also be sent from the container to the 
> AM.
> 

Yes, it already is sent to AM.

> 4) > The Hadoop Map-Reduce JobClient polls the ASM to obtain information 
> about the MR AM and then directly talks to the AM for status, counters etc.
> 
> Once the Job is completed the AM goes down, what happens to the Counters? 
> What is the flow of the Counter (Container -> NM -> AM)?
> 

Once jobs completes the Counters etc. are stored in JobHistory file (one per 
job) which is served up, if necessary, by the JobHistoryServer.

> 5) If a new YARN application is created. How can the NM trust the request 
> from AM?
> 

All interactions (RPCs) are authenticated. Also, there is a container token 
provided by the RM (during allocation) which is verified by the NM during 
container launch.

> 6) > MapReduce NextGen uses wire-compatible protocols to allow different 
> versions of servers and clients to communicate with each other.
> 
> What is meant by the `wire-compatible protocols` and how is it implemented?
> 

We use PB everywhere.

> 7) > The computation framework (ResourceManager and NodeManager) is 
> completely generic and is free of MapReduce specificities.
> 
> Is this the reason for adding auxiliary services for shuffling to the NM?
> 

Yes.

hth,
Arun

Re: Yarn related questions:

2012-01-06 Thread Arun C Murthy

Responses inline:

On Jan 6, 2012, at 9:34 AM, Ann Pal wrote:

> Thanks for your reply. Some additional questions:
> [1] How does the application master determine the size (memory requirement) 
> of the container  ? Can the container viewed as a JVM with CPU, memory?

Pretty much. It's related to the size of the JVM or any Unix process you want 
to run.

> [2] The document, mentions a concept of fungibility of resources across 
> servers. An allocated container of 2 GB of RAM for a reducer could be across 
> two servers of 1GB each.  If so a task is split across 2 servers? Not sure 
> how that works.

It means 'fungibility' across map and reduce tasks i.e. there is no more fixed 
map/reduce slots. A container can't be split across servers.

> [3] The application master corresponds to Job Tracker for a given job, and 
> Node Manager corresponds to task tracker  in  pre 0.23 hadoop. Is this 
> assumption correct?

Pretty much. Except that the AM doesn't do any resource mgmt done by the JT, 
that's done by the ResourceManager.

> [4] For data to be transferred from map->reduce node, is it the reduce node 
> "node manager" who periodically polls the application master, and 
> subsequently pulls map data from the completed map nodes?

No, the reduce task itself fetches map outputs.

The reduce tasks polls AM to get information about 'where' map outputs are 
available.

hth,
Arun

Re: Yarn related questions:

2012-01-06 Thread Arun C Murthy

Please don't hijack threads, start a new one. Thanks.

On Jan 6, 2012, at 10:41 AM, Arun C Murthy wrote:

> You're probably hitting MAPREDUCE-3537. Try using the hadoop-0.23.1-SNAPSHOT 
> or build it yourself from branch-0.23 on ASF svn.
> 
> Arun
> 
> On Jan 5, 2012, at 10:45 PM, raghavendhra rahul wrote:
> 
>> Yes i am writing my own application master.Is there a way to specify
>> node1: 10 conatiners
>> node2: 10 containers 
>> Can we specify this kind of list using the application master
>> 
>> Also i set request.setHostName("client"); where client is the hostname of a 
>> node
>> I checked the log to find the following error
>> java.io.FileNotFoundException: File /local1/yarn/.yarn/local/
>> usercache/rahul_2011/appcache/application_1325760852770_0001 does not exist
>> at 
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
>> at 
>> org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
>> at 
>> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>> at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>> at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
>> at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
>> at 
>> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>> at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:697)
>> at 
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:122)
>> at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:237)
>> at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:67)
>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>> at java.lang.Thread.run(Thread.java:636)
>> 
>> Any ideas.
>> 
>> 
>> On Fri, Jan 6, 2012 at 12:41 AM, Arun C Murthy  wrote:
>> Are you writing your own application i.e. custom ApplicationMaster?
>> 
>> You need to pass ResourceRequest (RR) with a valid hostname alongwith 
>> (optionally) RR with rack and also a mandatory RR with * as the 
>> resource-name.
>> 
>> Arun
>> 
>> On Jan 4, 2012, at 8:04 PM, raghavendhra rahul wrote:
>> 
>>> Hi,
>>> 
>>> I tried to set the client node for launching the container within the 
>>> application master.
>>> I have set the parameter as
>>> request.setHostName("client");
>>> but the containers are not launched in the destined host.Instead the loop 
>>> goes on continuously.
>>> 2012-01-04 15:11:48,535 INFO appmaster.ApplicationMaster 
>>> (ApplicationMaster.java:run(204)) - Current application state: loop=95, 
>>> appDone=false, total=2, requested=2, completed=0, failed=0, 
>>> currentAllocated=0
>>> 
>>> On Wed, Jan 4, 2012 at 11:24 PM, Robert Evans  wrote:
>>> Ann,
>>> 
>>> A container more or less corresponds to a task in MRV1.  There is one 
>>> exception to this, as the ApplicationMaster also runs in a container.  The 
>>> ApplicationMaster will request new containers for each mapper or reducer 
>>> task that it wants to launch.  There is separate code from the container 
>>> that will serve up the intermediate mapper output and is run as part of the 
>>> NodeManager (Similar to the TaskTracker from before).  When the 
>>> ApplicationMaster requests a container it also includes with it a hint as 
>>> to where it would like the container placed.  In fact it actually makes 
>>> three request one for the exact node, one for the rack the node is on, and 
>>> one that is generic and could be anywhere.  The scheduler will try to honor 
>>> those requests in the same order so data locality is still considered and 
>>> generally honored.  Yes there is the possibility of back and forth to get a 
>>> container, but the ApplicationMaster generally will try to use all of the 
>>> containers that it is given, even if they are not optimal.
>>> 
>>> --Bobby Evans

Re: Yarn related questions:

2012-01-06 Thread Arun C Murthy

You're probably hitting MAPREDUCE-3537. Try using the hadoop-0.23.1-SNAPSHOT or 
build it yourself from branch-0.23 on ASF svn.

Arun

On Jan 5, 2012, at 10:45 PM, raghavendhra rahul wrote:

> Yes i am writing my own application master.Is there a way to specify
> node1: 10 conatiners
> node2: 10 containers 
> Can we specify this kind of list using the application master
> 
> Also i set request.setHostName("client"); where client is the hostname of a 
> node
> I checked the log to find the following error
> java.io.FileNotFoundException: File /local1/yarn/.yarn/local/
> usercache/rahul_2011/appcache/application_1325760852770_0001 does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
> at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
> at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
> at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
> at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
> at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
> at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:697)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:122)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:237)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:67)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>     at java.lang.Thread.run(Thread.java:636)
> 
> Any ideas.
> 
> 
> On Fri, Jan 6, 2012 at 12:41 AM, Arun C Murthy  wrote:
> Are you writing your own application i.e. custom ApplicationMaster?
> 
> You need to pass ResourceRequest (RR) with a valid hostname alongwith 
> (optionally) RR with rack and also a mandatory RR with * as the resource-name.
> 
> Arun
> 
> On Jan 4, 2012, at 8:04 PM, raghavendhra rahul wrote:
> 
>> Hi,
>> 
>> I tried to set the client node for launching the container within the 
>> application master.
>> I have set the parameter as
>> request.setHostName("client");
>> but the containers are not launched in the destined host.Instead the loop 
>> goes on continuously.
>> 2012-01-04 15:11:48,535 INFO appmaster.ApplicationMaster 
>> (ApplicationMaster.java:run(204)) - Current application state: loop=95, 
>> appDone=false, total=2, requested=2, completed=0, failed=0, 
>> currentAllocated=0
>> 
>> On Wed, Jan 4, 2012 at 11:24 PM, Robert Evans  wrote:
>> Ann,
>> 
>> A container more or less corresponds to a task in MRV1.  There is one 
>> exception to this, as the ApplicationMaster also runs in a container.  The 
>> ApplicationMaster will request new containers for each mapper or reducer 
>> task that it wants to launch.  There is separate code from the container 
>> that will serve up the intermediate mapper output and is run as part of the 
>> NodeManager (Similar to the TaskTracker from before).  When the 
>> ApplicationMaster requests a container it also includes with it a hint as to 
>> where it would like the container placed.  In fact it actually makes three 
>> request one for the exact node, one for the rack the node is on, and one 
>> that is generic and could be anywhere.  The scheduler will try to honor 
>> those requests in the same order so data locality is still considered and 
>> generally honored.  Yes there is the possibility of back and forth to get a 
>> container, but the ApplicationMaster generally will try to use all of the 
>> containers that it is given, even if they are not optimal.
>> 
>> --Bobby Evans
>> 
>> 
>> On 1/4/12 10:23 AM, "Ann Pal"  wrote:
>> 
>> Hi,
>> I am trying to understand more about Hadoop Next Gen Map Reduce and had the 
>> following questions based on the following post:
>> 
>> http://developer.yahoo.com/blogs/hadoop/posts/2011/03/mapreduce-nextgen-scheduler/
>> 
>> [1] How does application decide how many containers it needs? The containers 
>> are used to store the intermediate

Re: Yarn related questions:

2012-01-05 Thread Arun C Murthy

Are you writing your own application i.e. custom ApplicationMaster?

You need to pass ResourceRequest (RR) with a valid hostname alongwith 
(optionally) RR with rack and also a mandatory RR with * as the resource-name.

Arun

On Jan 4, 2012, at 8:04 PM, raghavendhra rahul wrote:

> Hi,
> 
> I tried to set the client node for launching the container within the 
> application master.
> I have set the parameter as
> request.setHostName("client");
> but the containers are not launched in the destined host.Instead the loop 
> goes on continuously.
> 2012-01-04 15:11:48,535 INFO appmaster.ApplicationMaster 
> (ApplicationMaster.java:run(204)) - Current application state: loop=95, 
> appDone=false, total=2, requested=2, completed=0, failed=0, currentAllocated=0
> 
> On Wed, Jan 4, 2012 at 11:24 PM, Robert Evans  wrote:
> Ann,
> 
> A container more or less corresponds to a task in MRV1.  There is one 
> exception to this, as the ApplicationMaster also runs in a container.  The 
> ApplicationMaster will request new containers for each mapper or reducer task 
> that it wants to launch.  There is separate code from the container that will 
> serve up the intermediate mapper output and is run as part of the NodeManager 
> (Similar to the TaskTracker from before).  When the ApplicationMaster 
> requests a container it also includes with it a hint as to where it would 
> like the container placed.  In fact it actually makes three request one for 
> the exact node, one for the rack the node is on, and one that is generic and 
> could be anywhere.  The scheduler will try to honor those requests in the 
> same order so data locality is still considered and generally honored.  Yes 
> there is the possibility of back and forth to get a container, but the 
> ApplicationMaster generally will try to use all of the containers that it is 
> given, even if they are not optimal.
> 
> --Bobby Evans
> 
> 
> On 1/4/12 10:23 AM, "Ann Pal"  wrote:
> 
> Hi,
> I am trying to understand more about Hadoop Next Gen Map Reduce and had the 
> following questions based on the following post:
> 
> http://developer.yahoo.com/blogs/hadoop/posts/2011/03/mapreduce-nextgen-scheduler/
> 
> [1] How does application decide how many containers it needs? The containers 
> are used to store the intermediate result at the map nodes?
> 
> [2] During resource allocation, if the resource manager has no mapping 
> between map tasks to resources allocated, how can it properly allocate the 
> right resources. It might end up allocating resources on a node, which does 
> not have data for the map task, and hence is not optimal. In this case the 
> Application Master will have to reject it and request again . There could be 
> considerable back- and- forth between application master and resource manager 
> before it could converge. Is this right?
> 
> Thanks!
> 
>

Re: Exception from Yarn Launch Container

2012-01-03 Thread Arun C Murthy

Bing,

 Are you using the released version of hadoop-0.23? If so, you might want to 
upgrade to latest build off branch-0.23 (i.e. hadoop-0.23.1-SNAPSHOT) which has 
the fix for MAPREDUCE-3537.

Arun

On Dec 29, 2011, at 12:27 AM, Bing Jiang wrote:

> Hi, I use Yarn as resource management to deploy my run-time computing system. 
> I follow  
>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> as guide, and I find these issues below. 
> 
> yarn-nodemanager-**.log:
> 
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>  Adding container_1325062142731_0006_01_01 to application 
> application_1325062142731_0006
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType:
>  INIT_APPLICATION_RESOURCES
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType:
>  APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>  Processing application_1325062142731_0006 of type APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>  Application application_1325062142731_0006 transitioned from INITING to 
> RUNNING
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType:
>  APPLICATION_STARTED
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType:
>  INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Processing container_1325062142731_0006_01_01 of type INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1325062142731_0006_01_01 transitioned from NEW to 
> LOCALIZED
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
>  LAUNCH_CONTAINER
> 2011-12-29 15:49:16,287 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType:
>  CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Processing container_1325062142731_0006_01_01 of type CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1325062142731_0006_01_01 transitioned from LOCALIZED 
> to RUNNING
> 2011-12-29 15:49:16,288 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType:
>  START_MONITORING_CONTAINER
> 2011-12-29 15:49:16,289 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Failed to launch container
> java.io.FileNotFoundException: File 
> /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
> at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
> at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
> at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
> at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
>at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
> at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:697)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:123)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:237)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:67)
> at java.util.concurrent.Fu

Re: Information regarding completed MapTask

2011-12-27 Thread Arun C Murthy

The reduces get it from the JobTracker. Take a look at 
TaskCompletionEvents.java.

Arun

On Dec 27, 2011, at 1:26 AM, hadoop anis wrote:

> 
> 
> Friends, 
> 
> I want to know where does information regarding completed MapTask get stored?
> i.e. how reduce task know about completed map output data is available on 
> which tasktracker?
> what data structure gets maintained to track for the map output data?
> 
> If anyone know this please let me know. 
> 
> -- 
> Regards,
> 
> Anis
> Student MTech (Comp.Sci. & Engg) 
> 
> 
> 
> 
>

Re: how does Hadoop Yarn support different programming models?

2011-12-27 Thread Arun C Murthy

Yep!

Take a look at the link Mahadev sent on how to get your application to work 
inside YARN.


> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html

Arun

On Dec 26, 2011, at 7:45 PM, Bing Jiang wrote:

> Thanks, Praveen and mahadev. I get it .
>  
> I have  developed a distributed real-time computing system,  but  the system 
> does not  do much work on cluster resource management. 
> So  I want to get help from Yarn framework to do these things. Do you think 
> it is natural for Yarn to do another run-time system's resource management?
> Any suggestion or your view will be ok. 
> Thanks!
> 
> Bing
> 
> 2011/12/27 Praveen Sripati 
> Bing,
> 
> FYI ... here are some applications ported to YARN.
> 
> http://wiki.apache.org/hadoop/PoweredByYarn
> 
> Praveen
> 
> On Tue, Dec 27, 2011 at 5:27 AM, Mahadev Konar  
> wrote:
> Hi Bing,
>  These links should give you more info:
> 
> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> 
> Hope that helps.
> 
> thanks
> mahadev
> 
> On Mon, Dec 26, 2011 at 1:57 AM, Bing Jiang  wrote:
> 
> I know Hadoop Yarn can support MapReduce job well, but I have not found DAG 
> model task. Can you give me some demonstration I missed  out , and  point out 
> how to build my own programming models in the Hadoop Yarn.
> -- 
> Bing Jiang
> 
> 
> 
> 
> 
> 
> -- 
> Bing Jiang
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science
>

Re: A new map reduce framework for iterative/pipelined jobs.

2011-12-27 Thread Arun C Murthy

On Dec 26, 2011, at 10:30 PM, Kevin Burton wrote:

> One key point I wanted to mention for Hadoop developers (but then check out 
> the announcement).
> 
> I implemented a version of sysstat (iostat, vmstat, etc) in Peregrine and 
> would be more than happy to move it out and put it in another dedicated 
> project.
> 
> http://peregrine_mapreduce.bitbucket.org/xref/peregrine/sysstat/package-summary.html
> 
> I run this before and after major MR phases which makes it very easy to 
> understand the system throughput/performance for that iteration.
> 

Thanks for sharing. I'd love to play with it, do you have a README/user-guide 
for systat?

> ...
> 
> I'm pleased to announce Peregrine 0.5.0 - a new map reduce framework optimized
> for iterative and pipelined map reduce jobs.
> 
> http://peregrine_mapreduce.bitbucket.org/
> 

Sounds interesting. I briefly skimmed through the site.

Couple of questions: 
# How does peregrine deal with the case that you might not have available 
resources to start reduces while the maps are running? Is the map-output 
buffered to disk before the reduces start?
# How does peregrine deal with failure of in-flight reduces (potentially after 
they have recieved X% of maps' outputs).
# How much does peregrine depend on PFS? One idea worth exploring might be to 
run peregrine within YARN (MR2) as an application. Would you be interested in 
trying that?

Thanks again for sharing.

Arun

Re: AlreadyExistsException for log file on 0.20.205.0

2011-12-26 Thread Arun C Murthy

Is this with jvm reuse turned on?

On Dec 26, 2011, at 9:38 AM, Markus Jelsma wrote:

> Hi,
> 
> We're sometimes seeing this exception if a map task already failed before due 
> to, for example, an OOM error. Any ideas on how to address this issue?
> 
> org.apache.hadoop.io.SecureIOUtils$AlreadyExistsException: File 
> /opt/hadoop/hadoop-0.20.205.0/libexec/../logs/userlogs/job_201112261420_0003/attempt_201112261420_0003_m_29_1/log.tmp
>  
> already exists
>   at 
> org.apache.hadoop.io.SecureIOUtils.insecureCreateForWrite(SecureIOUtils.java:130)
>   at 
> org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:157)
>   at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:296)
>   at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:369)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:257)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 
> 
> Thanks

Re: Reducers fail without messages on 20.205.0

2011-12-26 Thread Arun C Murthy

I wouldn't use jvm reuse at this point. It's had a number of issues over time 
and I've consistently switched it off for a long while now.

Arun

On Dec 26, 2011, at 2:50 PM, Markus Jelsma wrote:

> Hi,
> 
>> Markus,
>> 
>> Good to know you fixed it now :)
>> 
>> Also consider raising reduce slowstart to a more suitable value, like
>> 80-85% -- this is pretty beneficial in both heap consumption and more
>> performant Reduce slots. The property name is
>> "mapred.reduce.slowstart.completed.maps" and carries a float value
>> (percentage).
> 
> Is that going to help if one has a non-zero value for reusing JVM instances? 
> We have little concurrent jobs running because the process flow is linear and 
> we prefer the benefit of copying map output while it's running.
> For this particular job, however, it is not the case since all mappers finish 
> almost simultaneously so changing that value would not make a difference 
> right? 80% finishes within a matter of seconds.
> 
> For the record, it's an Apache Nutch fetcher job configured to terminate 
> after 
> 1 hour. This downloads URL's, hence the disabling of speculative execution.

Re: Performance of direct vs indirect shuffling

2011-12-20 Thread Arun C Murthy

On Dec 20, 2011, at 3:55 PM, Kevin Burton wrote:

> The current hadoop implementation shuffles directly to disk and then those 
> disk files are eventually requested by the target nodes which are responsible 
> for doing the reduce() on the intermediate data.
> 
> However, this requires more 2x IO than strictly necessary.
> 
> If the data were instead shuffled DIRECTLY to the target host, this IO 
> overhead would be removed.
> 

We've discussed 'push' v/s 'pull' shuffle multiple times and each time turned 
away due to complexities in MR1. With MRv2 (YARN) this would be much more 
doable.

IAC...

A single reducer, in typical (well-designed?) applications, process multiple 
gigabytes of data across thousands of maps.

So, to really not do any disk i/o during the shuffle you'd need very large 
amounts of RAM...

Also, currently, the shuffle is effected by the reduce task. This has two major 
benefits :
# The 'pull' can only be initiated after the reduce is scheduled. The 'push' 
model would be hampered if the reduce hasn't been started.
# The 'pull' is more resilient to failure of a single reduce. In the push 
model, it's harder to deal with a reduce failing after a push from the map.

Again, with MR2 we could experiment with push v/s pull where it makes sense 
(small jobs etc.). I'd love to mentor/help someone interested in putting cycles 
into it.

Arun

Re: Application Master:

2011-12-20 Thread Arun C Murthy

On Dec 20, 2011, at 4:05 PM, Ann Pal wrote:

> Hi,
> I had the following questions related to Yarn:
> [1] How does the Application Master know where the data is, to give a list to 
> Resource Manager? Is it talking to the Name Node?

Yes, it's the responsibility of the AM to talk to the NN to figure which 
hosts/racks it needs.

> [2] How does Resource Manager interface with HDFS federation? How does it 
> know which specific name node  talk to?

Either full uri (hdfs://nn1:8080) or client-side mount tables.

Arun

Re: One task per Tasktracker

2011-12-20 Thread Arun C Murthy

Just use multiple slots per each map.

See: 
http://hadoop.apache.org/common/docs/stable/capacity_scheduler.html#Resource+based+scheduling

Arun

On Dec 20, 2011, at 3:46 AM, Nitin Khandelwal wrote:

> Hey,
> 
> We use capacity scheduler and divide our map slots among queues. For a 
> particular kind of job, we want to schedule at most one task per task 
> tracker. How does one do this?  We are using Hadoop 0.20.205.0.
> 
> Thanks,
> 
> -- 
> Nitin Khandelwal
> 
>

Re: Map and Reduce process hang out at 0%

2011-12-20 Thread Arun C Murthy

Can you look at the /nodes web-page to see how many nodes you have?

Also, do you see any exceptions in the ResourceManager logs on dn5?

Arun

On Dec 20, 2011, at 5:14 AM, Jingui Lee wrote:

> Hi,all
> 
> I am running hadoop 0.23 on 5 nodes.
> 
> I could run any YARN application or Mapreduce Job on this cluster before.
> 
> But, after I changed Resourcemanager Node from node4 to node5, when I run 
> applications (I have modified property referenced in configure file), map and 
> reduce process will hang up at 0% until I killed the application.
> 
> I don't know why.
> 
> terminal output:
> 
> bin/hadoop jar hadoop-mapreduce-examples-0.23.0.jar wordcount 
> /share/stdinput/1k /testread/hao
> 11/12/20 20:20:29 INFO mapreduce.Cluster: Cannot pick 
> org.apache.hadoop.mapred.LocalClientProtocolProvider as the 
> ClientProtocolProvider - returned null protocol
> 11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connecting to 
> ResourceManager at dn5/192.168.3.204:50010
> 11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy 
> for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connected to 
> ResourceManager at dn5/192.168.3.204:50010
> 11/12/20 20:20:29 WARN conf.Configuration: fs.default.name is deprecated. 
> Instead, use fs.defaultFS
> 11/12/20 20:20:29 WARN conf.Configuration: mapred.used.genericoptionsparser 
> is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
> 11/12/20 20:20:29 INFO input.FileInputFormat: Total input paths to process : 1
> 11/12/20 20:20:29 INFO util.NativeCodeLoader: Loaded the native-hadoop library
> 11/12/20 20:20:29 WARN snappy.LoadSnappy: Snappy native library not loaded
> 11/12/20 20:20:29 INFO mapreduce.JobSubmitter: number of splits:1
> 11/12/20 20:20:29 INFO mapred.YARNRunner: AppMaster capability = memory: 2048
> 11/12/20 20:20:29 INFO mapred.YARNRunner: Command to launch container for 
> ApplicationMaster is : $JAVA_HOME/bin/java 
> -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.mapreduce.container.log.dir= 
> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout 
> 2>/stderr 
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Submitted application 
> application_1324372145692_0004 to ResourceManager
> 11/12/20 20:20:29 INFO mapred.ClientCache: Connecting to HistoryServer at: 
> dn5:10020
> 11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 11/12/20 20:20:29 INFO mapred.ClientCache: Connected to HistoryServer at: 
> dn5:10020
> 11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy 
> for protocol interface org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
> 11/12/20 20:20:30 INFO mapreduce.Job: Running job: job_1324372145692_0004
> 11/12/20 20:20:31 INFO mapreduce.Job:  map 0% reduce 0%
> 
>

Re: Variable mapreduce.tasktracker.*.tasks.maximum per job

2011-12-19 Thread Arun C Murthy

Markus,

The CapacityScheduler in 0.20.205 (in fact since 0.20.203) supports the notion 
of 'high memory jobs' with which you can specify, for each job, the number of 
'slots' for each map/reduce. For e.g. you can say for job1 that each map needs 
2 slots and so on.

Unfortunately, I don't know how well this works in 0.22 - I might be wrong, but 
I heavily doubt it's been tested in 0.22. YMMV.

Hope that helps.

Arun

On Dec 19, 2011, at 3:02 PM, Markus Jelsma wrote:

> Hi,
> We have many different jobs running on a 0.22.0 cluster, each with its own 
> memory consumption. Some jobs can easily be run with a large amount of 
> *.tasks per job and others require much more memory and can only be run with 
> a minimum number of tasks per node.
> Is there any way to reconfigure a running cluster on a per job basis so we 
> can set the heap size and number of mapper and reduce tasks per node? If not, 
> we have to force all settings to a level that is right for the toughest jobs 
> which will have a negative impact on simpler jobs.
> Thoughts?
> Thanks

Re: Tasktracker Task Attempts Stuck (mapreduce.task.timeout not working)

2011-12-15 Thread Arun C Murthy

Hi John,

 It's hard for folks on this list to diagnose CDH (you might have to ask their 
lists). However, I haven't seen similar issues with hadoop-0.20.2xx in a while.

 One thing to check would be to grab a stack trace (jstack) on the tasks to see 
what they are upto. Next, try get a tcpdump to see if the tasks are indeed 
sending heartbeats to the TT, which might be the reason the TTs aren't timing 
them out.

hth,
Arun

On Dec 15, 2011, at 7:58 AM, John Miller wrote:

> I’ve recently come across some interesting things happening within a 50-node 
> cluster regarding the tasktrackers and task attempts.  Essentially tasks are 
> being created but they are sticking at 0.0% and it seems the 
> ‘mapreduce.task.timeout’ isn’t taking effect and they just sit there (for 
> days if we let them) and the jobs have to get killed.  Its interesting to 
> note that the HDFS datanode service and HBASE regionserver running on these 
> nodes work fine and we’ve been simply shutting down the tasktracker service 
> on them in order to get around jobs stalling forever.
>  
> Some historical information… We’re running Cloudera’s cdh3u0 release, and 
> this has so far only happened on a handful of random tasktracker nodes and it 
> seems to only effected those that have been taken down for maintenance and 
> then brought back into the cluster, or alternatively one node was brought 
> into the cluster after it had been running for a while and we ran into the 
> same issue.  After re-adding the nodes back into the cluster the tasktracker 
> service starts getting these stalls.  Also know that this has not happened to 
> every node that has been taken out of service for a time and then re-added… I 
> would say about 1/3’rd of them or so has ran into this issue after 
> maintenance.  The particular maintenance issues on the effected nodes were 
> NOT the same, i.e. one was bad ram another was a bad sector on a disk etc… 
> never the same initial problem only the same outcome after rejoining the 
> cluster.
>  
> It’s also never the same mapred job that sticks, nor is there any time 
> related evidence relating the stalls to a specific time of day.  Rather the 
> node will run fine for many jobs and then just all of a sudden some tasks 
> will stall and stick at 0.0%.  There are no visible errors in the log 
> outputs, although nothing will move forward nor will it release the mappers 
> for any other jobs to use until the stalled job is killed.  It seems that the 
> default ‘mapreduce.task.timeout’ just isn’t working for some reason.
>  
> Has anyone come across anything similar to this?  I can provide more 
> details/data as needed.
>  
> John Miller  |  Sr. Linux Systems Administrator
> 
> 530 E. Liberty St.
> Ann Arbor, MI 48104
> Direct: 734.922.7007
> http://mybuys.com/
>

Re: Where JobTracker stores Task'sinformation

2011-12-14 Thread Arun C Murthy

Take a look at JobInProgress.java. There is one object per job.

Arun

On Dec 14, 2011, at 1:14 AM, hadoop anis wrote:

> 
> 
> 
>   Hi Friends,
>   I want to know, where JobTracker stores Task's Information,
>   i.e.  which task is being executed on which tasktracker, and how 
> JobTracker  stores this information. 
>   If anyone know this please let me know it.
> 
> 
> 
> Regards,
> Anis
> 
> M.Tech. Student
>

Re: How Jobtracler stores tasktracker's information

2011-12-13 Thread Arun C Murthy

Moving to mapreduce-user@, bcc common-user@. Please use project specific lists. 

Take a look at JobTracker.heartbeat -> *Scheduler.assignTasks.

After the scheduler 'assigns' tasks, the JT sends the corresponding 
'LaunchTaskAction' to the TaskTracker.

hth,
Arun

On Dec 13, 2011, at 12:59 AM, hadoop anis wrote:

>  Anyone please tell this,
>  I want to know from where Jobtracker sends task(taskid) to
> tasktarcker for scheduling.
>  i.e where it creates taskid & tasktracker pairs
> 
> 
> 
> Thanks & Regards,
> 
> Mohmmadanis Moulavi
> 
> Student,
> MTech (Computer Sci. & Engg.)
> Walchand college of Engg. Sangli (M.S.) India

Re: Pause and Resume Hadoop map reduce job

2011-12-12 Thread Arun C Murthy

The CapacityScheduler (hadoop-0.20.203 onwards) allows you to stop a queue and 
start it again.

That will give you the behavior you described.

Arun

On Dec 12, 2011, at 5:50 AM, Dino Kečo wrote:

> Hi Hadoop users,
> 
> In my company we have been using Hadoop for 2 years and we have need to pause 
> and resume map reduce jobs. I was searching on Hadoop JIRA and there are 
> couple of tickets which are not resolved. So we have implemented our 
> solution. I would like to share this approach with you and to hear your 
> opinion about it.
> 
> We have created one special pool in fair scheduler called PAUSE (maxMapTasks 
> = 0, maxReduceTasks = 0). Our logic for pausing job is to move it into this 
> pool and kill all running tasks. When we want to resume job we move this job 
> into some other pool. Currently we can do maintenance of cloud except Job 
> Tracker while jobs are paused. Also we have some external services which we 
> use and we are doing their maintenance while jobs are paused. 
> 
> We know that records which are processed by running tasks will be 
> reprocessed. In some cases we use same HBase table as input and output and we 
> save job id on record. When record is re-processes we check this job id and 
> skip record if it is processed by same job. 
> 
> Our custom implementation of fair scheduler have this logic implemented and 
> it is deployed to our cluster. 
> 
> Please share your comments and concerns about this approach 
> 
> Regards,
> dino

Re: Running YARN on top of legacy HDFS (i.e. 0.20)

2011-12-09 Thread Arun C Murthy

I assume you have security switched off.

What issues are you running into?

On Dec 8, 2011, at 1:30 PM, Avery Ching wrote:

> I was able to convert FileContext to FileSystem and related methods fairly 
> straightforwardly, but am running into issues of dealing with security 
> incompatibilites (i.e. UserGroupInformation, etc.).  Yuck.
> 
> Avery
> 
> On 12/6/11 3:50 PM, Arun C Murthy wrote:
>> Avery,
>> 
>> If you could take a look at what it would take, I'd be grateful. I'm hoping 
>> it isn't very much effort.
>> 
>> thanks,
>> Arun
>> 
>> On Dec 6, 2011, at 10:05 AM, Avery Ching wrote:
>> 
>>> I think it would be nice if YARN could work on existing older HDFS 
>>> instances, a lot of folks will be slow to upgrade HDFS with all their 
>>> important data on it.  I could also go that route I guess.
>>> 
>>> Avery
>>> 
>>> On 12/6/11 8:51 AM, Arun C Murthy wrote:
>>>> Avery,
>>>> 
>>>>  They aren't 'api changes'. HDFS just has a new set of apis in hadoop-0.23 
>>>> (aka FileContext apis). Both the old (FileSystem apis) and new are 
>>>> supported in hadoop-0.23.
>>>> 
>>>>  We have used the new HDFS apis in YARN in some places.
>>>> 
>>>> hth,
>>>> Arun
>>>> 
>>>> On Dec 5, 2011, at 10:59 PM, Avery Ching wrote:
>>>> 
>>>>> Thank you for the response, that's what I thought as well =).  I spent 
>>>>> the day trying to port the required 0.23 APIs to 0.20 HDFS.  There have 
>>>>> been a lot of API changes!
>>>>> 
>>>>> Avery
>>>>> 
>>>>> On 12/5/11 9:14 PM, Mahadev Konar wrote:
>>>>>> Avery,
>>>>>>  Currently we have only tested 0.23 MRv2 with 0.23 hdfs. I might be
>>>>>> wrong but looking at the HDFS apis' it doesnt look like that it would
>>>>>> be a lot of work to getting it to work with 0.20 apis. We had been
>>>>>> using filecontext api's initially but have transitioned back to the
>>>>>> old API's.
>>>>>> 
>>>>>> Hope that helps.
>>>>>> 
>>>>>> mahadev
>>>>>> 
>>>>>> On Mon, Dec 5, 2011 at 4:01 PM, Avery Chingwrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I've been playing with 0.23.0, really nice stuff!  I was able to setup a
>>>>>>> small test cluster (40 nodes) and launch the example jobs.  I was also 
>>>>>>> able
>>>>>>> to recompile old Hadoop programs with the new jars and start up those
>>>>>>> programs as well.  My question is the following:
>>>>>>> 
>>>>>>> We have an HDFS instance based on 0.20 that I would like to hook up to 
>>>>>>> YARN.
>>>>>>>  This appears to be a bit of work.  Launching the jobs gives me the
>>>>>>> following error:
>>>>>>> 
>>>>>>> 2011-12-05 15:48:05,023 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) -
>>>>>>> Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
>>>>>>> 2011-12-05 15:48:05,040 INFO  mapred.ResourceMgrDelegate
>>>>>>> (ResourceMgrDelegate.java:(95)) - Connecting to ResourceManager at
>>>>>>> {removed}.{xxx}/{removed}:50177
>>>>>>> 2011-12-05 15:48:05,041 INFO  ipc.HadoopYarnRPC
>>>>>>> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc 
>>>>>>> proxy
>>>>>>> for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
>>>>>>> 2011-12-05 15:48:05,121 INFO  mapred.ResourceMgrDelegate
>>>>>>> (ResourceMgrDelegate.java:(99)) - Connected to ResourceManager at
>>>>>>> {removed}.{xxx}/{removed}:50177
>>>>>>> 2011-12-05 15:48:05,133 INFO  mapreduce.Cluster
>>>>>>> (Cluster.java:initialize(116)) - Failed to use
>>>>>>> org.apache.hadoop.mapred.YarnClientProtocolProvider due to error:
>>>>>>> java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs
>>>>>>> Exception in thread "main" java.io.IOException: Cannot initialize 
>>>>>>> Cluster.
>>>>>>> Please check your configuration for mapreduce.f

Re: Are the values available in job.xml the actual values used for job

2011-12-09 Thread Arun C Murthy

'final' is meant for admins to ensure certain values aren't overridable.

However, in the example you gave, you'll see 15 (since it's 'final').

Arun

On Dec 8, 2011, at 10:44 PM, Bejoy Ks wrote:

> Hi experts
>  I have a query with the job.xml file in map reduce.I set some 
> value in mapred-site.xml and marked as final, say mapred.num.reduce.tasks=15. 
> When I submit my job I explicitly specified the number of reducers as -D 
> mapred.num.reduce.tasks=4. Now as expected my my job should run with 15 
> reducers as I marked this value as final in my config file. Now what would be 
> the value that would be available for mapred.num.reduce.tasks in job.xml,  15 
> or 4 ?
> 
> Thank you
> 
> Regards
> Bejoy.K.S

Re: OOM Error Map output copy.

2011-12-09 Thread Arun C Murthy

Moving to mapreduce-user@, bcc common-user@. Please use project specific lists.

Niranjan,

If you average as 0.5G output per-map, it's 5000 maps *0.5G -> 2.5TB over 12 
reduces i.e. nearly 250G per reduce - compressed! 

If you think you have 4:1 compression you are doing nearly a Terabyte per 
reducer... which is way too high!

I'd recommend you bump to somewhere along 1000 reduces to get to 2.5G 
(compressed) per reducer for your job. If your compression ratio is 2:1, try 
500 reduces and so on.

If you are worried about other users, use the CapacityScheduler and submit your 
job to a queue with a small capacity and max-capacity to restrict your job to 
10 or 20 concurrent reduces at a given point.

Arun

On Dec 7, 2011, at 10:51 AM, Niranjan Balasubramanian wrote:

> All 
> 
> I am encountering the following out-of-memory error during the reduce phase 
> of a large job.
> 
> Map output copy failure : java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1669)
>   at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1529)
>   at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1378)
>   at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1310)
> I tried increasing the memory available using mapped.child.java.opts but that 
> only helps a little. The reduce task eventually fails again. Here are some 
> relevant job configuration details:
> 
> 1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers 
> filter out a small percentage of the input ( less than 1%).
> 
> 2. I am currently using 12 reducers and I can't increase this count by much 
> to ensure availability of reduce slots for other users. 
> 
> 3. mapred.child.java.opts --> -Xms512M -Xmx1536M -XX:+UseSerialGC
> 
> 4. mapred.job.shuffle.input.buffer.percent--> 0.70
> 
> 5. mapred.job.shuffle.merge.percent   --> 0.66
> 
> 6. mapred.inmem.merge.threshold   --> 1000
> 
> 7. I have nearly 5000 mappers which are supposed to produce LZO compressed 
> outputs. The logs seem to indicate that the map outputs range between 0.3G to 
> 0.8GB. 
> 
> Does anything here seem amiss? I'd appreciate any input of what settings to 
> try. I can try different reduced values for the input buffer percent and the 
> merge percent.  Given that the job runs for about 7-8 hours before crashing, 
> I would like to make some informed choices if possible.
> 
> Thanks. 
> ~ Niranjan.
> 
> 
>

Re: Not able to post a job in Hadoop 0.23.0

2011-12-08 Thread Arun C Murthy

Moving to mapreduce-user@, bcc common-user@.

Can you see any errors in the logs? Typically this happens when you have no 
NodeManagers.

Check the 'nodes' link and then RM logs.

Arun

On Nov 29, 2011, at 8:36 PM, Nitin Khandelwal wrote:

> HI ,
> 
> I have successfully setup Hadoop 0.23.0 in a single m/c. When i post a job,
> it gets posted successfully (i can see the job in UI), but the job is never
> "ASSIGNED" and waits forever.
> Here are details of what i see for that Job in UI
> 
> 
> Name: random-writer  State: ACCEPTED  FinalStatus: UNDEFINED
> Started: 30-Nov-2011
> 10:08:55  Elapsed: 49sec  Tracking URL:
> UNASSIGNED
> Diagnostics:
> AM container logs: AM not yet registered with RM  Cluster ID: 1322627869620
> ResourceManager state: STARTED  ResourceManager started on: 30-Nov-2011
> 10:07:49  ResourceManager version: 0.23.0 from
> 722cd694fc4ab6d040c0a34f9fb5b476e330ee60 by hortonmu source checksum
> 4975bf112aa7faa5673f604045ced798 on Thu Nov 3 09:07:31 UTC 2011  Hadoop
> version: 0.23.0 from d4fee83ec1462ab9824add6449320617caa7c605 by hortonmu
> source checksum 4e42b2d96c899a98a8ab8c7cc23f27ae on Thu Nov 3 08:59:12 UTC
> 2011
> Can some one tell where am i going wrong??
> 
> Thanks,
> -- 
> Nitin Khandelwal

Re: Running YARN on top of legacy HDFS (i.e. 0.20)

2011-12-06 Thread Arun C Murthy

Avery,

If you could take a look at what it would take, I'd be grateful. I'm hoping it 
isn't very much effort.

thanks,
Arun

On Dec 6, 2011, at 10:05 AM, Avery Ching wrote:

> I think it would be nice if YARN could work on existing older HDFS instances, 
> a lot of folks will be slow to upgrade HDFS with all their important data on 
> it.  I could also go that route I guess.
> 
> Avery
> 
> On 12/6/11 8:51 AM, Arun C Murthy wrote:
>> Avery,
>> 
>>  They aren't 'api changes'. HDFS just has a new set of apis in hadoop-0.23 
>> (aka FileContext apis). Both the old (FileSystem apis) and new are supported 
>> in hadoop-0.23.
>> 
>>  We have used the new HDFS apis in YARN in some places.
>> 
>> hth,
>> Arun
>> 
>> On Dec 5, 2011, at 10:59 PM, Avery Ching wrote:
>> 
>>> Thank you for the response, that's what I thought as well =).  I spent the 
>>> day trying to port the required 0.23 APIs to 0.20 HDFS.  There have been a 
>>> lot of API changes!
>>> 
>>> Avery
>>> 
>>> On 12/5/11 9:14 PM, Mahadev Konar wrote:
>>>> Avery,
>>>>  Currently we have only tested 0.23 MRv2 with 0.23 hdfs. I might be
>>>> wrong but looking at the HDFS apis' it doesnt look like that it would
>>>> be a lot of work to getting it to work with 0.20 apis. We had been
>>>> using filecontext api's initially but have transitioned back to the
>>>> old API's.
>>>> 
>>>> Hope that helps.
>>>> 
>>>> mahadev
>>>> 
>>>> On Mon, Dec 5, 2011 at 4:01 PM, Avery Ching   wrote:
>>>>> Hi,
>>>>> 
>>>>> I've been playing with 0.23.0, really nice stuff!  I was able to setup a
>>>>> small test cluster (40 nodes) and launch the example jobs.  I was also 
>>>>> able
>>>>> to recompile old Hadoop programs with the new jars and start up those
>>>>> programs as well.  My question is the following:
>>>>> 
>>>>> We have an HDFS instance based on 0.20 that I would like to hook up to 
>>>>> YARN.
>>>>>  This appears to be a bit of work.  Launching the jobs gives me the
>>>>> following error:
>>>>> 
>>>>> 2011-12-05 15:48:05,023 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) -
>>>>> Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
>>>>> 2011-12-05 15:48:05,040 INFO  mapred.ResourceMgrDelegate
>>>>> (ResourceMgrDelegate.java:(95)) - Connecting to ResourceManager at
>>>>> {removed}.{xxx}/{removed}:50177
>>>>> 2011-12-05 15:48:05,041 INFO  ipc.HadoopYarnRPC
>>>>> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc 
>>>>> proxy
>>>>> for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
>>>>> 2011-12-05 15:48:05,121 INFO  mapred.ResourceMgrDelegate
>>>>> (ResourceMgrDelegate.java:(99)) - Connected to ResourceManager at
>>>>> {removed}.{xxx}/{removed}:50177
>>>>> 2011-12-05 15:48:05,133 INFO  mapreduce.Cluster
>>>>> (Cluster.java:initialize(116)) - Failed to use
>>>>> org.apache.hadoop.mapred.YarnClientProtocolProvider due to error:
>>>>> java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs
>>>>> Exception in thread "main" java.io.IOException: Cannot initialize Cluster.
>>>>> Please check your configuration for mapreduce.framework.name and the
>>>>> correspond server addresses.
>>>>>at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:123)
>>>>>at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:85)
>>>>>at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:78)
>>>>>at org.apache.hadoop.mapreduce.Job$1.run(Job.java:1129)
>>>>>at org.apache.hadoop.mapreduce.Job$1.run(Job.java:1125)
>>>>>at java.security.AccessController.doPrivileged(Native Method)
>>>>>at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>>>>>at org.apache.hadoop.mapreduce.Job.connect(Job.java:1124)
>>>>>at org.apache.hadoop.mapreduce.Job.submit(Job.java:1153)
>>>>>at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
>>>>>at org.apache.giraph.graph.GiraphJ

Re: Running YARN on top of legacy HDFS (i.e. 0.20)

2011-12-06 Thread Arun C Murthy

Avery, 

 They aren't 'api changes'. HDFS just has a new set of apis in hadoop-0.23 (aka 
FileContext apis). Both the old (FileSystem apis) and new are supported in 
hadoop-0.23.

 We have used the new HDFS apis in YARN in some places.

hth,
Arun

On Dec 5, 2011, at 10:59 PM, Avery Ching wrote:

> Thank you for the response, that's what I thought as well =).  I spent the 
> day trying to port the required 0.23 APIs to 0.20 HDFS.  There have been a 
> lot of API changes!
> 
> Avery
> 
> On 12/5/11 9:14 PM, Mahadev Konar wrote:
>> Avery,
>>  Currently we have only tested 0.23 MRv2 with 0.23 hdfs. I might be
>> wrong but looking at the HDFS apis' it doesnt look like that it would
>> be a lot of work to getting it to work with 0.20 apis. We had been
>> using filecontext api's initially but have transitioned back to the
>> old API's.
>> 
>> Hope that helps.
>> 
>> mahadev
>> 
>> On Mon, Dec 5, 2011 at 4:01 PM, Avery Ching  wrote:
>>> Hi,
>>> 
>>> I've been playing with 0.23.0, really nice stuff!  I was able to setup a
>>> small test cluster (40 nodes) and launch the example jobs.  I was also able
>>> to recompile old Hadoop programs with the new jars and start up those
>>> programs as well.  My question is the following:
>>> 
>>> We have an HDFS instance based on 0.20 that I would like to hook up to YARN.
>>>  This appears to be a bit of work.  Launching the jobs gives me the
>>> following error:
>>> 
>>> 2011-12-05 15:48:05,023 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) -
>>> Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
>>> 2011-12-05 15:48:05,040 INFO  mapred.ResourceMgrDelegate
>>> (ResourceMgrDelegate.java:(95)) - Connecting to ResourceManager at
>>> {removed}.{xxx}/{removed}:50177
>>> 2011-12-05 15:48:05,041 INFO  ipc.HadoopYarnRPC
>>> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy
>>> for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
>>> 2011-12-05 15:48:05,121 INFO  mapred.ResourceMgrDelegate
>>> (ResourceMgrDelegate.java:(99)) - Connected to ResourceManager at
>>> {removed}.{xxx}/{removed}:50177
>>> 2011-12-05 15:48:05,133 INFO  mapreduce.Cluster
>>> (Cluster.java:initialize(116)) - Failed to use
>>> org.apache.hadoop.mapred.YarnClientProtocolProvider due to error:
>>> java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs
>>> Exception in thread "main" java.io.IOException: Cannot initialize Cluster.
>>> Please check your configuration for mapreduce.framework.name and the
>>> correspond server addresses.
>>>at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:123)
>>>at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:85)
>>>at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:78)
>>>at org.apache.hadoop.mapreduce.Job$1.run(Job.java:1129)
>>>at org.apache.hadoop.mapreduce.Job$1.run(Job.java:1125)
>>>at java.security.AccessController.doPrivileged(Native Method)
>>>at javax.security.auth.Subject.doAs(Subject.java:396)
>>>at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>>>at org.apache.hadoop.mapreduce.Job.connect(Job.java:1124)
>>>at org.apache.hadoop.mapreduce.Job.submit(Job.java:1153)
>>>at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
>>>at org.apache.giraph.graph.GiraphJob.run(GiraphJob.java:560)
>>>at
>>> org.apache.giraph.benchmark.PageRankBenchmark.run(PageRankBenchmark.java:193)
>>>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
>>>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83)
>>>at
>>> org.apache.giraph.benchmark.PageRankBenchmark.main(PageRankBenchmark.java:201)
>>>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>at java.lang.reflect.Method.invoke(Method.java:597)
>>>at org.apache.hadoop.util.RunJar.main(RunJar.java:189)
>>> 
>>> After doing a little digging it appears that YarnClientProtocolProvider
>>> creates a YARNRunner that uses org.apache.hadoop.fs.Hdfs, a class that is
>>> not available available in older versions of HDFS.
>>> 
>>> What versions of HDFS are currently supported and what HDFS versions are
>>> planned for support?  It would be great to be able to run YARN on legacy
>>> HDFS installations.
>>> 
>>> Thanks,
>>> 
>>> Avery
>

Re: is there a way to just abandon a map task?

2011-11-20 Thread Arun C Murthy


On Nov 20, 2011, at 5:18 PM, Mat Kelcey wrote:

> Thanks for the suggestion Arun, I hadn't seen these params before.
> 
> No way to do it for a job in flight though I guess?
> 

Unfortunately, no. You'll need to re-run the job.

Also, you want to use 'bin/mapred job -fail-task ' 4 times to 
abandon the task. If you use '-kill-task' it will continue to be re-run.

Arun

> Cheers,
> Mat
> 
> On 20 November 2011 16:43, Arun C Murthy  wrote:
>> Mat,
>> 
>>  Take a look at mapred.max.(map|reduce).failures.percent.
>> 
>>  See:
>>  
>> http://hadoop.apache.org/common/docs/r0.20.205.0/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int)
>>  
>> http://hadoop.apache.org/common/docs/r0.20.205.0/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int)
>> 
>> hth,
>> Arun
>> 
>> On Nov 20, 2011, at 1:31 PM, Mat Kelcey wrote:
>> 
>>> Hi,
>>> 
>>> I have a largish job running that, due to the quirks of the third
>>> party input format I'm using, has 280,000 map tasks. ( I know this is
>>> far from ideal but it's it'll do for me )
>>> 
>>> I'm passing this data (the common crawl web crawl dataset) through a
>>> visible-text-from-html extraction library (boilerpipe) which is
>>> struggling with _1_ particular task. It's hits a sequence of records
>>> that are _insanely_ slow to parse for some reason. Rather than a few
>>> minutes per split it's took 7+ hrs before I started explicitly trying
>>> to fail the task (hadoop job -fail-task). Since I'm running with bad
>>> record skipping I was hoping I could issue -fail-task a few times and
>>> ride over the bad records but it looks like there's quite a few there.
>>> Since it's only 1 of the 280,000 I'm actually happy to just give up on
>>> the entire split.
>>> 
>>> Now if I was running a map only job I'd just kill the job since I'd
>>> have the output of the other 279,999. This job has a no-op reduce step
>>> though since I wanted to take the chance to compact the output into a
>>> much smaller number of sequence files ( I regret that decision now) As
>>> such I can't just kill the job since I'd lose the rest of the
>>> processed data (if I understand correctly?)
>>> 
>>> So does anyone know a way to just abandon the entire split?
>>> 
>>> Cheers,
>>> Mat
>> 
>>

Re: is there a way to just abandon a map task?

2011-11-20 Thread Arun C Murthy

Mat,

 Take a look at mapred.max.(map|reduce).failures.percent.

 See: 
 
http://hadoop.apache.org/common/docs/r0.20.205.0/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int)
 
 
http://hadoop.apache.org/common/docs/r0.20.205.0/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int)

hth,
Arun

On Nov 20, 2011, at 1:31 PM, Mat Kelcey wrote:

> Hi,
> 
> I have a largish job running that, due to the quirks of the third
> party input format I'm using, has 280,000 map tasks. ( I know this is
> far from ideal but it's it'll do for me )
> 
> I'm passing this data (the common crawl web crawl dataset) through a
> visible-text-from-html extraction library (boilerpipe) which is
> struggling with _1_ particular task. It's hits a sequence of records
> that are _insanely_ slow to parse for some reason. Rather than a few
> minutes per split it's took 7+ hrs before I started explicitly trying
> to fail the task (hadoop job -fail-task). Since I'm running with bad
> record skipping I was hoping I could issue -fail-task a few times and
> ride over the bad records but it looks like there's quite a few there.
> Since it's only 1 of the 280,000 I'm actually happy to just give up on
> the entire split.
> 
> Now if I was running a map only job I'd just kill the job since I'd
> have the output of the other 279,999. This job has a no-op reduce step
> though since I wanted to take the chance to compact the output into a
> much smaller number of sequence files ( I regret that decision now) As
> such I can't just kill the job since I'd lose the rest of the
> processed data (if I understand correctly?)
> 
> So does anyone know a way to just abandon the entire split?
> 
> Cheers,
> Mat

Re: Business logic in cleanup?

2011-11-18 Thread Arun C Murthy


On Nov 18, 2011, at 10:44 AM, Harsh J wrote:
> 
> If you could follow up on that patch, and see it through, its wish granted 
> for a lot of us as well, as we move ahead with the newer APIs in the future 
> Hadoop releases ;-)
> 

The plan is to support both mapred and mapreduce MR apis for the forseeable 
future.

Arun

> On 18-Nov-2011, at 10:32 PM, Something Something wrote:
> 
>> Thanks again.  Will look at Mapper.run to understand better.  Actually, just 
>> a few minutes ago I got the AVROMapper to work (which will read from AVRO 
>> files). This will hopefully improve performance even more.
>> 
>> Interesting, AVROMapper doesn't extend from Mapper, so it doesn't have the 
>> 'cleanup' method.  Instead it provides a 'close' method, which seems to 
>> behave the same way.  Honestly, I like the method name 'close' better than 
>> 'cleanup'.
>> 
>> Doug - Is there a reason you chose to not extend from 
>> org/apache/hadoop/mapreduce/Mapper?
>> 
>> Thank you all for your help.
>> 
>> 
>> On Fri, Nov 18, 2011 at 7:44 AM, Harsh J  wrote:
>> Given that you are sure about it, and you also know why thats the
>> case, I'd definitely write inside the cleanup(…) hook. No harm at all
>> in doing that.
>> 
>> Take a look at mapreduce.Mapper#run(…) method in source and you'll
>> understand what I mean by it not being a stage or even an event, but
>> just a tail call after all map()s are called.
>> 
>> On Fri, Nov 18, 2011 at 8:58 PM, Something Something
>>  wrote:
>> > Thanks again for the clarification.  Not sure what you mean by it's not a
>> > 'stage'!  Okay.. may be not a stage but I think of it as an 'Event', such 
>> > as
>> > 'Mouseover', 'Mouseout'.  The 'cleanup' is really a 'MapperCompleted' 
>> > event,
>> > right?
>> >
>> > Confusion comes with the name of this method.  The name 'cleanup' makes me
>> > think it should not be really used as 'mapperCompleted', but it appears
>> > there's no harm in using it that way.
>> >
>> > Here's our dilemma - when we use (local) caching in the Mapper & write in
>> > the 'cleanup', our job completes in 18 minutes.  When we don't write in
>> > 'cleanup' it takes 3 hours!!!  Knowing this if you were to decide, would 
>> > you
>> > use 'cleanup' for this purpose?
>> >
>> > Thanks once again for your advice.
>> >
>> >
>> > On Thu, Nov 17, 2011 at 9:35 PM, Harsh J  wrote:
>> >>
>> >> Hello,
>> >>
>> >> On Fri, Nov 18, 2011 at 10:44 AM, Something Something
>> >>  wrote:
>> >> > Thanks for the reply.  Here's another concern we have.  Let's say Mapper
>> >> > has
>> >> > finished processing 1000 lines from the input file & then the machine
>> >> > goes
>> >> > down.  I believe Hadoop is smart enough to re-distribute the input split
>> >> > that was assigned to this Mapper, correct?  After re-assigning will it
>> >> > reprocess the 1000 lines that were processed successfully before & start
>> >> > from line 1001  OR  would it reprocess ALL lines?
>> >>
>> >> Attempts of any task start afresh. That's the default nature of Hadoop.
>> >>
>> >> So, it would begin from start again and hence reprocess ALL lines.
>> >> Understand that cleanup is just a fancy API call here, thats called
>> >> after the input reader completes - not a "stage".
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>> 
>> 
>> 
>> --
>> Harsh J
>> 
>

Re: Dropping 0.20.203 capacity scheduler into 0.20.2

2011-10-26 Thread Arun C Murthy

Sorry. This mostly won't work... we have significant changes in the interface 
between the JobTracker and schedulers (FS/CS) b/w 20.2 and 20.203 (performance, 
better limits etc.).

Your best bet might be to provision Hadoop yourself on EC2 with 0.20.203+.

Good luck!

Arun

On Oct 26, 2011, at 2:55 PM, Kai Ju Liu wrote:

> Hi. I'm currently running a Hadoop cluster on Amazon's EMR service, which 
> appears to be the 0.20.2 codebase plus several patches from the (deprecated?) 
> 0.20.3 branch. I'm interested in switching from using the fair scheduler to 
> the capacity scheduler, but I'm also interested in the user-limit-factor 
> configuration parameter introduced in 0.20.203. This parameter is not 
> available in the EMR-supplied capacity scheduler jar, so I was wondering if 
> it's possible and safe to drop the 0.20.203 capacity scheduler jar into my 
> Hadoop library path.
> 
> Any information would be very helpful. Thanks!
> 
> Kai Ju

Re: Streaming jar creates only 1 reducer

2011-10-21 Thread Arun C Murthy

You can also use -numReduceTasks <#reduces> option to streaming.

On Oct 21, 2011, at 10:22 PM, Mapred Learn wrote:

> Thanks Harsh !
> This is exactly what I thought !
> 
> And don't know what you mean by cross-post ? I just posted to mapred and HDFS 
> mailing lists ? What's your point about cross-pointing ??
> 
> Sent from my iPhone
> 
> On Oct 21, 2011, at 8:57 PM, Harsh J  wrote:
> 
>> Mapred,
>> 
>> You need to pass -Dmapred.reduce.tasks=N along. Reducers are a per-job 
>> configurable number, unlike mappers whose numbers can be determined based on 
>> inputs.
>> 
>> P.s. Please do not cross post questions to multiple lists.
>> 
>> On 22-Oct-2011, at 4:05 AM, Mapred Learn wrote:
>> 
>>> Do you know what parameters from conf files ?
>>> 
>>> Thanks,
>>> 
>>> Sent from my iPhone
>>> 
>>> On Oct 21, 2011, at 3:32 PM, Nick Jones  wrote:
>>> 
 FWIW, I usually specify the number of reducers in both streaming and
 against the Java API. The "default" is what's read from your config
 files on the submitting node.
 
 Nick Jones
 
 On Oct 21, 2011, at 5:00 PM, Mapred Learn  wrote:
 
> Hi,
> Does streaming jar create 1 reducer by default ? We have reduce tasks per 
> task tracker configured to be more than 1 but my job has about 150 
> mappers and only 1 reducer:
> 
> reducer.py basically just reads the line and prints it.
> 
> Why doesn't streaming.jar invokes multiple reducers for this case ?
> 
> Thanks,
> -JJ
> 
> 
>>

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Arun C Murthy

On Sep 27, 2011, at 12:09 PM, Kevin Burton wrote:

> Is it possible to connect the output of one map reduce job so that it is the 
> input to another map reduce job.
> 
> Basically… then reduce() outputs a key, that will be passed to another map() 
> function without having to store intermediate data to the filesystem.
> 

Currently there is no way to pipeline in such a manner - with hadoop-0.23 it's 
doable, but will take more effort.

Arun

Re: quotas for size of intermediate map/reduce output?

2011-09-21 Thread Arun C Murthy

We do track intermediate output used and if a job is using too much and can't 
be scheduled anywhere on a cluster the CS/JT will fail it. You'll need 
hadoop-0.20.204 for this though.

Also, with MRv2 we are in the process of adding limits on disk usage for 
intermediate outputs, logs etc.

hth,
Arun

On Sep 21, 2011, at 3:45 PM, Matt Steele wrote:

> Hi All,
> 
> Is it possible to enforce a maximum to the disk space consumed by a 
> map/reduce job's intermediate output?  It looks like you can impose limits on 
> hdfs consumption, or, via the capacity scheduler, limits on the RAM that a 
> map/reduce slot uses, or the number of slots used.
> 
> But if I'm worried that a job might exhaust the cluster's disk capacity 
> during the shuffle, my sense is that I'd have to quarantine the job on a 
> separate cluster.  Am I wrong?  Do you have any suggestions for me?
> 
> Thanks,
> Matt

Re: Capacity scheduler uses all slots from the same tasktracker

2011-08-23 Thread Arun C Murthy

Were they all 'data-local' or 'rack-local' tasks? If so, it's expected.

Arun

On Aug 23, 2011, at 3:51 PM, Sulabh Choudhury wrote:

> Hi,
> 
> So I just started using capacity scheduler for M/R jobs. I have 4 task 
> trackers each with 4 map/reduce slots.
> Configured a queue so that it uses 25% (4 slots) of the available slots. I 
> was expecting that it would distribute the job and use slots from each of the 
> 4 tasktrackers but it actually uses all 4 slots from a single TT.
> Is there a configuration I am missing or this is the expected behavior ?
> 
>

Re: Task re-scheduling in hadoop

2011-08-23 Thread Arun C Murthy

Moving to mapreduce-user@, bcc common-user@

On Aug 23, 2011, at 2:31 AM, Vaibhav Pol wrote:

> Hi  All,
>   I have some query regarding task re-scheduling.Can it possible
> to make Job tracker wait for some time  before  re-scheduling of failed
> tracker's tasks.
> 

Why would you want to do that?

Typically, you want the JT to retry the failed tasks as quickly as possible to 
fail the job rather than try all tasks and fail once.

Arun

> 
> Thanks and regards,
> Vaibhav Pol
> National PARAM Supercomputing Facility
> Centre for Development of Advanced Computing
> Ganeshkhind Road
> Pune University Campus
> PUNE-Maharastra
> Phone +91-20-25704176 ext: 176
> Cell Phone :  +919850466409
> 
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>

Re: Question regarding Capacity Scheduler

2011-08-17 Thread Arun C Murthy

Moving to mapreduce-user@, bcc common-user@

On Aug 17, 2011, at 10:53 AM, Matt Davies wrote:

> Hello,
> 
> I'm playing around with the Capacity Scheduler (coming from the Fair
> Scheduler), and it appears that a queue with jobs submitted by the same user
> are treated as FIFO.  So, for example, if I submit job1 and job2 to the
> "low" queue as "matt" job1 will run to completion before job2 starts.
> 
> Is there a way to share the resources in the queue between the 2 jobs when
> they both come from the same user?
> 

Not for the same user - the CS tries to get jobs done as quickly as possible 
and thus it won't share resources for the same user.

However, you can submit to diff queues or as diff. users to get resource 
sharing.

Arun

Re: Can I use the cores of each CPU to be the datanodes instead of CPU?

2011-08-08 Thread Arun C Murthy

Jun,

On Aug 8, 2011, at 2:19 AM, 谭军 wrote:
> 2 computers with 2 CPUs.
> Each CPU has 2 cores.
> Now I have 2 physical datanodes.
> Can I get 4 physical datanodes?
> I don't know wether I make my point clear?

Running multiple datanodes on the same machine really doesn't buy you anything 
- the underlying storage is just partitioned.

OTOH, if you just want to simulate a larger cluster you'll need to setup 2 
different HADOOP_HOME, configs etc. and also need to ensure ports don't clash. 
Same for running multiple tasktrackers on same node - again, only makes sense 
while simulating.

hth,
Arun

Re: Performance of mappers

2011-08-05 Thread Arun C Murthy


On Aug 5, 2011, at 12:31 PM, Iman E wrote:
> 
> 
> The task tracker logs does not show any problem. These are the log entries 
> for a task attempt that is too slow
> 2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: 
> LaunchTaskAction (registerTask): attempt_201108041814_0035_m_00_0 task's 
> state:UNASSIGNED
> 2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: Trying to 
> launch : attempt_201108041814_0035_m_00_0
> 2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: In 
> TaskLauncher, current free slots : 2 and trying to launch 
> attempt_201108041814_0035_m_00_0
> 2011-08-05 14:28:03,097 INFO org.apache.hadoop.mapred.TaskTracker: JVM with 
> ID: jvm_201108041814_0035_m_1371719584 given task: 
> attempt_201108041814_0035_m_00_0

Are these all the logs in the TT for that duration or just those relevant to 
the task.

IAC, you should first try with -Djava.net.preferIPv4Stack=true. I've seen cases 
where this causes the child map (or reduce) task to spend a long time trying to 
establish connection to the TaskTracker on 127.0.0.1.

If that doesn't help, you should move to 0.20.203 which has another set of 
fixes for task launches. Moving to the current stable release is a good idea 
nevertheless! :)

Arun

> 2011-08-05 14:32:52,341 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201108041814_0035_m_00_0 0.07052739% 
> 2011-08-05 14:32:55,398 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201108041814_0035_m_00_0 0.123025686% 
> 2011-08-05 14:32:58,402 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201108041814_0035_m_00_0 0.16794641% 
> 2011-08-05 14:33:01,419 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201108041814_0035_m_00_0 0.41990894% 
> 2011-08-05 14:33:04,804 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201108041814_0035_m_00_0 0.8607056% 
> 2011-08-05 14:33:06,617 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201108041814_0035_m_00_0 1.0% 
> 2011-08-05 14:33:06,625 INFO org.apache.hadoop.mapred.TaskTracker: Task 
> attempt_201108041814_0035_m_00_0 is done.
> 
> Thanks
> Iman
> 
>  
> 
> From: Arun C Murthy 
> To: mapreduce-user@hadoop.apache.org; Iman E 
> Sent: Friday, August 5, 2011 2:05 PM
> Subject: Re: Performance of mappers
> 
> Which release of Hadoop are you running?
> 
> What do the logs on the TaskTracker tell you during the time the slow tasks 
> are getting launched?
> 
> hadoop-0.20.203 has a ton of bug fixes since hadoop-0.20.2 which help fix 
> issues with slow launches - you might want to upgrade.
> 
> Arun
> 
> On Aug 5, 2011, at 11:02 AM, Iman E wrote:
> 
>> Hello all,
>> I have a question regarding the mappers. I can see from the logs that the 
>> start time of the mapper is different from start time of logging. I am 
>> having a problem because that time difference sometimes is few seconds, but 
>> other times it is
>>  
>> For example, one mapper that is supposed to read 65 MB. Its start time is 
>> 12:30:53 whereis the logging start time is 12:33:01 and the end time is 
>> 12:33:20. All the laoded data are local to the same rack.
>> In a perfect run, these numbers are as follows: the start time is 18:15:45, 
>> logging start time: 18:15:48, and end time: 18:16:02.
>>  
>>  
>> I am running a job of more than 2400 mapper. Because of the above problem, 
>> instead of the job taking 15-20 mins  on 14 machine ( it happened in few 
>> runs), other times it is taking more than 70 mins. Any suggestions how to 
>> fix this problem or what could possibly be causing it.
>>  
>> Thanks,
>> Iman
> 
> 
>

Re: Performance of mappers

2011-08-05 Thread Arun C Murthy

Which release of Hadoop are you running?

What do the logs on the TaskTracker tell you during the time the slow tasks are 
getting launched?

hadoop-0.20.203 has a ton of bug fixes since hadoop-0.20.2 which help fix 
issues with slow launches - you might want to upgrade.

Arun

On Aug 5, 2011, at 11:02 AM, Iman E wrote:

> Hello all,
> I have a question regarding the mappers. I can see from the logs that the 
> start time of the mapper is different from start time of logging. I am having 
> a problem because that time difference sometimes is few seconds, but other 
> times it is
>  
> For example, one mapper that is supposed to read 65 MB. Its start time is 
> 12:30:53 whereis the logging start time is 12:33:01 and the end time is 
> 12:33:20. All the laoded data are local to the same rack.
> In a perfect run, these numbers are as follows: the start time is 18:15:45, 
> logging start time: 18:15:48, and end time: 18:16:02.
>  
>  
> I am running a job of more than 2400 mapper. Because of the above problem, 
> instead of the job taking 15-20 mins  on 14 machine ( it happened in few 
> runs), other times it is taking more than 70 mins. Any suggestions how to fix 
> this problem or what could possibly be causing it.
>  
> Thanks,
> Iman

Re: Reducer Run on Which Machine?

2011-08-04 Thread Arun C Murthy

Nope, currently we don't do any smart scheduling for reduces since they need to 
fetch map outputs from many nodes anyway.

Arun

On Aug 4, 2011, at 10:24 PM, Suhendry Effendy wrote:

> I understand that we can decide which task run by which reducer in Hadoop by 
> using custom partitioner, but is there anyway to decide which reducer run on 
> which machine?
> 
> 
> Suhendry Effendy

Re: MapReduce jobs hanging or failing near completion

2011-08-03 Thread Arun C Murthy

How long was your job stuck?

The JT should have re-run the map on a different node. Do you see 'fetch 
failures' messages in the JT logs?

The upcoming hadoop-0.20.204 release (now under discussion/vote) has better 
logging to help diagnose this in the JT logs.

Arun

On Aug 3, 2011, at 10:30 AM, Kai Ju Liu wrote:

> Hi Arun. A funny thing happened this morning: one of my jobs got stuck with 
> the "fetch failures" messages that you mentioned. There was one pending map 
> task remaining and one failed map task that had that error, and the reducers 
> were stuck at just under 33.3% completion.
> 
> Is there a solution or diagnosis for this situation? I don't know if it's 
> related to the other issue I've been having, but it would be great to resolve 
> this one for now. Thanks!
> 
> Kai Ju
> 
> On Tue, Aug 2, 2011 at 10:18 AM, Kai Ju Liu  wrote:
> All of the reducers are complete, both on the job tracker page and the job 
> details page. I used to get "fetch failure" messages when HDFS was mounted on 
> EBS volumes, but I haven't seen any since I migrated to physical disks.
> 
> I'm currently using the fair scheduler, but it doesn't look like I've 
> specified any allocations. Perhaps I'll dig into this further with the 
> Cloudera team to see if there is indeed a problem with the job tracker or 
> scheduler. Otherwise, I'll give 0.20.203 + capacity scheduler a shot.
> 
> Thanks again for the pointers.
> 
> Kai Ju
> 
> 
> On Mon, Aug 1, 2011 at 10:08 PM, Arun C Murthy  wrote:
> On Aug 1, 2011, at 9:47 PM, Kai Ju Liu wrote:
> 
>> Hi Arun. Since migrating HDFS off EBS-mounted volumes and onto ephemeral 
>> disks, the problem has actually persisted. Now, however, there is no 
>> evidence of errors on any of the mappers. The job tracker lists one less map 
>> completed than the map total, while the job details show all mappers as 
>> having completed. The jobs "hang" in this state as before.
> 
> Are any of your job's reducers completing? Do you see 'fetch failures' 
> messages either in JT logs or reducers' (tasks) logs?
> 
> If not it's clear that the JobTracker/Scheduler (which Scheduler are you 
> using btw?) are 'losing' tasks which is a serious bug. You say that you are 
> running CDH - unfortunately I have no idea what patchsets you run with it. I 
> can't, at the top of my head, remember the JT/CapacityScheduler losing a task 
> - but I maintained Yahoo clusters which ran hadoop-0.20.203.
> 
> Here is something worth trying: 
> $ cat JOBTRACKER.log | grep Assigning | grep ___m_*
> 
> The JOBTRACKER.log is the JT's log file on the JT host and if your jobid is 
> job_12345342432_0001, then  == 12345342432 and  == 
> 0001.
> 
> Good luck.
> 
> Arun
> 
>> 
>> Is there something in particular I should be looking for on my local disks? 
>> Hadoop fsck shows all clear, but I'll have to wait until morning to take 
>> individual nodes offline to check their disks. Any further details you might 
>> have would be very helpful. Thanks!
>> 
>> Kai Ju
>> 
>> On Tue, Jul 19, 2011 at 1:50 PM, Arun C Murthy  wrote:
>> Is this reproducible? If so, I'd urge you to check your local disks...
>> 
>> Arun
>> 
>> On Jul 19, 2011, at 12:41 PM, Kai Ju Liu wrote:
>> 
>>> Hi Marcos. The issue appears to be the following. A reduce task is unable 
>>> to fetch results from a map task on HDFS. The map task is re-run, but the 
>>> map task is now unable to retrieve information that it needs to run. Here 
>>> is the error from the second map task:
>>> java.io.FileNotFoundException: 
>>> /mnt/hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201107171642_0560/attempt_201107171642_0560_m_000292_1/output/spill0.out
>>> at 
>>> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:176)
>>> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
>>> at org.apache.hadoop.mapred.Merger$Segment.init(Merger.java:205)
>>> at org.apache.hadoop.mapred.Merger$Segment.access$100(Merger.java:165)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:418)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
>>> at 
>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1547)
>>> at 
>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1179)
>>> at org.apache.hadoop.mapred.MapTask.runO

Re: MapReduce jobs hanging or failing near completion

2011-08-01 Thread Arun C Murthy

On Aug 1, 2011, at 9:47 PM, Kai Ju Liu wrote:

> Hi Arun. Since migrating HDFS off EBS-mounted volumes and onto ephemeral 
> disks, the problem has actually persisted. Now, however, there is no evidence 
> of errors on any of the mappers. The job tracker lists one less map completed 
> than the map total, while the job details show all mappers as having 
> completed. The jobs "hang" in this state as before.

Are any of your job's reducers completing? Do you see 'fetch failures' messages 
either in JT logs or reducers' (tasks) logs?

If not it's clear that the JobTracker/Scheduler (which Scheduler are you using 
btw?) are 'losing' tasks which is a serious bug. You say that you are running 
CDH - unfortunately I have no idea what patchsets you run with it. I can't, at 
the top of my head, remember the JT/CapacityScheduler losing a task - but I 
maintained Yahoo clusters which ran hadoop-0.20.203.

Here is something worth trying: 
$ cat JOBTRACKER.log | grep Assigning | grep ___m_*

The JOBTRACKER.log is the JT's log file on the JT host and if your jobid is 
job_12345342432_0001, then  == 12345342432 and  == 
0001.

Good luck.

Arun

> 
> Is there something in particular I should be looking for on my local disks? 
> Hadoop fsck shows all clear, but I'll have to wait until morning to take 
> individual nodes offline to check their disks. Any further details you might 
> have would be very helpful. Thanks!
> 
> Kai Ju
> 
> On Tue, Jul 19, 2011 at 1:50 PM, Arun C Murthy  wrote:
> Is this reproducible? If so, I'd urge you to check your local disks...
> 
> Arun
> 
> On Jul 19, 2011, at 12:41 PM, Kai Ju Liu wrote:
> 
>> Hi Marcos. The issue appears to be the following. A reduce task is unable to 
>> fetch results from a map task on HDFS. The map task is re-run, but the map 
>> task is now unable to retrieve information that it needs to run. Here is the 
>> error from the second map task:
>> java.io.FileNotFoundException: 
>> /mnt/hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201107171642_0560/attempt_201107171642_0560_m_000292_1/output/spill0.out
>>  at 
>> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:176)
>>  at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
>>  at org.apache.hadoop.mapred.Merger$Segment.init(Merger.java:205)
>>  at org.apache.hadoop.mapred.Merger$Segment.access$100(Merger.java:165)
>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:418)
>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
>>  at 
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1547)
>>  at 
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1179)
>>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
>>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
>>  at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>  at java.security.AccessController.doPrivileged(Native Method)
>>  at javax.security.auth.Subject.doAs(Subject.java:396)
>>  at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>>  at org.apache.hadoop.mapred.Child.main(Child.java:262)
>> 
>> I have been having general difficulties with HDFS on EBS, which pointed me 
>> in this direction. Does this sound like a possible hypothesis to you? Thanks!
>> 
>> 
>> 
>> Kai Ju
>> 
>> P.S. I am migrating off of HDFS on EBS, so I will post back with further 
>> results as soon as I have them.
>> On Thu, Jul 7, 2011 at 6:36 PM, Marcos Ortiz  wrote:
>> 
>> 
>> El 7/7/2011 8:43 PM, Kai Ju Liu escribió:
>> 
>> Over the past week or two, I've run into an issue where MapReduce jobs
>> hang or fail near completion. The percent completion of both map and
>> reduce tasks is often reported as 100%, but the actual number of
>> completed tasks is less than the total number. It appears that either
>> tasks backtrack and need to be restarted or the last few reduce tasks
>> hang interminably on the copy step.
>> 
>> In certain cases, the jobs actually complete. In other cases, I can't
>> wait long enough and have to kill the job manually.
>> 
>> My Hadoop cluster is hosted in EC2 on instances of type c1.xlarge with 4
>> attached EBS volumes. The instances run Ubuntu 10.04.1 with the
>> 2.6.32-309-ec2 kernel, and I'm currently using Cloudera's CDH3u0
>> distribution. Has anyone experienced similar behavior in their clusters,
>> and if so, had any luck resolving it? Thanks!
>> 
>> Can you post here your NN and DN logs files?
>> Regards
>> 
>> Kai Ju
>> 
>> -- 
>> Marcos Luís Ortíz Valmaseda
>>  Software Engineer (UCI)
>>  Linux User # 418229
>>  http://marcosluis2186.posterous.com
>>  http://twitter.com/marcosluis2186
>> 
> 
>

Re: Chaining Map Jobs

2011-07-29 Thread Arun C Murthy

Moving to mapreduce-user@, bcc common-user@.

Use JobControl:

http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Job+Control

Arun

On Jul 29, 2011, at 4:24 PM, Roger Chen wrote:

> Has anyone had experience with chaining map jobs in Hadoop framework 0.20.2?
> Thanks.
> 
> -- 
> Roger Chen
> UC Davis Genome Center

Re: what happen in my hadoop cluster?

2011-07-27 Thread Arun C Murthy

Moving to hdfs-user@, bcc mapreduce-user@. 

You NameNode isn't coming out of safemode since all the datanodes haven't 
rejoined the cluster...

> The ratio of reported blocks 0.2915 has not reached the threshold 0.9990. 
> Safe mode will be turned off automatically.

Can you check why your datanodes aren't joining?

Arun

On Jul 27, 2011, at 12:28 AM, 周俊清 wrote:

> hello everyone,
> I got an exception from my jobtracker's log file as follow:
> 2011-07-27 01:58:04,197 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up 
> the system directory
> 2011-07-27 01:58:04,230 INFO org.apache.hadoop.mapred.JobTracker: problem 
> cleaning system directory: 
> hdfs://dn224.pengyun.org:56900/home/hadoop/hadoop-tmp203/mapred/system
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete 
> /home/hadoop/hadoop-tmp203/mapred/system. Name node is in safe mode.
> The ratio of reported blocks 0.2915 has not reached the threshold 0.9990. 
> Safe mode will be turned off automatically.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1851)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1831)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:691)
> ……
> and 
>the log message of namenode:
> 2011-07-27 00:00:00,219 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 1 on 56900, call delete(/home/hadoop/hadoop-tmp203/mapred/system, true) from 
> 192.168.1.224:5131
> 2: error: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot 
> delete /home/hadoop/hadoop-tmp203/mapred/system. Name node is in safe mode.
> The ratio of reported blocks 0.2915 has not reached the threshold 0.9990. 
> Safe mode will be turned off automatically.
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete 
> /home/hadoop/hadoop-tmp203/mapred/system. Name node is in safe mode.
> The ratio of reported blocks 0.2915 has not reached the threshold 0.9990. 
> Safe mode will be turned off automatically.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1851)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1831)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:691)
> at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
> 
>  It means,I think,that the namenode is always being in the safe mode,so what 
> can i do about these exception.Anyone who can tell me why?I don't find the 
> file "/home/hadoop/hadoop-tmp203/mapred/system" in my system.The exception 
> upon which appearing in the log file are repeated,even when I restart my 
> hadoop.
>thanks for your concern.
> 
> 
> 
> Junqing Zhou
> 2ho...@163.com
> 
> 
> 
>

Re: Merge Reducers Outputs

2011-07-26 Thread Arun C Murthy

No, you either have small enough data that you can have all go to a single 
reducer or you can setup a (sampling) partitioner so that the partitions are 
sorted and you can get globally sorted output from multiple reduces - take a 
look at the TeraSort example for this.

Arun

On Jul 26, 2011, at 3:52 PM, Mohamed Riadh Trad wrote:

> Dear All,
> 
> Is it possible to set up a task with multiple reducers and merge reducers 
> outputs into one single file?
> 
> Bests,
> 
> Trad Mohamed Riadh, M.Sc, Ing.
> PhD. student
> INRIA-TELECOM PARISTECH - ENPC School of International Management
> 
> Office: 11-15
> Phone: (33)-1 39 63 59 33
> Fax: (33)-1 39 63 56 74
> Email: riadh.t...@inria.fr
> Home page: http://www-rocq.inria.fr/who/Mohamed.Trad/

Re: Job tracker error

2011-07-24 Thread Arun C Murthy


On Jul 24, 2011, at 2:34 AM, Joey Echeverria wrote:

> You're running out of memory trying to generate the splits. You need to set a 
> bigger heap for your driver program. Assuming you're using the hadoop jar 
> command to launch your job, you can do this by setting HADOOP_HEAPSIZE to a 
> larger value in $HADOOP_HOME/conf/hadoop-env.sh
> 

As Harsh pointed out, please use HADOOP_CLIENT_OPTS and not HADOOP_HEAPSIZE for 
the job-client.

Arun

> -Joey
> 
> On Jul 24, 2011 5:07 AM, "Gagan Bansal"  wrote:
> > Hi All,
> > 
> > I am getting the following error on running a job on about 12 TB of data.
> > This happens before any mappers or reducers are launched.
> > Also the job starts fine if I reduce the amount of input data. Any ideas as
> > to what may be the reason for this error?
> > 
> > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
> > exceeded
> > at java.util.Arrays.copyOf(Arrays.java:2786)
> > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:71)
> > at java.io.DataOutputStream.writeByte(DataOutputStream.java:136)
> > at org.apache.hadoop.io.UTF8.writeChars(UTF8.java:278)
> > at org.apache.hadoop.io.UTF8.writeString(UTF8.java:250)
> > at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:131)
> > at org.apache.hadoop.ipc.RPC$Invocation.write(RPC.java:111)
> > at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:741)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1011)
> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> > at $Proxy6.getBlockLocations(Unknown Source)
> > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> > at
> > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> > at $Proxy6.getBlockLocations(Unknown Source)
> > at
> > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:359)
> > at org.apache.hadoop.hdfs.DFSClient.getBlockLocations(DFSClient.java:380)
> > at
> > org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:178)
> > at
> > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:234)
> > at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:946)
> > at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:938)
> > at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:854)
> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:807)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:807)
> > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:781)
> > at
> > org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:876)
> > 
> > Gagan Bansal

Re: MapReduce jobs hanging or failing near completion

2011-07-19 Thread Arun C Murthy

Is this reproducible? If so, I'd urge you to check your local disks...

Arun

On Jul 19, 2011, at 12:41 PM, Kai Ju Liu wrote:

> Hi Marcos. The issue appears to be the following. A reduce task is unable to 
> fetch results from a map task on HDFS. The map task is re-run, but the map 
> task is now unable to retrieve information that it needs to run. Here is the 
> error from the second map task:
> java.io.FileNotFoundException: 
> /mnt/hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201107171642_0560/attempt_201107171642_0560_m_000292_1/output/spill0.out
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:176)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
>   at org.apache.hadoop.mapred.Merger$Segment.init(Merger.java:205)
>   at org.apache.hadoop.mapred.Merger$Segment.access$100(Merger.java:165)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:418)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
>   at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1547)
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1179)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>   at org.apache.hadoop.mapred.Child.main(Child.java:262)
> 
> I have been having general difficulties with HDFS on EBS, which pointed me in 
> this direction. Does this sound like a possible hypothesis to you? Thanks!
> 
> 
> Kai Ju
> 
> P.S. I am migrating off of HDFS on EBS, so I will post back with further 
> results as soon as I have them.
> On Thu, Jul 7, 2011 at 6:36 PM, Marcos Ortiz  wrote:
> 
> 
> El 7/7/2011 8:43 PM, Kai Ju Liu escribió:
> 
> Over the past week or two, I've run into an issue where MapReduce jobs
> hang or fail near completion. The percent completion of both map and
> reduce tasks is often reported as 100%, but the actual number of
> completed tasks is less than the total number. It appears that either
> tasks backtrack and need to be restarted or the last few reduce tasks
> hang interminably on the copy step.
> 
> In certain cases, the jobs actually complete. In other cases, I can't
> wait long enough and have to kill the job manually.
> 
> My Hadoop cluster is hosted in EC2 on instances of type c1.xlarge with 4
> attached EBS volumes. The instances run Ubuntu 10.04.1 with the
> 2.6.32-309-ec2 kernel, and I'm currently using Cloudera's CDH3u0
> distribution. Has anyone experienced similar behavior in their clusters,
> and if so, had any luck resolving it? Thanks!
> 
> Can you post here your NN and DN logs files?
> Regards
> 
> Kai Ju
> 
> -- 
> Marcos Luís Ortíz Valmaseda
>  Software Engineer (UCI)
>  Linux User # 418229
>  http://marcosluis2186.posterous.com
>  http://twitter.com/marcosluis2186
>

Re: Too many fetch-failures

2011-07-18 Thread Arun C Murthy


On Jul 18, 2011, at 3:02 PM, Geoffry Roberts wrote:

> All,
> 
> I am getting the following errors during my MR jobs (see below). Ultimately 
> the jobs finish well enough, but these errors do slow things down.  I've done 
> some reading and I understand that this is all caused by failures in my 
> network.  Is there a way of determining which node(s) in my cluster are 
> causing the problem?  
> 

The TT running on 'localhost' ran attempt_201107180916_0030_m_03_0 whose 
output couldn't be fetched. Take a look at the TT logs and see what you find.

Arun



> Thanks
> 
> 11/07/18 14:53:06 INFO mapreduce.Job:  map 99% reduce 28%
> 11/07/18 14:53:10 INFO mapreduce.Job:  map 100% reduce 28%
> 11/07/18 14:53:15 INFO mapreduce.Job: Task Id : 
> attempt_201107180916_0030_m_03_0, Status : FAILED
> Too many fetch-failures
> 11/07/18 14:53:15 WARN mapreduce.Job: Error reading task 
> outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201107180916_0030_m_03_0&filter=stdout
> 11/07/18 14:53:15 WARN mapreduce.Job: Error reading task 
> outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201107180916_0030_m_03_0&filter=stderr
> 11/07/18 14:53:17 INFO mapreduce.Job:  map 100% reduce 29%
> 11/07/18 14:53:19 INFO mapreduce.Job:  map 96% reduce 29%
> 11/07/18 14:53:25 INFO mapreduce.Job:  map 98% reduce 29%
> 
> 
> -- 
> Geoffry Roberts
>

Re: New to hadoop, trying to write a customary file split

2011-07-18 Thread Arun C Murthy

Hey Steve,

 Want to contribute it as an example to MR? Would love to help.

thanks,
Arun

On Jul 11, 2011, at 12:11 PM, Steve Lewis wrote:

> Look at this sample 
> =
> package org.systemsbiology.hadoop;
> 
> 
> 
> import org.apache.hadoop.conf.*;
> import org.apache.hadoop.fs.*;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.io.*;
> import org.apache.hadoop.io.compress.*;
> import org.apache.hadoop.mapreduce.*;
> import org.apache.hadoop.mapreduce.lib.input.*;
> 
> import java.io.*;
> import java.util.*;
> 
> /**
>  * org.systemsbiology.xtandem.hadoop.XMLTagInputFormat
>  * Splitter that reads scan tags from an XML file
>  * No assumption is made about lines but tage and end tags MUST look like 
>  with no embedded spaces
>  * usually you will subclass and hard code the tag you want to split on
>  */
> public class XMLTagInputFormat extends FileInputFormat {
> public static final XMLTagInputFormat[] EMPTY_ARRAY = {};
> 
> 
> private static final double SPLIT_SLOP = 1.1;   // 10% slop
> 
> 
> public static final int BUFFER_SIZE = 4096;
> 
> private final String m_BaseTag;
> private final String m_StartTag;
> private final String m_EndTag;
> private String m_Extension;
> 
> public XMLTagInputFormat(final String pBaseTag) {
> m_BaseTag = pBaseTag;
> m_StartTag = "<" + pBaseTag;
> m_EndTag = "";
> 
> }
> 
> public String getExtension() {
> return m_Extension;
> }
> 
> public void setExtension(final String pExtension) {
> m_Extension = pExtension;
> }
> 
> public boolean isSplitReadable(InputSplit split) {
> if (!(split instanceof FileSplit))
> return true;
> FileSplit fsplit = (FileSplit) split;
> Path path1 = fsplit.getPath();
> return isPathAcceptable(path1);
> }
> 
> protected boolean isPathAcceptable(final Path pPath1) {
> String path = pPath1.toString().toLowerCase();
> if(path.startsWith("part-r-"))
> return true;
> String extension = getExtension();
> if (extension != null && path.endsWith(extension.toLowerCase()))
> return true;
> if (extension != null && path.endsWith(extension.toLowerCase() + 
> ".gz"))
> return true;
> if (extension == null )
> return true;
> return false;
> }
> 
> public String getStartTag() {
> return m_StartTag;
> }
> 
> public String getBaseTag() {
> return m_BaseTag;
> }
> 
> public String getEndTag() {
> return m_EndTag;
> }
> 
> @Override
> public RecordReader createRecordReader(InputSplit split,
>TaskAttemptContext 
> context) {
> if (isSplitReadable(split))
> return new MyXMLFileReader();
> else
> return NullRecordReader.INSTANCE; // do not read
> }
> 
> @Override
> protected boolean isSplitable(JobContext context, Path file) {
> String fname = file.getName().toLowerCase();
> if(fname.endsWith(".gz"))
> return false;
> return true;
> }
> 
> /**
>  * Generate the list of files and make them into FileSplits.
>  * This needs to be copied to insert a filter on acceptable data
>  */
> @Override
> public List getSplits(JobContext job
> ) throws IOException {
> long minSize = Math.max(getFormatMinSplitSize(), 
> getMinSplitSize(job));
> long maxSize = getMaxSplitSize(job);
> 
> // generate splits
> List splits = new ArrayList();
> for (FileStatus file : listStatus(job)) {
> Path path = file.getPath();
> if (!isPathAcceptable(path))   // filter acceptable data
> continue;
> FileSystem fs = path.getFileSystem(job.getConfiguration());
> long length = file.getLen();
> BlockLocation[] blkLocations = fs.getFileBlockLocations(file, 0, 
> length);
> if ((length != 0) && isSplitable(job, path)) {
> long blockSize = file.getBlockSize();
> long splitSize = computeSplitSize(blockSize, minSize, 
> maxSize);
> 
> long bytesRemaining = length;
> while (((double) bytesRemaining) / splitSize > SPLIT_SLOP) {
> int blkIndex = getBlockIndex(blkLocations, length - 
> bytesRemaining);
> splits.add(new FileSplit(path, length - bytesRemaining, 
> splitSize,
> blkLocations[blkIndex].getHosts()));
> bytesRemaining -= splitSize;
> }
> 
> if (bytesRemaining != 0) {
> splits.add(new FileSplit(path, length - bytesRemaining, 
> bytesRemaining,
> blkLocations[blkLocations.length - 
> 1].getHosts()));
>

Re: Lack of data locality in Hadoop-0.20.2

2011-07-12 Thread Arun C Murthy

As Aaron mentioned the scheduler has very little leeway when you have a single 
replica.

OTOH, schedulers equate rack-locality to node-locality - this makes sense sense 
for a large-scale system since intra-rack b/w is good enough for most installs 
of Hadoop.

Arun

On Jul 12, 2011, at 7:36 AM, Virajith Jalaparti wrote:

> I am using a replication factor of 1 since I dont to incur the overhead of 
> replication and I am not much worried about reliability. 
> 
> I am just using the default Hadoop scheduler (FIFO, I think!). In case of a 
> single rack, rack-locality doesn't really have any meaning. Obviously 
> everything will run in the same rack. I am concerned about data-local maps. I 
> assumed that Hadoop would do a much better job at ensuring data-local maps 
> but it doesnt seem to be the case here.
> 
> -Virajith
> 
> On Tue, Jul 12, 2011 at 3:30 PM, Arun C Murthy  wrote:
> Why are you running with replication factor of 1?
> 
> Also, it depends on the scheduler you are using. The CapacityScheduler in 
> 0.20.203 (not 0.20.2) has much better locality for jobs, similarly with 
> FairScheduler.
> 
> IAC, running on a single rack with replication of 1 implies rack-locality for 
> all tasks which, in most cases, is good enough.
> 
> Arun
> 
> On Jul 12, 2011, at 5:45 AM, Virajith Jalaparti wrote:
> 
> > Hi,
> >
> > I was trying to run the Sort example in Hadoop-0.20.2 over 200GB of input 
> > data using a 20 node cluster of nodes. HDFS is configured to use 128MB 
> > block size (so 1600maps are created) and a replication factor of 1 is being 
> > used. All the 20 nodes are also hdfs datanodes. I was using a bandwidth 
> > value of 50Mbps between each of the nodes (this was configured using linux 
> > "tc"). I see that around 90% of the map tasks are reading data over the 
> > network i.e. most of the map tasks are not being scheduled at the nodes 
> > where the data to be processed by them is located.
> > My understanding was that Hadoop tries to schedule as many data-local maps 
> > as possible. But in this situation, this does not seem to happen. Any 
> > reason why this is happening? and is there a way to actually configure 
> > hadoop to ensure the maximum possible node locality?
> > Any help regarding this is very much appreciated.
> >
> > Thanks,
> > Virajith
> 
>

Re: Lack of data locality in Hadoop-0.20.2

2011-07-12 Thread Arun C Murthy

Why are you running with replication factor of 1?

Also, it depends on the scheduler you are using. The CapacityScheduler in 
0.20.203 (not 0.20.2) has much better locality for jobs, similarly with 
FairScheduler.

IAC, running on a single rack with replication of 1 implies rack-locality for 
all tasks which, in most cases, is good enough.

Arun

On Jul 12, 2011, at 5:45 AM, Virajith Jalaparti wrote:

> Hi,
> 
> I was trying to run the Sort example in Hadoop-0.20.2 over 200GB of input 
> data using a 20 node cluster of nodes. HDFS is configured to use 128MB block 
> size (so 1600maps are created) and a replication factor of 1 is being used. 
> All the 20 nodes are also hdfs datanodes. I was using a bandwidth value of 
> 50Mbps between each of the nodes (this was configured using linux "tc"). I 
> see that around 90% of the map tasks are reading data over the network i.e. 
> most of the map tasks are not being scheduled at the nodes where the data to 
> be processed by them is located. 
> My understanding was that Hadoop tries to schedule as many data-local maps as 
> possible. But in this situation, this does not seem to happen. Any reason why 
> this is happening? and is there a way to actually configure hadoop to ensure 
> the maximum possible node locality?
> Any help regarding this is very much appreciated.
> 
> Thanks,
> Virajith

Re: Distribute native library within a job jar

2011-07-10 Thread Arun C Murthy

Jarod,

On Jul 10, 2011, at 3:13 PM, Donghan (Jarod) Wang wrote:

> Hey Arun,
> 
> Thank you for the reply. The way you mentioned requires setting up
> native libraries somewhere on the hdfs before starting the job, which
> is what I am trying to avoid. What I want is bundling the libraries
> within the job JAR, in other words the libraries are shipped with the
> JAR and need not be pre-installed on the system. And once the job gets
> running, it extracts the lib from the job JAR and System.load it. I
> wonder if it is possible.
> 

It's possible, but very tedious.

Currently (0.20.xxx) unjars the job.jar for you, but that is going away in 0.23 
(even 0.22 I guess). Even then, you'll have to manually figure the path, load 
it etc.

OTOH, using the DC is supported. Even better, the native .so will be shared 
across jobs - so it's downloaded into the DC only once and re-used. I'd highly 
recommend that.

hth,
Arun

> Thanks,
> Jarod
> 
> On Sat, Jul 9, 2011 at 3:20 PM, Arun C Murthy  wrote:
>> Jarod,
>> 
>> On Jul 9, 2011, at 12:08 PM, Donghan (Jarod) Wang wrote:
>> 
>>> Hey all,
>>> 
>>> I'm working on a project that uses a native c library. Although I can
>>> use DistributedCache as a way to distribute the c library, I'd like to
>>> use the jar to do the job. What I mean is packing the c library into
>>> the job jar, and writing code in a way that the job can find the
>>> library once it gets submitted. I wonder if this is possible. If so
>>> how can I obtain the path in the code.
>> 
>> 
>> Just add it as a cache-file in the distributed cache, enable the =
>> symlink and just System.load the filename (of the symlink).
>> 
>> More details: 
>> http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#DistributedCache
>> 
>> hth,
>> Arun

Re: About the combiner execution

2011-07-10 Thread Arun C Murthy

(Moving to mapreduce-user@, bcc hdfs-user@. Please use appropriate project 
lists - thanks)

On Jul 10, 2011, at 4:42 AM, Florin P wrote:

> Hello!
>  I've read on 
> http://www.fromdev.com/2010/12/interview-questions-hadoop-mapreduce.html 
> (cite):
> "The execution of combiner is not guaranteed, Hadoop may or may not execute a 
> combiner. Also, if required it may execute it more then 1 times. Therefore 
> your MapReduce jobs should not depend on the combiners execution. "
> Is it true? 

Right. The way to visualize is that the MR framework in the map task collects 
the 'raw' (i.e. serialized) map-output key-values in the 'sort' buffer. When 
the buffer is full it runs the combiner (if available) and then spills it to 
the disk, even the last (final) spill. The combiner is also run when the 
multiple spills from disk need to be merged. 

However, the combiner execution also depends on having sufficient number of 
records to combine - this is because combiner execution is somewhat expensive 
since we need a extra serialize-deserialize pair.

Thus, the combiner maybe be run 0 or more times. 

> Also is it possible to use the Combiner without the Reducer? The framework 
> will take into the consideration the Combiner in this case?

No. When the job has no reduces the map-outputs are written straight to HDFS 
(typically) without sorting them. Thus, combiners are never in that execution 
path.

hth,
Arun

Re: Distribute native library within a job jar

2011-07-09 Thread Arun C Murthy

Jarod,

On Jul 9, 2011, at 12:08 PM, Donghan (Jarod) Wang wrote:

> Hey all,
> 
> I'm working on a project that uses a native c library. Although I can
> use DistributedCache as a way to distribute the c library, I'd like to
> use the jar to do the job. What I mean is packing the c library into
> the job jar, and writing code in a way that the job can find the
> library once it gets submitted. I wonder if this is possible. If so
> how can I obtain the path in the code.

Just add it as a cache-file in the distributed cache, enable the =
symlink and just System.load the filename (of the symlink).

More details: 
http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#DistributedCache

hth,
Arun

Re: Distribute native library within a job jar

2011-07-09 Thread Arun C Murthy

Jarod,

On Jul 9, 2011, at 12:08 PM, Donghan (Jarod) Wang wrote:

> Hey all,
> 
> I'm working on a project that uses a native c library. Although I can
> use DistributedCache as a way to distribute the c library, I'd like to
> use the jar to do the job. What I mean is packing the c library into
> the job jar, and writing code in a way that the job can find the
> library once it gets submitted. I wonder if this is possible. If so
> how can I obtain the path in the code.


Just add it as a cache-file in the distributed cache, enable the =
symlink and just System.load the filename (of the symlink).

More details: 
http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#DistributedCache

hth,
Arun

1 2 >

1 - 100 of 139 matches

Mail list logo