Re: MR job output report

2013-03-11 Thread Jagmohan Chauhan
The logs of any job can be seen with following command: hadoop job -history all output-directory-for-the-job. for example: hadoop job -history all /user/hduser3/sort-output On Mon, Mar 11, 2013 at 3:32 PM, John Meza wrote: > All, > Q. How does everyone refer to the typical output report of a Ma

MR job output report

2013-03-11 Thread John Meza
All,Q. How does everyone refer to the typical output report of a MapReduce job (below)? What is it called?Q. Isn't that "report" saved somewhere? Is it reproduciable? I've looked at the HDFS job output directory which contains a log, but that doesn't have all the info as the report.Q. How can I

Configuring HDFS access over SSL / TLS

2013-03-11 Thread Michael Allen
Hi all, I'm new to the Hadoop community. I'm trying to figure out if I'm missing something, or if there is really no straightforward way to enable HDFS access over SSL / TLS. >From the discussion I've seen on various sites and this mailing list, one may configure SSL for: - the HDFS proxy server

Re: Hadoop cluster hangs on big hive job

2013-03-11 Thread Luke Lu
You mean HDFS-4479? The log seems to indicate the infamous jetty hang issue (MAPREDUCE-2386) though. On Mon, Mar 11, 2013 at 1:52 PM, Suresh Srinivas wrote: > I have seen one such problem related to big hive jobs that open a lot of > files. See HDFS-4496 for more details. Snippet from the descr

Re: Hadoop cluster hangs on big hive job

2013-03-11 Thread Suresh Srinivas
I have seen one such problem related to big hive jobs that open a lot of files. See HDFS-4496 for more details. Snippet from the description: The following issue was observed in a cluster that was running a Hive job and was writing to 100,000 temporary files (each task is writing to 1000s of files)

Re: jetty7 and hadoop1.1.2

2013-03-11 Thread Алексей Бабутин
I have added task in JIRA - HADOOP-9395 2013/3/11 Jean-Marc Spaggiari > Sorry, wrong link. > > For Hadoop please use this one: > https://issues.apache.org/jira/browse/HADOOP > > JM > > 2013/3/11 Jean-Marc Spaggiari : > > Hi Alexey, > > > > Can y

Re: Hadoop cluster hangs on big hive job

2013-03-11 Thread Daning Wang
[hive@mr3-033 ~]$ hadoop version Hadoop 1.0.4 Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290 Compiled by hortonfo on Wed Oct 3 05:13:58 UTC 2012 On Sun, Mar 10, 2013 at 8:16 AM, Suresh Srinivas wrote: > What is the version of hadoop? > > Sent from phone

Re: jetty7 and hadoop1.1.2

2013-03-11 Thread Jean-Marc Spaggiari
Sorry, wrong link. For Hadoop please use this one: https://issues.apache.org/jira/browse/HADOOP JM 2013/3/11 Jean-Marc Spaggiari : > Hi Alexey, > > Can you please go to JIRA: https://issues.apache.org/jira/browse/HBASE > , create a ticket, and attached your patch to it? > > Thanks, > > JM > > 20

Re: jetty7 and hadoop1.1.2

2013-03-11 Thread Jean-Marc Spaggiari
Hi Alexey, Can you please go to JIRA: https://issues.apache.org/jira/browse/HBASE , create a ticket, and attached your patch to it? Thanks, JM 2013/3/11 Алексей Бабутин : > Patch in attach. > > After patch,hadoop compile and work fine.But only one cheat. > In HttpServer.java I have commited de

Re: jetty7 and hadoop1.1.2

2013-03-11 Thread Алексей Бабутин
Patch in attach. After patch,hadoop compile and work fine.But only one cheat. In HttpServer.java I have commited defineFilter(webAppContext, KRB5_FILTER ... because it is not working properly with defineFilter(...) string 435 in holder.setInitParameters(parameters). The,you can remove 448-450,i h

Namenode caching hostname resolution and tries to bind to an old ip address.

2013-03-11 Thread Michał Czerwiński
I had to change instance type of a namenode running in EC2 which requires host shutdown/startup and that involves ip change unless you are using elastic ip addresses. We are using fqdns to access our hadoop cluster so after host startup I've updated this particular DNS entry to point to a new ip. H

Re: Unexpected Hadoop behavior: map task re-running after reducer has been running

2013-03-11 Thread Ravi Prakash
This is not unexpected behavior. If there are fetch failures on the Reduce (i.e. its not able to get the map outputs) then a map may be rerun. From: David Parks To: "user@hadoop.apache.org" Sent: Monday, March 11, 2013 3:30 AM Subject: Re: Unexpected Hadoop

Re: OutOfMemory during Plain Java MapReduce

2013-03-11 Thread Christian Schneider
Thanks Paul and Harsh for your Tipps! I implemented the secondary sort and the related mapper. This is a very good idea to get a unique set. The original Question how to translate the "huge" Values (in terms of a "large" list of users for one key) into the format I need is still "somehow" open. I

Re: Unexpected Hadoop behavior: map task re-running after reducer has been running

2013-03-11 Thread Bejoy Ks
Hi David The issue with the maps getting re triggered is because one of the node where map outputs are stored are getting lost during reduce phase. As a result of this the map outputs are no longer availabe from that node for reduce to process and the job is again re triggered. Can you try re tri

Re:

2013-03-11 Thread Vikas Jadhav
Hello you can modify log4j.properties file in your conf folder and set appropriate logging level I think you should follow hadoop-wiki http://wiki.apache.org/hadoop/HowToContribute for modifying source code or It is very easy you can also refer to http://ahadoop.blogspot.in/ post by me Steps

Re:

2013-03-11 Thread Vikas Jadhav
Hello I think you should follow hadoop-wiki http://wiki.apache.org/hadoop/HowToContribute + It is very easy you can also refer to http://ahadoop.blogspot.in/ post by me Steps To build Hadoop Code using eclipse On

Re:

2013-03-11 Thread Ellis Miller
Sorry...was typing another email:) Anyway, default config files across Master and Slave Nodes: - /src/core-site.xml, src/hdfs/hdfs-default.xml and the one you would want to be standard, of course, the src/mapred/mapred-default.xml For each DataNode / Task Tracker across however many Virt

RE: error while running reduce

2013-03-11 Thread Samir Kumar Das Mohapatra
mapred.reduce.child.java.opts -Xmx1024M Above parameter you have to increase the memory. Bcz it is showing only 1GB , Again VM will comsume more memory as compared with real physical system. Just try to increase the VM Memory first. Then increase the reduce memory. From: Arindam Choudh

Re:

2013-03-11 Thread Ellis Miller
Could use an UDF to extend the Task.java yet if it's a true cluster or appropriately simulated then basic standard: 1. Master is comprised of NameNode and JobTracker 2. All other are Slave (hosts) or Virtual Machines consisting of DataNodes and TaskTrackers in particular So, on Slaves Hosts could

Re:

2013-03-11 Thread ????????????
I think you are right.You will compile the source files. -- Original -- From: "preethi ganeshan"; Date: Mon, Mar 11, 2013 05:13 PM To: "user"; Subject: Hi, I want to modify the Task.java so that it gives additional information in the usrlogs files. Ho

[no subject]

2013-03-11 Thread preethi ganeshan
Hi, I want to modify the Task.java so that it gives additional information in the usrlogs files. How do i go about the modification? I am new to Hadoop. Shall i simply open the src .mapred . appropriate file in eclipse modify and save? Will that help? Thank you Regards, Preethi Ganeshan

Re: Unexpected Hadoop behavior: map task re-running after reducer has been running

2013-03-11 Thread David Parks
I should have included the obvious log files here. Task attempt_201303080219_0002_r_005846_0 failed to report status for 7201 seconds. Killing! Task attempt_201303080219_0002_r_005857_0 failed to report status for 7203 seconds. Killing! Task attempt_201303080219_0002_r_005858_0 failed to report

Unexpected Hadoop behavior: map task re-running after reducer has been running

2013-03-11 Thread David Parks
I can't explain this behavior, can someone help me here: Kind % Complete Num Tasks Pending Running Complete Killed Failed/Killed Task Attempts map 100.00%23547 0 123546 0 247 / 0 reduce 62.40%13738 30 6232 0 336