[jira] [Commented] (MAPREDUCE-5709) Getting ClassNotFoundException even though jar is included in lib folder

2014-01-14 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870505#comment-13870505
 ] 

Harsh J commented on MAPREDUCE-5709:


Ajesh,

Please respond on the version request asked above to qualify if you're 
reporting a new bug and not something we've since fixed.

 Getting ClassNotFoundException even though jar is included in lib folder
 

 Key: MAPREDUCE-5709
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5709
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
 Environment: GNU/Linux 3.2.0-29-generic x86_64
Reporter: Ajesh Kumar

 In YARN, we are getting the below exception. The same is running fine in 
 MRv1.The jar containing CSVReader.class in included in the lib folder.
 Tried also setting the jar in HADOOP_CLASSPATH as well,but still the same 
 exception.
 13/12/31 09:26:37 INFO mapreduce.Job: map 0% reduce 0%
 13/12/31 09:26:42 INFO mapreduce.Job: Task Id : 
 attempt_1388462142258_0001_m_03_0, Status : FAILED
 Error: java.lang.RuntimeException: Error in configuring object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
 ..
 ..
 Caused by: java.lang.NoClassDefFoundError: au/com/bytecode/opencsv/CSVReader
 at 
 com.tcs.nextgen.pe.wm.datagen.maptask.MPOCDataMapperNewest.configure(MPOCDataMapperNewest.java:56)
 ... 22 more
 Caused by: java.lang.ClassNotFoundException: au.com.bytecode.opencsv.CSVReader
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5720) RM occur exception while unregistering

2014-01-14 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870572#comment-13870572
 ] 

Steve Loughran commented on MAPREDUCE-5720:
---

I'd flag up JRockit as trouble too -try to replicate on oracle java 6 or java7 
from oracle or openjdk

 RM occur exception while unregistering
 --

 Key: MAPREDUCE-5720
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5720
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
 Environment: rhel 5.8_x86 ;jrockit 1.6.0_31-R28.2.3-4.1.0;hadoop 2.2.0
Reporter: chillon_m





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5663) Add an interface to Input/Ouput Formats to obtain delegation tokens

2014-01-14 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870835#comment-13870835
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5663:
---

The Oozie server is responsible for obtaining all the tokens the main job may 
need:

* tokens to run the job (working dir, jobtokens)
* tokens for the Input and Output data (typically HDFS tokens, but they can be 
for different file systems, for Hbase, for HCatalog, etc).

For the typical case of running an MR job (directly or via Pig/Hive), the 
tokens of launcher job are sufficient for the main job. They just need to be 
propagated. The Oozie server makes sure the 
mapreduce.job.complete.cancel.delegation.tokens property is set to FALSE for 
the launcher job (Oozie gets rid of the launcher job for MR jobs once the main 
job is running).

For scenarios where the main job needs to interact with different services, 
Oozie must acquire them in advance. For HDFS this is done by simply setting the 
MRJobConfig.JOB_NAMENODES property, then the launcher job submission will get 
those tokens. For Hbase or HCatalog, Oozie has a CredentialsProvider that 
obtains those tokens (the requirement here is that Oozie is configured as proxy 
user in those services in order to get tokens for the user submitting the job).

From what it seems you are after generalizing this. If think we should do it 
with a slightly twist from what you are proposing:

* DelegationTokens should be always requested by the client, security enabled 
or not, computing the splits on the client or not.
* DelegationTokens fetching should be done regardless of the IF/OF 
implementation (take the case of talking with Hbase or HCatalog, job working 
dir service).
* DelegationTokens fetching should not be tied to split computation.

We could have a utility class that we pass a UGI, list of service URIs and 
returns a populated Credentials with tokens for all the specified services.

The IF/OF/Job would have to be able to extract the required URIs for the job.

Also, this mechanism could be used to obtain ALL tokens the AM needs.


 Add an interface to Input/Ouput Formats to obtain delegation tokens
 ---

 Key: MAPREDUCE-5663
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5663
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Michael Weng
 Attachments: MAPREDUCE-5663.4.txt, MAPREDUCE-5663.5.txt, 
 MAPREDUCE-5663.6.txt, MAPREDUCE-5663.patch.txt, MAPREDUCE-5663.patch.txt2, 
 MAPREDUCE-5663.patch.txt3


 Currently, delegation tokens are obtained as part of the getSplits / 
 checkOutputSpecs calls to the InputFormat / OutputFormat respectively.
 This works as long as the splits are generated on a node with kerberos 
 credentials. For split generation elsewhere (AM for example), an explicit 
 interface is required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5718) MR AM should tolerate RM restart/failover during commit

2014-01-14 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5718:


Attachment: mr-5718-0.patch

First-cut patch that deletes the startCommitFile if the commit is interrupted. 

However, in case of two AMs running during a partition, this can lead to one AM 
deleting the startCommitFile created by another AM. To avoid races in case of a 
partition, we might have to complicate this a little more. 

How about adding a .host.pid suffix to the name of the commit file? Each AM 
would write its own. When a subsequent AM comes up and verifies the state of 
commit from previous AMs, it would look for any? [~vinodkv], [~revans2] - 
thoughts? 

 MR AM should tolerate RM restart/failover during commit
 ---

 Key: MAPREDUCE-5718
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5718
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: mr-5718-0.patch


 While testing RM HA, we ran into this issue where if the RM fails over while 
 an MR AM is in the middle of a commit, the subsequent AM gets spawned but 
 dies with a diagnostic message - We crashed durring a commit. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5663) Add an interface to Input/Ouput Formats to obtain delegation tokens

2014-01-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870997#comment-13870997
 ] 

Siddharth Seth commented on MAPREDUCE-5663:
---

bq. DelegationTokens should be always requested by the client, security enabled 
or not, computing the splits on the client or not.
I think the client requesting the required tokens is required (directly or 
indirectly). Whether this is done independent of security is something I'm not 
too sure about - mainly from the perspective of services not handling getToken 
requests correctly if security is diabled. The JobClient currently doesn't do 
this, at least for HDFS.

bq. DelegationTokens fetching should be done regardless of the IF/OF 
implementation (take the case of talking with Hbase or HCatalog, job working 
dir service).
The intent of adding this interface is to be able to fetch tokens irrespective 
of the IF/OF - assuming the IF/OF implement the interface. For HBase / HCatalog 
sources which are outside of the IF/OF for a MR job - I don't think we have the 
capability for fetching tokens, and rely on the user providing them up front. 
That seems like a reasonable approach for now. Alternately, we could add a 
config specifying a list of classes which implement this interface - and can be 
invoked by the client code.

bq. DelegationTokens fetching should not be tied to split computation.
Completely agree with this. I don't think we can do this though - without 
making an incompatible change. We could explicitly fetch Credentials (if the 
interface is implemented), but at least some existing IF/OFs will continue to 
rely on getSplits / checkOutputSpecs for tokens.

bq. We could have a utility class that we pass a UGI, list of service URIs and 
returns a populated Credentials with tokens for all the specified services. The 
IF/OF/Job would have to be able to extract the required URIs for the job.
Would this utility class know how to handle all kinds of URIs ? I think it's 
better to leave the implementation of the Credentials Fetching code to the 
specific system (MR / HBase / HCatalog). Configure a list of 
CredentialProviders - which know how to fetch Credentials for the specific 
system.

 Add an interface to Input/Ouput Formats to obtain delegation tokens
 ---

 Key: MAPREDUCE-5663
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5663
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Michael Weng
 Attachments: MAPREDUCE-5663.4.txt, MAPREDUCE-5663.5.txt, 
 MAPREDUCE-5663.6.txt, MAPREDUCE-5663.patch.txt, MAPREDUCE-5663.patch.txt2, 
 MAPREDUCE-5663.patch.txt3


 Currently, delegation tokens are obtained as part of the getSplits / 
 checkOutputSpecs calls to the InputFormat / OutputFormat respectively.
 This works as long as the splits are generated on a node with kerberos 
 credentials. For split generation elsewhere (AM for example), an explicit 
 interface is required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5663) Add an interface to Input/Ouput Formats to obtain delegation tokens

2014-01-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871000#comment-13871000
 ] 

Siddharth Seth commented on MAPREDUCE-5663:
---

bq. DelegationTokens should be always requested by the client, security enabled 
or not, computing the splits on the client or not.
I think the client requesting the required tokens is required (directly or 
indirectly). Whether this is done independent of security is something I'm not 
too sure about - mainly from the perspective of services not handling getToken 
requests correctly if security is diabled. The JobClient currently doesn't do 
this, at least for HDFS.

bq. DelegationTokens fetching should be done regardless of the IF/OF 
implementation (take the case of talking with Hbase or HCatalog, job working 
dir service).
The intent of adding this interface is to be able to fetch tokens irrespective 
of the IF/OF - assuming the IF/OF implement the interface. For HBase / HCatalog 
sources which are outside of the IF/OF for a MR job - I don't think we have the 
capability for fetching tokens, and rely on the user providing them up front. 
That seems like a reasonable approach for now. Alternately, we could add a 
config specifying a list of classes which implement this interface - and can be 
invoked by the client code.

bq. DelegationTokens fetching should not be tied to split computation.
Completely agree with this. I don't think we can do this though - without 
making an incompatible change. We could explicitly fetch Credentials (if the 
interface is implemented), but at least some existing IF/OFs will continue to 
rely on getSplits / checkOutputSpecs for tokens.

bq. We could have a utility class that we pass a UGI, list of service URIs and 
returns a populated Credentials with tokens for all the specified services. The 
IF/OF/Job would have to be able to extract the required URIs for the job.
Would this utility class know how to handle all kinds of URIs ? I think it's 
better to leave the implementation of the Credentials Fetching code to the 
specific system (MR / HBase / HCatalog). Configure a list of 
CredentialProviders - which know how to fetch Credentials for the specific 
system.

 Add an interface to Input/Ouput Formats to obtain delegation tokens
 ---

 Key: MAPREDUCE-5663
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5663
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Michael Weng
 Attachments: MAPREDUCE-5663.4.txt, MAPREDUCE-5663.5.txt, 
 MAPREDUCE-5663.6.txt, MAPREDUCE-5663.patch.txt, MAPREDUCE-5663.patch.txt2, 
 MAPREDUCE-5663.patch.txt3


 Currently, delegation tokens are obtained as part of the getSplits / 
 checkOutputSpecs calls to the InputFormat / OutputFormat respectively.
 This works as long as the splits are generated on a node with kerberos 
 credentials. For split generation elsewhere (AM for example), an explicit 
 interface is required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (MAPREDUCE-5723) MR AM container log empty if exception occurs

2014-01-14 Thread Mohammad Kamrul Islam (JIRA)
Mohammad Kamrul Islam created MAPREDUCE-5723:


 Summary: MR AM container log empty if exception occurs
 Key: MAPREDUCE-5723
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5723
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.2.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam


It occurs when the property mapreduce.task.userlog.limit.kb is set non-zero 
in mapped-site.xml.
AM container syslog remains empty if any exception occurs. 

Bug details:
In MRAppMaster.java, the following code snippets show the bug.

} catch (Throwable t) {
   LOG.fatal(Error starting MRAppMaster, t);
  System.exit(1);
}finally {
   LogManager.shutdown();
 }

In the catch block, we are exiting the JVM. So finally block (therefore 
LogManager.shutdown()) is never executed.

Possible fix: 
Make sure LogManager.shutdown() is executed in all cases.
 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5723) MR AM container log empty if exception occurs

2014-01-14 Thread Mohammad Kamrul Islam (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated MAPREDUCE-5723:
-

Fix Version/s: 2.2.0
   trunk
   Status: Patch Available  (was: Open)

 MR AM container log empty if exception occurs
 -

 Key: MAPREDUCE-5723
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5723
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.2.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Fix For: trunk, 2.2.0

 Attachments: MAPREDUCE-5723.1.patch


 It occurs when the property mapreduce.task.userlog.limit.kb is set non-zero 
 in mapped-site.xml.
 AM container syslog remains empty if any exception occurs. 
 Bug details:
 In MRAppMaster.java, the following code snippets show the bug.
 } catch (Throwable t) {
LOG.fatal(Error starting MRAppMaster, t);
   System.exit(1);
 }finally {
LogManager.shutdown();
  }
 In the catch block, we are exiting the JVM. So finally block (therefore 
 LogManager.shutdown()) is never executed.
 Possible fix: 
 Make sure LogManager.shutdown() is executed in all cases.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5723) MR AM container log empty if exception occurs

2014-01-14 Thread Mohammad Kamrul Islam (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated MAPREDUCE-5723:
-

Attachment: MAPREDUCE-5723.1.patch

patch uploaded.

 MR AM container log empty if exception occurs
 -

 Key: MAPREDUCE-5723
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5723
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.2.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Fix For: trunk, 2.2.0

 Attachments: MAPREDUCE-5723.1.patch


 It occurs when the property mapreduce.task.userlog.limit.kb is set non-zero 
 in mapped-site.xml.
 AM container syslog remains empty if any exception occurs. 
 Bug details:
 In MRAppMaster.java, the following code snippets show the bug.
 } catch (Throwable t) {
LOG.fatal(Error starting MRAppMaster, t);
   System.exit(1);
 }finally {
LogManager.shutdown();
  }
 In the catch block, we are exiting the JVM. So finally block (therefore 
 LogManager.shutdown()) is never executed.
 Possible fix: 
 Make sure LogManager.shutdown() is executed in all cases.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5723) MR AM container log empty if exception occurs

2014-01-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871402#comment-13871402
 ] 

Hadoop QA commented on MAPREDUCE-5723:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12623004/MAPREDUCE-5723.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app:

  org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4318//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4318//console

This message is automatically generated.

 MR AM container log empty if exception occurs
 -

 Key: MAPREDUCE-5723
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5723
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.2.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Fix For: trunk, 2.2.0

 Attachments: MAPREDUCE-5723.1.patch


 It occurs when the property mapreduce.task.userlog.limit.kb is set non-zero 
 in mapped-site.xml.
 AM container syslog remains empty if any exception occurs. 
 Bug details:
 In MRAppMaster.java, the following code snippets show the bug.
 } catch (Throwable t) {
LOG.fatal(Error starting MRAppMaster, t);
   System.exit(1);
 }finally {
LogManager.shutdown();
  }
 In the catch block, we are exiting the JVM. So finally block (therefore 
 LogManager.shutdown()) is never executed.
 Possible fix: 
 Make sure LogManager.shutdown() is executed in all cases.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (MAPREDUCE-5724) JobHistoryServer does not start if HDFS is not running

2014-01-14 Thread Alejandro Abdelnur (JIRA)
Alejandro Abdelnur created MAPREDUCE-5724:
-

 Summary: JobHistoryServer does not start if HDFS is not running
 Key: MAPREDUCE-5724
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5724
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 3.0.0, 2.4.0
Reporter: Alejandro Abdelnur
Priority: Critical


Starting JHS without HDFS running fails with the following error:

{code}
STARTUP_MSG:   build = git://git.apache.org/hadoop-common.git -r 
ad74e8850b99e03b0b6435b04f5b3e9995bc3956; compiled by 'tucu' on 
2014-01-14T22:40Z
STARTUP_MSG:   java = 1.7.0_45
/
2014-01-14 16:47:40,264 INFO 
org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: registered UNIX signal 
handlers for [TERM, HUP, INT]
2014-01-14 16:47:40,883 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to 
load native-hadoop library for your platform... using builtin-java classes 
where applicable
2014-01-14 16:47:41,101 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: 
JobHistory Init
2014-01-14 16:47:41,710 INFO org.apache.hadoop.service.AbstractService: Service 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager failed in state INITED; 
cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating 
done directory: [hdfs://localhost:8020/tmp/hadoop-yarn/staging/history/done]
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done 
directory: [hdfs://localhost:8020/tmp/hadoop-yarn/staging/history/done]
at 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.serviceInit(HistoryFileManager.java:505)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.mapreduce.v2.hs.JobHistory.serviceInit(JobHistory.java:94)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at 
org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceInit(JobHistoryServer.java:143)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.launchJobHistoryServer(JobHistoryServer.java:207)
at 
org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:217)
Caused by: java.net.ConnectException: Call From dontknow.local/172.20.10.4 to 
localhost:8020 failed on connection exception: java.net.ConnectException: 
Connection refused; For more details see:  
http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1410)
at org.apache.hadoop.ipc.Client.call(Client.java:1359)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1722)
at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:124)
at org.apache.hadoop.fs.FileContext$14.next(FileContext.java:1106)
at org.apache.hadoop.fs.FileContext$14.next(FileContext.java:1102)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1102)
at org.apache.hadoop.fs.FileContext$Util.exists(FileContext.java:1514)
at 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.mkdir(HistoryFileManager.java:561)
at 

[jira] [Commented] (MAPREDUCE-5724) JobHistoryServer does not start if HDFS is not running

2014-01-14 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871450#comment-13871450
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5724:
---

YARN-24 fixed a similar issue for the NM, we should try doing something similar 
here.

 JobHistoryServer does not start if HDFS is not running
 --

 Key: MAPREDUCE-5724
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5724
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 3.0.0, 2.4.0
Reporter: Alejandro Abdelnur
Priority: Critical

 Starting JHS without HDFS running fails with the following error:
 {code}
 STARTUP_MSG:   build = git://git.apache.org/hadoop-common.git -r 
 ad74e8850b99e03b0b6435b04f5b3e9995bc3956; compiled by 'tucu' on 
 2014-01-14T22:40Z
 STARTUP_MSG:   java = 1.7.0_45
 /
 2014-01-14 16:47:40,264 INFO 
 org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: registered UNIX signal 
 handlers for [TERM, HUP, INT]
 2014-01-14 16:47:40,883 WARN org.apache.hadoop.util.NativeCodeLoader: Unable 
 to load native-hadoop library for your platform... using builtin-java classes 
 where applicable
 2014-01-14 16:47:41,101 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: 
 JobHistory Init
 2014-01-14 16:47:41,710 INFO org.apache.hadoop.service.AbstractService: 
 Service org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager failed in state 
 INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error 
 creating done directory: 
 [hdfs://localhost:8020/tmp/hadoop-yarn/staging/history/done]
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done 
 directory: [hdfs://localhost:8020/tmp/hadoop-yarn/staging/history/done]
   at 
 org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.serviceInit(HistoryFileManager.java:505)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.mapreduce.v2.hs.JobHistory.serviceInit(JobHistory.java:94)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
   at 
 org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceInit(JobHistoryServer.java:143)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.launchJobHistoryServer(JobHistoryServer.java:207)
   at 
 org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:217)
 Caused by: java.net.ConnectException: Call From dontknow.local/172.20.10.4 to 
 localhost:8020 failed on connection exception: java.net.ConnectException: 
 Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
   at org.apache.hadoop.ipc.Client.call(Client.java:1410)
   at org.apache.hadoop.ipc.Client.call(Client.java:1359)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1722)
   at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:124)
   at org.apache.hadoop.fs.FileContext$14.next(FileContext.java:1106)
   at org.apache.hadoop.fs.FileContext$14.next(FileContext.java:1102)
   at 

[jira] [Commented] (MAPREDUCE-5597) Missing alternatives in javadocs for deprecated API

2014-01-14 Thread Christopher Tubbs (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871498#comment-13871498
 ] 

Christopher Tubbs commented on MAPREDUCE-5597:
--

A grep for deprecated code might identify more, but those fixed here were my 
primary concern.

 Missing alternatives in javadocs for deprecated API
 ---

 Key: MAPREDUCE-5597
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5597
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client, documentation, job submission
Affects Versions: 2.2.0
Reporter: Christopher Tubbs
Assignee: Akira AJISAKA
  Labels: documentaion
 Attachments: MAPREDUCE-5597.patch


 Deprecated API, such as `new Job()` don't have javadocs explaining what the 
 alternatives are. (It'd also help if the new methods had @since tags to help 
 determine if one could safely use that API on older versions at runtime.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

2014-01-14 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871700#comment-13871700
 ] 

Carlo Curino commented on MAPREDUCE-5028:
-

Hey Karthik, any news on this? 

I think we might have stumbled upon the same issue while running gridmix with 
some large parameters for io.sort.mb. 
This was with a very recent version of trunk (errors showing up in LoadJob 
during cleanup() an regular map() calls).

 Maps fail when io.sort.mb is set to high value
 --

 Key: MAPREDUCE-5028
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Critical
 Fix For: 1.2.0, 2.4.0

 Attachments: MR-5028_testapp.patch, mr-5028-branch1.patch, 
 mr-5028-branch1.patch, mr-5028-branch1.patch, mr-5028-trunk.patch, 
 mr-5028-trunk.patch, mr-5028-trunk.patch, repro-mr-5028.patch


 Verified the problem exists on branch-1 with the following configuration:
 Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, 
 io.sort.mb=1280, dfs.block.size=2147483648
 Run teragen to generate 4 GB data
 Maps fail when you run wordcount on this configuration with the following 
 error: 
 {noformat}
 java.io.IOException: Spill failed
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
   at 
 org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
   at 
 org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:375)
   at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
   at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
   at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
   at 
 org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
   at 
 org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
   at 
 org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)