[jira] [Comment Edited] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-10-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15542845#comment-15542845
 ] 

Hitesh Shah edited comment on MAPREDUCE-6638 at 10/3/16 4:51 PM:
-

Havent looked at the patch in detail but [~haibochen]'s clarifying comments 
make sense. Jira title could be modified accordingly. +0 from my side. 


was (Author: hitesh):
Havent looked at the patch in detail but [~haibochen]'s clarifying comments 
make sense. +0 from my side. 

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, 
> mapreduce6638.003.patch, mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-10-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15542845#comment-15542845
 ] 

Hitesh Shah commented on MAPREDUCE-6638:


Havent looked at the patch in detail but [~haibochen]'s clarifying comments 
make sense. +0 from my side. 

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, 
> mapreduce6638.003.patch, mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6776) yarn.app.mapreduce.client.job.max-retries should have a more useful default

2016-09-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537287#comment-15537287
 ] 

Hitesh Shah commented on MAPREDUCE-6776:


FWIW, I do agree that this is a useful behavioral change that makes sense to 
push to branch-2 but might be better to call it out as incompatible but at the 
same release note it carefully to indicate that it will improve user experience 
and not have any detrimental impact apart from the retry delay in some edge 
cases. 

> yarn.app.mapreduce.client.job.max-retries should have a more useful default
> ---
>
> Key: MAPREDUCE-6776
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6776
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
> Attachments: MAPREDUCE-6776.001.patch, MAPREDUCE-6776.002.patch, 
> MAPREDUCE-6776.003.patch
>
>
> The default is 0, so any communication failure results in a client failure.  
> Oozie doesn't like that.  If the RM is failing over and Oozie gets a 
> communication failure, it assumes the target job has failed.  I propose 
> raising the default to something modest like 3 or 5.  The default retry 
> interval is 2s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6776) yarn.app.mapreduce.client.job.max-retries should have a more useful default

2016-09-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537275#comment-15537275
 ] 

Hitesh Shah commented on MAPREDUCE-6776:


>From a practical sense, this is not really an incompatible change as there is 
>some internal behavioral aspects that are being changed to retry 3 times 
>instead of no retries. 

However, from a pure theoretical compat perspective, a public default value is 
being changed as well as the value in mapred-default.xml. Tests which might be 
earlier doing some verification would expect immediate failures whereas now it 
might be reconnect or fail after 6 seconds or so. 

I suggest pushing this to trunk for sure as we are still in the alpha stage of 
releases. As for branch-2, I would check with the 2.8 release manager. 

> yarn.app.mapreduce.client.job.max-retries should have a more useful default
> ---
>
> Key: MAPREDUCE-6776
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6776
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
> Attachments: MAPREDUCE-6776.001.patch, MAPREDUCE-6776.002.patch, 
> MAPREDUCE-6776.003.patch
>
>
> The default is 0, so any communication failure results in a client failure.  
> Oozie doesn't like that.  If the RM is failing over and Oozie gets a 
> communication failure, it assumes the target job has failed.  I propose 
> raising the default to something modest like 3 or 5.  The default retry 
> interval is 2s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-09-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514897#comment-15514897
 ] 

Hitesh Shah edited comment on MAPREDUCE-6638 at 9/23/16 12:17 AM:
--

bq. (1) Avoid recovering an AM if encrypted spill is enabled

Encrypted spill w.r.t recovery is not the same as a committer not supporting 
recovery. Any reason we cannot just re-run the job from scratch if all reducers 
have not completed ( or re-run all maps and incomplete reducers )?

Ideally speaking, you could just re-run most of the job tasks again if needed 
to support proper fault tolerance even in scenarios where the key cannot be 
stored securely. In this scenario, the new AM can generate a new key. I would 
agree that this might not be a performant solution but it atleast solves the 
problem of not having the user to re-submit the job. If performance is an 
issue, users can turn off recovery when encryption is enabled for scenarios 
where the key cannot be stored securely.


was (Author: hitesh):
bq. (1) Avoid recovering an AM if encrypted spill is enabled

Encrypted spill w.r.t recovery is not the same as a committer not supporting 
recovery. Any reason we cannot just re-run the job from scratch if all reducers 
have not completed?

Ideally speaking, you could just re-run most of the job tasks again if needed 
to support proper fault tolerance even in scenarios where the key cannot be 
stored securely. In this scenario, the new AM can generate a new key. I would 
agree that this might not be a performant solution but it atleast solves the 
problem of not having the user to re-submit the job. If performance is an 
issue, users can turn off recovery when encryption is enabled for scenarios 
where the key cannot be stored securely.

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, 
> mapreduce6638.003.patch, mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-09-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514897#comment-15514897
 ] 

Hitesh Shah commented on MAPREDUCE-6638:


bq. (1) Avoid recovering an AM if encrypted spill is enabled

Encrypted spill w.r.t recovery is not the same as a committer not supporting 
recovery. Any reason we cannot just re-run the job from scratch if all reducers 
have not completed?

Ideally speaking, you could just re-run most of the job tasks again if needed 
to support proper fault tolerance even in scenarios where the key cannot be 
stored securely. In this scenario, the new AM can generate a new key. I would 
agree that this might not be a performant solution but it atleast solves the 
problem of not having the user to re-submit the job. If performance is an 
issue, users can turn off recovery when encryption is enabled for scenarios 
where the key cannot be stored securely.

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, 
> mapreduce6638.003.patch, mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6484) Yarn Client uses local address instead of RM address as token renewer in a secure cluster when RM HA is enabled.

2016-09-14 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15491425#comment-15491425
 ] 

Hitesh Shah commented on MAPREDUCE-6484:


[~asuresh] thanks for the pointer. Any reason why MR does not use that function 
in that case? 

> Yarn Client uses local address instead of RM address as token renewer in a 
> secure cluster when RM HA is enabled.
> 
>
> Key: MAPREDUCE-6484
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6484
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, security
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6484.001.patch, YARN-4187.000.patch
>
>
> Yarn Client uses local address instead of RM address as token renewer in a 
> secure cluster when RM HA is enabled. This will cause HDFS token renew 
> failure for renewer "nobody"  if the rules from 
> {{hadoop.security.auth_to_local}} exclude the client address in HDFS 
> {{DelegationTokenIdentifier}}.
> The reason why the local address is returned is: When HA is enabled, 
> "yarn.resourcemanager.address" may not be set,  if 
> {{HOSTNAME_PATTERN}}("_HOST") is used in "yarn.resourcemanager.principal", 
> the default address "0.0.0.0:8032" will be used,  Based on the following code 
> at SecurityUtil.java, the local address will be used to replace "0.0.0.0".
> {code}
>   private static String replacePattern(String[] components, String hostname)
>   throws IOException {
> String fqdn = hostname;
> if (fqdn == null || fqdn.isEmpty() || fqdn.equals("0.0.0.0")) {
>   fqdn = getLocalHostName();
> }
> return components[0] + "/" + fqdn.toLowerCase(Locale.US) + "@" + 
> components[2];
>   }
>   static String getLocalHostName() throws UnknownHostException {
> return InetAddress.getLocalHost().getCanonicalHostName();
>   }
>   public static String getServerPrincipal(String principalConfig,
>   InetAddress addr) throws IOException {
> String[] components = getComponents(principalConfig);
> if (components == null || components.length != 3
> || !components[1].equals(HOSTNAME_PATTERN)) {
>   return principalConfig;
> } else {
>   if (addr == null) {
> throw new IOException("Can't replace " + HOSTNAME_PATTERN
> + " pattern since client address is null");
>   }
>   return replacePattern(components, addr.getCanonicalHostName());
> }
>   }
> {code}
> The following is the exception which cause the job fail:
> {code}
> 15/09/12 16:27:24 WARN security.UserGroupInformation: 
> PriviledgedActionException as:t...@example.com (auth:KERBEROS) 
> cause:java.io.IOException: Failed to run job : yarn tries to renew a token 
> with renewer nobody
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:464)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7109)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:512)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:648)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:975)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> java.io.IOException: Failed to run job : yarn tries to renew a token with 
> renewer nobody
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:464)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7109)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:512)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClie

[jira] [Commented] (MAPREDUCE-6776) yarn.app.mapreduce.client.job.max-retries should have a more useful default

2016-09-12 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15485365#comment-15485365
 ] 

Hitesh Shah commented on MAPREDUCE-6776:


Changing this in 2.x would be an incompatible change. 

> yarn.app.mapreduce.client.job.max-retries should have a more useful default
> ---
>
> Key: MAPREDUCE-6776
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6776
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The default is 0, so any communication results in a client failure.  Oozie 
> doesn't like that.  If the RM is failing over and Oozie gets a communication 
> failure, it assumes the target job has failed.  I propose raising the default 
> to something modest like 3 or 5.  The default retry interval is 2s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6484) Yarn Client uses local address instead of RM address as token renewer in a secure cluster when RM HA is enabled.

2016-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478696#comment-15478696
 ] 

Hitesh Shah commented on MAPREDUCE-6484:


[~asuresh] [~zxu] It seems like the getMasterAddress() functionality ideally 
belongs in YARN and not in MR so that other applications that make use of YARN 
can always leverage the same functionality. Would you agree? 

> Yarn Client uses local address instead of RM address as token renewer in a 
> secure cluster when RM HA is enabled.
> 
>
> Key: MAPREDUCE-6484
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6484
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, security
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6484.001.patch, YARN-4187.000.patch
>
>
> Yarn Client uses local address instead of RM address as token renewer in a 
> secure cluster when RM HA is enabled. This will cause HDFS token renew 
> failure for renewer "nobody"  if the rules from 
> {{hadoop.security.auth_to_local}} exclude the client address in HDFS 
> {{DelegationTokenIdentifier}}.
> The reason why the local address is returned is: When HA is enabled, 
> "yarn.resourcemanager.address" may not be set,  if 
> {{HOSTNAME_PATTERN}}("_HOST") is used in "yarn.resourcemanager.principal", 
> the default address "0.0.0.0:8032" will be used,  Based on the following code 
> at SecurityUtil.java, the local address will be used to replace "0.0.0.0".
> {code}
>   private static String replacePattern(String[] components, String hostname)
>   throws IOException {
> String fqdn = hostname;
> if (fqdn == null || fqdn.isEmpty() || fqdn.equals("0.0.0.0")) {
>   fqdn = getLocalHostName();
> }
> return components[0] + "/" + fqdn.toLowerCase(Locale.US) + "@" + 
> components[2];
>   }
>   static String getLocalHostName() throws UnknownHostException {
> return InetAddress.getLocalHost().getCanonicalHostName();
>   }
>   public static String getServerPrincipal(String principalConfig,
>   InetAddress addr) throws IOException {
> String[] components = getComponents(principalConfig);
> if (components == null || components.length != 3
> || !components[1].equals(HOSTNAME_PATTERN)) {
>   return principalConfig;
> } else {
>   if (addr == null) {
> throw new IOException("Can't replace " + HOSTNAME_PATTERN
> + " pattern since client address is null");
>   }
>   return replacePattern(components, addr.getCanonicalHostName());
> }
>   }
> {code}
> The following is the exception which cause the job fail:
> {code}
> 15/09/12 16:27:24 WARN security.UserGroupInformation: 
> PriviledgedActionException as:t...@example.com (auth:KERBEROS) 
> cause:java.io.IOException: Failed to run job : yarn tries to renew a token 
> with renewer nobody
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:464)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7109)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:512)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:648)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:975)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> java.io.IOException: Failed to run job : yarn tries to renew a token with 
> renewer nobody
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:464)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7109)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.re

[jira] [Commented] (MAPREDUCE-6062) Use TestDFSIO test random read : job failed

2016-05-24 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299224#comment-15299224
 ] 

Hitesh Shah commented on MAPREDUCE-6062:


[~tfukudom] please hit "submit patch" to trigger the pre-commit build. 
https://wiki.apache.org/hadoop/HowToContribute has more info on dos and donts 
when contributing patches. In this case, I will defer to someone who has been 
looking at MR code in more recent times to do a review. If you do not see any 
updates on the jira within the next couple of days, please feel free to drop a 
polite email on the mapreduce-dev list asking for review help. 

> Use TestDFSIO test random read : job failed
> ---
>
> Key: MAPREDUCE-6062
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6062
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 2.2.0
> Environment: command : hadoop jar $JAR_PATH TestDFSIO-read -random 
> -nrFiles 12 -size 8000
>Reporter: chongyuanhuang
>Assignee: Takuya Fukudome
> Attachments: MAPREDUCE-6062.patch
>
>
> This is log:
> 2014-09-01 13:57:29,876 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.IllegalArgumentException: n must be 
> positive
>   at java.util.Random.nextInt(Random.java:300)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.nextOffset(TestDFSIO.java:601)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:580)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:546)
>   at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:134)
>   at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:37)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> 2014-09-01 13:57:29,886 INFO [main] org.apache.hadoop.mapred.Task: Runnning 
> cleanup for the task
> 2014-09-01 13:57:29,894 WARN [main] 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete 
> hdfs://m101:8020/benchmarks/TestDFSIO/io_random_read/_temporary/1/_temporary/attempt_1409538816633_0005_m_01_0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-6062) Use TestDFSIO test random read : job failed

2016-05-24 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reassigned MAPREDUCE-6062:
--

Assignee: Takuya Fukudome

[~tfukudom] added you to MR contributors. Hopefully this should get you 
unblocked.

> Use TestDFSIO test random read : job failed
> ---
>
> Key: MAPREDUCE-6062
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6062
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 2.2.0
> Environment: command : hadoop jar $JAR_PATH TestDFSIO-read -random 
> -nrFiles 12 -size 8000
>Reporter: chongyuanhuang
>Assignee: Takuya Fukudome
>
> This is log:
> 2014-09-01 13:57:29,876 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.IllegalArgumentException: n must be 
> positive
>   at java.util.Random.nextInt(Random.java:300)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.nextOffset(TestDFSIO.java:601)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:580)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:546)
>   at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:134)
>   at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:37)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> 2014-09-01 13:57:29,886 INFO [main] org.apache.hadoop.mapred.Task: Runnning 
> cleanup for the task
> 2014-09-01 13:57:29,894 WARN [main] 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete 
> hdfs://m101:8020/benchmarks/TestDFSIO/io_random_read/_temporary/1/_temporary/attempt_1409538816633_0005_m_01_0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5785) Derive heap size or mapreduce.*.memory.mb automatically

2014-11-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222592#comment-14222592
 ] 

Hitesh Shah commented on MAPREDUCE-5785:


bq. I think we should commit this to branch-2 as well

This change is incompatible especially as it modifies mapred-default.xml. Not 
sure why it would be committed to branch-2. 


> Derive heap size or mapreduce.*.memory.mb automatically
> ---
>
> Key: MAPREDUCE-5785
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mr-am, task
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch, 
> MAPREDUCE-5785.v03.patch, mr-5785-4.patch, mr-5785-5.patch, mr-5785-6.patch
>
>
> Currently users have to set 2 memory-related configs per Job / per task type. 
>  One first chooses some container size map reduce.\*.memory.mb and then a 
> corresponding maximum Java heap size Xmx < map reduce.\*.memory.mb. This 
> makes sure that the JVM's C-heap (native memory + Java heap) does not exceed 
> this mapreduce.*.memory.mb. If one forgets to tune Xmx, MR-AM might be 
> - allocating big containers whereas the JVM will only use the default 
> -Xmx200m.
> - allocating small containers that will OOM because Xmx is too high.
> With this JIRA, we propose to set Xmx automatically based on an empirical 
> ratio that can be adjusted. Xmx is not changed automatically if provided by 
> the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5956) MapReduce AM should not use maxAttempts to determine if this is the last retry

2014-07-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057866#comment-14057866
 ] 

Hitesh Shah commented on MAPREDUCE-5956:


To add to [~mayank_bansal]'s comment, this is the 4th ( and last ) attempt of 
the AM and there have been no preemptions. 

> MapReduce AM should not use maxAttempts to determine if this is the last retry
> --
>
> Key: MAPREDUCE-5956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: applicationmaster, mrv2
>Affects Versions: 2.4.0
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: MR-5956.patch
>
>
> Found this while reviewing YARN-2074. The problem is that after YARN-2074, we 
> don't count AM preemption towards AM failures on RM side, but MapReduce AM 
> itself checks the attempt id against the max-attempt count to determine if 
> this is the last attempt.
> {code}
> public void computeIsLastAMRetry() {
>   isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts;
> }
> {code}
> This causes issues w.r.t deletion of staging directory etc..



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5956) MapReduce AM should not use maxAttempts to determine if this is the last retry

2014-07-08 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14055396#comment-14055396
 ] 

Hitesh Shah commented on MAPREDUCE-5956:


[~vinodkv] By definition, if an AM calls unregister, it is telling the RM that 
this is my last attempt and the app should not be retried. Are now you saying 
that all attempts should now call unregisterAttempt() which will tell the app 
whether it is the final attempt and should call a final unregister()? If not, I 
think something else is needed as an AM will only call unregister() on an error 
if it thinks it is the last attempt. 

 

> MapReduce AM should not use maxAttempts to determine if this is the last retry
> --
>
> Key: MAPREDUCE-5956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: applicationmaster, mrv2
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Wangda Tan
>Priority: Blocker
>
> Found this while reviewing YARN-2074. The problem is that after YARN-2074, we 
> don't count AM preemption towards AM failures on RM side, but MapReduce AM 
> itself checks the attempt id against the max-attempt count to determine if 
> this is the last attempt.
> {code}
> public void computeIsLastAMRetry() {
>   isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts;
> }
> {code}
> This causes issues w.r.t deletion of staging directory etc..



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5696) Add Localization counters to MR

2013-12-26 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13856997#comment-13856997
 ] 

Hitesh Shah commented on MAPREDUCE-5696:


The introduction of localization counters in the env is akin to introducing a 
new API in YARN. Could you split this jira out into 2. One in YARN for the YARN 
changes where the new API/interface is introduced and this jira could be 
leveraged for the MR specific changes. 

> Add Localization counters to MR
> ---
>
> Key: MAPREDUCE-5696
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5696
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: LocalizationCounters.png, MAPREDUCE-5696.v01.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of user-visible metrics. The purpose of this JIRA is to 
> compliment YARN-1529. While YARN-1529 attempts to provide a cluster-wide view 
> to cluster admins, this JIRA focuses on exposing the localization overhead on 
> per-job basis to the job owner/user.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5487) In task processes, JobConf is unnecessarily loaded again in Limits

2013-12-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838320#comment-13838320
 ] 

Hitesh Shah commented on MAPREDUCE-5487:


A little bit late on this. Did anyone look into how this affects jobs where a 
user modifies the counter limit to be higher than the cluster configured value 
and what happens in the case where the jobhistory server is configured with a 
limit less than the user supplied limit? 

> In task processes, JobConf is unnecessarily loaded again in Limits
> --
>
> Key: MAPREDUCE-5487
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5487
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: performance, task
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.4.0
>
> Attachments: MAPREDUCE-5487-1.patch, MAPREDUCE-5487.patch
>
>
> Limits statically loads a JobConf, which incurs costs of reading files from 
> disk and parsing XML.  The contents of this JobConf are identical to the one 
> loaded by YarnChild (before adding job.xml as a resource).  Allowing Limits 
> to initialize with the JobConf loaded in YarnChild would reduce task startup 
> time.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (MAPREDUCE-5633) Can Hadoop use multi-cores of a processor under single machine

2013-11-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5633.


Resolution: Invalid

Please ask such questions on the user list. 
http://hadoop.apache.org/mailing_lists.html#User

> Can Hadoop use multi-cores of a processor under single machine
> --
>
> Key: MAPREDUCE-5633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5633
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>Reporter: Asif
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-10-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1378#comment-1378
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


[~jlowe] Thanks for the clarification. I believe the performance issues should 
hold regardless of any filesystem implementation used as long as the 
distributed cache layer ends up correctly interpreting the permissions to the 
appropriate LocalResource visibility. 

+1. Latest patch looks good to me. 

Let me know if you are waiting on anyone else to chime in on this. If not, 
please feel free to go ahead and commit or I shall commit later today.  

> Remove dependency on deployed MR jars
> -
>
> Key: MAPREDUCE-4421
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4421-2.patch, MAPREDUCE-4421-3.patch, 
> MAPREDUCE-4421-4.patch, MAPREDUCE-4421.patch, MAPREDUCE-4421.patch
>
>
> Currently MR AM depends on MR jars being deployed on all nodes via implicit 
> dependency on YARN_APPLICATION_CLASSPATH. 
> We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
> probably, just rely on adding a shaded MR jar along with job.jar to the 
> dist-cache.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-09-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782438#comment-13782438
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


Sorry for the delay in the review. 

Regarding addMRFrameworkToDistributedCache() - one minor question:  the code 
allows for a non-qualified URI. Should we enforce provision of a 
fully-qualified path always?

Minor nit: I believe there should be nothing in the implementation that 
requires HDFS as the storage for the MR tarball? Documentation needs to change 
as a result unless you believe there are reasons for not mentioning other 
filesystems ( except maybe from a testing point of view )?

Patch looks good otherwise. Thanks for adding the detailed docs.




> Remove dependency on deployed MR jars
> -
>
> Key: MAPREDUCE-4421
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4421-2.patch, MAPREDUCE-4421.patch, 
> MAPREDUCE-4421.patch
>
>
> Currently MR AM depends on MR jars being deployed on all nodes via implicit 
> dependency on YARN_APPLICATION_CLASSPATH. 
> We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
> probably, just rely on adding a shaded MR jar along with job.jar to the 
> dist-cache.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-09-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13773326#comment-13773326
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


[~jlowe] Thanks for the detailed answers to my queries. 

I believe this initial patch is a good start to making MR a user-land library. 
As it stands, it provides the additional flexibility which can be used by 
anyone to deploy MR with either the full tarball or a mix-match approach. 
Though it might be good to have some documentation on the 2 possible approaches 
( full tarball vs MR tarball ) and explain how the classpath should be setup. 

Depending on your viewpoint, the classpath-to-hdfs path mapping - whether it 
comes in from an additional file on HDFS could be considered in a follow-up 
jira if others believe this is a better solution. 

The one thing to change in the patch is the documentation for 
mapreduce.application.framework.path - it does not mention the use of the URI 
fragment and how that interacts with the configured classpath. 

Could you file a follow-up jira for the config handling? 




> Remove dependency on deployed MR jars
> -
>
> Key: MAPREDUCE-4421
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4421.patch, MAPREDUCE-4421.patch
>
>
> Currently MR AM depends on MR jars being deployed on all nodes via implicit 
> dependency on YARN_APPLICATION_CLASSPATH. 
> We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
> probably, just rely on adding a shaded MR jar along with job.jar to the 
> dist-cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-09-12 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765961#comment-13765961
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


s/Configuration/Jobconf/ in the previous comment.

> Remove dependency on deployed MR jars
> -
>
> Key: MAPREDUCE-4421
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4421.patch, MAPREDUCE-4421.patch
>
>
> Currently MR AM depends on MR jars being deployed on all nodes via implicit 
> dependency on YARN_APPLICATION_CLASSPATH. 
> We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
> probably, just rely on adding a shaded MR jar along with job.jar to the 
> dist-cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-09-12 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765889#comment-13765889
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


[~jlowe] Had a few questions/comments related to the implementation/patch: 

- Why does classpath need to include all of common, hdfs and yarn jar 
locations? Assuming that MR is running on a YARN-based cluster, shouldn't the 
location of the core dependencies come from the cluster deployment i.e. via the 
env that the NM sets for a container. I believe the only jars that MR should 
have in its uploaded tarball should be the client jars. I understand that there 
is no clear boundary for client-side only jars for common and hdfs today ( for 
For YARN, I believe it should be simple to split out the client-side 
requirements ) but it is something we should aim for or assume that the jars 
deployed on the cluster are compatible. 
  - I guess the underlying question is why use the full hadoop tarball and not 
just the mapreduce-only tarball? If MR is trully a user-land library, it should 
be treated as such and have a separate deployment approach.

- I would vote to make the tar-ball in HDFS be the only way to run MR on YARN. 
Obviously, this cannot be done for 2.x but we should move to this model on 
trunk and not support the current approach at all there. Comments? 

- The other point is related to configs. Configuration still loads mapred-site 
and mapred-default files and new Configuration objects are created on the 
cluster. Are these files still expected on the cluster? job.xml does override 
these but cluster configs could still have final params. If this is meant to be 
addressed in a follow-up jira to ensure all MR configs come from the client, 
you can ignore this point for now.

- How do you see framework name extracted from the path to be used? Is it just 
a safety check to ensure that it is found in the classpath? Will it have any 
relation to a version? A minor nit - framework name seems confusing in relation 
to the framework name in use from earlier i.e yarn vs local framework. 

- Description in the default-xml for mapreduce.application.framework.path does 
not mention the need for the URI fragment and how the fragment is used as a 
sanity check to the classpath. 

- Regarding versions, it seems like users will need to do 2 things. Change the 
location of the tarball on HDFS and modify the classpath. Users will need to 
know the exact structure of the classpath. In such a scenario, do defaults even 
make sense? On the other hand, if we define a common standard i.e. a base path 
for all MR tarballs, with each tarball in a defined structure  ( possibly with 
version info added on later on for the code to infer the structure of the 
tarball ), all the user would need to do is specify the base path ( which could 
have a default value ) and a version which again has a default value. The 
latter approach would require the code to construct the necessary classpath if 
the upload path is in use. Do you have any comments on which of the 2 
approaches makes more sense? The former is way more flexible but a bit more 
complex. The latter brittle/inflexible with respect to changing tarball 
structures but likely more easier to enforce a standard on.


> Remove dependency on deployed MR jars
> -
>
> Key: MAPREDUCE-4421
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4421.patch, MAPREDUCE-4421.patch
>
>
> Currently MR AM depends on MR jars being deployed on all nodes via implicit 
> dependency on YARN_APPLICATION_CLASSPATH. 
> We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
> probably, just rely on adding a shaded MR jar along with job.jar to the 
> dist-cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5130) Add missing job config options to mapred-default.xml

2013-07-31 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725902#comment-13725902
 ] 

Hitesh Shah commented on MAPREDUCE-5130:


[~sandyr] Was a bit thrown off by the jira description which mentions 
documenting *child.java.opts instead of the property names not using "child".

> Add missing job config options to mapred-default.xml
> 
>
> Key: MAPREDUCE-5130
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5130
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5130-1.patch, MAPREDUCE-5130-1.patch, 
> MAPREDUCE-5130-2.patch, MAPREDUCE-5130-3.patch, MAPREDUCE-5130-4.patch, 
> MAPREDUCE-5130-5.patch, MAPREDUCE-5130.patch, MAPREDUCE-5130.patch
>
>
> I came across that mapreduce.map.child.java.opts and 
> mapreduce.reduce.child.java.opts were missing in mapred-default.xml.  I'll do 
> a fuller sweep to see what else is missing before posting a patch.
> List so far:
> mapreduce.map/reduce.child.java.opts
> mapreduce.map/reduce.memory.mb
> mapreduce.job.jvm.numtasks
> mapreduce.input.lineinputformat.linespermap
> mapreduce.task.combine.progress.records
> mapreduce.map/reduce.env

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5130) Add missing job config options to mapred-default.xml

2013-07-31 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725728#comment-13725728
 ] 

Hitesh Shah commented on MAPREDUCE-5130:


Regarding mapreduce.map/reduce.child.java.opts, aren't they to be deprecated in 
favor or mapreduce.[map|reduce].java.opts?



> Add missing job config options to mapred-default.xml
> 
>
> Key: MAPREDUCE-5130
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5130
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5130-1.patch, MAPREDUCE-5130-1.patch, 
> MAPREDUCE-5130-2.patch, MAPREDUCE-5130-3.patch, MAPREDUCE-5130-4.patch, 
> MAPREDUCE-5130-5.patch, MAPREDUCE-5130.patch, MAPREDUCE-5130.patch
>
>
> I came across that mapreduce.map.child.java.opts and 
> mapreduce.reduce.child.java.opts were missing in mapred-default.xml.  I'll do 
> a fuller sweep to see what else is missing before posting a patch.
> List so far:
> mapreduce.map/reduce.child.java.opts
> mapreduce.map/reduce.memory.mb
> mapreduce.job.jvm.numtasks
> mapreduce.input.lineinputformat.linespermap
> mapreduce.task.combine.progress.records
> mapreduce.map/reduce.env

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5416) hadoop-mapreduce-client-common depends on hadoop-yarn-server-common

2013-07-24 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5416:
--

 Summary: hadoop-mapreduce-client-common depends on 
hadoop-yarn-server-common
 Key: MAPREDUCE-5416
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5416
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah


mapreduce-client-app and mapreduce-client-jobclient modules also depend on 
yarn-server-common but only in test scope.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5408) CLONE - The logging level of the tasks should be configurable by the job

2013-07-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716470#comment-13716470
 ] 

Hitesh Shah commented on MAPREDUCE-5408:


Mostly looks good. A couple of minor comments:

  - DEFAULT_LOG_LEVEL could be renamed to DEFAULT_TASK_LOG_LEVEL and the type 
changed to a string. Having the type as Level is not buying much as it always 
ends up being converted to a string when used. If the intention is to retain 
the backport as is, this comment can be ignored for now. 

  - Level.toLevel() has an api which takes in a default value. In the event 
that the user has a typo, the current usage falls back to using DEBUG where as 
the default-based api can be made to fall back to INFO.


 

> CLONE - The logging level of the tasks should be configurable by the job
> 
>
> Key: MAPREDUCE-5408
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5408
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-336_branch1.patch
>
>
> It would be nice to be able to configure the logging level of the Task JVM's 
> separately from the server JVM's. Reducing logging substantially increases 
> performance and reduces the consumption of local disk on the task trackers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5399) Large number of map tasks cause slow sort at reduce phase, invariant to amount of data to sort

2013-07-17 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5399.


Resolution: Invalid

If this is indeed an issue with Apache Hadoop-1.x, please feel free to file a 
jira with details specific to that. Issues with a particular vendor's distro 
should be redirected to the vendor in question. 

> Large number of map tasks cause slow sort at reduce phase, invariant to 
> amount of data to sort
> --
>
> Key: MAPREDUCE-5399
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5399
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Reporter: Stanislav Barton
>Priority: Critical
>
> We are using hadoop-2.0.0+1357-1.cdh4.3.0.p0.21 with MRv1. After upgrade from 
> 4.1.2 to 4.3.0, I have noticed some performance deterioration in our MR job 
> in the Reduce phase. The MR job has usually 10 000 map tasks (10 000 files on 
> input each about 100MB) and 6 000 reducers (one reducer per table region). I 
> was trying to figure out what at which phase the slow down appears (firstly I 
> suspected that the slow gathering of the 1 map output files is the 
> culprit) and found out that the problem is not reading the map output (the 
> shuffle) but the sort/merge phase that follows - the last and actual reduce 
> phase is fast. I have tried to up the io.sort.factor because I thought the 
> lots of small files are being merged on disk, but again upping that to 1000 
> didnt do any difference. I have then printed the stack trace and found out 
> that the problem is initialization of the 
> org.apache.hadoop.mapred.IFileInputStream namely the creation of the 
> Configuration object which is not propagated along from earlier context, see 
> the stack trace:
> Thread 13332: (state = IN_NATIVE)
>  - java.io.UnixFileSystem.getBooleanAttributes0(java.io.File) @bci=0 
> (Compiled frame; information may be imprecise)
>  - java.io.UnixFileSystem.getBooleanAttributes(java.io.File) @bci=2, line=228 
> (Compiled frame)
>  - java.io.File.exists() @bci=20, line=733 (Compiled frame)
>  - sun.misc.URLClassPath$FileLoader.getResource(java.lang.String, boolean) 
> @bci=136, line=999 (Compiled frame)
>  - sun.misc.URLClassPath$FileLoader.findResource(java.lang.String, boolean) 
> @bci=3, line=966 (Compiled frame)
>  - sun.misc.URLClassPath.findResource(java.lang.String, boolean) @bci=17, 
> line=146 (Compiled frame)
>  - java.net.URLClassLoader$2.run() @bci=12, line=385 (Compiled frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedAction, 
> java.security.AccessControlContext) @bci=0 (Compiled frame)
>  - java.net.URLClassLoader.findResource(java.lang.String) @bci=13, line=382 
> (Compiled frame)
>  - java.lang.ClassLoader.getResource(java.lang.String) @bci=30, line=1002 
> (Compiled frame)
>  - java.lang.ClassLoader.getResourceAsStream(java.lang.String) @bci=2, 
> line=1192 (Compiled frame)
>  - javax.xml.parsers.SecuritySupport$4.run() @bci=26, line=96 (Compiled frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedAction) 
> @bci=0 (Compiled frame)
>  - 
> javax.xml.parsers.SecuritySupport.getResourceAsStream(java.lang.ClassLoader, 
> java.lang.String) @bci=10, line=89 (Compiled frame)
>  - javax.xml.parsers.FactoryFinder.findJarServiceProvider(java.lang.String) 
> @bci=38, line=250 (Interpreted frame)
>  - javax.xml.parsers.FactoryFinder.find(java.lang.String, java.lang.String) 
> @bci=273, line=223 (Interpreted frame)
>  - javax.xml.parsers.DocumentBuilderFactory.newInstance() @bci=4, line=123 
> (Compiled frame)
>  - org.apache.hadoop.conf.Configuration.loadResource(java.util.Properties, 
> org.apache.hadoop.conf.Configuration$Resource, boolean) @bci=16, line=1890 
> (Compiled frame)
>  - org.apache.hadoop.conf.Configuration.loadResources(java.util.Properties, 
> java.util.ArrayList, boolean) @bci=49, line=1867 (Compiled frame)
>  - org.apache.hadoop.conf.Configuration.getProps() @bci=43, line=1785 
> (Compiled frame)
>  - org.apache.hadoop.conf.Configuration.get(java.lang.String) @bci=35, 
> line=712 (Compiled frame)
>  - org.apache.hadoop.conf.Configuration.getTrimmed(java.lang.String) @bci=2, 
> line=731 (Compiled frame)
>  - org.apache.hadoop.conf.Configuration.getBoolean(java.lang.String, boolean) 
> @bci=2, line=1047 (Interpreted frame)
>  - org.apache.hadoop.mapred.IFileInputStream.(java.io.InputStream, 
> long, org.apache.hadoop.conf.Configuration) @bci=111, line=93 (Interpreted 
> frame)
>  - 
> org.apache.hadoop.mapred.IFile$Reader.(org.apache.hadoop.conf.Configuration,
>  org.apache.hadoop.fs.FSDataInputStream, long, 
> org.apache.hadoop.io.compress.CompressionCodec, 
> org.apache.hadoop.mapred.Counters$Counter) @bci=

[jira] [Resolved] (MAPREDUCE-5325) ClientRMProtocol.getAllApplications should accept ApplicationType as a parameter---MR changes

2013-07-09 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5325.


   Resolution: Fixed
Fix Version/s: 2.1.0-beta

Committed to trunk, branch-2, branch-2.1-beta and branch-2.1.0-beta. Thanks 
Xuan.

> ClientRMProtocol.getAllApplications should accept ApplicationType as a 
> parameter---MR changes
> -
>
> Key: MAPREDUCE-5325
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5325
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.1.0-beta
>
> Attachments: MR-5325.1.patch, MR-5325.2.patch, MR-5325.3.patch, 
> MR-5325.4.patch, MR-5325.5.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5325) ClientRMProtocol.getAllApplications should accept ApplicationType as a parameter---MR changes

2013-07-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703862#comment-13703862
 ] 

Hitesh Shah commented on MAPREDUCE-5325:


Overall patch being reviewed as part of YARN-727. Will be committed together to 
ensure build does not break.

> ClientRMProtocol.getAllApplications should accept ApplicationType as a 
> parameter---MR changes
> -
>
> Key: MAPREDUCE-5325
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5325
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: MR-5325.1.patch, MR-5325.2.patch, MR-5325.3.patch, 
> MR-5325.4.patch, MR-5325.5.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5325) ClientRMProtocol.getAllApplications should accept ApplicationType as a parameter---MR changes

2013-06-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684234#comment-13684234
 ] 

Hitesh Shah commented on MAPREDUCE-5325:


@Xuan, will mapreduce jobs have different application types or only a single 
fixed type for all MR jobs? If the latter, the getAllJobs() should not be 
taking application type as an argument.

> ClientRMProtocol.getAllApplications should accept ApplicationType as a 
> parameter---MR changes
> -
>
> Key: MAPREDUCE-5325
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5325
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: MR-5325.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5324) Admin-provided user environment can be overridden by user provided values for the AM

2013-06-14 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5324:
--

 Summary: Admin-provided user environment can be overridden by user 
provided values for the AM
 Key: MAPREDUCE-5324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Priority: Minor


MRJobConfig.MR_AM_ADMIN_USER_ENV can be overridden by MRJobConfig.MR_AM_ENV.

Either the variable should be renamed to something along the lines of 
DEFAULT_ENV or the code fixed to have the correct overrides. Current 
documentation clearly states user env overrides admin env.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662321#comment-13662321
 ] 

Hitesh Shah commented on MAPREDUCE-5095:


Thanks Arpit. Committed to branch-1. 

> TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
> -
>
> Key: MAPREDUCE-5095
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: Open JDK7
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-5095.patch
>
>   Original Estimate: 1h
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The test fails due a test-order dependency that can be violated when running 
> with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5191:
---

Release Note:   (was: Thanks Ivan. Committed to trunk.)

> TestQueue#testQueue fails with timeout on Windows
> -
>
> Key: MAPREDUCE-5191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
> MAPREDUCE-5191.patch
>
>
> Test times out on my machine after 5 seconds always on the below stack:
> {code}
> testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec  <<< 
> ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:485)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
>   at 
> sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
>   at 
> sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
>   at 
> sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
>   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
>   at java.security.SecureRandom.next(SecureRandom.java:455)
>   at java.util.Random.nextLong(Random.java:284)
>   at java.io.File.generateFile(File.java:1682)
>   at java.io.File.createTempFile(File.java:1791)
>   at java.io.File.createTempFile(File.java:1828)
>   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
>   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5095:
---

Release Note:   (was: Thanks Arpit. Committed to branch-1. )

> TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
> -
>
> Key: MAPREDUCE-5095
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: Open JDK7
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-5095.patch
>
>   Original Estimate: 1h
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The test fails due a test-order dependency that can be violated when running 
> with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662320#comment-13662320
 ] 

Hitesh Shah commented on MAPREDUCE-5191:


Thanks Ivan. Committed to trunk.

> TestQueue#testQueue fails with timeout on Windows
> -
>
> Key: MAPREDUCE-5191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
> MAPREDUCE-5191.patch
>
>
> Test times out on my machine after 5 seconds always on the below stack:
> {code}
> testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec  <<< 
> ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:485)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
>   at 
> sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
>   at 
> sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
>   at 
> sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
>   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
>   at java.security.SecureRandom.next(SecureRandom.java:455)
>   at java.util.Random.nextLong(Random.java:284)
>   at java.io.File.generateFile(File.java:1682)
>   at java.io.File.createTempFile(File.java:1791)
>   at java.io.File.createTempFile(File.java:1828)
>   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
>   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5095.


  Resolution: Fixed
Release Note: Thanks Arpit. Committed to branch-1. 

> TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
> -
>
> Key: MAPREDUCE-5095
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: Open JDK7
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-5095.patch
>
>   Original Estimate: 1h
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The test fails due a test-order dependency that can be violated when running 
> with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662316#comment-13662316
 ] 

Hitesh Shah commented on MAPREDUCE-5095:


[~arpitagarwal] Should have reviewed the whole patch in context. Thanks for the 
clarification. +1. Will commit shortly. 

> TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
> -
>
> Key: MAPREDUCE-5095
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: Open JDK7
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-5095.patch
>
>   Original Estimate: 1h
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The test fails due a test-order dependency that can be violated when running 
> with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5191:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
 Release Note: Thanks Ivan. Committed to trunk.
   Status: Resolved  (was: Patch Available)

> TestQueue#testQueue fails with timeout on Windows
> -
>
> Key: MAPREDUCE-5191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
> MAPREDUCE-5191.patch
>
>
> Test times out on my machine after 5 seconds always on the below stack:
> {code}
> testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec  <<< 
> ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:485)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
>   at 
> sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
>   at 
> sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
>   at 
> sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
>   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
>   at java.security.SecureRandom.next(SecureRandom.java:455)
>   at java.util.Random.nextLong(Random.java:284)
>   at java.io.File.generateFile(File.java:1682)
>   at java.io.File.createTempFile(File.java:1791)
>   at java.io.File.createTempFile(File.java:1828)
>   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
>   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662313#comment-13662313
 ] 

Hitesh Shah commented on MAPREDUCE-5191:


+1. Committing shortly. 

> TestQueue#testQueue fails with timeout on Windows
> -
>
> Key: MAPREDUCE-5191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
> MAPREDUCE-5191.patch
>
>
> Test times out on my machine after 5 seconds always on the below stack:
> {code}
> testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec  <<< 
> ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:485)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
>   at 
> sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
>   at 
> sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
>   at 
> sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
>   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
>   at java.security.SecureRandom.next(SecureRandom.java:455)
>   at java.util.Random.nextLong(Random.java:284)
>   at java.io.File.generateFile(File.java:1682)
>   at java.io.File.createTempFile(File.java:1791)
>   at java.io.File.createTempFile(File.java:1828)
>   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
>   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-13 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5191:
---

Status: Open  (was: Patch Available)

> TestQueue#testQueue fails with timeout on Windows
> -
>
> Key: MAPREDUCE-5191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.patch
>
>
> Test times out on my machine after 5 seconds always on the below stack:
> {code}
> testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec  <<< 
> ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:485)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
>   at 
> sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
>   at 
> sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
>   at 
> sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
>   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
>   at java.security.SecureRandom.next(SecureRandom.java:455)
>   at java.util.Random.nextLong(Random.java:284)
>   at java.io.File.generateFile(File.java:1682)
>   at java.io.File.createTempFile(File.java:1791)
>   at java.io.File.createTempFile(File.java:1828)
>   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
>   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-11 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5240:
---

Component/s: (was: mrv1)
 mrv2

> inside of FileOutputCommitter the initialized Credentials cache appears to be 
> empty
> ---
>
> Key: MAPREDUCE-5240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.4-alpha
>Reporter: Roman Shaposhnik
>Priority: Blocker
> Fix For: 2.0.5-beta
>
> Attachments: LostCreds.java
>
>
> I am attaching a modified wordcount job that clearly demonstrates the problem 
> we've encountered in running Sqoop2 on YARN (BIGTOP-949).
> Here's what running it produces:
> {noformat}
> $ hadoop fs -mkdir in
> $ hadoop fs -put /etc/passwd in
> $ hadoop jar ./bug.jar org.myorg.LostCreds
> 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
> longer used.
> numberOfSecretKeys: 1
> numberOfTokens: 0
> ..
> ..
> ..
> 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
> state FAILED due to: Job commit failed: java.io.IOException:
> numberOfSecretKeys: 0
> numberOfTokens: 0
>   at 
> org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}
> As you can see, even though we've clearly initialized the creds via:
> {noformat}
> job.getCredentials().addSecretKey(new Text("mykey"), "mysecret".getBytes());
> {noformat}
> It doesn't seem to appear later in the job.
> This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
> YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654947#comment-13654947
 ] 

Hitesh Shah commented on MAPREDUCE-5191:


Does it make sense to not use the temp file method in such a scenario to reduce 
the time it takes to run? How about just creating a file under target/ with the 
name of the test as filename? On a Mac, I saw this test run on an avg of 1 
second for multiple runs. 

 

> TestQueue#testQueue fails with timeout on Windows
> -
>
> Key: MAPREDUCE-5191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.patch
>
>
> Test times out on my machine after 5 seconds always on the below stack:
> {code}
> testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec  <<< 
> ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:485)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
>   at 
> sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
>   at 
> sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
>   at 
> sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
>   at 
> sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
>   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
>   at java.security.SecureRandom.next(SecureRandom.java:455)
>   at java.util.Random.nextLong(Random.java:284)
>   at java.io.File.generateFile(File.java:1682)
>   at java.io.File.createTempFile(File.java:1791)
>   at java.io.File.createTempFile(File.java:1828)
>   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
>   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654710#comment-13654710
 ] 

Hitesh Shah commented on MAPREDUCE-5095:


Should abortCalled also be changed to a non-static? 


> TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
> -
>
> Key: MAPREDUCE-5095
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: Open JDK7
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-5095.patch
>
>   Original Estimate: 1h
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The test fails due a test-order dependency that can be violated when running 
> with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5179:
---

Status: Patch Available  (was: Open)

> Change TestHSWebServices to do string equal check on hadoop build version 
> similar to YARN-605
> -
>
> Key: MAPREDUCE-5179
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: MAPREDUCE-5179.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642073#comment-13642073
 ] 

Hitesh Shah commented on MAPREDUCE-5179:


[~vinodkv], None others found. 

> Change TestHSWebServices to do string equal check on hadoop build version 
> similar to YARN-605
> -
>
> Key: MAPREDUCE-5179
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: MAPREDUCE-5179.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-24 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13640672#comment-13640672
 ] 

Hitesh Shah commented on MAPREDUCE-5179:


Needs YARN-605 to be committed before this can go in.

> Change TestHSWebServices to do string equal check on hadoop build version 
> similar to YARN-605
> -
>
> Key: MAPREDUCE-5179
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: MAPREDUCE-5179.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5178) Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.

2013-04-24 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13640674#comment-13640674
 ] 

Hitesh Shah commented on MAPREDUCE-5178:


Needs YARN-577 to go in before this can be committed.

> Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.
> -
>
> Key: MAPREDUCE-5178
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5178
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: MAPREDUCE-5178.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-24 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5179:
---

Attachment: MAPREDUCE-5179.1.patch

> Change TestHSWebServices to do string equal check on hadoop build version 
> similar to YARN-605
> -
>
> Key: MAPREDUCE-5179
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: MAPREDUCE-5179.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-24 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5179:
--

 Summary: Change TestHSWebServices to do string equal check on 
hadoop build version similar to YARN-605
 Key: MAPREDUCE-5179
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5178) Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.

2013-04-24 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reassigned MAPREDUCE-5178:
--

Assignee: Hitesh Shah

> Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.
> -
>
> Key: MAPREDUCE-5178
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5178
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: MAPREDUCE-5178.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5178) Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.

2013-04-24 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5178:
---

Attachment: MAPREDUCE-5178.1.patch

> Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.
> -
>
> Key: MAPREDUCE-5178
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5178
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hitesh Shah
> Attachments: MAPREDUCE-5178.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5178) Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.

2013-04-24 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5178:
--

 Summary: Fix use of BuilderUtils#newApplicationReport as a result 
of YARN-577.
 Key: MAPREDUCE-5178
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5178
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5142) MR AM unregisters with state KILLED when an error causes dispatcher to shutdown

2013-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5142:
---

Description: 
RMCommunicator sets final state to KILLED if the job is in a running state and 
isSignalled is set to true. 

{code}
  } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
  || (jobImpl.getInternalState() == JobStateInternal.RUNNING && 
isSignalled)) {
finishState = FinalApplicationStatus.KILLED;
  } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
  || jobImpl.getInternalState() == JobStateInternal.ERROR) {
finishState = FinalApplicationStatus.FAILED;
{code}

This happens when any uncaught exception in any event handler ends up causing 
the AsyncDispatcher to trigger a shutdown. In such a scenario, even though the 
AM actually failed due to some error, its actual state ends up as KILLED.




  was:
RMCommunicator sets final state to KILLED if the job is in a running state and 
isSignalled is set to true. 

{code}
  } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
  || (jobImpl.getInternalState() == JobStateInternal.RUNNING && 
isSignalled)) {
finishState = FinalApplicationStatus.KILLED;
  } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
  || jobImpl.getInternalState() == JobStateInternal.ERROR) {
finishState = FinalApplicationStatus.FAILED;
{code}

This happens when for some reason, there is an exception in a state machine's 
event handler causing AsyncDispatcher to trigger a shutdown. In such a 
scenario, even though the AM actually failed due to some error, its actual 
state ends up as KILLED.





> MR AM unregisters with state KILLED when an error causes dispatcher to 
> shutdown
> ---
>
> Key: MAPREDUCE-5142
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5142
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha, 0.23.5
>Reporter: Hitesh Shah
>
> RMCommunicator sets final state to KILLED if the job is in a running state 
> and isSignalled is set to true. 
> {code}
>   } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
>   || (jobImpl.getInternalState() == JobStateInternal.RUNNING && 
> isSignalled)) {
> finishState = FinalApplicationStatus.KILLED;
>   } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
>   || jobImpl.getInternalState() == JobStateInternal.ERROR) {
> finishState = FinalApplicationStatus.FAILED;
> {code}
> This happens when any uncaught exception in any event handler ends up causing 
> the AsyncDispatcher to trigger a shutdown. In such a scenario, even though 
> the AM actually failed due to some error, its actual state ends up as KILLED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5142) MR AM unregisters with state KILLED when an error causes dispatcher to shutdown

2013-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5142:
---

Affects Version/s: 2.0.3-alpha
   0.23.5

> MR AM unregisters with state KILLED when an error causes dispatcher to 
> shutdown
> ---
>
> Key: MAPREDUCE-5142
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5142
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha, 0.23.5
>Reporter: Hitesh Shah
>
> RMCommunicator sets final state to KILLED if the job is in a running state 
> and isSignalled is set to true. 
> {code}
>   } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
>   || (jobImpl.getInternalState() == JobStateInternal.RUNNING && 
> isSignalled)) {
> finishState = FinalApplicationStatus.KILLED;
>   } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
>   || jobImpl.getInternalState() == JobStateInternal.ERROR) {
> finishState = FinalApplicationStatus.FAILED;
> {code}
> This happens when for some reason, there is an exception in a state machine's 
> event handler causing AsyncDispatcher to trigger a shutdown. In such a 
> scenario, even though the AM actually failed due to some error, its actual 
> state ends up as KILLED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5142) MR AM unregisters with state KILLED when an error causes dispatcher to shutdown

2013-04-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628187#comment-13628187
 ] 

Hitesh Shah commented on MAPREDUCE-5142:


@Jason, yes - definitely the same underlying issue. Addressing the CLC creation 
would address a part of the issue but currently all uncaught exceptions will 
end up with the AM in a KILLED state.   

> MR AM unregisters with state KILLED when an error causes dispatcher to 
> shutdown
> ---
>
> Key: MAPREDUCE-5142
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5142
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hitesh Shah
>
> RMCommunicator sets final state to KILLED if the job is in a running state 
> and isSignalled is set to true. 
> {code}
>   } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
>   || (jobImpl.getInternalState() == JobStateInternal.RUNNING && 
> isSignalled)) {
> finishState = FinalApplicationStatus.KILLED;
>   } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
>   || jobImpl.getInternalState() == JobStateInternal.ERROR) {
> finishState = FinalApplicationStatus.FAILED;
> {code}
> This happens when for some reason, there is an exception in a state machine's 
> event handler causing AsyncDispatcher to trigger a shutdown. In such a 
> scenario, even though the AM actually failed due to some error, its actual 
> state ends up as KILLED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5142) MR AM unregisters with state KILLED when an error causes dispatcher to shutdown

2013-04-10 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5142:
--

 Summary: MR AM unregisters with state KILLED when an error causes 
dispatcher to shutdown
 Key: MAPREDUCE-5142
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5142
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah


RMCommunicator sets final state to KILLED if the job is in a running state and 
isSignalled is set to true. 

{code}
  } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
  || (jobImpl.getInternalState() == JobStateInternal.RUNNING && 
isSignalled)) {
finishState = FinalApplicationStatus.KILLED;
  } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
  || jobImpl.getInternalState() == JobStateInternal.ERROR) {
finishState = FinalApplicationStatus.FAILED;
{code}

This happens when for some reason, there is an exception in a state machine's 
event handler causing AsyncDispatcher to trigger a shutdown. In such a 
scenario, even though the AM actually failed due to some error, its actual 
state ends up as KILLED.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-04-04 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622537#comment-13622537
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


Minor clarification - changes.txt was modified in branch-2 and branch-2.0.4 - 
trunk has some additional mayhem to clear out first.

> MiniMRCluster should use a random component when creating an actual cluster
> ---
>
> Key: MAPREDUCE-5083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.0.4-alpha
>
> Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
> MAPREDUCE-5083-trunk.txt
>
>
> Currently all unit tests end up using the same work dir - which can affect 
> anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-04-04 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5083.


  Resolution: Fixed
   Fix Version/s: (was: 2.0.5-beta)
  2.0.4-alpha
Target Version/s:   (was: 2.0.5-beta)
Release Note: Committed to branch-2.0.4. Modified changes.txt in trunk, 
branch-2 and branch-2.0.4 accordingly.

> MiniMRCluster should use a random component when creating an actual cluster
> ---
>
> Key: MAPREDUCE-5083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.0.4-alpha
>
> Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
> MAPREDUCE-5083-trunk.txt
>
>
> Currently all unit tests end up using the same work dir - which can affect 
> anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-04-04 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reopened MAPREDUCE-5083:



> MiniMRCluster should use a random component when creating an actual cluster
> ---
>
> Key: MAPREDUCE-5083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.0.5-beta
>
> Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
> MAPREDUCE-5083-trunk.txt
>
>
> Currently all unit tests end up using the same work dir - which can affect 
> anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-04-04 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622520#comment-13622520
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


@Stack, I will be committing shortly to branch-2.0.4.

> MiniMRCluster should use a random component when creating an actual cluster
> ---
>
> Key: MAPREDUCE-5083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.0.5-beta
>
> Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
> MAPREDUCE-5083-trunk.txt
>
>
> Currently all unit tests end up using the same work dir - which can affect 
> anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5088) MR Client gets an renewer token exception while Oozie is submitting a job

2013-04-03 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5088:
---

Fix Version/s: (was: 2.0.5-beta)
   (was: 3.0.0)

> MR Client gets an renewer token exception while Oozie is submitting a job
> -
>
> Key: MAPREDUCE-5088
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5088
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Roman Shaposhnik
>Assignee: Daryn Sharp
>Priority: Blocker
> Fix For: 2.0.4-alpha
>
> Attachments: HADOOP-9409.patch, HADOOP-9409.patch, 
> MAPREDUCE-5088.patch, MAPREDUCE-5088.patch, MAPREDUCE-5088.txt
>
>
> After the fix for HADOOP-9299 I'm now getting the following bizzare exception 
> in Oozie while trying to submit a job. This also seems to be KRB related:
> {noformat}
> 2013-03-15 13:34:16,555  WARN ActionStartXCommand:542 - USER[hue] GROUP[-] 
> TOKEN[] APP[MapReduce] JOB[001-130315123130987-oozie-oozi-W] 
> ACTION[001-130315123130987-oozie-oozi-W@Sleep] Error starting action 
> [Sleep]. ErrorType [ERROR], ErrorCode [UninitializedMessageException], 
> Message [UninitializedMessageException: Message missing required fields: 
> renewer]
> org.apache.oozie.action.ActionExecutorException: 
> UninitializedMessageException: Message missing required fields: renewer
>   at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401)
>   at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:738)
>   at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:889)
>   at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:211)
>   at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:59)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:277)
>   at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:326)
>   at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:255)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.UninitializedMessageException: Message missing 
> required fields: renewer
>   at 
> com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:605)
>   at 
> org.apache.hadoop.security.proto.SecurityProtos$GetDelegationTokenRequestProto$Builder.build(SecurityProtos.java:973)
>   at 
> org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetDelegationTokenRequestPBImpl.mergeLocalToProto(GetDelegationTokenRequestPBImpl.java:84)
>   at 
> org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetDelegationTokenRequestPBImpl.getProto(GetDelegationTokenRequestPBImpl.java:67)
>   at 
> org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getDelegationToken(MRClientProtocolPBClientImpl.java:200)
>   at 
> org.apache.hadoop.mapred.YARNRunner.getDelegationTokenFromHS(YARNRunner.java:194)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:273)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1439)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:581)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1439)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:576)
>   at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:723)
>   ... 10 more
> 2013-03-15 13:34:16,555  WARN ActionStartXCommand:542 - USER[hue] GROUP[-] 
> TOKEN[] APP[MapReduce] JOB[001-13031512313
> {noform

[jira] [Commented] (MAPREDUCE-5088) MR Client gets an renewer token exception while Oozie is submitting a job

2013-04-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621482#comment-13621482
 ] 

Hitesh Shah commented on MAPREDUCE-5088:


Updated fixed version to 2.0.4-alpha as assumption is that anything committed 
to 2.0.4-alpha should also have been committed to trunk and branch-2.

> MR Client gets an renewer token exception while Oozie is submitting a job
> -
>
> Key: MAPREDUCE-5088
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5088
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Roman Shaposhnik
>Assignee: Daryn Sharp
>Priority: Blocker
> Fix For: 2.0.4-alpha
>
> Attachments: HADOOP-9409.patch, HADOOP-9409.patch, 
> MAPREDUCE-5088.patch, MAPREDUCE-5088.patch, MAPREDUCE-5088.txt
>
>
> After the fix for HADOOP-9299 I'm now getting the following bizzare exception 
> in Oozie while trying to submit a job. This also seems to be KRB related:
> {noformat}
> 2013-03-15 13:34:16,555  WARN ActionStartXCommand:542 - USER[hue] GROUP[-] 
> TOKEN[] APP[MapReduce] JOB[001-130315123130987-oozie-oozi-W] 
> ACTION[001-130315123130987-oozie-oozi-W@Sleep] Error starting action 
> [Sleep]. ErrorType [ERROR], ErrorCode [UninitializedMessageException], 
> Message [UninitializedMessageException: Message missing required fields: 
> renewer]
> org.apache.oozie.action.ActionExecutorException: 
> UninitializedMessageException: Message missing required fields: renewer
>   at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401)
>   at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:738)
>   at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:889)
>   at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:211)
>   at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:59)
>   at org.apache.oozie.command.XCommand.call(XCommand.java:277)
>   at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:326)
>   at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:255)
>   at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.UninitializedMessageException: Message missing 
> required fields: renewer
>   at 
> com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:605)
>   at 
> org.apache.hadoop.security.proto.SecurityProtos$GetDelegationTokenRequestProto$Builder.build(SecurityProtos.java:973)
>   at 
> org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetDelegationTokenRequestPBImpl.mergeLocalToProto(GetDelegationTokenRequestPBImpl.java:84)
>   at 
> org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetDelegationTokenRequestPBImpl.getProto(GetDelegationTokenRequestPBImpl.java:67)
>   at 
> org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getDelegationToken(MRClientProtocolPBClientImpl.java:200)
>   at 
> org.apache.hadoop.mapred.YARNRunner.getDelegationTokenFromHS(YARNRunner.java:194)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:273)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1439)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:581)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1439)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:576)
>   at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:723)
>   ... 10 

[jira] [Commented] (MAPREDUCE-5094) Disable mem monitoring by default in MiniMRYarnCluster

2013-03-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611837#comment-13611837
 ] 

Hitesh Shah commented on MAPREDUCE-5094:


@Stack, only max pmem is configurable directly but max vmem can be configured 
indirectly via the vmem-pmem ratio ( default ratio is 2.1 ).

> Disable mem monitoring by default in MiniMRYarnCluster
> --
>
> Key: MAPREDUCE-5094
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5094
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> YARN-449. Some hbase tests were failing since containers were getting killed. 
> I believe these checks are disabled by default on the branch-1 MiniMRCluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-03-22 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5083:
---

   Resolution: Fixed
Fix Version/s: 2.0.5-beta
   Status: Resolved  (was: Patch Available)

Thanks Sid. Committed to branch-2 and trunk.

> MiniMRCluster should use a random component when creating an actual cluster
> ---
>
> Key: MAPREDUCE-5083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.0.5-beta
>
> Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
> MAPREDUCE-5083-trunk.txt
>
>
> Currently all unit tests end up using the same work dir - which can affect 
> anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-03-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611084#comment-13611084
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


+1. Will commit shortly.

> MiniMRCluster should use a random component when creating an actual cluster
> ---
>
> Key: MAPREDUCE-5083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
> MAPREDUCE-5083-trunk.txt
>
>
> Currently all unit tests end up using the same work dir - which can affect 
> anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-03-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608254#comment-13608254
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


( above change needed in MiniMRCluster.java )

> MiniMRCluster should use a random component when creating an actual cluster
> ---
>
> Key: MAPREDUCE-5083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk.txt
>
>
> Currently all unit tests end up using the same work dir - which can affect 
> anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-03-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608252#comment-13608252
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


{code}
String identifier = this.getClass().getName()
{code}

Should replace getClass().getName() to getSimpleName for ensuring things don't 
break on Windows 


> MiniMRCluster should use a random component when creating an actual cluster
> ---
>
> Key: MAPREDUCE-5083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk.txt
>
>
> Currently all unit tests end up using the same work dir - which can affect 
> anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603885#comment-13603885
 ] 

Hitesh Shah commented on MAPREDUCE-5066:


Job notification also exists in 2.x which may face the same set of issues. 

> JobTracker should set a timeout when calling into job.end.notification.url
> --
>
> Key: MAPREDUCE-5066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
>
> In current code, timeout is not specified when JobTracker (JobEndNotifier) 
> calls into the notification URL. When the given URL points to a server that 
> will not respond for a long time, job notifications are completely stuck 
> (given that we have only a single thread processing all notifications). We've 
> seen this cause noticeable delays in job execution in components that rely on 
> job end notifications (like Oozie workflows). 
> I propose we introduce a configurable timeout option and set a default to a 
> reasonably small value.
> If we want, we can also introduce a configurable number of workers processing 
> the notification queue (not sure if this is needed though at this point).
> I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-15 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5066:
---

Affects Version/s: 2.0.3-alpha

> JobTracker should set a timeout when calling into job.end.notification.url
> --
>
> Key: MAPREDUCE-5066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
>
> In current code, timeout is not specified when JobTracker (JobEndNotifier) 
> calls into the notification URL. When the given URL points to a server that 
> will not respond for a long time, job notifications are completely stuck 
> (given that we have only a single thread processing all notifications). We've 
> seen this cause noticeable delays in job execution in components that rely on 
> job end notifications (like Oozie workflows). 
> I propose we introduce a configurable timeout option and set a default to a 
> reasonably small value.
> If we want, we can also introduce a configurable number of workers processing 
> the notification queue (not sure if this is needed though at this point).
> I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4442) Accessing hadoop counters from a job is unreliable in yarn during AM process cleanup window

2013-02-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4442:
---

Labels: usability  (was: )

> Accessing hadoop counters from a job is unreliable in yarn during AM process 
> cleanup  window
> 
>
> Key: MAPREDUCE-4442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Rahul Jain
>  Labels: usability
> Attachments: am_logs_counter_failure.html, 
> rsrc_mgr_logs_counter_failed.txt
>
>
> We found this issue during our tests moving from MapReduceV1 to MapReduceV2. 
> A few of our applications access job counters multiple times:
> a) After submission of job, while job is execution (works fine)
> b) Right after job complete notification is received (works fine)
> c) Few seconds after job complete notification (fails most of the time).
> The error snippet is as follows:
> {code}
> 2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
> connection Thread[IPC Client (1252749669) connection to 
> sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
> 2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-07-12 19:12:29,216 ERROR [UserGroupInformation] 
> PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException
> 2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
> retrieve counters. null
> java.io.IOException
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
>   at 
> org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
>   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
>   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
>   at 
> org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)
> {code}
> The connection to 10.202.50.187:47944 is actually the connection to AM; 
> appears that we are connecting to AM to get the counters for the successful 
> job and not yet to the history server.
>  
> I'll attach the logs for AM and resource mgr separately, however no unusual 
> activity is seen in those.
> This makes me suspect that we have a race condition in the code trying to 
> access job counters when AM is finishing up and the job hasn't moved to 
> history server yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4648) Diagnostics from AM are missing from job history

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4648:
---

Labels: usability  (was: )

> Diagnostics from AM are missing from job history
> 
>
> Key: MAPREDUCE-4648
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4648
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0, 2.0.0-alpha
>Reporter: Jason Lowe
>  Labels: usability
>
> When a job fails during setup or commit, any diagnostics from the MapReduce 
> ApplicationMaster are not available in the job history.  Currently the 
> diagnostics for the job are collected from the diagnostics of tasks run for 
> the job, but the AM has no corresponding task record in the job history.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4693:
---

Labels: usability  (was: )

> Historyserver should provide counters for failed tasks
> --
>
> Key: MAPREDUCE-4693
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Reporter: Jason Lowe
>  Labels: usability
>
> Currently the historyserver is not providing counters for failed tasks, even 
> though they are available via the AM as long as the job is still running.  
> Those counters are lost when the client needs to redirect to the 
> historyserver after the job completes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4692) Investigate and remove MR1 JTConfig and its constants use in the MR project on trunk

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4692:
---

Labels: usability  (was: )

> Investigate and remove MR1 JTConfig and its constants use in the MR project 
> on trunk
> 
>
> Key: MAPREDUCE-4692
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4692
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Reporter: Harsh J
>Priority: Minor
>  Labels: usability
>
> Filed on behalf of Robert from MAPREDUCE-3223
> {quote}
> Are there any JIRAs to deprecate the configs from where they reside in the 
> code? 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/server/jobtracker/JTConfig.java
>  for example. I know we cannot delete them out just yet, because MRV1 code 
> still exists and may be using it, but it would be good to mark all of those 
> configs as deprecated. So that we can delete them in trunk once the MRV1 code 
> is completely removed.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4704) TaskHeartbeatHandler misreports a ping timeout as a task timeout

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4704:
---

Labels: usability  (was: )

> TaskHeartbeatHandler misreports a ping timeout as a task timeout
> 
>
> Key: MAPREDUCE-4704
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4704
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am, mrv2
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Priority: Minor
>  Labels: usability
>
> When a task fails to ping within the hardcoded ping timeout of 5 minutes, 
> TaskHeartbeatHandler logs a message reporting the wrong timeout value.  It 
> reports a timeout of mapreduce.task.timeout seconds rather than the 5 minute 
> ping timeout.
> This can lead to user confusion if they try increasing mapreduce.task.timeout 
> and see the log message showing the larger value but the task continues to 
> timeout after only 5 minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4818:
---

Labels: usability  (was: )

> Easier identification of tasks that timeout during localization
> ---
>
> Key: MAPREDUCE-4818
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 0.23.3, 2.0.3-alpha
>Reporter: Jason Lowe
>  Labels: usability
>
> When a task is taking too long to localize and is killed by the AM due to 
> task timeout, the job UI/history is not very helpful.  The attempt simply 
> lists a diagnostic stating it was killed due to timeout, but there are no 
> logs for the attempt since it never actually got started.  There are log 
> messages on the NM that show the container never made it past localization by 
> the time it was killed, but users often do not have access to those logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4794) DefaultSpeculator generates error messages on normal shutdown

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4794:
---

Labels: usability  (was: )

> DefaultSpeculator generates error messages on normal shutdown
> -
>
> Key: MAPREDUCE-4794
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4794
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>  Labels: usability
> Attachments: MAPREDUCE-4794.patch
>
>
> DefaultSpeculator can log the following error message on a normal shutdown of 
> the ApplicationMaster:
> {noformat}
> 2012-11-13 01:35:31,841 ERROR [DefaultSpeculator background processing] 
> org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: Background 
> thread returning, interrupted : java.lang.InterruptedException
> {noformat}
> and in addition for some reason it logs the corresponding backtrace to stdout.
> Like the errors fixed in MAPREDUCE-4741, this error message in the syslog and 
> backtrace on stdout can be confusing to users as to whether the job really 
> succeeded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4997) Deprecate mapreduce.jobtracker.address

2013-02-11 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576013#comment-13576013
 ] 

Hitesh Shah commented on MAPREDUCE-4997:


For users to transition from MR1 to MR2, we are talking about a full cluster 
change - replacing the JT/TTs with new daemons - RM/NMs. In this scenario, the 
users would be well aware of the change and therefore have to make the 
necessary config changes too. Therefore, it seems like not supporting 
mapreduce.jobtracker.address would be more ideal so as to not give them the 
wrong impression that the RM is a JT replacement.

> Deprecate mapreduce.jobtracker.address
> --
>
> Key: MAPREDUCE-4997
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4997
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> mapreduce.jobtracker.address currently is not used, but users transitioning 
> from mr1 to mr2 may expect their previous job configs to work, so it should 
> be deprecated in favor of yarn.resourcemanager.address.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4994) -jt generic command line option does not work

2013-02-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575341#comment-13575341
 ] 

Hitesh Shah commented on MAPREDUCE-4994:


Also, is there any reason to make the hadoop command-line be YARN and 
resourcemanager-aware? Ignoring what was supported in earlier versions, for the 
future, would it more preferable to have the local runner option be part of say 
a mapred command line option? 

> -jt generic command line option does not work
> -
>
> Key: MAPREDUCE-4994
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4994
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4994-1.patch, MAPREDUCE-4994.patch
>
>
> hadoop jar myjar.jar MyDriver -fs file:/// -jt local input.txt output/
> should run a job using the local file system and the local job runner. 
> Instead it tries to connect to a jobtracker.
> hadoop jar myjar.jar MyDriver -fs file:/// -jt host:port input.txt output/
> does not use the given host/port
> This appears to be because Cluster#initialize, which loads the 
> ClientProtocol, contains no special handling for mapred.job.tracker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4994) -jt generic command line option does not work

2013-02-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575340#comment-13575340
 ] 

Hitesh Shah commented on MAPREDUCE-4994:


It makes sense to remove -jt as there is no notion of jobtracker anywhere in 
2.x 

> -jt generic command line option does not work
> -
>
> Key: MAPREDUCE-4994
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4994
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4994-1.patch, MAPREDUCE-4994.patch
>
>
> hadoop jar myjar.jar MyDriver -fs file:/// -jt local input.txt output/
> should run a job using the local file system and the local job runner. 
> Instead it tries to connect to a jobtracker.
> hadoop jar myjar.jar MyDriver -fs file:/// -jt host:port input.txt output/
> does not use the given host/port
> This appears to be because Cluster#initialize, which loads the 
> ClientProtocol, contains no special handling for mapred.job.tracker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4143) ApplicationMaster retry times should be set by Client

2013-02-05 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13572070#comment-13572070
 ] 

Hitesh Shah commented on MAPREDUCE-4143:


Seems like a reasonable feature to have with a slight caveat that the retry 
limit should be bounded by the limit configured on the RM. A client should not 
be able to set retry limit to 1000 for example.

> ApplicationMaster retry times should be set by Client
> -
>
> Key: MAPREDUCE-4143
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4143
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 0.23.1
> Environment: suse
>Reporter: xieguiming
>Priority: Minor
>
> We should support that different client or user have different 
> ApplicationMaster retry times. It also say that 
> "yarn.resourcemanager.am.max-retries" should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4837) Add webservices for jobtracker

2013-01-29 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4837:
---

   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Thanks Arun. Committed to branch-1.

> Add webservices for jobtracker
> --
>
> Key: MAPREDUCE-4837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4837
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 1.2.0
>
> Attachments: MAPREDUCE-4837.patch
>
>
> Add MR-AM web-services to branch-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4837) Add webservices for jobtracker

2013-01-29 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4837:
---

Summary: Add webservices for jobtracker  (was: Add MR-AM web-services to 
branch-1)

> Add webservices for jobtracker
> --
>
> Key: MAPREDUCE-4837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4837
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: MAPREDUCE-4837.patch
>
>
> Add MR-AM web-services to branch-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4837) Add MR-AM web-services to branch-1

2013-01-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564701#comment-13564701
 ] 

Hitesh Shah commented on MAPREDUCE-4837:


Code changes seem straight-forward and look fine. Applied patch and verified 
"format=json"-based calls manually against branch 1. 

+1 assuming output of test-patch on branch-1 does not throw up any issues.



> Add MR-AM web-services to branch-1
> --
>
> Key: MAPREDUCE-4837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4837
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: MAPREDUCE-4837.patch
>
>
> Add MR-AM web-services to branch-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560174#comment-13560174
 ] 

Hitesh Shah commented on MAPREDUCE-4951:


@Jason, having the RM ask the AM to kill the container in case of preemption 
would likely not work as the AM cannot be trusted. Obviously, there could be a 
different approach where the RM informs the AM that a particular container will 
be preempted soon but the RM eventually would need to trigger a kill for that 
container after a certain delay if it is still up.


 

> Container preemption interpreted as task failure
> 
>
> Key: MAPREDUCE-4951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mr-am, mrv2
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951.patch
>
>
> When YARN reports a completed container to the MR AM, it always interprets it 
> as a failure.  This can lead to a job failing because too many of its tasks 
> failed, when in fact they only failed because the scheduler preempted them.
> MR needs to recognize the special exit code value of -100 and interpret it as 
> a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4508) YARN needs to properly check the NM,AM memory properties in yarn-site.xml and mapred.xml and report errors accordingly.

2012-08-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443685#comment-13443685
 ] 

Hitesh Shah commented on MAPREDUCE-4508:


File MAPREDUCE-4508 for the issue mentioned in the previous comment.

> YARN needs to properly check the NM,AM memory properties in yarn-site.xml and 
> mapred.xml and report errors accordingly.
> ---
>
> Key: MAPREDUCE-4508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.0.0-alpha
> Environment: CentOs6.0, Hadoop2.0.0 Alpha
>Reporter: Anil Gupta
>  Labels: Map, Reduce, YARN
>
> Please refer to this discussion on the Hadoop Mailing list:
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110
> Summary:
> I was running YARN(Hadoop2.0.0 Alpha) on a 8 datanode, 4 admin node 
> Hadoop/HBase cluster. My datanodes were only having 3.2GB of memory. So, i 
> configured the yarn.nodemanager.resource.memory-mb property in yarn-site.xml 
> to 1200. After setting the property if i run any Yarn Job then the 
> NodemManager wont be able to start any Map task since by default the 
> yarn.app.mapreduce.am.resource.mb property is set to 1500 MB in 
> mapred-site.xml. 
> Expected Behavior: NodeManager should give an error if 
> yarn.app.mapreduce.am.resource.mb >= yarn.nodemanager.resource.memory-mb.
> Please let me know if more information is required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4578) Handle container requests that request more resources than available in the cluster

2012-08-22 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-4578:
--

 Summary: Handle container requests that request more resources 
than available in the cluster
 Key: MAPREDUCE-4578
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4578
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.0-alpha, 0.23.0
Reporter: Hitesh Shah


In heterogenous clusters, a simple check at the scheduler to check if the 
allocation request is within the max allocatable range is not enough. 

If there are large nodes in the cluster which are not available, there may be 
situations where some allocation requests will never be fulfilled. Need an 
approach to decide when to invalidate such requests. For application 
submissions, there will need to be a feedback loop for applications that could 
not be launched. For running AMs, AllocationResponse may need to augmented with 
information for invalidated/cancelled container requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4508) YARN needs to properly check the NM,AM memory properties in yarn-site.xml and mapred.xml and report errors accordingly.

2012-08-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439943#comment-13439943
 ] 

Hitesh Shah commented on MAPREDUCE-4508:


Sorry for the late reply. I dont believe that an error should be thrown when 
the AM requested memory is greater than the NM memory. I believe this is more 
of a configuration bug where the scheduler max allocation should be set such 
that an error is thrown for any AM requesting more than that. The RM should 
error out if the max scheduler allocation for a single container is less than 
the resources required to launch a new AM. Please let me know if you have seen 
something contrary to this. 

However, depending on how the scheduler max allocation is configured, there 
will be situations in heterogenous clusters where certain nodes may be down 
creating holes causing requests for large amount of resources/memory to wait 
for an indefinite amount of time. This is something which needs to be addressed 
separately and is a bit more tricky in terms of when to decide whether the 
allocation request cannot be fulfilled ( both from a new AM perspective or 
container requests by an AM ). I will file a separate jira for that.  



> YARN needs to properly check the NM,AM memory properties in yarn-site.xml and 
> mapred.xml and report errors accordingly.
> ---
>
> Key: MAPREDUCE-4508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.0.0-alpha
> Environment: CentOs6.0, Hadoop2.0.0 Alpha
>Reporter: Anil Gupta
>  Labels: Map, Reduce, YARN
>
> Please refer to this discussion on the Hadoop Mailing list:
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110
> Summary:
> I was running YARN(Hadoop2.0.0 Alpha) on a 8 datanode, 4 admin node 
> Hadoop/HBase cluster. My datanodes were only having 3.2GB of memory. So, i 
> configured the yarn.nodemanager.resource.memory-mb property in yarn-site.xml 
> to 1200. After setting the property if i run any Yarn Job then the 
> NodemManager wont be able to start any Map task since by default the 
> yarn.app.mapreduce.am.resource.mb property is set to 1500 MB in 
> mapred-site.xml. 
> Expected Behavior: NodeManager should give an error if 
> yarn.app.mapreduce.am.resource.mb >= yarn.nodemanager.resource.memory-mb.
> Please let me know if more information is required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4508) YARN needs to properly check the NM,AM memory properties in yarn-site.xml and mapred.xml and report errors accordingly.

2012-08-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427751#comment-13427751
 ] 

Hitesh Shah commented on MAPREDUCE-4508:


Seems like a dup of MAPREDUCE-3796

> YARN needs to properly check the NM,AM memory properties in yarn-site.xml and 
> mapred.xml and report errors accordingly.
> ---
>
> Key: MAPREDUCE-4508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.0.0-alpha
> Environment: CentOs6.0, Hadoop2.0.0 Alpha
>Reporter: Anil Gupta
>  Labels: Map, Reduce, YARN
>
> Please refer to this discussion on the Hadoop Mailing list:
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110
> Summary:
> I was running YARN(Hadoop2.0.0 Alpha) on a 8 datanode, 4 admin node 
> Hadoop/HBase cluster. My datanodes were only having 3.2GB of memory. So, i 
> configured the yarn.nodemanager.resource.memory-mb property in yarn-site.xml 
> to 1200. After setting the property if i run any Yarn Job then the 
> NodemManager wont be able to start any Map task since by default the 
> yarn.app.mapreduce.am.resource.mb property is set to 1500 MB in 
> mapred-site.xml. 
> Expected Behavior: NodeManager should give an error if 
> yarn.app.mapreduce.am.resource.mb >= yarn.nodemanager.resource.memory-mb.
> Please let me know if more information is required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Status: Open  (was: Patch Available)

Cancelling patch as the unit test should fail until 3067 is addressed. 

> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Fix For: 0.23.0
>
> Attachments: MR-2179.1.patch, MR-2179.2.patch, MR-2179.3.patch, 
> mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114388#comment-13114388
 ] 

Hitesh Shah commented on MAPREDUCE-2719:


Additional javac warnings due to: 

[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
org.apache.hadoop:hadoop-yarn-applications-distributedshell:jar:0.24.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.rat:apache-rat-plugin 
is missing. @ org.apache.hadoop:hadoop-yarn:${yarn.version}, 
/Users/Hitesh/dev/hadoop-common/hadoop-mapreduce-project/hadoop-yarn/pom.xml, 
line 389, column 15
[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
org.apache.hadoop:hadoop-yarn-applications:pom:0.24.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.rat:apache-rat-plugin 
is missing. @ org.apache.hadoop:hadoop-yarn:${yarn.version}, 
/Users/Hitesh/dev/hadoop-common/hadoop-mapreduce-project/hadoop-yarn/pom.xml, 
line 389, column 15



> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Fix For: 0.23.0
>
> Attachments: MR-2179.1.patch, MR-2179.2.patch, MR-2179.3.patch, 
> mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Status: Patch Available  (was: Open)

Local build still shows some javadoc warnings:

[WARNING] 
hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationToken.java:33:
 warning - Tag @link: reference not found: HttpServletRequest  

- these are not addressed in the patch.



> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Fix For: 0.23.0
>
> Attachments: MR-2179.1.patch, MR-2179.2.patch, MR-2179.3.patch, 
> mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Attachment: MR-2179.3.patch

Addressed patch warnings reported by automated build.

Changed resource allocation to a higher number as the container was getting 
killed by the monitoring layer causing unit test to fail. 

> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Fix For: 0.23.0
>
> Attachments: MR-2179.1.patch, MR-2179.2.patch, MR-2179.3.patch, 
> mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Status: Open  (was: Patch Available)

> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Fix For: 0.23.0
>
> Attachments: MR-2179.1.patch, MR-2179.2.patch, mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Attachment: MR-2179.2.patch

Updated patch with a very simple integration test that deploys and runs the ds 
app master on the miniyarncluster. 

> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Attachments: MR-2179.1.patch, MR-2179.2.patch, mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Status: Patch Available  (was: Open)

> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Attachments: MR-2179.1.patch, MR-2179.2.patch, mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-22 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Attachment: MR-2179.1.patch

Attaching code with relevant pom files to create a new module. 

Current structure is 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/
 



> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Attachments: MR-2179.1.patch, mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113067#comment-13113067
 ] 

Hitesh Shah commented on MAPREDUCE-2719:


Tests still pending. 

> MR-279: Write a shell command application
> -
>
> Key: MAPREDUCE-2719
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Reporter: Sharad Agarwal
>Assignee: Hitesh Shah
> Attachments: MR-2179.1.patch, mr-2719.wip.patch
>
>
> With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
> an AplicationMaster (also corresponding simple client), to submit and run a 
> shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3067) Container exit status not set properly to launched process's exit code on successful completion of process

2011-09-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112349#comment-13112349
 ] 

Hitesh Shah commented on MAPREDUCE-3067:


Ate a couple of words in that statement. The code in RMContainerAllocator 
currently keeps a count of completed maps and reduces but does not seem to 
check the exit status. For the sake of documentation, it would be good if you 
could clarify as to why the exit status does not need to be checked for 
map/reduce task containers.

> Container exit status not set properly to launched process's exit code on 
> successful completion of process
> --
>
> Key: MAPREDUCE-3067
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3067
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 0.23.0
>
>
> When testing the distributed shell sample app master, the container exit 
> status was being returned incorrectly. 
> 11/09/21 11:32:58 INFO DistributedShell.ApplicationMaster: Got container 
> status for containerID= container_1316629955324_0001_01_02, 
> state=COMPLETE, exitStatus=-1000, diagnostics=

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3067) Container exit status not set properly to launched process's exit code on successful completion of process

2011-09-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112331#comment-13112331
 ] 

Hitesh Shah commented on MAPREDUCE-3067:


Second aspect to this is the exit status is checked on completion of map or 
reduce tasks.

> Container exit status not set properly to launched process's exit code on 
> successful completion of process
> --
>
> Key: MAPREDUCE-3067
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3067
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Hitesh Shah
> Fix For: 0.23.0
>
>
> When testing the distributed shell sample app master, the container exit 
> status was being returned incorrectly. 
> 11/09/21 11:32:58 INFO DistributedShell.ApplicationMaster: Got container 
> status for containerID= container_1316629955324_0001_01_02, 
> state=COMPLETE, exitStatus=-1000, diagnostics=

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3067) Container exit status not set properly to launched process's exit code on successful completion of process

2011-09-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112329#comment-13112329
 ] 

Hitesh Shah commented on MAPREDUCE-3067:


Possible patch for addressing part of the issue. 

--- 
a/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
+++ 
b/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
@@ -554,6 +554,9 @@ public class ContainerImpl implements Container {
   static class ExitedWithSuccessTransition extends ContainerTransition {
 @Override
 public void transition(ContainerImpl container, ContainerEvent event) {
+  // Set exit code to 0 to denote success
+  container.exitCode = 0;
+
   // TODO: Add containerWorkDir to the deletion service.

   // Inform the localizer to decrement reference counts and cleanup


> Container exit status not set properly to launched process's exit code on 
> successful completion of process
> --
>
> Key: MAPREDUCE-3067
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3067
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Hitesh Shah
> Fix For: 0.23.0
>
>
> When testing the distributed shell sample app master, the container exit 
> status was being returned incorrectly. 
> 11/09/21 11:32:58 INFO DistributedShell.ApplicationMaster: Got container 
> status for containerID= container_1316629955324_0001_01_02, 
> state=COMPLETE, exitStatus=-1000, diagnostics=

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >