[jira] [Comment Edited] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-10-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542845#comment-15542845
 ] 

Hitesh Shah edited comment on MAPREDUCE-6638 at 10/3/16 4:51 PM:
-

Havent looked at the patch in detail but [~haibochen]'s clarifying comments 
make sense. Jira title could be modified accordingly. +0 from my side. 


was (Author: hitesh):
Havent looked at the patch in detail but [~haibochen]'s clarifying comments 
make sense. +0 from my side. 

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, 
> mapreduce6638.003.patch, mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-10-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542845#comment-15542845
 ] 

Hitesh Shah commented on MAPREDUCE-6638:


Havent looked at the patch in detail but [~haibochen]'s clarifying comments 
make sense. +0 from my side. 

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, 
> mapreduce6638.003.patch, mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6776) yarn.app.mapreduce.client.job.max-retries should have a more useful default

2016-09-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15537287#comment-15537287
 ] 

Hitesh Shah commented on MAPREDUCE-6776:


FWIW, I do agree that this is a useful behavioral change that makes sense to 
push to branch-2 but might be better to call it out as incompatible but at the 
same release note it carefully to indicate that it will improve user experience 
and not have any detrimental impact apart from the retry delay in some edge 
cases. 

> yarn.app.mapreduce.client.job.max-retries should have a more useful default
> ---
>
> Key: MAPREDUCE-6776
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6776
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
> Attachments: MAPREDUCE-6776.001.patch, MAPREDUCE-6776.002.patch, 
> MAPREDUCE-6776.003.patch
>
>
> The default is 0, so any communication failure results in a client failure.  
> Oozie doesn't like that.  If the RM is failing over and Oozie gets a 
> communication failure, it assumes the target job has failed.  I propose 
> raising the default to something modest like 3 or 5.  The default retry 
> interval is 2s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6776) yarn.app.mapreduce.client.job.max-retries should have a more useful default

2016-09-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15537275#comment-15537275
 ] 

Hitesh Shah commented on MAPREDUCE-6776:


>From a practical sense, this is not really an incompatible change as there is 
>some internal behavioral aspects that are being changed to retry 3 times 
>instead of no retries. 

However, from a pure theoretical compat perspective, a public default value is 
being changed as well as the value in mapred-default.xml. Tests which might be 
earlier doing some verification would expect immediate failures whereas now it 
might be reconnect or fail after 6 seconds or so. 

I suggest pushing this to trunk for sure as we are still in the alpha stage of 
releases. As for branch-2, I would check with the 2.8 release manager. 

> yarn.app.mapreduce.client.job.max-retries should have a more useful default
> ---
>
> Key: MAPREDUCE-6776
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6776
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
> Attachments: MAPREDUCE-6776.001.patch, MAPREDUCE-6776.002.patch, 
> MAPREDUCE-6776.003.patch
>
>
> The default is 0, so any communication failure results in a client failure.  
> Oozie doesn't like that.  If the RM is failing over and Oozie gets a 
> communication failure, it assumes the target job has failed.  I propose 
> raising the default to something modest like 3 or 5.  The default retry 
> interval is 2s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-09-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514897#comment-15514897
 ] 

Hitesh Shah edited comment on MAPREDUCE-6638 at 9/23/16 12:17 AM:
--

bq. (1) Avoid recovering an AM if encrypted spill is enabled

Encrypted spill w.r.t recovery is not the same as a committer not supporting 
recovery. Any reason we cannot just re-run the job from scratch if all reducers 
have not completed ( or re-run all maps and incomplete reducers )?

Ideally speaking, you could just re-run most of the job tasks again if needed 
to support proper fault tolerance even in scenarios where the key cannot be 
stored securely. In this scenario, the new AM can generate a new key. I would 
agree that this might not be a performant solution but it atleast solves the 
problem of not having the user to re-submit the job. If performance is an 
issue, users can turn off recovery when encryption is enabled for scenarios 
where the key cannot be stored securely.


was (Author: hitesh):
bq. (1) Avoid recovering an AM if encrypted spill is enabled

Encrypted spill w.r.t recovery is not the same as a committer not supporting 
recovery. Any reason we cannot just re-run the job from scratch if all reducers 
have not completed?

Ideally speaking, you could just re-run most of the job tasks again if needed 
to support proper fault tolerance even in scenarios where the key cannot be 
stored securely. In this scenario, the new AM can generate a new key. I would 
agree that this might not be a performant solution but it atleast solves the 
problem of not having the user to re-submit the job. If performance is an 
issue, users can turn off recovery when encryption is enabled for scenarios 
where the key cannot be stored securely.

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, 
> mapreduce6638.003.patch, mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-09-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514897#comment-15514897
 ] 

Hitesh Shah commented on MAPREDUCE-6638:


bq. (1) Avoid recovering an AM if encrypted spill is enabled

Encrypted spill w.r.t recovery is not the same as a committer not supporting 
recovery. Any reason we cannot just re-run the job from scratch if all reducers 
have not completed?

Ideally speaking, you could just re-run most of the job tasks again if needed 
to support proper fault tolerance even in scenarios where the key cannot be 
stored securely. In this scenario, the new AM can generate a new key. I would 
agree that this might not be a performant solution but it atleast solves the 
problem of not having the user to re-submit the job. If performance is an 
issue, users can turn off recovery when encryption is enabled for scenarios 
where the key cannot be stored securely.

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, 
> mapreduce6638.003.patch, mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6484) Yarn Client uses local address instead of RM address as token renewer in a secure cluster when RM HA is enabled.

2016-09-14 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491425#comment-15491425
 ] 

Hitesh Shah commented on MAPREDUCE-6484:


[~asuresh] thanks for the pointer. Any reason why MR does not use that function 
in that case? 

> Yarn Client uses local address instead of RM address as token renewer in a 
> secure cluster when RM HA is enabled.
> 
>
> Key: MAPREDUCE-6484
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6484
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, security
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6484.001.patch, YARN-4187.000.patch
>
>
> Yarn Client uses local address instead of RM address as token renewer in a 
> secure cluster when RM HA is enabled. This will cause HDFS token renew 
> failure for renewer "nobody"  if the rules from 
> {{hadoop.security.auth_to_local}} exclude the client address in HDFS 
> {{DelegationTokenIdentifier}}.
> The reason why the local address is returned is: When HA is enabled, 
> "yarn.resourcemanager.address" may not be set,  if 
> {{HOSTNAME_PATTERN}}("_HOST") is used in "yarn.resourcemanager.principal", 
> the default address "0.0.0.0:8032" will be used,  Based on the following code 
> at SecurityUtil.java, the local address will be used to replace "0.0.0.0".
> {code}
>   private static String replacePattern(String[] components, String hostname)
>   throws IOException {
> String fqdn = hostname;
> if (fqdn == null || fqdn.isEmpty() || fqdn.equals("0.0.0.0")) {
>   fqdn = getLocalHostName();
> }
> return components[0] + "/" + fqdn.toLowerCase(Locale.US) + "@" + 
> components[2];
>   }
>   static String getLocalHostName() throws UnknownHostException {
> return InetAddress.getLocalHost().getCanonicalHostName();
>   }
>   public static String getServerPrincipal(String principalConfig,
>   InetAddress addr) throws IOException {
> String[] components = getComponents(principalConfig);
> if (components == null || components.length != 3
> || !components[1].equals(HOSTNAME_PATTERN)) {
>   return principalConfig;
> } else {
>   if (addr == null) {
> throw new IOException("Can't replace " + HOSTNAME_PATTERN
> + " pattern since client address is null");
>   }
>   return replacePattern(components, addr.getCanonicalHostName());
> }
>   }
> {code}
> The following is the exception which cause the job fail:
> {code}
> 15/09/12 16:27:24 WARN security.UserGroupInformation: 
> PriviledgedActionException as:t...@example.com (auth:KERBEROS) 
> cause:java.io.IOException: Failed to run job : yarn tries to renew a token 
> with renewer nobody
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:464)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7109)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:512)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:648)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:975)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> java.io.IOException: Failed to run job : yarn tries to renew a token with 
> renewer nobody
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:464)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7109)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:512)
> at 
> 

[jira] [Commented] (MAPREDUCE-6776) yarn.app.mapreduce.client.job.max-retries should have a more useful default

2016-09-12 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485365#comment-15485365
 ] 

Hitesh Shah commented on MAPREDUCE-6776:


Changing this in 2.x would be an incompatible change. 

> yarn.app.mapreduce.client.job.max-retries should have a more useful default
> ---
>
> Key: MAPREDUCE-6776
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6776
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The default is 0, so any communication results in a client failure.  Oozie 
> doesn't like that.  If the RM is failing over and Oozie gets a communication 
> failure, it assumes the target job has failed.  I propose raising the default 
> to something modest like 3 or 5.  The default retry interval is 2s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6484) Yarn Client uses local address instead of RM address as token renewer in a secure cluster when RM HA is enabled.

2016-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15478696#comment-15478696
 ] 

Hitesh Shah commented on MAPREDUCE-6484:


[~asuresh] [~zxu] It seems like the getMasterAddress() functionality ideally 
belongs in YARN and not in MR so that other applications that make use of YARN 
can always leverage the same functionality. Would you agree? 

> Yarn Client uses local address instead of RM address as token renewer in a 
> secure cluster when RM HA is enabled.
> 
>
> Key: MAPREDUCE-6484
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6484
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, security
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6484.001.patch, YARN-4187.000.patch
>
>
> Yarn Client uses local address instead of RM address as token renewer in a 
> secure cluster when RM HA is enabled. This will cause HDFS token renew 
> failure for renewer "nobody"  if the rules from 
> {{hadoop.security.auth_to_local}} exclude the client address in HDFS 
> {{DelegationTokenIdentifier}}.
> The reason why the local address is returned is: When HA is enabled, 
> "yarn.resourcemanager.address" may not be set,  if 
> {{HOSTNAME_PATTERN}}("_HOST") is used in "yarn.resourcemanager.principal", 
> the default address "0.0.0.0:8032" will be used,  Based on the following code 
> at SecurityUtil.java, the local address will be used to replace "0.0.0.0".
> {code}
>   private static String replacePattern(String[] components, String hostname)
>   throws IOException {
> String fqdn = hostname;
> if (fqdn == null || fqdn.isEmpty() || fqdn.equals("0.0.0.0")) {
>   fqdn = getLocalHostName();
> }
> return components[0] + "/" + fqdn.toLowerCase(Locale.US) + "@" + 
> components[2];
>   }
>   static String getLocalHostName() throws UnknownHostException {
> return InetAddress.getLocalHost().getCanonicalHostName();
>   }
>   public static String getServerPrincipal(String principalConfig,
>   InetAddress addr) throws IOException {
> String[] components = getComponents(principalConfig);
> if (components == null || components.length != 3
> || !components[1].equals(HOSTNAME_PATTERN)) {
>   return principalConfig;
> } else {
>   if (addr == null) {
> throw new IOException("Can't replace " + HOSTNAME_PATTERN
> + " pattern since client address is null");
>   }
>   return replacePattern(components, addr.getCanonicalHostName());
> }
>   }
> {code}
> The following is the exception which cause the job fail:
> {code}
> 15/09/12 16:27:24 WARN security.UserGroupInformation: 
> PriviledgedActionException as:t...@example.com (auth:KERBEROS) 
> cause:java.io.IOException: Failed to run job : yarn tries to renew a token 
> with renewer nobody
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:464)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7109)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:512)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:648)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:975)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> java.io.IOException: Failed to run job : yarn tries to renew a token with 
> renewer nobody
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:464)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7109)
> at 
> 

[jira] [Commented] (MAPREDUCE-6062) Use TestDFSIO test random read : job failed

2016-05-24 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299224#comment-15299224
 ] 

Hitesh Shah commented on MAPREDUCE-6062:


[~tfukudom] please hit "submit patch" to trigger the pre-commit build. 
https://wiki.apache.org/hadoop/HowToContribute has more info on dos and donts 
when contributing patches. In this case, I will defer to someone who has been 
looking at MR code in more recent times to do a review. If you do not see any 
updates on the jira within the next couple of days, please feel free to drop a 
polite email on the mapreduce-dev list asking for review help. 

> Use TestDFSIO test random read : job failed
> ---
>
> Key: MAPREDUCE-6062
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6062
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 2.2.0
> Environment: command : hadoop jar $JAR_PATH TestDFSIO-read -random 
> -nrFiles 12 -size 8000
>Reporter: chongyuanhuang
>Assignee: Takuya Fukudome
> Attachments: MAPREDUCE-6062.patch
>
>
> This is log:
> 2014-09-01 13:57:29,876 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.IllegalArgumentException: n must be 
> positive
>   at java.util.Random.nextInt(Random.java:300)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.nextOffset(TestDFSIO.java:601)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:580)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:546)
>   at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:134)
>   at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:37)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> 2014-09-01 13:57:29,886 INFO [main] org.apache.hadoop.mapred.Task: Runnning 
> cleanup for the task
> 2014-09-01 13:57:29,894 WARN [main] 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete 
> hdfs://m101:8020/benchmarks/TestDFSIO/io_random_read/_temporary/1/_temporary/attempt_1409538816633_0005_m_01_0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-6062) Use TestDFSIO test random read : job failed

2016-05-24 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reassigned MAPREDUCE-6062:
--

Assignee: Takuya Fukudome

[~tfukudom] added you to MR contributors. Hopefully this should get you 
unblocked.

> Use TestDFSIO test random read : job failed
> ---
>
> Key: MAPREDUCE-6062
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6062
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 2.2.0
> Environment: command : hadoop jar $JAR_PATH TestDFSIO-read -random 
> -nrFiles 12 -size 8000
>Reporter: chongyuanhuang
>Assignee: Takuya Fukudome
>
> This is log:
> 2014-09-01 13:57:29,876 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.IllegalArgumentException: n must be 
> positive
>   at java.util.Random.nextInt(Random.java:300)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.nextOffset(TestDFSIO.java:601)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:580)
>   at 
> org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:546)
>   at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:134)
>   at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:37)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> 2014-09-01 13:57:29,886 INFO [main] org.apache.hadoop.mapred.Task: Runnning 
> cleanup for the task
> 2014-09-01 13:57:29,894 WARN [main] 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete 
> hdfs://m101:8020/benchmarks/TestDFSIO/io_random_read/_temporary/1/_temporary/attempt_1409538816633_0005_m_01_0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5785) Derive heap size or mapreduce.*.memory.mb automatically

2014-11-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222592#comment-14222592
 ] 

Hitesh Shah commented on MAPREDUCE-5785:


bq. I think we should commit this to branch-2 as well

This change is incompatible especially as it modifies mapred-default.xml. Not 
sure why it would be committed to branch-2. 


 Derive heap size or mapreduce.*.memory.mb automatically
 ---

 Key: MAPREDUCE-5785
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mr-am, task
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch, 
 MAPREDUCE-5785.v03.patch, mr-5785-4.patch, mr-5785-5.patch, mr-5785-6.patch


 Currently users have to set 2 memory-related configs per Job / per task type. 
  One first chooses some container size map reduce.\*.memory.mb and then a 
 corresponding maximum Java heap size Xmx  map reduce.\*.memory.mb. This 
 makes sure that the JVM's C-heap (native memory + Java heap) does not exceed 
 this mapreduce.*.memory.mb. If one forgets to tune Xmx, MR-AM might be 
 - allocating big containers whereas the JVM will only use the default 
 -Xmx200m.
 - allocating small containers that will OOM because Xmx is too high.
 With this JIRA, we propose to set Xmx automatically based on an empirical 
 ratio that can be adjusted. Xmx is not changed automatically if provided by 
 the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5956) MapReduce AM should not use maxAttempts to determine if this is the last retry

2014-07-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057866#comment-14057866
 ] 

Hitesh Shah commented on MAPREDUCE-5956:


To add to [~mayank_bansal]'s comment, this is the 4th ( and last ) attempt of 
the AM and there have been no preemptions. 

 MapReduce AM should not use maxAttempts to determine if this is the last retry
 --

 Key: MAPREDUCE-5956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mrv2
Affects Versions: 2.4.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Wangda Tan
Priority: Blocker
 Attachments: MR-5956.patch


 Found this while reviewing YARN-2074. The problem is that after YARN-2074, we 
 don't count AM preemption towards AM failures on RM side, but MapReduce AM 
 itself checks the attempt id against the max-attempt count to determine if 
 this is the last attempt.
 {code}
 public void computeIsLastAMRetry() {
   isLastAMRetry = appAttemptID.getAttemptId() = maxAppAttempts;
 }
 {code}
 This causes issues w.r.t deletion of staging directory etc..



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5956) MapReduce AM should not use maxAttempts to determine if this is the last retry

2014-07-08 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055396#comment-14055396
 ] 

Hitesh Shah commented on MAPREDUCE-5956:


[~vinodkv] By definition, if an AM calls unregister, it is telling the RM that 
this is my last attempt and the app should not be retried. Are now you saying 
that all attempts should now call unregisterAttempt() which will tell the app 
whether it is the final attempt and should call a final unregister()? If not, I 
think something else is needed as an AM will only call unregister() on an error 
if it thinks it is the last attempt. 

 

 MapReduce AM should not use maxAttempts to determine if this is the last retry
 --

 Key: MAPREDUCE-5956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mrv2
Reporter: Vinod Kumar Vavilapalli
Assignee: Wangda Tan
Priority: Blocker

 Found this while reviewing YARN-2074. The problem is that after YARN-2074, we 
 don't count AM preemption towards AM failures on RM side, but MapReduce AM 
 itself checks the attempt id against the max-attempt count to determine if 
 this is the last attempt.
 {code}
 public void computeIsLastAMRetry() {
   isLastAMRetry = appAttemptID.getAttemptId() = maxAppAttempts;
 }
 {code}
 This causes issues w.r.t deletion of staging directory etc..



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5696) Add Localization counters to MR

2013-12-26 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856997#comment-13856997
 ] 

Hitesh Shah commented on MAPREDUCE-5696:


The introduction of localization counters in the env is akin to introducing a 
new API in YARN. Could you split this jira out into 2. One in YARN for the YARN 
changes where the new API/interface is introduced and this jira could be 
leveraged for the MR specific changes. 

 Add Localization counters to MR
 ---

 Key: MAPREDUCE-5696
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5696
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: LocalizationCounters.png, MAPREDUCE-5696.v01.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of user-visible metrics. The purpose of this JIRA is to 
 compliment YARN-1529. While YARN-1529 attempts to provide a cluster-wide view 
 to cluster admins, this JIRA focuses on exposing the localization overhead on 
 per-job basis to the job owner/user.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5487) In task processes, JobConf is unnecessarily loaded again in Limits

2013-12-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838320#comment-13838320
 ] 

Hitesh Shah commented on MAPREDUCE-5487:


A little bit late on this. Did anyone look into how this affects jobs where a 
user modifies the counter limit to be higher than the cluster configured value 
and what happens in the case where the jobhistory server is configured with a 
limit less than the user supplied limit? 

 In task processes, JobConf is unnecessarily loaded again in Limits
 --

 Key: MAPREDUCE-5487
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5487
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: performance, task
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.4.0

 Attachments: MAPREDUCE-5487-1.patch, MAPREDUCE-5487.patch


 Limits statically loads a JobConf, which incurs costs of reading files from 
 disk and parsing XML.  The contents of this JobConf are identical to the one 
 loaded by YarnChild (before adding job.xml as a resource).  Allowing Limits 
 to initialize with the JobConf loaded in YarnChild would reduce task startup 
 time.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (MAPREDUCE-5633) Can Hadoop use multi-cores of a processor under single machine

2013-11-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5633.


Resolution: Invalid

Please ask such questions on the user list. 
http://hadoop.apache.org/mailing_lists.html#User

 Can Hadoop use multi-cores of a processor under single machine
 --

 Key: MAPREDUCE-5633
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5633
 Project: Hadoop Map/Reduce
  Issue Type: Task
Reporter: Asif





--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-10-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1378#comment-1378
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


[~jlowe] Thanks for the clarification. I believe the performance issues should 
hold regardless of any filesystem implementation used as long as the 
distributed cache layer ends up correctly interpreting the permissions to the 
appropriate LocalResource visibility. 

+1. Latest patch looks good to me. 

Let me know if you are waiting on anyone else to chime in on this. If not, 
please feel free to go ahead and commit or I shall commit later today.  

 Remove dependency on deployed MR jars
 -

 Key: MAPREDUCE-4421
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4421-2.patch, MAPREDUCE-4421-3.patch, 
 MAPREDUCE-4421-4.patch, MAPREDUCE-4421.patch, MAPREDUCE-4421.patch


 Currently MR AM depends on MR jars being deployed on all nodes via implicit 
 dependency on YARN_APPLICATION_CLASSPATH. 
 We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
 probably, just rely on adding a shaded MR jar along with job.jar to the 
 dist-cache.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-09-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782438#comment-13782438
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


Sorry for the delay in the review. 

Regarding addMRFrameworkToDistributedCache() - one minor question:  the code 
allows for a non-qualified URI. Should we enforce provision of a 
fully-qualified path always?

Minor nit: I believe there should be nothing in the implementation that 
requires HDFS as the storage for the MR tarball? Documentation needs to change 
as a result unless you believe there are reasons for not mentioning other 
filesystems ( except maybe from a testing point of view )?

Patch looks good otherwise. Thanks for adding the detailed docs.




 Remove dependency on deployed MR jars
 -

 Key: MAPREDUCE-4421
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4421-2.patch, MAPREDUCE-4421.patch, 
 MAPREDUCE-4421.patch


 Currently MR AM depends on MR jars being deployed on all nodes via implicit 
 dependency on YARN_APPLICATION_CLASSPATH. 
 We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
 probably, just rely on adding a shaded MR jar along with job.jar to the 
 dist-cache.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-09-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773326#comment-13773326
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


[~jlowe] Thanks for the detailed answers to my queries. 

I believe this initial patch is a good start to making MR a user-land library. 
As it stands, it provides the additional flexibility which can be used by 
anyone to deploy MR with either the full tarball or a mix-match approach. 
Though it might be good to have some documentation on the 2 possible approaches 
( full tarball vs MR tarball ) and explain how the classpath should be setup. 

Depending on your viewpoint, the classpath-to-hdfs path mapping - whether it 
comes in from an additional file on HDFS could be considered in a follow-up 
jira if others believe this is a better solution. 

The one thing to change in the patch is the documentation for 
mapreduce.application.framework.path - it does not mention the use of the URI 
fragment and how that interacts with the configured classpath. 

Could you file a follow-up jira for the config handling? 




 Remove dependency on deployed MR jars
 -

 Key: MAPREDUCE-4421
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4421.patch, MAPREDUCE-4421.patch


 Currently MR AM depends on MR jars being deployed on all nodes via implicit 
 dependency on YARN_APPLICATION_CLASSPATH. 
 We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
 probably, just rely on adding a shaded MR jar along with job.jar to the 
 dist-cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-09-12 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765889#comment-13765889
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


[~jlowe] Had a few questions/comments related to the implementation/patch: 

- Why does classpath need to include all of common, hdfs and yarn jar 
locations? Assuming that MR is running on a YARN-based cluster, shouldn't the 
location of the core dependencies come from the cluster deployment i.e. via the 
env that the NM sets for a container. I believe the only jars that MR should 
have in its uploaded tarball should be the client jars. I understand that there 
is no clear boundary for client-side only jars for common and hdfs today ( for 
For YARN, I believe it should be simple to split out the client-side 
requirements ) but it is something we should aim for or assume that the jars 
deployed on the cluster are compatible. 
  - I guess the underlying question is why use the full hadoop tarball and not 
just the mapreduce-only tarball? If MR is trully a user-land library, it should 
be treated as such and have a separate deployment approach.

- I would vote to make the tar-ball in HDFS be the only way to run MR on YARN. 
Obviously, this cannot be done for 2.x but we should move to this model on 
trunk and not support the current approach at all there. Comments? 

- The other point is related to configs. Configuration still loads mapred-site 
and mapred-default files and new Configuration objects are created on the 
cluster. Are these files still expected on the cluster? job.xml does override 
these but cluster configs could still have final params. If this is meant to be 
addressed in a follow-up jira to ensure all MR configs come from the client, 
you can ignore this point for now.

- How do you see framework name extracted from the path to be used? Is it just 
a safety check to ensure that it is found in the classpath? Will it have any 
relation to a version? A minor nit - framework name seems confusing in relation 
to the framework name in use from earlier i.e yarn vs local framework. 

- Description in the default-xml for mapreduce.application.framework.path does 
not mention the need for the URI fragment and how the fragment is used as a 
sanity check to the classpath. 

- Regarding versions, it seems like users will need to do 2 things. Change the 
location of the tarball on HDFS and modify the classpath. Users will need to 
know the exact structure of the classpath. In such a scenario, do defaults even 
make sense? On the other hand, if we define a common standard i.e. a base path 
for all MR tarballs, with each tarball in a defined structure  ( possibly with 
version info added on later on for the code to infer the structure of the 
tarball ), all the user would need to do is specify the base path ( which could 
have a default value ) and a version which again has a default value. The 
latter approach would require the code to construct the necessary classpath if 
the upload path is in use. Do you have any comments on which of the 2 
approaches makes more sense? The former is way more flexible but a bit more 
complex. The latter brittle/inflexible with respect to changing tarball 
structures but likely more easier to enforce a standard on.


 Remove dependency on deployed MR jars
 -

 Key: MAPREDUCE-4421
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4421.patch, MAPREDUCE-4421.patch


 Currently MR AM depends on MR jars being deployed on all nodes via implicit 
 dependency on YARN_APPLICATION_CLASSPATH. 
 We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
 probably, just rely on adding a shaded MR jar along with job.jar to the 
 dist-cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

2013-09-12 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765961#comment-13765961
 ] 

Hitesh Shah commented on MAPREDUCE-4421:


s/Configuration/Jobconf/ in the previous comment.

 Remove dependency on deployed MR jars
 -

 Key: MAPREDUCE-4421
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4421.patch, MAPREDUCE-4421.patch


 Currently MR AM depends on MR jars being deployed on all nodes via implicit 
 dependency on YARN_APPLICATION_CLASSPATH. 
 We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
 probably, just rely on adding a shaded MR jar along with job.jar to the 
 dist-cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5130) Add missing job config options to mapred-default.xml

2013-07-31 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725728#comment-13725728
 ] 

Hitesh Shah commented on MAPREDUCE-5130:


Regarding mapreduce.map/reduce.child.java.opts, aren't they to be deprecated in 
favor or mapreduce.[map|reduce].java.opts?



 Add missing job config options to mapred-default.xml
 

 Key: MAPREDUCE-5130
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5130
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5130-1.patch, MAPREDUCE-5130-1.patch, 
 MAPREDUCE-5130-2.patch, MAPREDUCE-5130-3.patch, MAPREDUCE-5130-4.patch, 
 MAPREDUCE-5130-5.patch, MAPREDUCE-5130.patch, MAPREDUCE-5130.patch


 I came across that mapreduce.map.child.java.opts and 
 mapreduce.reduce.child.java.opts were missing in mapred-default.xml.  I'll do 
 a fuller sweep to see what else is missing before posting a patch.
 List so far:
 mapreduce.map/reduce.child.java.opts
 mapreduce.map/reduce.memory.mb
 mapreduce.job.jvm.numtasks
 mapreduce.input.lineinputformat.linespermap
 mapreduce.task.combine.progress.records
 mapreduce.map/reduce.env

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5130) Add missing job config options to mapred-default.xml

2013-07-31 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725902#comment-13725902
 ] 

Hitesh Shah commented on MAPREDUCE-5130:


[~sandyr] Was a bit thrown off by the jira description which mentions 
documenting *child.java.opts instead of the property names not using child.

 Add missing job config options to mapred-default.xml
 

 Key: MAPREDUCE-5130
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5130
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5130-1.patch, MAPREDUCE-5130-1.patch, 
 MAPREDUCE-5130-2.patch, MAPREDUCE-5130-3.patch, MAPREDUCE-5130-4.patch, 
 MAPREDUCE-5130-5.patch, MAPREDUCE-5130.patch, MAPREDUCE-5130.patch


 I came across that mapreduce.map.child.java.opts and 
 mapreduce.reduce.child.java.opts were missing in mapred-default.xml.  I'll do 
 a fuller sweep to see what else is missing before posting a patch.
 List so far:
 mapreduce.map/reduce.child.java.opts
 mapreduce.map/reduce.memory.mb
 mapreduce.job.jvm.numtasks
 mapreduce.input.lineinputformat.linespermap
 mapreduce.task.combine.progress.records
 mapreduce.map/reduce.env

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5416) hadoop-mapreduce-client-common depends on hadoop-yarn-server-common

2013-07-24 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5416:
--

 Summary: hadoop-mapreduce-client-common depends on 
hadoop-yarn-server-common
 Key: MAPREDUCE-5416
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5416
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah


mapreduce-client-app and mapreduce-client-jobclient modules also depend on 
yarn-server-common but only in test scope.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5408) CLONE - The logging level of the tasks should be configurable by the job

2013-07-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716470#comment-13716470
 ] 

Hitesh Shah commented on MAPREDUCE-5408:


Mostly looks good. A couple of minor comments:

  - DEFAULT_LOG_LEVEL could be renamed to DEFAULT_TASK_LOG_LEVEL and the type 
changed to a string. Having the type as Level is not buying much as it always 
ends up being converted to a string when used. If the intention is to retain 
the backport as is, this comment can be ignored for now. 

  - Level.toLevel() has an api which takes in a default value. In the event 
that the user has a typo, the current usage falls back to using DEBUG where as 
the default-based api can be made to fall back to INFO.


 

 CLONE - The logging level of the tasks should be configurable by the job
 

 Key: MAPREDUCE-5408
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5408
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Arun C Murthy
 Fix For: 1.3.0

 Attachments: MAPREDUCE-336_branch1.patch


 It would be nice to be able to configure the logging level of the Task JVM's 
 separately from the server JVM's. Reducing logging substantially increases 
 performance and reduces the consumption of local disk on the task trackers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5399) Large number of map tasks cause slow sort at reduce phase, invariant to amount of data to sort

2013-07-17 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5399.


Resolution: Invalid

If this is indeed an issue with Apache Hadoop-1.x, please feel free to file a 
jira with details specific to that. Issues with a particular vendor's distro 
should be redirected to the vendor in question. 

 Large number of map tasks cause slow sort at reduce phase, invariant to 
 amount of data to sort
 --

 Key: MAPREDUCE-5399
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5399
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Reporter: Stanislav Barton
Priority: Critical

 We are using hadoop-2.0.0+1357-1.cdh4.3.0.p0.21 with MRv1. After upgrade from 
 4.1.2 to 4.3.0, I have noticed some performance deterioration in our MR job 
 in the Reduce phase. The MR job has usually 10 000 map tasks (10 000 files on 
 input each about 100MB) and 6 000 reducers (one reducer per table region). I 
 was trying to figure out what at which phase the slow down appears (firstly I 
 suspected that the slow gathering of the 1 map output files is the 
 culprit) and found out that the problem is not reading the map output (the 
 shuffle) but the sort/merge phase that follows - the last and actual reduce 
 phase is fast. I have tried to up the io.sort.factor because I thought the 
 lots of small files are being merged on disk, but again upping that to 1000 
 didnt do any difference. I have then printed the stack trace and found out 
 that the problem is initialization of the 
 org.apache.hadoop.mapred.IFileInputStream namely the creation of the 
 Configuration object which is not propagated along from earlier context, see 
 the stack trace:
 Thread 13332: (state = IN_NATIVE)
  - java.io.UnixFileSystem.getBooleanAttributes0(java.io.File) @bci=0 
 (Compiled frame; information may be imprecise)
  - java.io.UnixFileSystem.getBooleanAttributes(java.io.File) @bci=2, line=228 
 (Compiled frame)
  - java.io.File.exists() @bci=20, line=733 (Compiled frame)
  - sun.misc.URLClassPath$FileLoader.getResource(java.lang.String, boolean) 
 @bci=136, line=999 (Compiled frame)
  - sun.misc.URLClassPath$FileLoader.findResource(java.lang.String, boolean) 
 @bci=3, line=966 (Compiled frame)
  - sun.misc.URLClassPath.findResource(java.lang.String, boolean) @bci=17, 
 line=146 (Compiled frame)
  - java.net.URLClassLoader$2.run() @bci=12, line=385 (Compiled frame)
  - 
 java.security.AccessController.doPrivileged(java.security.PrivilegedAction, 
 java.security.AccessControlContext) @bci=0 (Compiled frame)
  - java.net.URLClassLoader.findResource(java.lang.String) @bci=13, line=382 
 (Compiled frame)
  - java.lang.ClassLoader.getResource(java.lang.String) @bci=30, line=1002 
 (Compiled frame)
  - java.lang.ClassLoader.getResourceAsStream(java.lang.String) @bci=2, 
 line=1192 (Compiled frame)
  - javax.xml.parsers.SecuritySupport$4.run() @bci=26, line=96 (Compiled frame)
  - 
 java.security.AccessController.doPrivileged(java.security.PrivilegedAction) 
 @bci=0 (Compiled frame)
  - 
 javax.xml.parsers.SecuritySupport.getResourceAsStream(java.lang.ClassLoader, 
 java.lang.String) @bci=10, line=89 (Compiled frame)
  - javax.xml.parsers.FactoryFinder.findJarServiceProvider(java.lang.String) 
 @bci=38, line=250 (Interpreted frame)
  - javax.xml.parsers.FactoryFinder.find(java.lang.String, java.lang.String) 
 @bci=273, line=223 (Interpreted frame)
  - javax.xml.parsers.DocumentBuilderFactory.newInstance() @bci=4, line=123 
 (Compiled frame)
  - org.apache.hadoop.conf.Configuration.loadResource(java.util.Properties, 
 org.apache.hadoop.conf.Configuration$Resource, boolean) @bci=16, line=1890 
 (Compiled frame)
  - org.apache.hadoop.conf.Configuration.loadResources(java.util.Properties, 
 java.util.ArrayList, boolean) @bci=49, line=1867 (Compiled frame)
  - org.apache.hadoop.conf.Configuration.getProps() @bci=43, line=1785 
 (Compiled frame)
  - org.apache.hadoop.conf.Configuration.get(java.lang.String) @bci=35, 
 line=712 (Compiled frame)
  - org.apache.hadoop.conf.Configuration.getTrimmed(java.lang.String) @bci=2, 
 line=731 (Compiled frame)
  - org.apache.hadoop.conf.Configuration.getBoolean(java.lang.String, boolean) 
 @bci=2, line=1047 (Interpreted frame)
  - org.apache.hadoop.mapred.IFileInputStream.init(java.io.InputStream, 
 long, org.apache.hadoop.conf.Configuration) @bci=111, line=93 (Interpreted 
 frame)
  - 
 org.apache.hadoop.mapred.IFile$Reader.init(org.apache.hadoop.conf.Configuration,
  org.apache.hadoop.fs.FSDataInputStream, long, 
 org.apache.hadoop.io.compress.CompressionCodec, 
 org.apache.hadoop.mapred.Counters$Counter) @bci=60, line=303 (Interpreted 
 frame)
  - 
 

[jira] [Commented] (MAPREDUCE-5325) ClientRMProtocol.getAllApplications should accept ApplicationType as a parameter---MR changes

2013-07-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703862#comment-13703862
 ] 

Hitesh Shah commented on MAPREDUCE-5325:


Overall patch being reviewed as part of YARN-727. Will be committed together to 
ensure build does not break.

 ClientRMProtocol.getAllApplications should accept ApplicationType as a 
 parameter---MR changes
 -

 Key: MAPREDUCE-5325
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5325
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: MR-5325.1.patch, MR-5325.2.patch, MR-5325.3.patch, 
 MR-5325.4.patch, MR-5325.5.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5325) ClientRMProtocol.getAllApplications should accept ApplicationType as a parameter---MR changes

2013-07-09 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5325.


   Resolution: Fixed
Fix Version/s: 2.1.0-beta

Committed to trunk, branch-2, branch-2.1-beta and branch-2.1.0-beta. Thanks 
Xuan.

 ClientRMProtocol.getAllApplications should accept ApplicationType as a 
 parameter---MR changes
 -

 Key: MAPREDUCE-5325
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5325
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.1.0-beta

 Attachments: MR-5325.1.patch, MR-5325.2.patch, MR-5325.3.patch, 
 MR-5325.4.patch, MR-5325.5.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5325) ClientRMProtocol.getAllApplications should accept ApplicationType as a parameter---MR changes

2013-06-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13684234#comment-13684234
 ] 

Hitesh Shah commented on MAPREDUCE-5325:


@Xuan, will mapreduce jobs have different application types or only a single 
fixed type for all MR jobs? If the latter, the getAllJobs() should not be 
taking application type as an argument.

 ClientRMProtocol.getAllApplications should accept ApplicationType as a 
 parameter---MR changes
 -

 Key: MAPREDUCE-5325
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5325
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: MR-5325.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5324) Admin-provided user environment can be overridden by user provided values for the AM

2013-06-14 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5324:
--

 Summary: Admin-provided user environment can be overridden by user 
provided values for the AM
 Key: MAPREDUCE-5324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Priority: Minor


MRJobConfig.MR_AM_ADMIN_USER_ENV can be overridden by MRJobConfig.MR_AM_ENV.

Either the variable should be renamed to something along the lines of 
DEFAULT_ENV or the code fixed to have the correct overrides. Current 
documentation clearly states user env overrides admin env.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662313#comment-13662313
 ] 

Hitesh Shah commented on MAPREDUCE-5191:


+1. Committing shortly. 

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5191:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
 Release Note: Thanks Ivan. Committed to trunk.
   Status: Resolved  (was: Patch Available)

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662316#comment-13662316
 ] 

Hitesh Shah commented on MAPREDUCE-5095:


[~arpitagarwal] Should have reviewed the whole patch in context. Thanks for the 
clarification. +1. Will commit shortly. 

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5095.


  Resolution: Fixed
Release Note: Thanks Arpit. Committed to branch-1. 

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5191:
---

Release Note:   (was: Thanks Ivan. Committed to trunk.)

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5095:
---

Release Note:   (was: Thanks Arpit. Committed to branch-1. )

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662320#comment-13662320
 ] 

Hitesh Shah commented on MAPREDUCE-5191:


Thanks Ivan. Committed to trunk.

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662321#comment-13662321
 ] 

Hitesh Shah commented on MAPREDUCE-5095:


Thanks Arpit. Committed to branch-1. 

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-13 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5191:
---

Status: Open  (was: Patch Available)

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-11 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5240:
---

Component/s: (was: mrv1)
 mrv2

 inside of FileOutputCommitter the initialized Credentials cache appears to be 
 empty
 ---

 Key: MAPREDUCE-5240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.4-alpha
Reporter: Roman Shaposhnik
Priority: Blocker
 Fix For: 2.0.5-beta

 Attachments: LostCreds.java


 I am attaching a modified wordcount job that clearly demonstrates the problem 
 we've encountered in running Sqoop2 on YARN (BIGTOP-949).
 Here's what running it produces:
 {noformat}
 $ hadoop fs -mkdir in
 $ hadoop fs -put /etc/passwd in
 $ hadoop jar ./bug.jar org.myorg.LostCreds
 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
 longer used.
 numberOfSecretKeys: 1
 numberOfTokens: 0
 ..
 ..
 ..
 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
 state FAILED due to: Job commit failed: java.io.IOException:
 numberOfSecretKeys: 0
 numberOfTokens: 0
   at 
 org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}
 As you can see, even though we've clearly initialized the creds via:
 {noformat}
 job.getCredentials().addSecretKey(new Text(mykey), mysecret.getBytes());
 {noformat}
 It doesn't seem to appear later in the job.
 This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
 YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654710#comment-13654710
 ] 

Hitesh Shah commented on MAPREDUCE-5095:


Should abortCalled also be changed to a non-static? 


 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654947#comment-13654947
 ] 

Hitesh Shah commented on MAPREDUCE-5191:


Does it make sense to not use the temp file method in such a scenario to reduce 
the time it takes to run? How about just creating a file under target/ with the 
name of the test as filename? On a Mac, I saw this test run on an avg of 1 
second for multiple runs. 

 

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642073#comment-13642073
 ] 

Hitesh Shah commented on MAPREDUCE-5179:


[~vinodkv], None others found. 

 Change TestHSWebServices to do string equal check on hadoop build version 
 similar to YARN-605
 -

 Key: MAPREDUCE-5179
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: MAPREDUCE-5179.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5179:
---

Status: Patch Available  (was: Open)

 Change TestHSWebServices to do string equal check on hadoop build version 
 similar to YARN-605
 -

 Key: MAPREDUCE-5179
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: MAPREDUCE-5179.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5178) Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.

2013-04-24 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5178:
--

 Summary: Fix use of BuilderUtils#newApplicationReport as a result 
of YARN-577.
 Key: MAPREDUCE-5178
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5178
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5178) Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.

2013-04-24 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5178:
---

Attachment: MAPREDUCE-5178.1.patch

 Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.
 -

 Key: MAPREDUCE-5178
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5178
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
 Attachments: MAPREDUCE-5178.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-24 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5179:
--

 Summary: Change TestHSWebServices to do string equal check on 
hadoop build version similar to YARN-605
 Key: MAPREDUCE-5179
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5178) Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.

2013-04-24 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reassigned MAPREDUCE-5178:
--

Assignee: Hitesh Shah

 Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.
 -

 Key: MAPREDUCE-5178
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5178
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: MAPREDUCE-5178.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-24 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5179:
---

Attachment: MAPREDUCE-5179.1.patch

 Change TestHSWebServices to do string equal check on hadoop build version 
 similar to YARN-605
 -

 Key: MAPREDUCE-5179
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: MAPREDUCE-5179.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5179) Change TestHSWebServices to do string equal check on hadoop build version similar to YARN-605

2013-04-24 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640672#comment-13640672
 ] 

Hitesh Shah commented on MAPREDUCE-5179:


Needs YARN-605 to be committed before this can go in.

 Change TestHSWebServices to do string equal check on hadoop build version 
 similar to YARN-605
 -

 Key: MAPREDUCE-5179
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5179
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: MAPREDUCE-5179.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5178) Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.

2013-04-24 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640674#comment-13640674
 ] 

Hitesh Shah commented on MAPREDUCE-5178:


Needs YARN-577 to go in before this can be committed.

 Fix use of BuilderUtils#newApplicationReport as a result of YARN-577.
 -

 Key: MAPREDUCE-5178
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5178
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: MAPREDUCE-5178.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5142) MR AM unregisters with state KILLED when an error causes dispatcher to shutdown

2013-04-10 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-5142:
--

 Summary: MR AM unregisters with state KILLED when an error causes 
dispatcher to shutdown
 Key: MAPREDUCE-5142
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5142
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah


RMCommunicator sets final state to KILLED if the job is in a running state and 
isSignalled is set to true. 

{code}
  } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
  || (jobImpl.getInternalState() == JobStateInternal.RUNNING  
isSignalled)) {
finishState = FinalApplicationStatus.KILLED;
  } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
  || jobImpl.getInternalState() == JobStateInternal.ERROR) {
finishState = FinalApplicationStatus.FAILED;
{code}

This happens when for some reason, there is an exception in a state machine's 
event handler causing AsyncDispatcher to trigger a shutdown. In such a 
scenario, even though the AM actually failed due to some error, its actual 
state ends up as KILLED.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5142) MR AM unregisters with state KILLED when an error causes dispatcher to shutdown

2013-04-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628187#comment-13628187
 ] 

Hitesh Shah commented on MAPREDUCE-5142:


@Jason, yes - definitely the same underlying issue. Addressing the CLC creation 
would address a part of the issue but currently all uncaught exceptions will 
end up with the AM in a KILLED state.   

 MR AM unregisters with state KILLED when an error causes dispatcher to 
 shutdown
 ---

 Key: MAPREDUCE-5142
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5142
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hitesh Shah

 RMCommunicator sets final state to KILLED if the job is in a running state 
 and isSignalled is set to true. 
 {code}
   } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
   || (jobImpl.getInternalState() == JobStateInternal.RUNNING  
 isSignalled)) {
 finishState = FinalApplicationStatus.KILLED;
   } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
   || jobImpl.getInternalState() == JobStateInternal.ERROR) {
 finishState = FinalApplicationStatus.FAILED;
 {code}
 This happens when for some reason, there is an exception in a state machine's 
 event handler causing AsyncDispatcher to trigger a shutdown. In such a 
 scenario, even though the AM actually failed due to some error, its actual 
 state ends up as KILLED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5142) MR AM unregisters with state KILLED when an error causes dispatcher to shutdown

2013-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5142:
---

Description: 
RMCommunicator sets final state to KILLED if the job is in a running state and 
isSignalled is set to true. 

{code}
  } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
  || (jobImpl.getInternalState() == JobStateInternal.RUNNING  
isSignalled)) {
finishState = FinalApplicationStatus.KILLED;
  } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
  || jobImpl.getInternalState() == JobStateInternal.ERROR) {
finishState = FinalApplicationStatus.FAILED;
{code}

This happens when any uncaught exception in any event handler ends up causing 
the AsyncDispatcher to trigger a shutdown. In such a scenario, even though the 
AM actually failed due to some error, its actual state ends up as KILLED.




  was:
RMCommunicator sets final state to KILLED if the job is in a running state and 
isSignalled is set to true. 

{code}
  } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
  || (jobImpl.getInternalState() == JobStateInternal.RUNNING  
isSignalled)) {
finishState = FinalApplicationStatus.KILLED;
  } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
  || jobImpl.getInternalState() == JobStateInternal.ERROR) {
finishState = FinalApplicationStatus.FAILED;
{code}

This happens when for some reason, there is an exception in a state machine's 
event handler causing AsyncDispatcher to trigger a shutdown. In such a 
scenario, even though the AM actually failed due to some error, its actual 
state ends up as KILLED.





 MR AM unregisters with state KILLED when an error causes dispatcher to 
 shutdown
 ---

 Key: MAPREDUCE-5142
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5142
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Hitesh Shah

 RMCommunicator sets final state to KILLED if the job is in a running state 
 and isSignalled is set to true. 
 {code}
   } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
   || (jobImpl.getInternalState() == JobStateInternal.RUNNING  
 isSignalled)) {
 finishState = FinalApplicationStatus.KILLED;
   } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
   || jobImpl.getInternalState() == JobStateInternal.ERROR) {
 finishState = FinalApplicationStatus.FAILED;
 {code}
 This happens when any uncaught exception in any event handler ends up causing 
 the AsyncDispatcher to trigger a shutdown. In such a scenario, even though 
 the AM actually failed due to some error, its actual state ends up as KILLED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5142) MR AM unregisters with state KILLED when an error causes dispatcher to shutdown

2013-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5142:
---

Affects Version/s: 2.0.3-alpha
   0.23.5

 MR AM unregisters with state KILLED when an error causes dispatcher to 
 shutdown
 ---

 Key: MAPREDUCE-5142
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5142
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Hitesh Shah

 RMCommunicator sets final state to KILLED if the job is in a running state 
 and isSignalled is set to true. 
 {code}
   } else if (jobImpl.getInternalState() == JobStateInternal.KILLED
   || (jobImpl.getInternalState() == JobStateInternal.RUNNING  
 isSignalled)) {
 finishState = FinalApplicationStatus.KILLED;
   } else if (jobImpl.getInternalState() == JobStateInternal.FAILED
   || jobImpl.getInternalState() == JobStateInternal.ERROR) {
 finishState = FinalApplicationStatus.FAILED;
 {code}
 This happens when for some reason, there is an exception in a state machine's 
 event handler causing AsyncDispatcher to trigger a shutdown. In such a 
 scenario, even though the AM actually failed due to some error, its actual 
 state ends up as KILLED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-04-04 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622520#comment-13622520
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


@Stack, I will be committing shortly to branch-2.0.4.

 MiniMRCluster should use a random component when creating an actual cluster
 ---

 Key: MAPREDUCE-5083
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: 2.0.5-beta

 Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
 MAPREDUCE-5083-trunk.txt


 Currently all unit tests end up using the same work dir - which can affect 
 anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-04-04 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reopened MAPREDUCE-5083:



 MiniMRCluster should use a random component when creating an actual cluster
 ---

 Key: MAPREDUCE-5083
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: 2.0.5-beta

 Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
 MAPREDUCE-5083-trunk.txt


 Currently all unit tests end up using the same work dir - which can affect 
 anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-04-04 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5083.


  Resolution: Fixed
   Fix Version/s: (was: 2.0.5-beta)
  2.0.4-alpha
Target Version/s:   (was: 2.0.5-beta)
Release Note: Committed to branch-2.0.4. Modified changes.txt in trunk, 
branch-2 and branch-2.0.4 accordingly.

 MiniMRCluster should use a random component when creating an actual cluster
 ---

 Key: MAPREDUCE-5083
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: 2.0.4-alpha

 Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
 MAPREDUCE-5083-trunk.txt


 Currently all unit tests end up using the same work dir - which can affect 
 anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-04-04 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622537#comment-13622537
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


Minor clarification - changes.txt was modified in branch-2 and branch-2.0.4 - 
trunk has some additional mayhem to clear out first.

 MiniMRCluster should use a random component when creating an actual cluster
 ---

 Key: MAPREDUCE-5083
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: 2.0.4-alpha

 Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk_2.txt, 
 MAPREDUCE-5083-trunk.txt


 Currently all unit tests end up using the same work dir - which can affect 
 anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5088) MR Client gets an renewer token exception while Oozie is submitting a job

2013-04-03 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5088:
---

Fix Version/s: (was: 2.0.5-beta)
   (was: 3.0.0)

 MR Client gets an renewer token exception while Oozie is submitting a job
 -

 Key: MAPREDUCE-5088
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5088
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Roman Shaposhnik
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 2.0.4-alpha

 Attachments: HADOOP-9409.patch, HADOOP-9409.patch, 
 MAPREDUCE-5088.patch, MAPREDUCE-5088.patch, MAPREDUCE-5088.txt


 After the fix for HADOOP-9299 I'm now getting the following bizzare exception 
 in Oozie while trying to submit a job. This also seems to be KRB related:
 {noformat}
 2013-03-15 13:34:16,555  WARN ActionStartXCommand:542 - USER[hue] GROUP[-] 
 TOKEN[] APP[MapReduce] JOB[001-130315123130987-oozie-oozi-W] 
 ACTION[001-130315123130987-oozie-oozi-W@Sleep] Error starting action 
 [Sleep]. ErrorType [ERROR], ErrorCode [UninitializedMessageException], 
 Message [UninitializedMessageException: Message missing required fields: 
 renewer]
 org.apache.oozie.action.ActionExecutorException: 
 UninitializedMessageException: Message missing required fields: renewer
   at 
 org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401)
   at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:738)
   at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:889)
   at 
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:211)
   at 
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:59)
   at org.apache.oozie.command.XCommand.call(XCommand.java:277)
   at 
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:326)
   at 
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:255)
   at 
 org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: com.google.protobuf.UninitializedMessageException: Message missing 
 required fields: renewer
   at 
 com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:605)
   at 
 org.apache.hadoop.security.proto.SecurityProtos$GetDelegationTokenRequestProto$Builder.build(SecurityProtos.java:973)
   at 
 org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetDelegationTokenRequestPBImpl.mergeLocalToProto(GetDelegationTokenRequestPBImpl.java:84)
   at 
 org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetDelegationTokenRequestPBImpl.getProto(GetDelegationTokenRequestPBImpl.java:67)
   at 
 org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getDelegationToken(MRClientProtocolPBClientImpl.java:200)
   at 
 org.apache.hadoop.mapred.YARNRunner.getDelegationTokenFromHS(YARNRunner.java:194)
   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:273)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1439)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:581)
   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1439)
   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:576)
   at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:723)
   ... 10 more
 2013-03-15 13:34:16,555  WARN ActionStartXCommand:542 - USER[hue] GROUP[-] 
 TOKEN[] APP[MapReduce] JOB[001-13031512313
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent 

[jira] [Commented] (MAPREDUCE-5088) MR Client gets an renewer token exception while Oozie is submitting a job

2013-04-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621482#comment-13621482
 ] 

Hitesh Shah commented on MAPREDUCE-5088:


Updated fixed version to 2.0.4-alpha as assumption is that anything committed 
to 2.0.4-alpha should also have been committed to trunk and branch-2.

 MR Client gets an renewer token exception while Oozie is submitting a job
 -

 Key: MAPREDUCE-5088
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5088
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Roman Shaposhnik
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 2.0.4-alpha

 Attachments: HADOOP-9409.patch, HADOOP-9409.patch, 
 MAPREDUCE-5088.patch, MAPREDUCE-5088.patch, MAPREDUCE-5088.txt


 After the fix for HADOOP-9299 I'm now getting the following bizzare exception 
 in Oozie while trying to submit a job. This also seems to be KRB related:
 {noformat}
 2013-03-15 13:34:16,555  WARN ActionStartXCommand:542 - USER[hue] GROUP[-] 
 TOKEN[] APP[MapReduce] JOB[001-130315123130987-oozie-oozi-W] 
 ACTION[001-130315123130987-oozie-oozi-W@Sleep] Error starting action 
 [Sleep]. ErrorType [ERROR], ErrorCode [UninitializedMessageException], 
 Message [UninitializedMessageException: Message missing required fields: 
 renewer]
 org.apache.oozie.action.ActionExecutorException: 
 UninitializedMessageException: Message missing required fields: renewer
   at 
 org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401)
   at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:738)
   at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:889)
   at 
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:211)
   at 
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:59)
   at org.apache.oozie.command.XCommand.call(XCommand.java:277)
   at 
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:326)
   at 
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:255)
   at 
 org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: com.google.protobuf.UninitializedMessageException: Message missing 
 required fields: renewer
   at 
 com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:605)
   at 
 org.apache.hadoop.security.proto.SecurityProtos$GetDelegationTokenRequestProto$Builder.build(SecurityProtos.java:973)
   at 
 org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetDelegationTokenRequestPBImpl.mergeLocalToProto(GetDelegationTokenRequestPBImpl.java:84)
   at 
 org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetDelegationTokenRequestPBImpl.getProto(GetDelegationTokenRequestPBImpl.java:67)
   at 
 org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getDelegationToken(MRClientProtocolPBClientImpl.java:200)
   at 
 org.apache.hadoop.mapred.YARNRunner.getDelegationTokenFromHS(YARNRunner.java:194)
   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:273)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1439)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:581)
   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1439)
   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:576)
   at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:723)
   ... 10 more
 2013-03-15 13:34:16,555  WARN ActionStartXCommand:542 - USER[hue] GROUP[-] 
 

[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-03-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608252#comment-13608252
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


{code}
String identifier = this.getClass().getName()
{code}

Should replace getClass().getName() to getSimpleName for ensuring things don't 
break on Windows 


 MiniMRCluster should use a random component when creating an actual cluster
 ---

 Key: MAPREDUCE-5083
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk.txt


 Currently all unit tests end up using the same work dir - which can affect 
 anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5083) MiniMRCluster should use a random component when creating an actual cluster

2013-03-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608254#comment-13608254
 ] 

Hitesh Shah commented on MAPREDUCE-5083:


( above change needed in MiniMRCluster.java )

 MiniMRCluster should use a random component when creating an actual cluster
 ---

 Key: MAPREDUCE-5083
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5083
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-5083-branch2.txt, MAPREDUCE-5083-trunk.txt


 Currently all unit tests end up using the same work dir - which can affect 
 anyone trying to run parallel instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-15 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5066:
---

Affects Version/s: 2.0.3-alpha

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic

 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603885#comment-13603885
 ] 

Hitesh Shah commented on MAPREDUCE-5066:


Job notification also exists in 2.x which may face the same set of issues. 

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic

 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4442) Accessing hadoop counters from a job is unreliable in yarn during AM process cleanup window

2013-02-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4442:
---

Labels: usability  (was: )

 Accessing hadoop counters from a job is unreliable in yarn during AM process 
 cleanup  window
 

 Key: MAPREDUCE-4442
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4442
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Rahul Jain
  Labels: usability
 Attachments: am_logs_counter_failure.html, 
 rsrc_mgr_logs_counter_failed.txt


 We found this issue during our tests moving from MapReduceV1 to MapReduceV2. 
 A few of our applications access job counters multiple times:
 a) After submission of job, while job is execution (works fine)
 b) Right after job complete notification is received (works fine)
 c) Few seconds after job complete notification (fails most of the time).
 The error snippet is as follows:
 {code}
 2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
 connection Thread[IPC Client (1252749669) connection to 
 sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
 java.lang.NullPointerException
   at 
 org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
 2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
 completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
 2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
 completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
 2012-07-12 19:12:29,216 ERROR [UserGroupInformation] 
 PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException
 2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
 retrieve counters. null
 java.io.IOException
   at 
 org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
   at 
 org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
   at 
 org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
   at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
   at 
 org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)
 {code}
 The connection to 10.202.50.187:47944 is actually the connection to AM; 
 appears that we are connecting to AM to get the counters for the successful 
 job and not yet to the history server.
  
 I'll attach the logs for AM and resource mgr separately, however no unusual 
 activity is seen in those.
 This makes me suspect that we have a race condition in the code trying to 
 access job counters when AM is finishing up and the job hasn't moved to 
 history server yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4818:
---

Labels: usability  (was: )

 Easier identification of tasks that timeout during localization
 ---

 Key: MAPREDUCE-4818
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 0.23.3, 2.0.3-alpha
Reporter: Jason Lowe
  Labels: usability

 When a task is taking too long to localize and is killed by the AM due to 
 task timeout, the job UI/history is not very helpful.  The attempt simply 
 lists a diagnostic stating it was killed due to timeout, but there are no 
 logs for the attempt since it never actually got started.  There are log 
 messages on the NM that show the container never made it past localization by 
 the time it was killed, but users often do not have access to those logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4794) DefaultSpeculator generates error messages on normal shutdown

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4794:
---

Labels: usability  (was: )

 DefaultSpeculator generates error messages on normal shutdown
 -

 Key: MAPREDUCE-4794
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4794
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
  Labels: usability
 Attachments: MAPREDUCE-4794.patch


 DefaultSpeculator can log the following error message on a normal shutdown of 
 the ApplicationMaster:
 {noformat}
 2012-11-13 01:35:31,841 ERROR [DefaultSpeculator background processing] 
 org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: Background 
 thread returning, interrupted : java.lang.InterruptedException
 {noformat}
 and in addition for some reason it logs the corresponding backtrace to stdout.
 Like the errors fixed in MAPREDUCE-4741, this error message in the syslog and 
 backtrace on stdout can be confusing to users as to whether the job really 
 succeeded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4704) TaskHeartbeatHandler misreports a ping timeout as a task timeout

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4704:
---

Labels: usability  (was: )

 TaskHeartbeatHandler misreports a ping timeout as a task timeout
 

 Key: MAPREDUCE-4704
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4704
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Priority: Minor
  Labels: usability

 When a task fails to ping within the hardcoded ping timeout of 5 minutes, 
 TaskHeartbeatHandler logs a message reporting the wrong timeout value.  It 
 reports a timeout of mapreduce.task.timeout seconds rather than the 5 minute 
 ping timeout.
 This can lead to user confusion if they try increasing mapreduce.task.timeout 
 and see the log message showing the larger value but the task continues to 
 timeout after only 5 minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4693:
---

Labels: usability  (was: )

 Historyserver should provide counters for failed tasks
 --

 Key: MAPREDUCE-4693
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Reporter: Jason Lowe
  Labels: usability

 Currently the historyserver is not providing counters for failed tasks, even 
 though they are available via the AM as long as the job is still running.  
 Those counters are lost when the client needs to redirect to the 
 historyserver after the job completes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4692) Investigate and remove MR1 JTConfig and its constants use in the MR project on trunk

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4692:
---

Labels: usability  (was: )

 Investigate and remove MR1 JTConfig and its constants use in the MR project 
 on trunk
 

 Key: MAPREDUCE-4692
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4692
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Reporter: Harsh J
Priority: Minor
  Labels: usability

 Filed on behalf of Robert from MAPREDUCE-3223
 {quote}
 Are there any JIRAs to deprecate the configs from where they reside in the 
 code? 
 ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/server/jobtracker/JTConfig.java
  for example. I know we cannot delete them out just yet, because MRV1 code 
 still exists and may be using it, but it would be good to mark all of those 
 configs as deprecated. So that we can delete them in trunk once the MRV1 code 
 is completely removed.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4648) Diagnostics from AM are missing from job history

2013-02-12 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4648:
---

Labels: usability  (was: )

 Diagnostics from AM are missing from job history
 

 Key: MAPREDUCE-4648
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4648
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0, 2.0.0-alpha
Reporter: Jason Lowe
  Labels: usability

 When a job fails during setup or commit, any diagnostics from the MapReduce 
 ApplicationMaster are not available in the job history.  Currently the 
 diagnostics for the job are collected from the diagnostics of tasks run for 
 the job, but the AM has no corresponding task record in the job history.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4997) Deprecate mapreduce.jobtracker.address

2013-02-11 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576013#comment-13576013
 ] 

Hitesh Shah commented on MAPREDUCE-4997:


For users to transition from MR1 to MR2, we are talking about a full cluster 
change - replacing the JT/TTs with new daemons - RM/NMs. In this scenario, the 
users would be well aware of the change and therefore have to make the 
necessary config changes too. Therefore, it seems like not supporting 
mapreduce.jobtracker.address would be more ideal so as to not give them the 
wrong impression that the RM is a JT replacement.

 Deprecate mapreduce.jobtracker.address
 --

 Key: MAPREDUCE-4997
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4997
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 mapreduce.jobtracker.address currently is not used, but users transitioning 
 from mr1 to mr2 may expect their previous job configs to work, so it should 
 be deprecated in favor of yarn.resourcemanager.address.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4994) -jt generic command line option does not work

2013-02-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575340#comment-13575340
 ] 

Hitesh Shah commented on MAPREDUCE-4994:


It makes sense to remove -jt as there is no notion of jobtracker anywhere in 
2.x 

 -jt generic command line option does not work
 -

 Key: MAPREDUCE-4994
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4994
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4994-1.patch, MAPREDUCE-4994.patch


 hadoop jar myjar.jar MyDriver -fs file:/// -jt local input.txt output/
 should run a job using the local file system and the local job runner. 
 Instead it tries to connect to a jobtracker.
 hadoop jar myjar.jar MyDriver -fs file:/// -jt host:port input.txt output/
 does not use the given host/port
 This appears to be because Cluster#initialize, which loads the 
 ClientProtocol, contains no special handling for mapred.job.tracker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4994) -jt generic command line option does not work

2013-02-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575341#comment-13575341
 ] 

Hitesh Shah commented on MAPREDUCE-4994:


Also, is there any reason to make the hadoop command-line be YARN and 
resourcemanager-aware? Ignoring what was supported in earlier versions, for the 
future, would it more preferable to have the local runner option be part of say 
a mapred command line option? 

 -jt generic command line option does not work
 -

 Key: MAPREDUCE-4994
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4994
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4994-1.patch, MAPREDUCE-4994.patch


 hadoop jar myjar.jar MyDriver -fs file:/// -jt local input.txt output/
 should run a job using the local file system and the local job runner. 
 Instead it tries to connect to a jobtracker.
 hadoop jar myjar.jar MyDriver -fs file:/// -jt host:port input.txt output/
 does not use the given host/port
 This appears to be because Cluster#initialize, which loads the 
 ClientProtocol, contains no special handling for mapred.job.tracker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4143) ApplicationMaster retry times should be set by Client

2013-02-05 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572070#comment-13572070
 ] 

Hitesh Shah commented on MAPREDUCE-4143:


Seems like a reasonable feature to have with a slight caveat that the retry 
limit should be bounded by the limit configured on the RM. A client should not 
be able to set retry limit to 1000 for example.

 ApplicationMaster retry times should be set by Client
 -

 Key: MAPREDUCE-4143
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4143
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 0.23.1
 Environment: suse
Reporter: xieguiming
Priority: Minor

 We should support that different client or user have different 
 ApplicationMaster retry times. It also say that 
 yarn.resourcemanager.am.max-retries should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4837) Add webservices for jobtracker

2013-01-29 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4837:
---

Summary: Add webservices for jobtracker  (was: Add MR-AM web-services to 
branch-1)

 Add webservices for jobtracker
 --

 Key: MAPREDUCE-4837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4837
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: MAPREDUCE-4837.patch


 Add MR-AM web-services to branch-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4837) Add webservices for jobtracker

2013-01-29 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-4837:
---

   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Thanks Arun. Committed to branch-1.

 Add webservices for jobtracker
 --

 Key: MAPREDUCE-4837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4837
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 1.2.0

 Attachments: MAPREDUCE-4837.patch


 Add MR-AM web-services to branch-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4837) Add MR-AM web-services to branch-1

2013-01-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564701#comment-13564701
 ] 

Hitesh Shah commented on MAPREDUCE-4837:


Code changes seem straight-forward and look fine. Applied patch and verified 
format=json-based calls manually against branch 1. 

+1 assuming output of test-patch on branch-1 does not throw up any issues.



 Add MR-AM web-services to branch-1
 --

 Key: MAPREDUCE-4837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4837
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: MAPREDUCE-4837.patch


 Add MR-AM web-services to branch-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560174#comment-13560174
 ] 

Hitesh Shah commented on MAPREDUCE-4951:


@Jason, having the RM ask the AM to kill the container in case of preemption 
would likely not work as the AM cannot be trusted. Obviously, there could be a 
different approach where the RM informs the AM that a particular container will 
be preempted soon but the RM eventually would need to trigger a kill for that 
container after a certain delay if it is still up.


 

 Container preemption interpreted as task failure
 

 Key: MAPREDUCE-4951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951.patch


 When YARN reports a completed container to the MR AM, it always interprets it 
 as a failure.  This can lead to a job failing because too many of its tasks 
 failed, when in fact they only failed because the scheduler preempted them.
 MR needs to recognize the special exit code value of -100 and interpret it as 
 a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4508) YARN needs to properly check the NM,AM memory properties in yarn-site.xml and mapred.xml and report errors accordingly.

2012-08-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13443685#comment-13443685
 ] 

Hitesh Shah commented on MAPREDUCE-4508:


File MAPREDUCE-4508 for the issue mentioned in the previous comment.

 YARN needs to properly check the NM,AM memory properties in yarn-site.xml and 
 mapred.xml and report errors accordingly.
 ---

 Key: MAPREDUCE-4508
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4508
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.0-alpha
 Environment: CentOs6.0, Hadoop2.0.0 Alpha
Reporter: Anil Gupta
  Labels: Map, Reduce, YARN

 Please refer to this discussion on the Hadoop Mailing list:
 http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110
 Summary:
 I was running YARN(Hadoop2.0.0 Alpha) on a 8 datanode, 4 admin node 
 Hadoop/HBase cluster. My datanodes were only having 3.2GB of memory. So, i 
 configured the yarn.nodemanager.resource.memory-mb property in yarn-site.xml 
 to 1200. After setting the property if i run any Yarn Job then the 
 NodemManager wont be able to start any Map task since by default the 
 yarn.app.mapreduce.am.resource.mb property is set to 1500 MB in 
 mapred-site.xml. 
 Expected Behavior: NodeManager should give an error if 
 yarn.app.mapreduce.am.resource.mb = yarn.nodemanager.resource.memory-mb.
 Please let me know if more information is required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4508) YARN needs to properly check the NM,AM memory properties in yarn-site.xml and mapred.xml and report errors accordingly.

2012-08-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439943#comment-13439943
 ] 

Hitesh Shah commented on MAPREDUCE-4508:


Sorry for the late reply. I dont believe that an error should be thrown when 
the AM requested memory is greater than the NM memory. I believe this is more 
of a configuration bug where the scheduler max allocation should be set such 
that an error is thrown for any AM requesting more than that. The RM should 
error out if the max scheduler allocation for a single container is less than 
the resources required to launch a new AM. Please let me know if you have seen 
something contrary to this. 

However, depending on how the scheduler max allocation is configured, there 
will be situations in heterogenous clusters where certain nodes may be down 
creating holes causing requests for large amount of resources/memory to wait 
for an indefinite amount of time. This is something which needs to be addressed 
separately and is a bit more tricky in terms of when to decide whether the 
allocation request cannot be fulfilled ( both from a new AM perspective or 
container requests by an AM ). I will file a separate jira for that.  



 YARN needs to properly check the NM,AM memory properties in yarn-site.xml and 
 mapred.xml and report errors accordingly.
 ---

 Key: MAPREDUCE-4508
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4508
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.0-alpha
 Environment: CentOs6.0, Hadoop2.0.0 Alpha
Reporter: Anil Gupta
  Labels: Map, Reduce, YARN

 Please refer to this discussion on the Hadoop Mailing list:
 http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110
 Summary:
 I was running YARN(Hadoop2.0.0 Alpha) on a 8 datanode, 4 admin node 
 Hadoop/HBase cluster. My datanodes were only having 3.2GB of memory. So, i 
 configured the yarn.nodemanager.resource.memory-mb property in yarn-site.xml 
 to 1200. After setting the property if i run any Yarn Job then the 
 NodemManager wont be able to start any Map task since by default the 
 yarn.app.mapreduce.am.resource.mb property is set to 1500 MB in 
 mapred-site.xml. 
 Expected Behavior: NodeManager should give an error if 
 yarn.app.mapreduce.am.resource.mb = yarn.nodemanager.resource.memory-mb.
 Please let me know if more information is required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4578) Handle container requests that request more resources than available in the cluster

2012-08-22 Thread Hitesh Shah (JIRA)
Hitesh Shah created MAPREDUCE-4578:
--

 Summary: Handle container requests that request more resources 
than available in the cluster
 Key: MAPREDUCE-4578
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4578
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.0-alpha, 0.23.0
Reporter: Hitesh Shah


In heterogenous clusters, a simple check at the scheduler to check if the 
allocation request is within the max allocatable range is not enough. 

If there are large nodes in the cluster which are not available, there may be 
situations where some allocation requests will never be fulfilled. Need an 
approach to decide when to invalidate such requests. For application 
submissions, there will need to be a feedback loop for applications that could 
not be launched. For running AMs, AllocationResponse may need to augmented with 
information for invalidated/cancelled container requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4508) YARN needs to properly check the NM,AM memory properties in yarn-site.xml and mapred.xml and report errors accordingly.

2012-08-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427751#comment-13427751
 ] 

Hitesh Shah commented on MAPREDUCE-4508:


Seems like a dup of MAPREDUCE-3796

 YARN needs to properly check the NM,AM memory properties in yarn-site.xml and 
 mapred.xml and report errors accordingly.
 ---

 Key: MAPREDUCE-4508
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4508
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.0-alpha
 Environment: CentOs6.0, Hadoop2.0.0 Alpha
Reporter: Anil Gupta
  Labels: Map, Reduce, YARN

 Please refer to this discussion on the Hadoop Mailing list:
 http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110
 Summary:
 I was running YARN(Hadoop2.0.0 Alpha) on a 8 datanode, 4 admin node 
 Hadoop/HBase cluster. My datanodes were only having 3.2GB of memory. So, i 
 configured the yarn.nodemanager.resource.memory-mb property in yarn-site.xml 
 to 1200. After setting the property if i run any Yarn Job then the 
 NodemManager wont be able to start any Map task since by default the 
 yarn.app.mapreduce.am.resource.mb property is set to 1500 MB in 
 mapred-site.xml. 
 Expected Behavior: NodeManager should give an error if 
 yarn.app.mapreduce.am.resource.mb = yarn.nodemanager.resource.memory-mb.
 Please let me know if more information is required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Attachment: MR-2179.2.patch

Updated patch with a very simple integration test that deploys and runs the ds 
app master on the miniyarncluster. 

 MR-279: Write a shell command application
 -

 Key: MAPREDUCE-2719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Reporter: Sharad Agarwal
Assignee: Hitesh Shah
 Attachments: MR-2179.1.patch, MR-2179.2.patch, mr-2719.wip.patch


 With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
 an AplicationMaster (also corresponding simple client), to submit and run a 
 shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Status: Patch Available  (was: Open)

 MR-279: Write a shell command application
 -

 Key: MAPREDUCE-2719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Reporter: Sharad Agarwal
Assignee: Hitesh Shah
 Attachments: MR-2179.1.patch, MR-2179.2.patch, mr-2719.wip.patch


 With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
 an AplicationMaster (also corresponding simple client), to submit and run a 
 shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Status: Open  (was: Patch Available)

 MR-279: Write a shell command application
 -

 Key: MAPREDUCE-2719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Reporter: Sharad Agarwal
Assignee: Hitesh Shah
 Fix For: 0.23.0

 Attachments: MR-2179.1.patch, MR-2179.2.patch, mr-2719.wip.patch


 With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
 an AplicationMaster (also corresponding simple client), to submit and run a 
 shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-22 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-2719:
---

Attachment: MR-2179.1.patch

Attaching code with relevant pom files to create a new module. 

Current structure is 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/
 



 MR-279: Write a shell command application
 -

 Key: MAPREDUCE-2719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Reporter: Sharad Agarwal
Assignee: Hitesh Shah
 Attachments: MR-2179.1.patch, mr-2719.wip.patch


 With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
 an AplicationMaster (also corresponding simple client), to submit and run a 
 shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2719) MR-279: Write a shell command application

2011-09-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13113067#comment-13113067
 ] 

Hitesh Shah commented on MAPREDUCE-2719:


Tests still pending. 

 MR-279: Write a shell command application
 -

 Key: MAPREDUCE-2719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2719
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Reporter: Sharad Agarwal
Assignee: Hitesh Shah
 Attachments: MR-2179.1.patch, mr-2719.wip.patch


 With nextgen hadoop (mrv2), it is simple to write non-MR applications. Write 
 an AplicationMaster (also corresponding simple client), to submit and run a 
 shell command application in the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3067) Container exit status not set properly to launched process's exit code on successful completion of process

2011-09-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13112329#comment-13112329
 ] 

Hitesh Shah commented on MAPREDUCE-3067:


Possible patch for addressing part of the issue. 

--- 
a/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
+++ 
b/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
@@ -554,6 +554,9 @@ public class ContainerImpl implements Container {
   static class ExitedWithSuccessTransition extends ContainerTransition {
 @Override
 public void transition(ContainerImpl container, ContainerEvent event) {
+  // Set exit code to 0 to denote success
+  container.exitCode = 0;
+
   // TODO: Add containerWorkDir to the deletion service.

   // Inform the localizer to decrement reference counts and cleanup


 Container exit status not set properly to launched process's exit code on 
 successful completion of process
 --

 Key: MAPREDUCE-3067
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3067
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
 Fix For: 0.23.0


 When testing the distributed shell sample app master, the container exit 
 status was being returned incorrectly. 
 11/09/21 11:32:58 INFO DistributedShell.ApplicationMaster: Got container 
 status for containerID= container_1316629955324_0001_01_02, 
 state=COMPLETE, exitStatus=-1000, diagnostics=

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3067) Container exit status not set properly to launched process's exit code on successful completion of process

2011-09-21 Thread Hitesh Shah (JIRA)
Container exit status not set properly to launched process's exit code on 
successful completion of process
--

 Key: MAPREDUCE-3067
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3067
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
 Fix For: 0.23.0


When testing the distributed shell sample app master, the container exit status 
was being returned incorrectly. 

11/09/21 11:32:58 INFO DistributedShell.ApplicationMaster: Got container status 
for containerID= container_1316629955324_0001_01_02, state=COMPLETE, 
exitStatus=-1000, diagnostics=

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3067) Container exit status not set properly to launched process's exit code on successful completion of process

2011-09-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13112331#comment-13112331
 ] 

Hitesh Shah commented on MAPREDUCE-3067:


Second aspect to this is the exit status is checked on completion of map or 
reduce tasks.

 Container exit status not set properly to launched process's exit code on 
 successful completion of process
 --

 Key: MAPREDUCE-3067
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3067
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
 Fix For: 0.23.0


 When testing the distributed shell sample app master, the container exit 
 status was being returned incorrectly. 
 11/09/21 11:32:58 INFO DistributedShell.ApplicationMaster: Got container 
 status for containerID= container_1316629955324_0001_01_02, 
 state=COMPLETE, exitStatus=-1000, diagnostics=

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3055) Simplify parameter passing to Application Master from Client. SImplify approach to pass info such appId, ClusterTimestamp and failcount required by App Master.

2011-09-20 Thread Hitesh Shah (JIRA)
Simplify parameter passing to Application Master from Client. SImplify approach 
to pass info such  appId, ClusterTimestamp and failcount required by App Master.


 Key: MAPREDUCE-3055
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3055
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
Priority: Minor
 Fix For: 0.23.0


The Application master needs the application attempt id to register with the 
Applications Manager. To create an appAttemptId object, the appId object(needs 
cluster timestamp and app id) and failCount are needed.

Currently, all clients need to pass in the appId, cluster timestamp and fail 
count to the app master for the required objects to be constructed. 

We could look at simplifying this by providing either placeholders that would 
have values replaced by the app master launcher or setting it  into the 
environment ( although that requires a set of whitelisted env vars that can 
only be set by the yarn framework ). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3004) sort example fails in shuffle/reduce stage as it assumes a local job by default

2011-09-19 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-3004:
---

Attachment: mapreduce-3004-branch-0.23.3.patch

Addressing code review comments. Added 2 constants to MRConfig to reference 
classic and yarn framework names instead of using hardcoded strings. 

 sort example fails in shuffle/reduce stage as it assumes a local job by 
 default 
 

 Key: MAPREDUCE-3004
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3004
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
Assignee: Hitesh Shah
Priority: Minor
 Fix For: 0.23.0

 Attachments: mapreduce-3004-branch-0.23.2.patch, 
 mapreduce-3004-branch-0.23.3.patch, mapreduce-3004-branch-0.23.patch


 Log trace when running sort on a single node setup:
 11/09/13 17:01:06 INFO mapreduce.Job:  map 100% reduce 0%
 11/09/13 17:01:10 INFO mapreduce.Job: Task Id : 
 attempt_1315949787252_0009_r_00_0, Status : FAILED
 java.lang.UnsupportedOperationException: Incompatible with LocalRunner
   at 
 org.apache.hadoop.mapred.YarnOutputFiles.getInputFile(YarnOutputFiles.java:200)
   at org.apache.hadoop.mapred.ReduceTask.getMapFiles(ReduceTask.java:183)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:365)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:148)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:143)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3004) sort example fails in shuffle/reduce stage as it assumes a local job by default

2011-09-19 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-3004:
---

Status: Patch Available  (was: Open)

 sort example fails in shuffle/reduce stage as it assumes a local job by 
 default 
 

 Key: MAPREDUCE-3004
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3004
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
Assignee: Hitesh Shah
Priority: Minor
 Fix For: 0.23.0

 Attachments: mapreduce-3004-branch-0.23.2.patch, 
 mapreduce-3004-branch-0.23.3.patch, mapreduce-3004-branch-0.23.patch


 Log trace when running sort on a single node setup:
 11/09/13 17:01:06 INFO mapreduce.Job:  map 100% reduce 0%
 11/09/13 17:01:10 INFO mapreduce.Job: Task Id : 
 attempt_1315949787252_0009_r_00_0, Status : FAILED
 java.lang.UnsupportedOperationException: Incompatible with LocalRunner
   at 
 org.apache.hadoop.mapred.YarnOutputFiles.getInputFile(YarnOutputFiles.java:200)
   at org.apache.hadoop.mapred.ReduceTask.getMapFiles(ReduceTask.java:183)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:365)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:148)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:143)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3041) Enhance YARN Client-RM protocol to provide access to information such as cluster's Min/Max Resource capabilities similar to that of AM-RM protocol

2011-09-19 Thread Hitesh Shah (JIRA)
Enhance YARN Client-RM protocol to provide access to information such as 
cluster's Min/Max Resource capabilities similar to that of AM-RM protocol
--

 Key: MAPREDUCE-3041
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3041
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Fix For: 0.23.0


To request a container to launch an application master, the client needs to 
know the min/max resource capabilities so as to be able to make a proper 
resource request when submitting a new application.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3041) Enhance YARN Client-RM protocol to provide access to information such as cluster's Min/Max Resource capabilities similar to that of AM-RM protocol

2011-09-19 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108108#comment-13108108
 ] 

Hitesh Shah commented on MAPREDUCE-3041:


One option is to modify GetNewApplicationIdResponse to return the min/max 
capabilities along with the new application id such that the client can use the 
information to submit the application request to RM/ASM.

 Enhance YARN Client-RM protocol to provide access to information such as 
 cluster's Min/Max Resource capabilities similar to that of AM-RM protocol
 --

 Key: MAPREDUCE-3041
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3041
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Fix For: 0.23.0


 To request a container to launch an application master, the client needs to 
 know the min/max resource capabilities so as to be able to make a proper 
 resource request when submitting a new application.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-3033) JobClient requires mapreduce.jobtracker.address config even when mapreduce.framework.name is set to yarn

2011-09-19 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reassigned MAPREDUCE-3033:
--

Assignee: Hitesh Shah

 JobClient requires mapreduce.jobtracker.address config even when 
 mapreduce.framework.name is set to yarn
 

 Key: MAPREDUCE-3033
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3033
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 0.23.0
Reporter: Karam Singh
Assignee: Hitesh Shah
 Fix For: 0.23.0


 If mapreduce.jobtracker.address is not set in mapred-site.xml and 
 mapreduce.framework.name is set yarn, job submission fails :
 Tried to submit sleep job with maps 1 task. Job submission failed with 
 following exception -:
 {code}
 11/09/19 13:19:20 INFO ipc.YarnRPC: Creating YarnRPC for 
 org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
 11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connecting to 
 ResourceManager at RMHost:8040
 11/09/19 13:19:20 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy 
 for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
 11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connected to 
 ResourceManager at RMHost:8040
 11/09/19 13:19:21 INFO mapred.ResourceMgrDelegate: DEBUG --- 
 getStagingAreaDir: dir=/user/username/.staging
 11/09/19 13:19:21 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
 /user/username/.staging/job_1316435926198_0004
 java.lang.RuntimeException: Not a host:port pair: local
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:148)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:132)
   at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:42)
   at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:47)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:104)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:90)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:83)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:346)
   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1072)
   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1069)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1069)
   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1089)
   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:262)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
   at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
   at 
 org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:111)
   at 
 org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:118)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:189)
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3027) MR-279: Completed container exit state needs to be enhanced to differentiate between container aborts/failures and actual application process exit codes

2011-09-16 Thread Hitesh Shah (JIRA)
MR-279: Completed container exit state needs to be enhanced to differentiate 
between container aborts/failures and actual application process exit codes


 Key: MAPREDUCE-3027
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3027
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Fix For: 0.23.0


Currently, a completed container's exit status is set to -100 to denote the 
container being killed by the framework either as a result of the application 
releasing the container or a node failure. An application process may also 
return an exit code of -100 creating an ambiguity. 

 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >