[jira] [Created] (MAPREDUCE-3667) Gridmix jobs are failing with OOM in reduce shuffle phase.

2012-01-13 Thread Amol Kekre (Created) (JIRA)
Gridmix jobs are failing with OOM in reduce shuffle phase.
--

 Key: MAPREDUCE-3667
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3667
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Amol Kekre
Priority: Blocker
 Fix For: 0.23.1


Roll up bug for gridmix3 benchmark

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3631) Corner case in headroom calculation

2012-01-05 Thread Amol Kekre (Created) (JIRA)
Corner case in headroom calculation
---

 Key: MAPREDUCE-3631
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3631
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Amol Kekre
 Fix For: 0.23.1


When there is a single queue and a large job fills up the whole cluster, a lost 
NM can lead to wrong headroom when all
slots are taken up by reduces since at that point headroom isn't recomputed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3630) NullPointerException running teragen

2012-01-05 Thread Amol Kekre (Created) (JIRA)
NullPointerException running teragen


 Key: MAPREDUCE-3630
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3630
 Project: Hadoop Map/Reduce
  Issue Type: Task
Reporter: Amol Kekre


CMD = /grid/0/gs/gridre/yroot.omegab/share/hadoopcommon/bin/hadoop --config 
/grid/0/gs/gridre/yroot.omegab/conf/hadoop/
jar 
/grid/0/gs/gridre/yroot.omegab/share/hadoopmapred/hadoop-mapreduce-examples-*.jar
 teragen
-Dmapred.job.queue.name=audience -Dmapreduce.job.acl-view-job=* 
-Dyarn.app.mapreduce.am.staging-dir=/user
-Dmapreduce.jobtracker.staging.root.dir=/user 1 teraInputDir

Error:
11/09/21 21:59:59 INFO mapreduce.Job:  map 50% reduce 0%
11/09/21 22:00:00 INFO mapreduce.Job: Task Id : 
attempt_1316132655177_0533_m_01_0, Status : FAILED
java.lang.NullPointerException
at 
org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.cleanup(TeraGen.java:241)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:708)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:148)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:143)

11/09/21 22:00:00 WARN mapreduce.Job: Error reading task output Server returned 
HTTP response code: 400 for URL: 
http://.../tasklog?plaintext=true&attemptid=attempt_1316132655177_0533_m_01_0&filter=stdout
11/09/21 22:00:00 WARN mapreduce.Job: Error reading task output Server returned 
HTTP response code: 400 for URL: 
http://.../tasklog?plaintext=true&attemptid=attempt_1316132655177_0533_m_01_0&filter=stderr


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3629) Remove sleep from MRAppMaster during app-finish.

2012-01-05 Thread Amol Kekre (Created) (JIRA)
Remove sleep from MRAppMaster during app-finish.


 Key: MAPREDUCE-3629
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3629
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Amol Kekre
 Fix For: 0.23.1


MRAppMaster waits for 5 secs during app-finish, this was needed before we had
client-side redirection. This affects the app execution in that, AppMaster will 
killed by the NM once NM gets
confirmation from RM.

AppMaster should go away immediately. Also, the done call to RM from AM should 
be the last thing AM ever does.
Otherwise, today, JobHistory writing gets interrupted if AM gets killed by the 
NM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3628) DFSIO read throughput is decreased by 16% in 0.23 than Hadoop-0.20.204 on 350 nodes size cluster.

2012-01-05 Thread Amol Kekre (Created) (JIRA)
DFSIO read throughput is decreased by 16% in 0.23 than Hadoop-0.20.204 on 350 
nodes size cluster.
-

 Key: MAPREDUCE-3628
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3628
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Amol Kekre
Priority: Critical
 Fix For: 0.23.1


DFSIO read throughput is decreased by 16% in 0.23 than Hadoop-0.20.204 on 350 
nodes size cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3127) Unable to restrict users based on resourcemanager.admin.acls value set

2011-09-30 Thread Amol Kekre (Created) (JIRA)
Unable to restrict users based on resourcemanager.admin.acls value set
--

 Key: MAPREDUCE-3127
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3127
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Amol Kekre


Setting the following property in yarn-site.xml with user ids to restrict 
ability to run
'rmadmin -refreshQueues is not honoured


yarn.server.resourcemanager.admin.acls
hadoop1




Should it be the same for rmadmin -refreshNodes?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira