[jira] [Updated] (MAPREDUCE-5611) CombineFileInputFormat creates more rack-local tasks due to less split location info.

2013-11-06 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-5611:
---

Affects Version/s: (was: 2.2.0)

 CombineFileInputFormat creates more rack-local tasks due to less split 
 location info.
 -

 Key: MAPREDUCE-5611
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5611
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chandra Prakash Bhagtani
Assignee: Chandra Prakash Bhagtani

 I have come across an issue with CombineFileInputFormat. Actually I ran a 
 hive query on approx 1.2 GB data with CombineHiveInputFormat which internally 
 uses CombineFileInputFormat. My cluster size is 9 datanodes and 
 max.split.size is 256 MB
 When I ran this query with replication factor 9, hive consistently creates 
 all 6 rack-local tasks and with replication factor 3 it creates 5 rack-local 
 and 1 data local tasks. 
  When replication factor is 9 (equal to cluster size), all the tasks should 
 be data-local as each datanode contains all the replicas of the input data, 
 but that is not happening i.e all the tasks are rack-local. 
 When I dug into CombineFileInputFormat.java code in getMoreSplits method, I 
 found the issue with the following snippet (specially in case of higher 
 replication factor)
 {code:title=CombineFileInputFormat.java|borderStyle=solid}
 for (IteratorMap.EntryString,
  ListOneBlockInfo iter = nodeToBlocks.entrySet().iterator();
  iter.hasNext();) {
Map.EntryString, ListOneBlockInfo one = iter.next();
   nodes.add(one.getKey());
   ListOneBlockInfo blocksInNode = one.getValue();
   // for each block, copy it into validBlocks. Delete it from
   // blockToNodes so that the same block does not appear in
   // two different splits.
   for (OneBlockInfo oneblock : blocksInNode) {
 if (blockToNodes.containsKey(oneblock)) {
   validBlocks.add(oneblock);
   blockToNodes.remove(oneblock);
   curSplitSize += oneblock.length;
   // if the accumulated split size exceeds the maximum, then
   // create this split.
   if (maxSize != 0  curSplitSize = maxSize) {
 // create an input split and add it to the splits array
 addCreatedSplit(splits, nodes, validBlocks);
 curSplitSize = 0;
 validBlocks.clear();
   }
 }
   }
 {code}
 First node in the map nodeToBlocks has all the replicas of input file, so the 
 above code creates 6 splits all with only one location. Now if JT doesn't 
 schedule these tasks on that node, all the tasks will be rack-local, even 
 though all the other datanodes have all the other replicas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (MAPREDUCE-3057) Job History Server goes of OutOfMemory with 1200 Jobs and Heap Size set to 10 GB

2011-09-21 Thread Karam Singh (JIRA)
Job History Server goes of OutOfMemory with 1200 Jobs and Heap Size set to 10 GB


 Key: MAPREDUCE-3057
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3057
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 0.23.0
Reporter: Karam Singh


History server was started with -Xmx1m
Ran GridMix V3 with 1200 Jobs trace in STRESS mode on 350 nodes with each node 
4 NMS.
All jobs finished as reported by RM Web UI and HADOOP_MAPRED_HOME/bin/mapred 
job -list all
But found that GridMix job client was stuck while trying connect to 
HistoryServer
Then tried to do HADOOP_MAPRED_HOME/bin/mapred job -status jobid
JobClient also got stuck while looking for token to connect to History server
Then looked at History Server logs and found History is trowing 
java.lang.OutOfMemoryError: GC overhead limit exceeded error.

With 10GB of Heap space and 1200 Jobs, History Server should not go out of 
memory .
No matter what are the type of jobs.





--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3058) Sometimes task keeps on running while its Syslog says that it is shutdown

2011-09-21 Thread Karam Singh (JIRA)
Sometimes task keeps on running while its Syslog says that it is shutdown
-

 Key: MAPREDUCE-3058
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3058
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Karam Singh


While running GridMix V3 one got stuck for 15 hrs.
After clicking on Job found one of its reduce got stuck.
Looking at syslog of reducer it found that- :
Task started at -: 
2011-09-19 17:57:22,002 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Scheduled snapshot period at 10 second(s).
2011-09-19 17:57:22,002 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
ReduceTask metrics system started

While end of syslog says -:
2011-09-19 18:06:49,818 INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 
as DATANODE1
2011-09-19 18:06:49,818 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
for block BP-1405370709-NAMENODE-1316452621953:blk_-7004355226367468317_79871 
in pipeline  DATANODE2,  DATANODE1: bad datanode  DATANODE1
2011-09-19 18:06:49,818 DEBUG 
org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtocol: 
lastAckedSeqno = 26870
2011-09-19 18:06:49,820 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(26613121) connection to NAMENODE from gridperf sending #454
2011-09-19 18:06:49,826 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(26613121) connection to NAMENODE from gridperf got value #454
2011-09-19 18:06:49,827 DEBUG org.apache.hadoop.ipc.RPC: Call: 
getAdditionalDatanode 8
2011-09-19 18:06:49,827 DEBUG org.apache.hadoop.hdfs.DFSClient: Connecting to 
datanode DATANODE2
2011-09-19 18:06:49,827 DEBUG org.apache.hadoop.hdfs.DFSClient: Send buf size 
131071
2011-09-19 18:06:49,833 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
Exception
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:158)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:860)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:929)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:740)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:415)
2011-09-19 18:06:49,837 WARN org.apache.hadoop.mapred.YarnChild: Exception 
running child : java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:158)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:860)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:929)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:740)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:415)

2011-09-19 18:06:49,837 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(26613121) connection to APPMASTER from job_1316452677984_0862 sending #455
2011-09-19 18:06:49,839 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(26613121) connection to APPMASTER from job_1316452677984_0862 got value #455
2011-09-19 18:06:49,840 DEBUG org.apache.hadoop.ipc.RPC: Call: statusUpdate 3
2011-09-19 18:06:49,840 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
for the task
2011-09-19 18:06:49,840 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(26613121) connection to NAMENODE from gridperf sending #456
2011-09-19 18:06:49,858 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(26613121) connection to NAMENODE from gridperf got value #456
2011-09-19 18:06:49,858 DEBUG org.apache.hadoop.ipc.RPC: Call: delete 18
2011-09-19 18:06:49,858 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(26613121) connection to APPMASTER from job_1316452677984_0862 sending #457
2011-09-19 18:06:49,859 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(26613121) connection to APPMASTER from job_1316452677984_0862 got value #457
2011-09-19 18:06:49,859 DEBUG org.apache.hadoop.ipc.RPC: Call: 
reportDiagnosticInfo 1
2011-09-19 18:06:49,859 DEBUG 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: refCount=1
2011-09-19 18:06:49,859 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Stopping ReduceTask metrics system...
2011-09-19 18:06:49,859 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Stopping 

[jira] [Created] (MAPREDUCE-3059) Resourcemanager metrices does not have aggregate containers allocated and containers eeleased Metrics

2011-09-21 Thread Karam Singh (JIRA)
Resourcemanager metrices does not have aggregate containers allocated and 
containers eeleased Metrics
-

 Key: MAPREDUCE-3059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3059
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Karam Singh


Resourcemanager metrices does not have any aggregate containers allocate and 
container released metrics.
If I want of how container allocated or container released, I do not know any 
way to find.
NodeManager do have containers alunched and container released metrics, but 
this is not central location, so get aggregate number, 1st all NM metrics 
needed to be merged

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3059) Resourcemanager metrices does not have aggregate containers allocated and containers released Metrics

2011-09-21 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-3059:
---

Summary: Resourcemanager metrices does not have aggregate containers 
allocated and containers released Metrics  (was: Resourcemanager metrices does 
not have aggregate containers allocated and containers eeleased Metrics)

 Resourcemanager metrices does not have aggregate containers allocated and 
 containers released Metrics
 -

 Key: MAPREDUCE-3059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3059
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Karam Singh

 Resourcemanager metrices does not have any aggregate containers allocate and 
 container released metrics.
 If I want of how container allocated or container released, I do not know any 
 way to find.
 NodeManager do have containers alunched and container released metrics, but 
 this is not central location, so get aggregate number, 1st all NM metrics 
 needed to be merged

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3031) Job Client goes into infinite loop when we kill AM

2011-09-19 Thread Karam Singh (JIRA)
Job Client goes into infinite loop when we kill AM
--

 Key: MAPREDUCE-3031
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3031
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Karam Singh


Started a cluster. Sumitted a sleep job with around 1 maps and 1000 reduces.
Killed AM with kill -9 7000 thousands maps got completed

RM Application kepts on saying Application RUNNING
and jobclient went in infinit loop of trying to connecting AM

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3033) JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even mapreduce.framework.name is set top yarn

2011-09-19 Thread Karam Singh (JIRA)
JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even 
mapreduce.framework.name is set top yarn
---

 Key: MAPREDUCE-3033
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3033
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Reporter: Karam Singh


If mapreduce.jobtracker.address is set in mapred-site.xml
And mapreduce.framework.name is set yarn 
job submission fails :

Tried to submit sleep job with maps 1 task. Job submission failed with 
following exception -:
11/09/19 13:19:20 INFO ipc.YarnRPC: Creating YarnRPC for 
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connecting to 
ResourceManager at RMHost:8040
11/09/19 13:19:20 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy 
for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connected to ResourceManager 
at RMHost:8040
11/09/19 13:19:21 INFO mapred.ResourceMgrDelegate: DEBUG --- getStagingAreaDir: 
dir=/user/username/.staging
11/09/19 13:19:21 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
/user/username/.staging/job_1316435926198_0004
java.lang.RuntimeException: Not a host:port pair: local
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:148)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:132)
at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:42)
at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:47)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:104)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:90)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:83)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:346)
at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1072)
at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1069)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1069)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1089)
at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:262)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at 
org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:111)
at 
org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:189)



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3033) JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even mapreduce.framework.name is set to yarn

2011-09-19 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-3033:
---

Summary: JobClient requires mapreduce.jobtracker.address tag in 
mapred-site.xm even mapreduce.framework.name is set to yarn  (was: JobClient 
requires mapreduce.jobtracker.address tag in mapred-site.xm even 
mapreduce.framework.name is set top yarn)

 JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even 
 mapreduce.framework.name is set to yarn
 --

 Key: MAPREDUCE-3033
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3033
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Reporter: Karam Singh

 If mapreduce.jobtracker.address is set in mapred-site.xml
 And mapreduce.framework.name is set yarn 
 job submission fails :
 Tried to submit sleep job with maps 1 task. Job submission failed with 
 following exception -:
 11/09/19 13:19:20 INFO ipc.YarnRPC: Creating YarnRPC for 
 org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
 11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connecting to 
 ResourceManager at RMHost:8040
 11/09/19 13:19:20 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy 
 for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
 11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connected to 
 ResourceManager at RMHost:8040
 11/09/19 13:19:21 INFO mapred.ResourceMgrDelegate: DEBUG --- 
 getStagingAreaDir: dir=/user/username/.staging
 11/09/19 13:19:21 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
 /user/username/.staging/job_1316435926198_0004
 java.lang.RuntimeException: Not a host:port pair: local
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:148)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:132)
   at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:42)
   at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:47)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:104)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:90)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:83)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:346)
   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1072)
   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1069)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1069)
   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1089)
   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:262)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
   at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
   at 
 org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:111)
   at 
 org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:118)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3006) MapReduce AM exits prematurely before completely writing and closing the JobHistory file

2011-09-16 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106887#comment-13106887
 ] 

Karam Singh commented on MAPREDUCE-3006:


Today did fresh checkout of branch 0.23
Applied Patch provided by Vinod
Compiled yarn and deployed
Ran the Sleep Job 100K tasks, when job got completed, successfully accessed job 
from HistroyServer, whereas without job history did not get completed


 MapReduce AM exits prematurely before completely writing and closing the 
 JobHistory file
 

 Key: MAPREDUCE-3006
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3006
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.0

 Attachments: MAPREDUCE-3006-20110915.txt, MAPREDUCE-3006-20110916.txt


 [~Karams] was executing a sleep job with 100,000 tasks on a 350 node cluster 
 to test MR AM's scalability and ran into this. The job ran successfully but 
 the history was not available.
 I debugged around and figured that the job is finishing prematurely before 
 the JobHistory is written. In most of the cases, we don't see this bug as we 
 have a 5 seconds sleep in AM towards the end.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3007) JobClient cannot talk to JobHistory server in secure mode

2011-09-15 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105260#comment-13105260
 ] 

Karam Singh commented on MAPREDUCE-3007:


Applied MAPREDUCE-3007-20110914.2.txt
Now I am not facing the problem of Jobclient not able to connect to 
Historyserver



 JobClient cannot talk to JobHistory server in secure mode
 -

 Key: MAPREDUCE-3007
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3007
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.0

 Attachments: MAPREDUCE-3007-20110914.2.txt, 
 MAPREDUCE-3007-20110914.txt


 In secure mode, Jobclient cannot connect to HistoryServer. Thanks to 
 [~karams] for finding this out.
 {code}
 11/09/14 09:57:51 INFO mapred.ClientServiceDelegate: Application state is 
 completed. Redirecting to job history server
 11/09/14 09:57:51 INFO security.ApplicationTokenSelector: Looking for a token 
 with service history-server:10020
 11/09/14 09:57:51 INFO security.ApplicationTokenSelector: Token kind is 
 YARN_APPLICATION_TOKEN and the token's service name is Am-ip:46257
 11/09/14 09:57:51 INFO security.UserGroupInformation: Initiating logout for 
 user-principal
 11/09/14 09:57:51 INFO security.UserGroupInformation: Initiating re-login for 
 user-principal
 11/09/14 09:57:55 WARN security.UserGroupInformation: Not attempting to 
 re-login since the last re-login was attempted less than 600 seconds before.
 11/09/14 09:57:56 WARN security.UserGroupInformation: Not attempting to 
 re-login since the last re-login was attempted less than 600 seconds before.
 11/09/14 09:58:00 WARN security.UserGroupInformation: Not attempting to 
 re-login since the last re-login was attempted less than 600 seconds before.
 11/09/14 09:58:05 WARN security.UserGroupInformation: Not attempting to 
 re-login since the last re-login was attempted less than 600 seconds before.
 11/09/14 09:58:05 WARN ipc.Client: Couldn't setup connection for 
 user-principal to null
 11/09/14 09:58:05 INFO mapred.ClientServiceDelegate: Failed to contact 
 AM/History for job job_1315993268700_0001  Will retry..
 {code}
 Am surprised no one working with YARN+MR ever ran into this!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2947) Sort fails on YARN+MR with lots of task failures

2011-09-08 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100234#comment-13100234
 ] 

Karam Singh commented on MAPREDUCE-2947:


I am not seeing the error after applying patch

 Sort fails on YARN+MR with lots of task failures
 

 Key: MAPREDUCE-2947
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2947
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2947-20110907.txt


 [~karams](the great man the world hardly knows about) found lots of failing 
 tasks while running sort on a 350 node cluster. The failed tasks eventually 
 failed the job and this happening consistently on the big cluster.
 {quote}
 Container launch failed for container_1315410418107_0002_01_002511 : 
 RemoteTrace: java.lang.IllegalArgumentException at 
 java.nio.Buffer.position(Buffer.java:218) at 
 java.nio.HeapByteBuffer.get(HeapByteBuffer.java:129) at 
 java.nio.ByteBuffer.get(ByteBuffer.java:675) at 
 com.google.protobuf.ByteString.copyFrom(ByteString.java:108) at 
 com.google.protobuf.ByteString.copyFrom(ByteString.java:117) at 
 org.apache.hadoop.yarn.util.ProtoUtils.convertToProtoFormat(ProtoUtils.java:97)
  at 
 org.apache.hadoop.yarn.api.records.ProtoBase.convertToProtoFormat(ProtoBase.java:59)
  at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.access$100(StartContainerResponsePBImpl.java:35)
  at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl$1$1.next(StartContainerResponsePBImpl.java:134)
  at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl$1$1.next(StartContainerResponsePBImpl.java:122)
  at 
 com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:319)
  at 
 org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerResponseProto$Builder.addAllServiceResponse(YarnServiceProtos.java:12620)
  at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.addServiceResponseToProto(StartContainerResponsePBImpl.java:144)
  at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.mergeLocalToBuilder(StartContainerResponsePBImpl.java:60)
  at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.mergeLocalToProto(StartContainerResponsePBImpl.java:68)
  at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.getProto(StartContainerResponsePBImpl.java:52)
  at 
 org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:69)
  at 
 org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
  at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:337)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490) at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:151)
  at $Proxy20.startContainer(Unknown Source) at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
  at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:215)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:619) 
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-582) Streaming: if streaming command finds errors in the --config , it reports that the input file is not found and fails

2010-06-15 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878941#action_12878941
 ] 

Karam Singh commented on MAPREDUCE-582:
---

I tried to run
hadoop --config  jar hadoop-streaming.jar -jt local -fs local -files 
map.pl,dev-karams/red.pl -input data -output test1 -mapper map.pl -reducer 
red.pl. 
Job ran fine . I did not any error.


 Streaming: if streaming command finds errors in the --config , it reports 
 that the input file is not found and fails
 

 Key: MAPREDUCE-582
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-582
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Reporter: arkady borkovsky

 The error message is 
  ERROR streaming.StreamJob: Error Launching job : Input Pattern ../* 
 matches 0 files
 which is quite confusing and scary.
 Neds better error handling.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-595) streaming command line does not honor -jt option

2010-06-15 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878944#action_12878944
 ] 

Karam Singh commented on MAPREDUCE-595:
---

This issue does exist anymore.

 streaming command line does not honor -jt option
 

 Key: MAPREDUCE-595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-595
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Reporter: Karam Singh
Priority: Minor

 ran hadoop streaming command as -:
 bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input input path 
 -mapper mapper -reducer reducer -output output path  -dfs h:p -jt h:p
 (Make sure hadoop-site.xml is not in config dir. dfs abnd jt are running )
 Streaming will run as local runner 
 On looking at StreamJob.java following was found -:
 String jt = (String)cmdLine.getValue(mapred.job.tracker);
   if (null != jt){
 userJobConfProps_.put(fs.default.name, jt);
   }
 Where usage is having create option like -:
 Option jt = createOption(jt, 
  Optional. Override JobTracker configuration, 
 h:p|local, 1, false);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-595) streaming command line does not honor -jt option

2010-06-15 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878945#action_12878945
 ] 

Karam Singh commented on MAPREDUCE-595:
---

Ignore my previous comment. This issue does not exist anymore. 

 streaming command line does not honor -jt option
 

 Key: MAPREDUCE-595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-595
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Reporter: Karam Singh
Priority: Minor

 ran hadoop streaming command as -:
 bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input input path 
 -mapper mapper -reducer reducer -output output path  -dfs h:p -jt h:p
 (Make sure hadoop-site.xml is not in config dir. dfs abnd jt are running )
 Streaming will run as local runner 
 On looking at StreamJob.java following was found -:
 String jt = (String)cmdLine.getValue(mapred.job.tracker);
   if (null != jt){
 userJobConfProps_.put(fs.default.name, jt);
   }
 Where usage is having create option like -:
 Option jt = createOption(jt, 
  Optional. Override JobTracker configuration, 
 h:p|local, 1, false);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1540) Sometime JobTracker holds stale refrence of JobInProgress.

2010-02-26 Thread Karam Singh (JIRA)
Sometime JobTracker holds stale refrence of JobInProgress.
--

 Key: MAPREDUCE-1540
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1540
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.2
Reporter: Karam Singh


Ran random writer, sort and sort validate job. Checked the jmap -histo:live and 
verified that there is no reference of JobInProgress after Jobs are retired 
Now submitter around 77  sleeps of around 1 maps. then after 1 hr killed 
all the job when jobs got retired. again checked jmap -histo:live  for 
JobInProgress for JT process found 2 references were there.
Found this while doing snaity testing of 1316

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1540) Sometimes JobTracker holds stale refrence of JobInProgress even after Job gets retired

2010-02-26 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-1540:
---

Summary: Sometimes JobTracker holds stale refrence of JobInProgress even 
after Job gets retired  (was: Sometime JobTracker holds stale refrence of 
JobInProgress.)

 Sometimes JobTracker holds stale refrence of JobInProgress even after Job 
 gets retired
 --

 Key: MAPREDUCE-1540
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1540
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.2
Reporter: Karam Singh

 Ran random writer, sort and sort validate job. Checked the jmap -histo:live 
 and verified that there is no reference of JobInProgress after Jobs are 
 retired 
 Now submitter around 77  sleeps of around 1 maps. then after 1 hr killed 
 all the job when jobs got retired. again checked jmap -histo:live  for 
 JobInProgress for JT process found 2 references were there.
 Found this while doing snaity testing of 1316

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1540) Sometimes JobTracker holds stale refrence of JobInProgress even after Job gets retired

2010-02-26 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-1540:
---

Description: 
Ran random writer, sort and sort validate job. Checked the jmap -histo:live and 
verified that there is no reference of JobInProgress after Jobs are retired 
Now submitter around 77  sleeps of around 1 maps. then after 1 hr killed 
all the job when jobs got retired. again checked jmap -histo:live  for 
JobInProgress for JT process found 2 references were there.
Found this while doing sanity testing of 1316

  was:
Ran random writer, sort and sort validate job. Checked the jmap -histo:live and 
verified that there is no reference of JobInProgress after Jobs are retired 
Now submitter around 77  sleeps of around 1 maps. then after 1 hr killed 
all the job when jobs got retired. again checked jmap -histo:live  for 
JobInProgress for JT process found 2 references were there.
Found this while doing snaity testing of 1316


 Sometimes JobTracker holds stale refrence of JobInProgress even after Job 
 gets retired
 --

 Key: MAPREDUCE-1540
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1540
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.2
Reporter: Karam Singh

 Ran random writer, sort and sort validate job. Checked the jmap -histo:live 
 and verified that there is no reference of JobInProgress after Jobs are 
 retired 
 Now submitter around 77  sleeps of around 1 maps. then after 1 hr killed 
 all the job when jobs got retired. again checked jmap -histo:live  for 
 JobInProgress for JT process found 2 references were there.
 Found this while doing sanity testing of 1316

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1240) refreshQueues does not work correctly when dealing with maximum-capacity

2009-11-25 Thread Karam Singh (JIRA)
refreshQueues does not work correctly when dealing with maximum-capacity


 Key: MAPREDUCE-1240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Karam Singh


If we comment out or remove maximum-capacity property or set 
maximum-capacity=-1 for a queue which has maximum-capacity some value (say 60). 
After using command -:
mapred mradmin -refreshQueues 

When we check the Queue Scheduling from Web UI or from CLI it still retains old 
value and schedules tasks according up to old maximum-capacity if there no 
other job in cluster where the expected behavior maximum-capacity not retained


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1232) Job info for root level queue does not print correct values

2009-11-23 Thread Karam Singh (JIRA)
Job info for root level queue does not print correct values
---

 Key: MAPREDUCE-1232
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1232
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched
Affects Versions: 0.21.0
Reporter: Karam Singh


Job info for root level queue does not print correct values.
Test case -:
[
   Queue Name= q1
   Its sub-queues are Sq1, Sq2.
   Submit around 8 jobs to each queue Sq1 and Sq2; (Out of these 2 jobs from 
each queue starts running)
   Job Info for each queue Sq1 and Sq2 prints -:
  Job info
  Number of Waiting Jobs: 6
  Number of users who have submitted jobs: 1

  While Job Info for q1 (parent of Sq1 and Sq2) prints -:
Job info
Number of Waiting Jobs: 0
Number of users who have submitted jobs: 0
]
For root level either we should remove Job Info or we should print cumulative 
values of waiting jobs and users who submitted jobs from child queues



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-998) Wrong error message thrown when we try submit to container queue.

2009-09-17 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-998:
--

Description: 
Setup have multilevel queue.
parent queues a,b and has two child queues a11, a12. If we try sub queue a 
the following error is thrown -:
[
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Queue a does not 
exist
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2758)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2740)

]
where it should have proper like user cannot submit job to container queue.


  was:
Setup have multilevel queue.
parant queues a,b and has two child queues a11, a12. If we try sub queue a 
the following error is thrown -:
[
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Queue a does not 
exist
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2758)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2740)

]
where it should have proper like user cannot submit job to container queue.



 Wrong error message thrown when we try submit to container queue.
 -

 Key: MAPREDUCE-998
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-998
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched
Affects Versions: 0.21.0
Reporter: Karam Singh

 Setup have multilevel queue.
 parent queues a,b and has two child queues a11, a12. If we try sub queue a 
 the following error is thrown -:
 [
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Queue a does 
 not exist
 at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2758)
 at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2740)
 ]
 where it should have proper like user cannot submit job to container queue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-997) Acls are working properly when they are to user groups

2009-09-17 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756519#action_12756519
 ] 

Karam Singh commented on MAPREDUCE-997:
---

yes I used hadoop.job.ugi

 Acls are working properly when they are to user groups 
 ---

 Key: MAPREDUCE-997
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-997
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Karam Singh

 When submit-job-acl set usergroup (ug1).
 if user submits a using hadoop.job.ugi=u1,ug2 it is also gets accepted. (user 
 u1 is also part ug1).
 In hadoop 0.20.0, job gets rejected. Its a regression issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-887) After 4491, task cleaup directory some gets created under the ownershiptasktracker user instread job submitting.

2009-08-28 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748815#action_12748815
 ] 

Karam Singh commented on MAPREDUCE-887:
---

It seems the issue observed is timing. As cleanup directories and very quickly. 
So seems because ls -lR command was launched ownership might changed. Checked 
with commenting out code of cleanup files found permissions are set correctly 
making issue as resloved invalid

 After 4491, task cleaup directory some gets created under the 
 ownershiptasktracker user instread job submitting.
 

 Key: MAPREDUCE-887
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-887
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.21.0
Reporter: Karam Singh

 Some time, when task is killed, task cleanup directory is created under the 
 ownership tasktracker launching user instead job submitting user.
 dr-xrws--- karams   hadoop  ]  job_200908170914_0020
  |-- [drwxr-sr-x mapred   hadoop  ]  
 attempt_200908170914_0020_m_02_0.cleanup
  `-- [drwxrws--- karams   hadoop  ]  
 attempt_200908170914_0020_m_12_0
 Here karams is user who submitted job and mapred is the use who launched TT. 
 taskattrempt.cleanup created with mapred  user not with karams user.
 This issue is intermittent, not always reproducible. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-857) task fails with NPE when GzipCodec is used for mapred.map.output.compression.codec and native libary is not present

2009-08-13 Thread Karam Singh (JIRA)
task fails with NPE  when GzipCodec is used for 
mapred.map.output.compression.codec and native libary is not present


 Key: MAPREDUCE-857
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-857
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Reporter: Karam Singh
 Fix For: 0.21.0


Ran a job with 
mapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec.
Whenmaps of job completes they with following NPE  -:
tasklog -:
2009-08-12 13:48:13,423 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 256
2009-08-12 13:48:13,611 INFO org.apache.hadoop.mapred.MapTask: data buffer = 
204010944/214748368
2009-08-12 13:48:13,611 INFO org.apache.hadoop.mapred.MapTask: record buffer = 
3187670/3355443
2009-08-12 13:49:45,473 INFO org.apache.hadoop.mapred.MapTask: Starting flush 
of map output
2009-08-12 13:49:45,544 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to 
load native-hadoop library for your platform... using builtin-java classes 
where applicable
2009-08-12 13:49:45,545 INFO org.apache.hadoop.io.compress.CodecPool: Got 
brand-new compressor
2009-08-12 13:49:45,546 WARN org.apache.hadoop.mapred.Child: Error running 
child : java.lang.NullPointerException
at org.apache.hadoop.mapred.IFile$Writer.init(IFile.java:105)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1248)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146)
at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:528)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:604)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:318)
at org.apache.hadoop.mapred.Child.main(Child.java:162)

Line 105 of IFile.java contains followings line in trunk code on which error 
was seen -:
Line 104: this.compressor = CodecPool.getCompressor(codec);
Line: this.compressor.reset();


If native is available job runs successfully without any failures 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-857) task fails with NPE when GzipCodec is used for mapred.map.output.compression.codec and native libary is not present

2009-08-13 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-857:
--

Affects Version/s: 0.21.0
Fix Version/s: (was: 0.21.0)

 task fails with NPE  when GzipCodec is used for 
 mapred.map.output.compression.codec and native libary is not present
 

 Key: MAPREDUCE-857
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-857
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.21.0
Reporter: Karam Singh

 Ran a job with 
 mapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec.
 Whenmaps of job completes they with following NPE  -:
 tasklog -:
 2009-08-12 13:48:13,423 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 
 256
 2009-08-12 13:48:13,611 INFO org.apache.hadoop.mapred.MapTask: data buffer = 
 204010944/214748368
 2009-08-12 13:48:13,611 INFO org.apache.hadoop.mapred.MapTask: record buffer 
 = 3187670/3355443
 2009-08-12 13:49:45,473 INFO org.apache.hadoop.mapred.MapTask: Starting flush 
 of map output
 2009-08-12 13:49:45,544 WARN org.apache.hadoop.util.NativeCodeLoader: Unable 
 to load native-hadoop library for your platform... using builtin-java classes 
 where applicable
 2009-08-12 13:49:45,545 INFO org.apache.hadoop.io.compress.CodecPool: Got 
 brand-new compressor
 2009-08-12 13:49:45,546 WARN org.apache.hadoop.mapred.Child: Error running 
 child : java.lang.NullPointerException
 at org.apache.hadoop.mapred.IFile$Writer.init(IFile.java:105)
 at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1248)
 at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146)
 at 
 org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:528)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:604)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:318)
 at org.apache.hadoop.mapred.Child.main(Child.java:162)
 Line 105 of IFile.java contains followings line in trunk code on which error 
 was seen -:
 Line 104: this.compressor = CodecPool.getCompressor(codec);
 Line: this.compressor.reset();
 If native is available job runs successfully without any failures 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-832) Too man y WARN messages about deprecated memorty config variables in JobTacker log

2009-08-07 Thread Karam Singh (JIRA)
Too man y WARN messages about deprecated memorty config variables in JobTacker 
log
--

 Key: MAPREDUCE-832
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-832
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Karam Singh


When user submit a mapred job using old memory config vairiable 
(mapred.task.maxmem) followinig message too many times in JobTracker logs -:
[
WARN org.apache.hadoop.mapred.JobConf: The variable mapred.task.maxvmem is no 
longer used instead use  mapred.job.map.memory.mb and 
mapred.job.reduce.memory.mb
]


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-833) Jobclient does not print any warning message when old memory config variable used with -D option from command line

2009-08-07 Thread Karam Singh (JIRA)
Jobclient does not print any warning message when old memory config variable 
used with -D option from command line
--

 Key: MAPREDUCE-833
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-833
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Karam Singh




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-834) When TaskTracker config use old memory management values its memory monitoring is diabled.

2009-08-07 Thread Karam Singh (JIRA)
When TaskTracker config use old memory management values its memory monitoring 
is diabled.
--

 Key: MAPREDUCE-834
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-834
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Karam Singh


TaskTracker memory config values -:
mapred.tasktracker.vmem.reserved=8589934592
mapred.task.default.maxvmem=2147483648
mapred.task.limit.maxvmem=4294967296
mapred.tasktracker.pmem.reserved=2147483648
TaskTracker start as -:
   2009-08-05 12:39:03,308 WARN 
org.apache.hadoop.mapred.TaskTracker: The variable 
mapred.tasktracker.vmem.reserved is no longer used
2009-08-05 12:39:03,308 WARN 
org.apache.hadoop.mapred.TaskTracker: The variable 
mapred.tasktracker.pmem.reserved is no longer used
2009-08-05 12:39:03,308 WARN 
org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.default.maxvmem 
is no longer used
2009-08-05 12:39:03,308 WARN 
org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.limit.maxvmem is 
no longer used
2009-08-05 12:39:03,308 INFO 
org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for 
all reduce tasks on tracker_name
2009-08-05 12:39:03,309 INFO 
org.apache.hadoop.mapred.TaskTracker:  Using MemoryCalculatorPlugin : 
org.apache.hadoop.util.linuxmemorycalculatorplu...@19be4777
2009-08-05 12:39:03,311 WARN 
org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks 
is -1. TaskMemoryManager is disabled.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-832) Too many WARN messages about deprecated memorty config variables in JobTacker log

2009-08-07 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-832:
--

Summary: Too many WARN messages about deprecated memorty config variables 
in JobTacker log  (was: Too man y WARN messages about deprecated memorty config 
variables in JobTacker log)

 Too many WARN messages about deprecated memorty config variables in JobTacker 
 log
 -

 Key: MAPREDUCE-832
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-832
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Karam Singh

 When user submit a mapred job using old memory config vairiable 
 (mapred.task.maxmem) followinig message too many times in JobTracker logs -:
 [
 WARN org.apache.hadoop.mapred.JobConf: The variable mapred.task.maxvmem is no 
 longer used instead use  mapred.job.map.memory.mb and 
 mapred.job.reduce.memory.mb
 ]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-835) hadoop-mapred examples,test and tools jar iles are being packaged when ant binary or bin-package

2009-08-07 Thread Karam Singh (JIRA)
hadoop-mapred examples,test and tools jar iles are being packaged when ant 
binary or bin-package


 Key: MAPREDUCE-835
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-835
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Karam Singh


When checking mapreduce trunk.
If run ant binary or ant bin-package commands-:
hadoop-mapred-test-0.21.0-dev.jar, hadoop-mapred-examples-0.21.0-dev.jar, 
hadoop-mapred-tools-0.21.0-dev.jar are being in tar or 
build/hadoop-mapred-0.21.0-dev packe directory. But they present under build 
directory.

For ant tar and ant package they are being packaged correclty. 
buid/hadoop-mapred-0.21.0-dev directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-836) Examples of hadoop pipes a even when -Dcompile.native=yes -Dcompile.c++=yes option are used while running ant package or tar or similar commands.

2009-08-07 Thread Karam Singh (JIRA)
Examples of hadoop pipes a even when -Dcompile.native=yes -Dcompile.c++=yes 
option are used while running ant package or tar or similar commands.
-

 Key: MAPREDUCE-836
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-836
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Affects Versions: 0.20.1, 0.21.0
Reporter: Karam Singh


Examples of hadoop pies and python are not packed even when 
-Dcompile.native=yes -Dcompile.c++=yes option are used while running ant 
package or tar or similar commands. 
The pipes examples are compiled and copied under build/c++-examples but are not 
being packaged. Similar is case with python examples also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-734) java.util.ConcurrentModificationException observed in unreserving slots for HiRam Jobs

2009-07-08 Thread Karam Singh (JIRA)
java.util.ConcurrentModificationException observed in unreserving slots for 
HiRam Jobs
--

 Key: MAPREDUCE-734
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-734
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched
Reporter: Karam Singh


Ran jobs out which 3 were HiRAM, the job were not removed from scheduler queue 
even after they successfully completed
hadoop queue -info queue -showJobs displays somwthing like -:
job_200907080724_0031   2   1247059146868   username  NORMAL  0 running map 
tasks using 0 map slots. 0 additional slots reserved. 0 running reduce tasks 
using 0 reduce slots. 60 additional slots reserved.
job_200907080724_0030   2   1247059146972   username  NORMAL  0 running map 
tasks using 0 map slots. 0 additional slots reserved. 0 running reduce tasks 
using 0 reduce slots. 60 additional slots reserved.

But it does not block anything, but seems like zombie process of system
Jobtracker log show java.util.ConcurrentModificationException



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-722) More slots are getting reserved for HiRAM job tasks then required

2009-07-07 Thread Karam Singh (JIRA)
More slots are getting reserved for HiRAM job tasks then required
-

 Key: MAPREDUCE-722
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-722
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched
 Environment: Cluster MR capacity=248/248 Map slot size=1500 mb and  
reduce slot size=2048 mb. Total number of nodes=124
4 queues each having Capacity=25% User Limit=100%.


Reporter: Karam Singh


Submitted a normal job with map=124=reduces
After submitted High RAM with maps=31=reduces map.memory=1800 reduce.memory=2800
Again 3 job maps=124=reduces
total of 248 slots were reserved for both maps and reduces for High Job which 
much higher then required.
Is observed in Hadoop 0.20.0

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.