[jira] [Updated] (MAPREDUCE-5611) CombineFileInputFormat creates more rack-local tasks due to less split location info.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-5611: --- Affects Version/s: (was: 2.2.0) CombineFileInputFormat creates more rack-local tasks due to less split location info. - Key: MAPREDUCE-5611 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5611 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Chandra Prakash Bhagtani Assignee: Chandra Prakash Bhagtani I have come across an issue with CombineFileInputFormat. Actually I ran a hive query on approx 1.2 GB data with CombineHiveInputFormat which internally uses CombineFileInputFormat. My cluster size is 9 datanodes and max.split.size is 256 MB When I ran this query with replication factor 9, hive consistently creates all 6 rack-local tasks and with replication factor 3 it creates 5 rack-local and 1 data local tasks. When replication factor is 9 (equal to cluster size), all the tasks should be data-local as each datanode contains all the replicas of the input data, but that is not happening i.e all the tasks are rack-local. When I dug into CombineFileInputFormat.java code in getMoreSplits method, I found the issue with the following snippet (specially in case of higher replication factor) {code:title=CombineFileInputFormat.java|borderStyle=solid} for (IteratorMap.EntryString, ListOneBlockInfo iter = nodeToBlocks.entrySet().iterator(); iter.hasNext();) { Map.EntryString, ListOneBlockInfo one = iter.next(); nodes.add(one.getKey()); ListOneBlockInfo blocksInNode = one.getValue(); // for each block, copy it into validBlocks. Delete it from // blockToNodes so that the same block does not appear in // two different splits. for (OneBlockInfo oneblock : blocksInNode) { if (blockToNodes.containsKey(oneblock)) { validBlocks.add(oneblock); blockToNodes.remove(oneblock); curSplitSize += oneblock.length; // if the accumulated split size exceeds the maximum, then // create this split. if (maxSize != 0 curSplitSize = maxSize) { // create an input split and add it to the splits array addCreatedSplit(splits, nodes, validBlocks); curSplitSize = 0; validBlocks.clear(); } } } {code} First node in the map nodeToBlocks has all the replicas of input file, so the above code creates 6 splits all with only one location. Now if JT doesn't schedule these tasks on that node, all the tasks will be rack-local, even though all the other datanodes have all the other replicas. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (MAPREDUCE-3057) Job History Server goes of OutOfMemory with 1200 Jobs and Heap Size set to 10 GB
Job History Server goes of OutOfMemory with 1200 Jobs and Heap Size set to 10 GB Key: MAPREDUCE-3057 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3057 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Karam Singh History server was started with -Xmx1m Ran GridMix V3 with 1200 Jobs trace in STRESS mode on 350 nodes with each node 4 NMS. All jobs finished as reported by RM Web UI and HADOOP_MAPRED_HOME/bin/mapred job -list all But found that GridMix job client was stuck while trying connect to HistoryServer Then tried to do HADOOP_MAPRED_HOME/bin/mapred job -status jobid JobClient also got stuck while looking for token to connect to History server Then looked at History Server logs and found History is trowing java.lang.OutOfMemoryError: GC overhead limit exceeded error. With 10GB of Heap space and 1200 Jobs, History Server should not go out of memory . No matter what are the type of jobs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3058) Sometimes task keeps on running while its Syslog says that it is shutdown
Sometimes task keeps on running while its Syslog says that it is shutdown - Key: MAPREDUCE-3058 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3058 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Karam Singh While running GridMix V3 one got stuck for 15 hrs. After clicking on Job found one of its reduce got stuck. Looking at syslog of reducer it found that- : Task started at -: 2011-09-19 17:57:22,002 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2011-09-19 17:57:22,002 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system started While end of syslog says -: 2011-09-19 18:06:49,818 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as DATANODE1 2011-09-19 18:06:49,818 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-1405370709-NAMENODE-1316452621953:blk_-7004355226367468317_79871 in pipeline DATANODE2, DATANODE1: bad datanode DATANODE1 2011-09-19 18:06:49,818 DEBUG org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtocol: lastAckedSeqno = 26870 2011-09-19 18:06:49,820 DEBUG org.apache.hadoop.ipc.Client: IPC Client (26613121) connection to NAMENODE from gridperf sending #454 2011-09-19 18:06:49,826 DEBUG org.apache.hadoop.ipc.Client: IPC Client (26613121) connection to NAMENODE from gridperf got value #454 2011-09-19 18:06:49,827 DEBUG org.apache.hadoop.ipc.RPC: Call: getAdditionalDatanode 8 2011-09-19 18:06:49,827 DEBUG org.apache.hadoop.hdfs.DFSClient: Connecting to datanode DATANODE2 2011-09-19 18:06:49,827 DEBUG org.apache.hadoop.hdfs.DFSClient: Send buf size 131071 2011-09-19 18:06:49,833 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:158) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:860) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:929) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:740) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:415) 2011-09-19 18:06:49,837 WARN org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:158) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:860) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:929) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:740) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:415) 2011-09-19 18:06:49,837 DEBUG org.apache.hadoop.ipc.Client: IPC Client (26613121) connection to APPMASTER from job_1316452677984_0862 sending #455 2011-09-19 18:06:49,839 DEBUG org.apache.hadoop.ipc.Client: IPC Client (26613121) connection to APPMASTER from job_1316452677984_0862 got value #455 2011-09-19 18:06:49,840 DEBUG org.apache.hadoop.ipc.RPC: Call: statusUpdate 3 2011-09-19 18:06:49,840 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task 2011-09-19 18:06:49,840 DEBUG org.apache.hadoop.ipc.Client: IPC Client (26613121) connection to NAMENODE from gridperf sending #456 2011-09-19 18:06:49,858 DEBUG org.apache.hadoop.ipc.Client: IPC Client (26613121) connection to NAMENODE from gridperf got value #456 2011-09-19 18:06:49,858 DEBUG org.apache.hadoop.ipc.RPC: Call: delete 18 2011-09-19 18:06:49,858 DEBUG org.apache.hadoop.ipc.Client: IPC Client (26613121) connection to APPMASTER from job_1316452677984_0862 sending #457 2011-09-19 18:06:49,859 DEBUG org.apache.hadoop.ipc.Client: IPC Client (26613121) connection to APPMASTER from job_1316452677984_0862 got value #457 2011-09-19 18:06:49,859 DEBUG org.apache.hadoop.ipc.RPC: Call: reportDiagnosticInfo 1 2011-09-19 18:06:49,859 DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl: refCount=1 2011-09-19 18:06:49,859 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ReduceTask metrics system... 2011-09-19 18:06:49,859 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping
[jira] [Created] (MAPREDUCE-3059) Resourcemanager metrices does not have aggregate containers allocated and containers eeleased Metrics
Resourcemanager metrices does not have aggregate containers allocated and containers eeleased Metrics - Key: MAPREDUCE-3059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3059 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Karam Singh Resourcemanager metrices does not have any aggregate containers allocate and container released metrics. If I want of how container allocated or container released, I do not know any way to find. NodeManager do have containers alunched and container released metrics, but this is not central location, so get aggregate number, 1st all NM metrics needed to be merged -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3059) Resourcemanager metrices does not have aggregate containers allocated and containers released Metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-3059: --- Summary: Resourcemanager metrices does not have aggregate containers allocated and containers released Metrics (was: Resourcemanager metrices does not have aggregate containers allocated and containers eeleased Metrics) Resourcemanager metrices does not have aggregate containers allocated and containers released Metrics - Key: MAPREDUCE-3059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3059 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Karam Singh Resourcemanager metrices does not have any aggregate containers allocate and container released metrics. If I want of how container allocated or container released, I do not know any way to find. NodeManager do have containers alunched and container released metrics, but this is not central location, so get aggregate number, 1st all NM metrics needed to be merged -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3031) Job Client goes into infinite loop when we kill AM
Job Client goes into infinite loop when we kill AM -- Key: MAPREDUCE-3031 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3031 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Karam Singh Started a cluster. Sumitted a sleep job with around 1 maps and 1000 reduces. Killed AM with kill -9 7000 thousands maps got completed RM Application kepts on saying Application RUNNING and jobclient went in infinit loop of trying to connecting AM -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3033) JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even mapreduce.framework.name is set top yarn
JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even mapreduce.framework.name is set top yarn --- Key: MAPREDUCE-3033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3033 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Reporter: Karam Singh If mapreduce.jobtracker.address is set in mapred-site.xml And mapreduce.framework.name is set yarn job submission fails : Tried to submit sleep job with maps 1 task. Job submission failed with following exception -: 11/09/19 13:19:20 INFO ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connecting to ResourceManager at RMHost:8040 11/09/19 13:19:20 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol 11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connected to ResourceManager at RMHost:8040 11/09/19 13:19:21 INFO mapred.ResourceMgrDelegate: DEBUG --- getStagingAreaDir: dir=/user/username/.staging 11/09/19 13:19:21 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/username/.staging/job_1316435926198_0004 java.lang.RuntimeException: Not a host:port pair: local at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:148) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:132) at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:42) at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:47) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:104) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:90) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:83) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:346) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1072) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1069) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1069) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1089) at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:262) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:111) at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:118) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:189) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3033) JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even mapreduce.framework.name is set to yarn
[ https://issues.apache.org/jira/browse/MAPREDUCE-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-3033: --- Summary: JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even mapreduce.framework.name is set to yarn (was: JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even mapreduce.framework.name is set top yarn) JobClient requires mapreduce.jobtracker.address tag in mapred-site.xm even mapreduce.framework.name is set to yarn -- Key: MAPREDUCE-3033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3033 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Reporter: Karam Singh If mapreduce.jobtracker.address is set in mapred-site.xml And mapreduce.framework.name is set yarn job submission fails : Tried to submit sleep job with maps 1 task. Job submission failed with following exception -: 11/09/19 13:19:20 INFO ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connecting to ResourceManager at RMHost:8040 11/09/19 13:19:20 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol 11/09/19 13:19:20 INFO mapred.ResourceMgrDelegate: Connected to ResourceManager at RMHost:8040 11/09/19 13:19:21 INFO mapred.ResourceMgrDelegate: DEBUG --- getStagingAreaDir: dir=/user/username/.staging 11/09/19 13:19:21 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/username/.staging/job_1316435926198_0004 java.lang.RuntimeException: Not a host:port pair: local at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:148) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:132) at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:42) at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:47) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:104) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:90) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:83) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:346) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1072) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1069) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1069) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1089) at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:262) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:111) at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:118) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:189) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3006) MapReduce AM exits prematurely before completely writing and closing the JobHistory file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106887#comment-13106887 ] Karam Singh commented on MAPREDUCE-3006: Today did fresh checkout of branch 0.23 Applied Patch provided by Vinod Compiled yarn and deployed Ran the Sleep Job 100K tasks, when job got completed, successfully accessed job from HistroyServer, whereas without job history did not get completed MapReduce AM exits prematurely before completely writing and closing the JobHistory file Key: MAPREDUCE-3006 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3006 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 0.23.0 Attachments: MAPREDUCE-3006-20110915.txt, MAPREDUCE-3006-20110916.txt [~Karams] was executing a sleep job with 100,000 tasks on a 350 node cluster to test MR AM's scalability and ran into this. The job ran successfully but the history was not available. I debugged around and figured that the job is finishing prematurely before the JobHistory is written. In most of the cases, we don't see this bug as we have a 5 seconds sleep in AM towards the end. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3007) JobClient cannot talk to JobHistory server in secure mode
[ https://issues.apache.org/jira/browse/MAPREDUCE-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105260#comment-13105260 ] Karam Singh commented on MAPREDUCE-3007: Applied MAPREDUCE-3007-20110914.2.txt Now I am not facing the problem of Jobclient not able to connect to Historyserver JobClient cannot talk to JobHistory server in secure mode - Key: MAPREDUCE-3007 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3007 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 0.23.0 Attachments: MAPREDUCE-3007-20110914.2.txt, MAPREDUCE-3007-20110914.txt In secure mode, Jobclient cannot connect to HistoryServer. Thanks to [~karams] for finding this out. {code} 11/09/14 09:57:51 INFO mapred.ClientServiceDelegate: Application state is completed. Redirecting to job history server 11/09/14 09:57:51 INFO security.ApplicationTokenSelector: Looking for a token with service history-server:10020 11/09/14 09:57:51 INFO security.ApplicationTokenSelector: Token kind is YARN_APPLICATION_TOKEN and the token's service name is Am-ip:46257 11/09/14 09:57:51 INFO security.UserGroupInformation: Initiating logout for user-principal 11/09/14 09:57:51 INFO security.UserGroupInformation: Initiating re-login for user-principal 11/09/14 09:57:55 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 11/09/14 09:57:56 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 11/09/14 09:58:00 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 11/09/14 09:58:05 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 11/09/14 09:58:05 WARN ipc.Client: Couldn't setup connection for user-principal to null 11/09/14 09:58:05 INFO mapred.ClientServiceDelegate: Failed to contact AM/History for job job_1315993268700_0001 Will retry.. {code} Am surprised no one working with YARN+MR ever ran into this! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2947) Sort fails on YARN+MR with lots of task failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100234#comment-13100234 ] Karam Singh commented on MAPREDUCE-2947: I am not seeing the error after applying patch Sort fails on YARN+MR with lots of task failures Key: MAPREDUCE-2947 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2947 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Fix For: 0.23.0 Attachments: MAPREDUCE-2947-20110907.txt [~karams](the great man the world hardly knows about) found lots of failing tasks while running sort on a 350 node cluster. The failed tasks eventually failed the job and this happening consistently on the big cluster. {quote} Container launch failed for container_1315410418107_0002_01_002511 : RemoteTrace: java.lang.IllegalArgumentException at java.nio.Buffer.position(Buffer.java:218) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:129) at java.nio.ByteBuffer.get(ByteBuffer.java:675) at com.google.protobuf.ByteString.copyFrom(ByteString.java:108) at com.google.protobuf.ByteString.copyFrom(ByteString.java:117) at org.apache.hadoop.yarn.util.ProtoUtils.convertToProtoFormat(ProtoUtils.java:97) at org.apache.hadoop.yarn.api.records.ProtoBase.convertToProtoFormat(ProtoBase.java:59) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.access$100(StartContainerResponsePBImpl.java:35) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl$1$1.next(StartContainerResponsePBImpl.java:134) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl$1$1.next(StartContainerResponsePBImpl.java:122) at com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:319) at org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerResponseProto$Builder.addAllServiceResponse(YarnServiceProtos.java:12620) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.addServiceResponseToProto(StartContainerResponsePBImpl.java:144) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.mergeLocalToBuilder(StartContainerResponsePBImpl.java:60) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.mergeLocalToProto(StartContainerResponsePBImpl.java:68) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.StartContainerResponsePBImpl.getProto(StartContainerResponsePBImpl.java:52) at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:69) at org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:337) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490) at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:151) at $Proxy20.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:215) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {quote} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-582) Streaming: if streaming command finds errors in the --config , it reports that the input file is not found and fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878941#action_12878941 ] Karam Singh commented on MAPREDUCE-582: --- I tried to run hadoop --config jar hadoop-streaming.jar -jt local -fs local -files map.pl,dev-karams/red.pl -input data -output test1 -mapper map.pl -reducer red.pl. Job ran fine . I did not any error. Streaming: if streaming command finds errors in the --config , it reports that the input file is not found and fails Key: MAPREDUCE-582 URL: https://issues.apache.org/jira/browse/MAPREDUCE-582 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: arkady borkovsky The error message is ERROR streaming.StreamJob: Error Launching job : Input Pattern ../* matches 0 files which is quite confusing and scary. Neds better error handling. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-595) streaming command line does not honor -jt option
[ https://issues.apache.org/jira/browse/MAPREDUCE-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878944#action_12878944 ] Karam Singh commented on MAPREDUCE-595: --- This issue does exist anymore. streaming command line does not honor -jt option Key: MAPREDUCE-595 URL: https://issues.apache.org/jira/browse/MAPREDUCE-595 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Karam Singh Priority: Minor ran hadoop streaming command as -: bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input input path -mapper mapper -reducer reducer -output output path -dfs h:p -jt h:p (Make sure hadoop-site.xml is not in config dir. dfs abnd jt are running ) Streaming will run as local runner On looking at StreamJob.java following was found -: String jt = (String)cmdLine.getValue(mapred.job.tracker); if (null != jt){ userJobConfProps_.put(fs.default.name, jt); } Where usage is having create option like -: Option jt = createOption(jt, Optional. Override JobTracker configuration, h:p|local, 1, false); -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-595) streaming command line does not honor -jt option
[ https://issues.apache.org/jira/browse/MAPREDUCE-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878945#action_12878945 ] Karam Singh commented on MAPREDUCE-595: --- Ignore my previous comment. This issue does not exist anymore. streaming command line does not honor -jt option Key: MAPREDUCE-595 URL: https://issues.apache.org/jira/browse/MAPREDUCE-595 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Karam Singh Priority: Minor ran hadoop streaming command as -: bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input input path -mapper mapper -reducer reducer -output output path -dfs h:p -jt h:p (Make sure hadoop-site.xml is not in config dir. dfs abnd jt are running ) Streaming will run as local runner On looking at StreamJob.java following was found -: String jt = (String)cmdLine.getValue(mapred.job.tracker); if (null != jt){ userJobConfProps_.put(fs.default.name, jt); } Where usage is having create option like -: Option jt = createOption(jt, Optional. Override JobTracker configuration, h:p|local, 1, false); -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1540) Sometime JobTracker holds stale refrence of JobInProgress.
Sometime JobTracker holds stale refrence of JobInProgress. -- Key: MAPREDUCE-1540 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1540 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Karam Singh Ran random writer, sort and sort validate job. Checked the jmap -histo:live and verified that there is no reference of JobInProgress after Jobs are retired Now submitter around 77 sleeps of around 1 maps. then after 1 hr killed all the job when jobs got retired. again checked jmap -histo:live for JobInProgress for JT process found 2 references were there. Found this while doing snaity testing of 1316 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1540) Sometimes JobTracker holds stale refrence of JobInProgress even after Job gets retired
[ https://issues.apache.org/jira/browse/MAPREDUCE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-1540: --- Summary: Sometimes JobTracker holds stale refrence of JobInProgress even after Job gets retired (was: Sometime JobTracker holds stale refrence of JobInProgress.) Sometimes JobTracker holds stale refrence of JobInProgress even after Job gets retired -- Key: MAPREDUCE-1540 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1540 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Karam Singh Ran random writer, sort and sort validate job. Checked the jmap -histo:live and verified that there is no reference of JobInProgress after Jobs are retired Now submitter around 77 sleeps of around 1 maps. then after 1 hr killed all the job when jobs got retired. again checked jmap -histo:live for JobInProgress for JT process found 2 references were there. Found this while doing snaity testing of 1316 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1540) Sometimes JobTracker holds stale refrence of JobInProgress even after Job gets retired
[ https://issues.apache.org/jira/browse/MAPREDUCE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-1540: --- Description: Ran random writer, sort and sort validate job. Checked the jmap -histo:live and verified that there is no reference of JobInProgress after Jobs are retired Now submitter around 77 sleeps of around 1 maps. then after 1 hr killed all the job when jobs got retired. again checked jmap -histo:live for JobInProgress for JT process found 2 references were there. Found this while doing sanity testing of 1316 was: Ran random writer, sort and sort validate job. Checked the jmap -histo:live and verified that there is no reference of JobInProgress after Jobs are retired Now submitter around 77 sleeps of around 1 maps. then after 1 hr killed all the job when jobs got retired. again checked jmap -histo:live for JobInProgress for JT process found 2 references were there. Found this while doing snaity testing of 1316 Sometimes JobTracker holds stale refrence of JobInProgress even after Job gets retired -- Key: MAPREDUCE-1540 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1540 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Karam Singh Ran random writer, sort and sort validate job. Checked the jmap -histo:live and verified that there is no reference of JobInProgress after Jobs are retired Now submitter around 77 sleeps of around 1 maps. then after 1 hr killed all the job when jobs got retired. again checked jmap -histo:live for JobInProgress for JT process found 2 references were there. Found this while doing sanity testing of 1316 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1240) refreshQueues does not work correctly when dealing with maximum-capacity
refreshQueues does not work correctly when dealing with maximum-capacity Key: MAPREDUCE-1240 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1240 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Karam Singh If we comment out or remove maximum-capacity property or set maximum-capacity=-1 for a queue which has maximum-capacity some value (say 60). After using command -: mapred mradmin -refreshQueues When we check the Queue Scheduling from Web UI or from CLI it still retains old value and schedules tasks according up to old maximum-capacity if there no other job in cluster where the expected behavior maximum-capacity not retained -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1232) Job info for root level queue does not print correct values
Job info for root level queue does not print correct values --- Key: MAPREDUCE-1232 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1232 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched Affects Versions: 0.21.0 Reporter: Karam Singh Job info for root level queue does not print correct values. Test case -: [ Queue Name= q1 Its sub-queues are Sq1, Sq2. Submit around 8 jobs to each queue Sq1 and Sq2; (Out of these 2 jobs from each queue starts running) Job Info for each queue Sq1 and Sq2 prints -: Job info Number of Waiting Jobs: 6 Number of users who have submitted jobs: 1 While Job Info for q1 (parent of Sq1 and Sq2) prints -: Job info Number of Waiting Jobs: 0 Number of users who have submitted jobs: 0 ] For root level either we should remove Job Info or we should print cumulative values of waiting jobs and users who submitted jobs from child queues -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-998) Wrong error message thrown when we try submit to container queue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-998: -- Description: Setup have multilevel queue. parent queues a,b and has two child queues a11, a12. If we try sub queue a the following error is thrown -: [ org.apache.hadoop.ipc.RemoteException: java.io.IOException: Queue a does not exist at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2758) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2740) ] where it should have proper like user cannot submit job to container queue. was: Setup have multilevel queue. parant queues a,b and has two child queues a11, a12. If we try sub queue a the following error is thrown -: [ org.apache.hadoop.ipc.RemoteException: java.io.IOException: Queue a does not exist at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2758) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2740) ] where it should have proper like user cannot submit job to container queue. Wrong error message thrown when we try submit to container queue. - Key: MAPREDUCE-998 URL: https://issues.apache.org/jira/browse/MAPREDUCE-998 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched Affects Versions: 0.21.0 Reporter: Karam Singh Setup have multilevel queue. parent queues a,b and has two child queues a11, a12. If we try sub queue a the following error is thrown -: [ org.apache.hadoop.ipc.RemoteException: java.io.IOException: Queue a does not exist at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2758) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2740) ] where it should have proper like user cannot submit job to container queue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-997) Acls are working properly when they are to user groups
[ https://issues.apache.org/jira/browse/MAPREDUCE-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756519#action_12756519 ] Karam Singh commented on MAPREDUCE-997: --- yes I used hadoop.job.ugi Acls are working properly when they are to user groups --- Key: MAPREDUCE-997 URL: https://issues.apache.org/jira/browse/MAPREDUCE-997 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Karam Singh When submit-job-acl set usergroup (ug1). if user submits a using hadoop.job.ugi=u1,ug2 it is also gets accepted. (user u1 is also part ug1). In hadoop 0.20.0, job gets rejected. Its a regression issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-887) After 4491, task cleaup directory some gets created under the ownershiptasktracker user instread job submitting.
[ https://issues.apache.org/jira/browse/MAPREDUCE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748815#action_12748815 ] Karam Singh commented on MAPREDUCE-887: --- It seems the issue observed is timing. As cleanup directories and very quickly. So seems because ls -lR command was launched ownership might changed. Checked with commenting out code of cleanup files found permissions are set correctly making issue as resloved invalid After 4491, task cleaup directory some gets created under the ownershiptasktracker user instread job submitting. Key: MAPREDUCE-887 URL: https://issues.apache.org/jira/browse/MAPREDUCE-887 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Karam Singh Some time, when task is killed, task cleanup directory is created under the ownership tasktracker launching user instead job submitting user. dr-xrws--- karams hadoop ] job_200908170914_0020 |-- [drwxr-sr-x mapred hadoop ] attempt_200908170914_0020_m_02_0.cleanup `-- [drwxrws--- karams hadoop ] attempt_200908170914_0020_m_12_0 Here karams is user who submitted job and mapred is the use who launched TT. taskattrempt.cleanup created with mapred user not with karams user. This issue is intermittent, not always reproducible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-857) task fails with NPE when GzipCodec is used for mapred.map.output.compression.codec and native libary is not present
task fails with NPE when GzipCodec is used for mapred.map.output.compression.codec and native libary is not present Key: MAPREDUCE-857 URL: https://issues.apache.org/jira/browse/MAPREDUCE-857 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Karam Singh Fix For: 0.21.0 Ran a job with mapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec. Whenmaps of job completes they with following NPE -: tasklog -: 2009-08-12 13:48:13,423 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 256 2009-08-12 13:48:13,611 INFO org.apache.hadoop.mapred.MapTask: data buffer = 204010944/214748368 2009-08-12 13:48:13,611 INFO org.apache.hadoop.mapred.MapTask: record buffer = 3187670/3355443 2009-08-12 13:49:45,473 INFO org.apache.hadoop.mapred.MapTask: Starting flush of map output 2009-08-12 13:49:45,544 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2009-08-12 13:49:45,545 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor 2009-08-12 13:49:45,546 WARN org.apache.hadoop.mapred.Child: Error running child : java.lang.NullPointerException at org.apache.hadoop.mapred.IFile$Writer.init(IFile.java:105) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1248) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:528) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:604) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:318) at org.apache.hadoop.mapred.Child.main(Child.java:162) Line 105 of IFile.java contains followings line in trunk code on which error was seen -: Line 104: this.compressor = CodecPool.getCompressor(codec); Line: this.compressor.reset(); If native is available job runs successfully without any failures -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-857) task fails with NPE when GzipCodec is used for mapred.map.output.compression.codec and native libary is not present
[ https://issues.apache.org/jira/browse/MAPREDUCE-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-857: -- Affects Version/s: 0.21.0 Fix Version/s: (was: 0.21.0) task fails with NPE when GzipCodec is used for mapred.map.output.compression.codec and native libary is not present Key: MAPREDUCE-857 URL: https://issues.apache.org/jira/browse/MAPREDUCE-857 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Affects Versions: 0.21.0 Reporter: Karam Singh Ran a job with mapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec. Whenmaps of job completes they with following NPE -: tasklog -: 2009-08-12 13:48:13,423 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 256 2009-08-12 13:48:13,611 INFO org.apache.hadoop.mapred.MapTask: data buffer = 204010944/214748368 2009-08-12 13:48:13,611 INFO org.apache.hadoop.mapred.MapTask: record buffer = 3187670/3355443 2009-08-12 13:49:45,473 INFO org.apache.hadoop.mapred.MapTask: Starting flush of map output 2009-08-12 13:49:45,544 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2009-08-12 13:49:45,545 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor 2009-08-12 13:49:45,546 WARN org.apache.hadoop.mapred.Child: Error running child : java.lang.NullPointerException at org.apache.hadoop.mapred.IFile$Writer.init(IFile.java:105) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1248) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:528) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:604) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:318) at org.apache.hadoop.mapred.Child.main(Child.java:162) Line 105 of IFile.java contains followings line in trunk code on which error was seen -: Line 104: this.compressor = CodecPool.getCompressor(codec); Line: this.compressor.reset(); If native is available job runs successfully without any failures -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-832) Too man y WARN messages about deprecated memorty config variables in JobTacker log
Too man y WARN messages about deprecated memorty config variables in JobTacker log -- Key: MAPREDUCE-832 URL: https://issues.apache.org/jira/browse/MAPREDUCE-832 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Karam Singh When user submit a mapred job using old memory config vairiable (mapred.task.maxmem) followinig message too many times in JobTracker logs -: [ WARN org.apache.hadoop.mapred.JobConf: The variable mapred.task.maxvmem is no longer used instead use mapred.job.map.memory.mb and mapred.job.reduce.memory.mb ] -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-833) Jobclient does not print any warning message when old memory config variable used with -D option from command line
Jobclient does not print any warning message when old memory config variable used with -D option from command line -- Key: MAPREDUCE-833 URL: https://issues.apache.org/jira/browse/MAPREDUCE-833 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Karam Singh -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-834) When TaskTracker config use old memory management values its memory monitoring is diabled.
When TaskTracker config use old memory management values its memory monitoring is diabled. -- Key: MAPREDUCE-834 URL: https://issues.apache.org/jira/browse/MAPREDUCE-834 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Karam Singh TaskTracker memory config values -: mapred.tasktracker.vmem.reserved=8589934592 mapred.task.default.maxvmem=2147483648 mapred.task.limit.maxvmem=4294967296 mapred.tasktracker.pmem.reserved=2147483648 TaskTracker start as -: 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.vmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.pmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.default.maxvmem is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.limit.maxvmem is no longer used 2009-08-05 12:39:03,308 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_name 2009-08-05 12:39:03,309 INFO org.apache.hadoop.mapred.TaskTracker: Using MemoryCalculatorPlugin : org.apache.hadoop.util.linuxmemorycalculatorplu...@19be4777 2009-08-05 12:39:03,311 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-832) Too many WARN messages about deprecated memorty config variables in JobTacker log
[ https://issues.apache.org/jira/browse/MAPREDUCE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-832: -- Summary: Too many WARN messages about deprecated memorty config variables in JobTacker log (was: Too man y WARN messages about deprecated memorty config variables in JobTacker log) Too many WARN messages about deprecated memorty config variables in JobTacker log - Key: MAPREDUCE-832 URL: https://issues.apache.org/jira/browse/MAPREDUCE-832 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Karam Singh When user submit a mapred job using old memory config vairiable (mapred.task.maxmem) followinig message too many times in JobTracker logs -: [ WARN org.apache.hadoop.mapred.JobConf: The variable mapred.task.maxvmem is no longer used instead use mapred.job.map.memory.mb and mapred.job.reduce.memory.mb ] -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-835) hadoop-mapred examples,test and tools jar iles are being packaged when ant binary or bin-package
hadoop-mapred examples,test and tools jar iles are being packaged when ant binary or bin-package Key: MAPREDUCE-835 URL: https://issues.apache.org/jira/browse/MAPREDUCE-835 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Karam Singh When checking mapreduce trunk. If run ant binary or ant bin-package commands-: hadoop-mapred-test-0.21.0-dev.jar, hadoop-mapred-examples-0.21.0-dev.jar, hadoop-mapred-tools-0.21.0-dev.jar are being in tar or build/hadoop-mapred-0.21.0-dev packe directory. But they present under build directory. For ant tar and ant package they are being packaged correclty. buid/hadoop-mapred-0.21.0-dev directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-836) Examples of hadoop pipes a even when -Dcompile.native=yes -Dcompile.c++=yes option are used while running ant package or tar or similar commands.
Examples of hadoop pipes a even when -Dcompile.native=yes -Dcompile.c++=yes option are used while running ant package or tar or similar commands. - Key: MAPREDUCE-836 URL: https://issues.apache.org/jira/browse/MAPREDUCE-836 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.1, 0.21.0 Reporter: Karam Singh Examples of hadoop pies and python are not packed even when -Dcompile.native=yes -Dcompile.c++=yes option are used while running ant package or tar or similar commands. The pipes examples are compiled and copied under build/c++-examples but are not being packaged. Similar is case with python examples also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-734) java.util.ConcurrentModificationException observed in unreserving slots for HiRam Jobs
java.util.ConcurrentModificationException observed in unreserving slots for HiRam Jobs -- Key: MAPREDUCE-734 URL: https://issues.apache.org/jira/browse/MAPREDUCE-734 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched Reporter: Karam Singh Ran jobs out which 3 were HiRAM, the job were not removed from scheduler queue even after they successfully completed hadoop queue -info queue -showJobs displays somwthing like -: job_200907080724_0031 2 1247059146868 username NORMAL 0 running map tasks using 0 map slots. 0 additional slots reserved. 0 running reduce tasks using 0 reduce slots. 60 additional slots reserved. job_200907080724_0030 2 1247059146972 username NORMAL 0 running map tasks using 0 map slots. 0 additional slots reserved. 0 running reduce tasks using 0 reduce slots. 60 additional slots reserved. But it does not block anything, but seems like zombie process of system Jobtracker log show java.util.ConcurrentModificationException -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-722) More slots are getting reserved for HiRAM job tasks then required
More slots are getting reserved for HiRAM job tasks then required - Key: MAPREDUCE-722 URL: https://issues.apache.org/jira/browse/MAPREDUCE-722 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched Environment: Cluster MR capacity=248/248 Map slot size=1500 mb and reduce slot size=2048 mb. Total number of nodes=124 4 queues each having Capacity=25% User Limit=100%. Reporter: Karam Singh Submitted a normal job with map=124=reduces After submitted High RAM with maps=31=reduces map.memory=1800 reduce.memory=2800 Again 3 job maps=124=reduces total of 248 slots were reserved for both maps and reduces for High Job which much higher then required. Is observed in Hadoop 0.20.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.