[jira] [Commented] (MAPREDUCE-5379) Include token tracking ids in jobconf
[ https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766097#comment-13766097 ] Hadoop QA commented on MAPREDUCE-5379: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602922/mr-5379-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3999//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3999//console This message is automatically generated. Include token tracking ids in jobconf - Key: MAPREDUCE-5379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission, security Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Karthik Kambatla Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379-2.patch, MAPREDUCE-5379.patch, mr-5379-3.patch HDFS-4680 enables audit logging delegation tokens. By storing the tracking ids in the job conf, we can enable tracking what files each job touches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4680) Job history cleaner should only check timestamps of files in old enough directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-4680: - Attachment: MAPREDUCE-4680.patch New patch suppresses the 3 new javac warnings (caused by a test) and fixes the test failure. Job history cleaner should only check timestamps of files in old enough directories --- Key: MAPREDUCE-4680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4680 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 2.0.0-alpha Reporter: Sandy Ryza Assignee: Robert Kanter Attachments: MAPREDUCE-4680.patch, MAPREDUCE-4680.patch Job history files are stored in /mm/dd folders. Currently, the job history cleaner checks the modification date of each file in every one of these folders to see whether it's past the maximum age. The load on HDFS could be reduced by only checking the ages of files in directories that are old enough, as determined by their name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4680) Job history cleaner should only check timestamps of files in old enough directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766152#comment-13766152 ] Hadoop QA commented on MAPREDUCE-4680: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602936/MAPREDUCE-4680.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4000//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4000//console This message is automatically generated. Job history cleaner should only check timestamps of files in old enough directories --- Key: MAPREDUCE-4680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4680 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 2.0.0-alpha Reporter: Sandy Ryza Assignee: Robert Kanter Attachments: MAPREDUCE-4680.patch, MAPREDUCE-4680.patch Job history files are stored in /mm/dd folders. Currently, the job history cleaner checks the modification date of each file in every one of these folders to see whether it's past the maximum age. The load on HDFS could be reduced by only checking the ages of files in directories that are old enough, as determined by their name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4680) Job history cleaner should only check timestamps of files in old enough directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765857#comment-13765857 ] Hadoop QA commented on MAPREDUCE-4680: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602866/MAPREDUCE-4680.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1149 javac compiler warnings (more than the trunk's current 1146 warnings). {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: org.apache.hadoop.mapreduce.v2.hs.TestJobHistory {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3997//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3997//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3997//console This message is automatically generated. Job history cleaner should only check timestamps of files in old enough directories --- Key: MAPREDUCE-4680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4680 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 2.0.0-alpha Reporter: Sandy Ryza Assignee: Robert Kanter Attachments: MAPREDUCE-4680.patch Job history files are stored in /mm/dd folders. Currently, the job history cleaner checks the modification date of each file in every one of these folders to see whether it's past the maximum age. The load on HDFS could be reduced by only checking the ages of files in directories that are old enough, as determined by their name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5505) Clients should be notified job finished after job successfully unregistered
Jian He created MAPREDUCE-5505: -- Summary: Clients should be notified job finished after job successfully unregistered Key: MAPREDUCE-5505 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jian He This is to make sure user is notified job finished after job is really done. This does increase client latency but can reduce some races during unregister like YARN-540 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5329) APPLICATION_INIT is never sent to AuxServices other than the builtin ShuffleHandler
[ https://issues.apache.org/jira/browse/MAPREDUCE-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766295#comment-13766295 ] Avner BenHanoch commented on MAPREDUCE-5329: Hi Siddharth, My patch is ready for your review. Its core part is ~25 lines. The rest is mainly tests. Thanks, Avner APPLICATION_INIT is never sent to AuxServices other than the builtin ShuffleHandler --- Key: MAPREDUCE-5329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5329 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.1.0-beta, 2.0.6-alpha Reporter: Avner BenHanoch Fix For: trunk Attachments: MAPREDUCE-5329.patch APPLICATION_INIT is never sent to AuxServices other than the built-in ShuffleHandler. This means that 3rd party ShuffleProvider(s) will not be able to function, because APPLICATION_INIT enables the AuxiliaryService to map jobId-userId. This is needed for properly finding the MOFs of a job per reducers' requests. NOTE: The built-in ShuffleHandler does get APPLICATION_INIT events due to hard-coded expression in hadoop code. The current TaskAttemptImpl.java code explicitly call: serviceData.put (ShuffleHandler.MAPREDUCE_SHUFFLE_SERVICEID, ...) and ignores any additional AuxiliaryService. As a result, only the built-in ShuffleHandler will get APPLICATION_INIT events. Any 3rd party AuxillaryService will never get APPLICATION_INIT events. I think a solution can be in one of two ways: 1. Change TaskAttemptImpl.java to loop on all Auxiliary Services and register each of them, by calling serviceData.put (…) in loop. 2. Change AuxServices.java similar to the fix in: MAPREDUCE-2668 APPLICATION_STOP is never sent to AuxServices. This means that in case the 'handle' method gets APPLICATION_INIT event it will demultiplex it to all Aux Services regardless of the value in event.getServiceID(). I prefer the 2nd solution. I am welcoming any ideas. I can provide the needed patch for any option that people like. See [Pluggable Shuffle in Hadoop documentation|http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5506) Hadoop-1.1.1 occurs ArrayIndexOutOfBoundsException with MultithreadedMapRunner
sam liu created MAPREDUCE-5506: -- Summary: Hadoop-1.1.1 occurs ArrayIndexOutOfBoundsException with MultithreadedMapRunner Key: MAPREDUCE-5506 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5506 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.1.1 Environment: RHEL 6.3 x86_64 Reporter: sam liu Priority: Blocker After I set: - 'jobConf.setMapRunnerClass(MultithreadedMapRunner.class);' in MR app - 'mapred.map.multithreadedrunner.threads = 2' in mapred-site.xml A simple MR app failed as its Map task encountered ArrayIndexOutOfBoundsException as below(please ignore the line numbers in the exception as I added some log print codes): java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1331) at java.io.DataOutputStream.write(DataOutputStream.java:101) at org.apache.hadoop.io.Text.write(Text.java:282) at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90) at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1060) at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:591) at study.hadoop.mapreduce.sample.WordCount$Map.map(WordCount.java:41) at study.hadoop.mapreduce.sample.WordCount$Map.map(WordCount.java:1) at org.apache.hadoop.mapred.lib.MultithreadedMapRunner$MapperInvokeRunable.run(MultithreadedMapRunner.java:231) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919) at java.lang.Thread.run(Thread.java:738) And the exception happens on line 'System.arraycopy(b, off, kvbuffer, bufindex, len)' in MapTask.java#MapOutputBuffer#Buffer#write(). When the exception occurs, 'b.length=4' but 'len=9'. Btw, if I set 'mapred.map.multithreadedrunner.threads = 1', no exception happened. So it should be an issue caused by multiple threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5164) command mapred job and mapred queue omit HADOOP_CLIENT_OPTS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766508#comment-13766508 ] Hudson commented on MAPREDUCE-5164: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1547 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1547/]) MAPREDUCE-5164. mapred job and queue commands omit HADOOP_CLIENT_OPTS. Contributed by Nemon Lou. (devaraj: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1522595) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred.cmd command mapred job and mapred queue omit HADOOP_CLIENT_OPTS - Key: MAPREDUCE-5164 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5164 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Nemon Lou Assignee: Nemon Lou Fix For: 2.1.1-beta Attachments: MAPREDUCE-5164.patch, MAPREDUCE-5164.patch, MAPREDUCE-5164.patch, MAPREDUCE-5164.patch HADOOP_CLIENT_OPTS does not take effect when type mapred job -list and mapred queue -list. The mapred script omit it -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5164) command mapred job and mapred queue omit HADOOP_CLIENT_OPTS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766400#comment-13766400 ] Hudson commented on MAPREDUCE-5164: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #331 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/331/]) MAPREDUCE-5164. mapred job and queue commands omit HADOOP_CLIENT_OPTS. Contributed by Nemon Lou. (devaraj: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1522595) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred.cmd command mapred job and mapred queue omit HADOOP_CLIENT_OPTS - Key: MAPREDUCE-5164 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5164 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Nemon Lou Assignee: Nemon Lou Fix For: 2.1.1-beta Attachments: MAPREDUCE-5164.patch, MAPREDUCE-5164.patch, MAPREDUCE-5164.patch, MAPREDUCE-5164.patch HADOOP_CLIENT_OPTS does not take effect when type mapred job -list and mapred queue -list. The mapred script omit it -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5164) command mapred job and mapred queue omit HADOOP_CLIENT_OPTS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766470#comment-13766470 ] Hudson commented on MAPREDUCE-5164: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #1521 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1521/]) MAPREDUCE-5164. mapred job and queue commands omit HADOOP_CLIENT_OPTS. Contributed by Nemon Lou. (devaraj: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1522595) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred.cmd command mapred job and mapred queue omit HADOOP_CLIENT_OPTS - Key: MAPREDUCE-5164 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5164 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Nemon Lou Assignee: Nemon Lou Fix For: 2.1.1-beta Attachments: MAPREDUCE-5164.patch, MAPREDUCE-5164.patch, MAPREDUCE-5164.patch, MAPREDUCE-5164.patch HADOOP_CLIENT_OPTS does not take effect when type mapred job -list and mapred queue -list. The mapred script omit it -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5507) MapReduce reducer preemption gets hanged
Omkar Vinit Joshi created MAPREDUCE-5507: Summary: MapReduce reducer preemption gets hanged Key: MAPREDUCE-5507 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Omkar Vinit Joshi Today if we are setting yarn.app.mapreduce.am.job.reduce.rampup.limit and mapreduce.job.reduce.slowstart.completedmaps then reducer are launched more aggressively. However the calculation to either Ramp up or Ramp down reducer is not down in most optimal way. * If MR AM at any point sees situation something like ** scheduledMaps : 30 ** scheduledReducers : 10 ** assignedMaps : 0 ** assignedReducers : 11 ** finishedMaps : 120 ** headroom : 756 ( when your map /reduce task needs only 512mb) * then today it simply hangs because it thinks that there is sufficient room to launch one more mapper and therefore there is no need to ramp down. However, if this continues forever then this is not the correct way / optimal way. * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 and there are running reducers around should wait for certain time ( upper limited by average map task completion time ... for heuristic sake)..but after that if still it doesn't get new container for map task then should preempt the reducer one by one with some interval and should ramp up slowly... ** Preemption of reducer can be done in little smarter way *** preempt reducer on a node manager for which there is any pending map request. *** otherwise preempt any other reducer. MR AM will contribute to getting new mapper by releasing such a reducer / container because it will reduce its cluster consumption and thereby may become candidate for an allocation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-5507) MapReduce reducer preemption gets hanged
[ https://issues.apache.org/jira/browse/MAPREDUCE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi reassigned MAPREDUCE-5507: Assignee: Omkar Vinit Joshi MapReduce reducer preemption gets hanged Key: MAPREDUCE-5507 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Today if we are setting yarn.app.mapreduce.am.job.reduce.rampup.limit and mapreduce.job.reduce.slowstart.completedmaps then reducer are launched more aggressively. However the calculation to either Ramp up or Ramp down reducer is not down in most optimal way. * If MR AM at any point sees situation something like ** scheduledMaps : 30 ** scheduledReducers : 10 ** assignedMaps : 0 ** assignedReducers : 11 ** finishedMaps : 120 ** headroom : 756 ( when your map /reduce task needs only 512mb) * then today it simply hangs because it thinks that there is sufficient room to launch one more mapper and therefore there is no need to ramp down. However, if this continues forever then this is not the correct way / optimal way. * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 and there are running reducers around should wait for certain time ( upper limited by average map task completion time ... for heuristic sake)..but after that if still it doesn't get new container for map task then should preempt the reducer one by one with some interval and should ramp up slowly... ** Preemption of reducer can be done in little smarter way *** preempt reducer on a node manager for which there is any pending map request. *** otherwise preempt any other reducer. MR AM will contribute to getting new mapper by releasing such a reducer / container because it will reduce its cluster consumption and thereby may become candidate for an allocation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5379) Include token tracking ids in jobconf
[ https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766901#comment-13766901 ] Andrew Wang commented on MAPREDUCE-5379: Thanks Karthik, the patch looks good to me. As I'm not well-versed in the ways of MR, it'd be good to get confirmation from someone else as well. Include token tracking ids in jobconf - Key: MAPREDUCE-5379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission, security Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Karthik Kambatla Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379-2.patch, MAPREDUCE-5379.patch, mr-5379-3.patch HDFS-4680 enables audit logging delegation tokens. By storing the tracking ids in the job conf, we can enable tracking what files each job touches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5379) Include token tracking ids in jobconf
[ https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766896#comment-13766896 ] Karthik Kambatla commented on MAPREDUCE-5379: - I verified this manually on both non-secure and secure clusters. On the secure cluster, the tracking id shows up in the jobconf. On the non-secure cluster, made sure there were no regressions and the jobs ran fine. Include token tracking ids in jobconf - Key: MAPREDUCE-5379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission, security Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Karthik Kambatla Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379-2.patch, MAPREDUCE-5379.patch, mr-5379-3.patch HDFS-4680 enables audit logging delegation tokens. By storing the tracking ids in the job conf, we can enable tracking what files each job touches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-5332: -- Attachment: MAPREDUCE-5332-6.patch Minor tweak to patch to set the permissions on the file during the create which should reduce the number of RPC calls when using HDFS as the filesystem. Support token-preserving restart of history server -- Key: MAPREDUCE-5332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobhistoryserver Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, MAPREDUCE-5332-4.patch, MAPREDUCE-5332-5.patch, MAPREDUCE-5332-5.patch, MAPREDUCE-5332-6.patch, MAPREDUCE-5332.patch To better support rolling upgrades through a cluster, the history server needs the ability to restart without losing track of delegation tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5379) Include token tracking ids in jobconf
[ https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5379: Status: Open (was: Patch Available) Include token tracking ids in jobconf - Key: MAPREDUCE-5379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission, security Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Karthik Kambatla Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379-2.patch, MAPREDUCE-5379.patch, mr-5379-3.patch HDFS-4680 enables audit logging delegation tokens. By storing the tracking ids in the job conf, we can enable tracking what files each job touches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5379) Include token tracking ids in jobconf
[ https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5379: Attachment: mr-5379-4.patch Updated patch to not use a Joiner, and use conf.setStrings instead. Include token tracking ids in jobconf - Key: MAPREDUCE-5379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission, security Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Karthik Kambatla Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379-2.patch, MAPREDUCE-5379.patch, mr-5379-3.patch, mr-5379-4.patch HDFS-4680 enables audit logging delegation tokens. By storing the tracking ids in the job conf, we can enable tracking what files each job touches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5379) Include token tracking ids in jobconf
[ https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767055#comment-13767055 ] Karthik Kambatla commented on MAPREDUCE-5379: - Manually verified the updated patch as well - the tracking id shows up in the jobconf. Include token tracking ids in jobconf - Key: MAPREDUCE-5379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission, security Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Karthik Kambatla Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379-2.patch, MAPREDUCE-5379.patch, mr-5379-3.patch, mr-5379-4.patch HDFS-4680 enables audit logging delegation tokens. By storing the tracking ids in the job conf, we can enable tracking what files each job touches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars
[ https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767146#comment-13767146 ] Jason Lowe commented on MAPREDUCE-4421: --- Thanks for the review, Hitesh! bq. Why does classpath need to include all of common, hdfs and yarn jar locations? Assuming that MR is running on a YARN-based cluster, shouldn't the location of the core dependencies come from the cluster deployment i.e. via the env that the NM sets for a container. I believe the only jars that MR should have in its uploaded tarball should be the client jars. I understand that there is no clear boundary for client-side only jars for common and hdfs today ( for For YARN, I believe it should be simple to split out the client-side requirements ) but it is something we should aim for or assume that the jars deployed on the cluster are compatible. This is primarily for avoiding jar conflicts and removing dependencies on the nodes. If the cluster upgrades and picks up a new version of jackson/jersey/guava/name-your-favorite-jar-that-breaks-apps-when-updated then that means existing apps can suddenly break due to jar conflicts. Another case we've seen is when a dependency jar is dropped between versions, and apps were depending upon those to be provided by Hadoop. Having the apps provide all of their dependencies means we can focus on just the RPC layer compatibilities (something we have to solve anyway) rather than have to worry as well about the myriad of combinations between jars within the app and those being picked up from the nodes. However if desired the user could configure it to work with just a partial tarball by setting the classpath to pickup the jars on the nodes via HADOOP_COMMON_HOME/HADOOP_HDFS_HOME/HADOOP_YARN_HOME references in the classpath like MRApps is doing today. bq. I would vote to make the tar-ball in HDFS be the only way to run MR on YARN. Obviously, this cannot be done for 2.x but we should move to this model on trunk and not support the current approach at all there. Comments? I'm all for it, and I see this as being a stepping stone to getting there. We'd like to have the ability to run out of HDFS in 2.x as a potential way to do a rolling upgrade of bugfixes in the MR framework. It probably won't be a complete solution to all forms of upgrades (i.e.: what if the client code or ShuffleHandler needs the fix), but it could still be very useful in practice. bq. The other point is related to configs. Yes, final parameter configs on the nodes conflicting with the job.xml settings are another concern. In practice I don't expect that to be a common issue, but it is something we should try to address in a followup JIRA. bq. How do you see framework name extracted from the path to be used? Is it just a safety check to ensure that it is found in the classpath? Will it have any relation to a version? I see the framework fragment alias primarily used for sanity-checks in case the classpath wasn't updated when using a specified framework and to allow the classpath settings to be a bit more general. For example, ops could configure the classpath once based on an expected framework tarball layout (e.g.: mrframework/share/mapreduce/* : mrframework/share/mapreduce/lib/* etc) and different versions of the tarball can be used without modifying the classpath as long as they match that layout. e.g.: mrtarball-2.3.1.tgz#mrframework, mrtarball-2.3.4.tgz#mrframework, etc. It's sort of like the assumed-layout approach from your last comment. Ops could set the classpath and users could select the framework version without having to set the classpath as long as the layout is compatible. Users could still override the classpath if using a framework that isn't compatible with the assumed layout. One problem with the common classpath approach is that the archives need to have the same directory structure, so top-level directories with the version number in them break it. The tarballs deployed to HDFS would have to be reorganized to have a common dir name rather than the versioned name. Not difficult to do, but it is annoying. bq. A minor nit - framework name seems confusing in relation to the framework name in use from earlier i.e yarn vs local framework. Yeah, that's true. I'm open to suggestions for what to call this instead of framework. bq. Regarding versions, it seems like users will need to do 2 things. Change the location of the tarball on HDFS and modify the classpath. Users will need to know the exact structure of the classpath. In such a scenario, do defaults even make sense? I wanted this to be flexible so ops/users could decide how to organize the framework (i.e.: partial/complete tarball, monolithic jar, whatever) and be able to set the classpath accordingly. I thought about hardcoding the assumption of the layout, but then that
[jira] [Updated] (MAPREDUCE-5504) mapred queue -info inconsistent with types
[ https://issues.apache.org/jira/browse/MAPREDUCE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated MAPREDUCE-5504: -- Attachment: MAPREDUCE-5504.patch Hi, I've created a patch for this issue. mapred queue -info inconsistent with types -- Key: MAPREDUCE-5504 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5504 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.23.9 Reporter: Thomas Graves Attachments: MAPREDUCE-5504.patch $ mapred queue -info default == Queue Name : default Queue State : running Scheduling Info : Capacity: 4.0, MaximumCapacity: 0.67, CurrentCapacity: 0.9309831 The capacity is displayed in % as 4, however maximum capacity is displayed as an absolute number 0.67 instead of 67%. We should make these consistent with the type we are displaying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5504) mapred queue -info inconsistent with types
[ https://issues.apache.org/jira/browse/MAPREDUCE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated MAPREDUCE-5504: -- Target Version/s: 3.0.0, 0.23.10 (was: 0.23.10) Status: Patch Available (was: Open) mapred queue -info inconsistent with types -- Key: MAPREDUCE-5504 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5504 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.23.9 Reporter: Thomas Graves Attachments: MAPREDUCE-5504.patch $ mapred queue -info default == Queue Name : default Queue State : running Scheduling Info : Capacity: 4.0, MaximumCapacity: 0.67, CurrentCapacity: 0.9309831 The capacity is displayed in % as 4, however maximum capacity is displayed as an absolute number 0.67 instead of 67%. We should make these consistent with the type we are displaying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5379) Include token tracking ids in jobconf
[ https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767170#comment-13767170 ] Alejandro Abdelnur commented on MAPREDUCE-5379: --- +1 LGTM Include token tracking ids in jobconf - Key: MAPREDUCE-5379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission, security Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Karthik Kambatla Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379-2.patch, MAPREDUCE-5379.patch, mr-5379-3.patch, mr-5379-4.patch HDFS-4680 enables audit logging delegation tokens. By storing the tracking ids in the job conf, we can enable tracking what files each job touches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5504) mapred queue -info inconsistent with types
[ https://issues.apache.org/jira/browse/MAPREDUCE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767188#comment-13767188 ] Hadoop QA commented on MAPREDUCE-5504: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12603154/MAPREDUCE-5504.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4002//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4002//console This message is automatically generated. mapred queue -info inconsistent with types -- Key: MAPREDUCE-5504 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5504 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.23.9 Reporter: Thomas Graves Attachments: MAPREDUCE-5504.patch $ mapred queue -info default == Queue Name : default Queue State : running Scheduling Info : Capacity: 4.0, MaximumCapacity: 0.67, CurrentCapacity: 0.9309831 The capacity is displayed in % as 4, however maximum capacity is displayed as an absolute number 0.67 instead of 67%. We should make these consistent with the type we are displaying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5508) Memory Leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
Xi Fang created MAPREDUCE-5508: -- Summary: Memory Leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object that is properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Summary: Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob (was: Memory Leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob --- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object that is properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767210#comment-13767210 ] Xi Fang commented on MAPREDUCE-5508: This bug was found in Microsoft's large scale test with about 200,000 job submissions. The memory usage is steadily growing up. There is a long discussion between Hortonworks (thanks [~cnauroth] and [~vinodkv]) and Microsoft on this issue. Here is the summary of the discussion. 1. The heap dumps are showing DistributedFileSystem instances that are only referred to from the cache's HashMap entries. Since nothing else has a reference, nothing else can ever attempt to close it, and therefore it will never be removed from the cache. 2. The special check for tempDirFS (see code in description) in the patch for MAPREDUCE-5351 is intended as an optimization so that CleanupQueue doesn't need to immediately reopen a FileSystem that was just closed. However, we observed that we're getting different identity hash code values on the subject in the key. The code is assuming that CleanupQueue will find the same Subject that was used inside JobInProgress. Unfortunately, this is not guaranteed, because we may have crossed into a different access control context at this point, via UserGroupInformation#doAs. Even though it's conceptually the same user, the Subject is a function of the current AccessControlContext: {code} public synchronized static UserGroupInformation getCurrentUser() throws IOException { AccessControlContext context = AccessController.getContext(); Subject subject = Subject.getSubject(context); {code} Even if the contexts are logically equivalent between JobInProgress and CleanupQueue, we see no guarantee that Java will give you the same Subject instance, which is required for successful lookup in the FileSystem cache (because of the use of identity hash code). A fix is abandon this optimization and close the FileSystem within the same AccessControlContext that opened it. Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob --- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object that is properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Description: MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} was: MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object that is properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob --- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Description: MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... if (tempDirFs != fs) { try { fs.close(); } catch (IOException ie) { ... } {code} was: MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob --- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... if (tempDirFs != fs) { try { fs.close(); } catch (IOException ie) { ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Attachment: MAPREDUCE-5508.patch Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob --- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical Attachments: MAPREDUCE-5508.patch MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... if (tempDirFs != fs) { try { fs.close(); } catch (IOException ie) { ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-5508 started by Xi Fang. Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob --- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical Attachments: MAPREDUCE-5508.patch MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... if (tempDirFs != fs) { try { fs.close(); } catch (IOException ie) { ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Summary: JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob (was: Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob -- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical Attachments: MAPREDUCE-5508.patch MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... if (tempDirFs != fs) { try { fs.close(); } catch (IOException ie) { ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-5508: -- Affects Version/s: 1.2.1 JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob -- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win, 1.2.1 Reporter: Xi Fang Assignee: Xi Fang Priority: Critical Attachments: MAPREDUCE-5508.patch MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... if (tempDirFs != fs) { try { fs.close(); } catch (IOException ie) { ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767308#comment-13767308 ] Sandy Ryza commented on MAPREDUCE-5508: --- Have you tested this fix? I took a deeper look into this and it doesn't appear that tempDirFs and fs are ever even ending up equal because tempDirFs is created with the wrong UGI. The deeper problem to me is that we are creating a new UGI, which can have a new subject, which can create a new entry in the FS cache, every time CleanupQueue#deletePath is called with a null UGI. This occurs here: {code} CleanupQueue.getInstance().addToQueue( new PathDeletionContext(tempDir, conf)); {code} A better fix would be to avoid this, either by having CleanupQueue hold a UGI of the login user for use in these situations or to avoid the doAs entirely when the given UGI is null. JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob -- Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win, 1.2.1 Reporter: Xi Fang Assignee: Xi Fang Priority: Critical Attachments: MAPREDUCE-5508.patch MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see tempDirFs) that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... if (tempDirFs != fs) { try { fs.close(); } catch (IOException ie) { ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira