[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816210#comment-13816210 ] Xi Fang commented on MAPREDUCE-5508: One way to confirm that is to set to mapred.jobtracker.completeuserjobs.maximum = 0 and run some jobs. After all the jobs are done, wait for a while and check the number of FS objects in FileSystem#Cache. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Fix For: 1-win, 1.3.0 > > Attachments: CleanupQueue.java, JobInProgress.java, > MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, MAPREDUCE-5508.3.patch, > MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800914#comment-13800914 ] Xi Fang commented on MAPREDUCE-5508: Thanks Chris and viswanathan. And I think the three patches are what you need. It won't affect production environment because it is a very back-end thing. Users won't notice any difference I think. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Fix For: 1-win, 1.3.0 > > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, > MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776023#comment-13776023 ] Xi Fang commented on MAPREDUCE-5508: Thanks Chris and Sandy! > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Fix For: 1-win, 1.3.0 > > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, > MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13775904#comment-13775904 ] Xi Fang commented on MAPREDUCE-5508: Thanks Chris and Sandy. I just finished the large scale test. I didn't find memory leak in my test. I removed tabs and attached a new patch. So Chris, do you think we should file a new Jira for the idempotent implementation? > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, > MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Attachment: MAPREDUCE-5508.3.patch > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, > MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Attachment: MAPREDUCE-5508.2.patch > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, > MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774322#comment-13774322 ] Xi Fang commented on MAPREDUCE-5508: Thanks Chris. I attached a new patch and will launch a large scale test tomorrow. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, > MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774280#comment-13774280 ] Xi Fang commented on MAPREDUCE-5508: [~cnauroth], thanks for your comments. bq. Swallowing the InterruptedException is problematic if any upstream code depends on seeing the thread's interrupted status, so let's restore the interrupted status in the catch block by calling Thread.currentThread().interrupt(). If we call Thread.currentThread().interrupt(), is that possible that fs won't be closed in JobInProgress#cleanupJob()? bq. If there is an InterruptedException, then we currently would pass a null tempDirFs to the CleanupQueue, where we'd once again risk leaking memory. I suggest that if there is an InterruptedException, then we skip adding to the CleanupQueue and log a warning. This is consistent with the error-handling strategy in the rest of the method. (It logs warnings.) I think if the answer to my first question is "fs will be closed in JobInProgress#cleanupJob()", there will be no memory leak. This is because even if we pass null into CleanupQueue, the new fs created in CleanupQueue#deletePath() would be closed anyway. Thanks Chris. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772073#comment-13772073 ] Xi Fang commented on MAPREDUCE-5508: I set both staging and system dirs to hdfs on my test cluster. I ran 35,000 job submissions and manually checked the number of DistributedFileSystem objects. No memory leak related to DistributedFileSystem was found. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771276#comment-13771276 ] Xi Fang commented on MAPREDUCE-5508: Thanks Chris and Sandy. I made a draft patch for the proposal. I am thinking we still pass "tempDirFs" into PathDeletionContext instead of passing "fs", in order to deal with the case that fs is closed by someone. Although tempDirFs might be different from fs due to the different subject problem discussed above, in most of cases they would be the same (I used "userUGI" to get tempDirFs). So this is still an optimization. Let me know your comments. Thanks. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Attachment: MAPREDUCE-5508.1.patch > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767847#comment-13767847 ] Xi Fang commented on MAPREDUCE-5508: Thanks Chris for filing HDFS-5211. That sounds good to me:) > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767702#comment-13767702 ] Xi Fang commented on MAPREDUCE-5508: Thanks [~sandyr] and [~cnauroth]. Actually, the above discussion made me have second thoughts on the patch attached. There is a race condition here. Supposed that Path#getFileSystem in CleanupQueue#deletePath retrieved the same instance of JobInProgress#fs from FileSystem#Cache as well. Because there is race condition between DistributedFileSystem#close() and FileSystem#close(), it is possible that at the most just after JobInProgress#cleanupJob closed JobInProgress#fs's DFSClient, the processor switched to CleanupQueue#deletePath and called fs.delete(). Because this fs's DFCClient has been closed, an exception would be thrown and this staging directory won't be deleted then. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767525#comment-13767525 ] Xi Fang commented on MAPREDUCE-5508: Thanks Sandy for the information on HADOOP-6670. I think we may still need to close fs anyway, because p.getFileSystem(conf) in CleanupQueue#deletePath may not be able to find the FileSystem#Cache entry of JobInProgress#fs because of the different subject problem we discussed above. In this case, nothing will remove JobInProgress#fs from the FileSystem#Cache. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767402#comment-13767402 ] Xi Fang commented on MAPREDUCE-5508: Just found Chris was also working on this thread :). I agree with Chris. Changing the hash code may have a wide impact on existing code that would be risky. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767401#comment-13767401 ] Xi Fang commented on MAPREDUCE-5508: [~sandyr] Thanks for your comments. bq. Have you tested this fix. Yes. We have tested this fix on our test cluster (about 130,000 submission). After the workflow was done, we waited for a couple of minutes (jobs were retiring), then forced GC, and then dumped the memory. We manually checked the FileSystem#Cache. There was no memory leak. bq. For your analysis 1. I agree with "it doesn't appear that tempDirFs and fs are ever even ending up equal because tempDirFs is created with the wrong UGI." 2. I think tempDir would be fine because 1) JobInProgess#cleanupJob won't introduce a file system instance for tempDir and 2) the fs in CleanupQueue@deletePath would be reused (i.e. only one instance would exist in FileSystem#Cache). My initial thought was this part has a memory leak. But a test shows that there is no problem here. 3. The problem is actually {code} tempDirFs = jobTempDirPath.getFileSystem(conf); {code} The problem here is that this guy "MAY" (I will explain later) put a new entry in FileSystem#Cache. Note that this would eventually go into UserGroupInformation#getCurrentUser to get a UGI with a current AccessControlContext. CleanupQueue#deletePath won't close this entry because a different UGI (i.e. "userUGI" created in JobInProgress) is used there. Here is the tricky part which we had a long discussion with [~cnauroth] and [~vinodkv]. The problem here is that although we may only have one current user, the following code "MAY" return different subjects. {code} static UserGroupInformation getCurrentUser() throws IOException { AccessControlContext context = AccessController.getContext(); -->Subject subject = Subject.getSubject(context); -< {code} Because the entry of FileSystem#Cache uses identityHashCode of a subject to construct the key, a file system object created by "jobTempDirPath.getFileSystem(conf)" may not be found later when this code is executed again, although we may have the same principle (i.e. the current user). This eventually leads to an unbounded number of file system instances in FileSystem#Cache. Nothing is going to remove them from the cache. Please let me know if you have any questions. > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win, 1.2.1 >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Summary: JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob (was: Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob) > JobTracker memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > -- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-5508 started by Xi Fang. > Memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > --- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Attachment: MAPREDUCE-5508.patch > Memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > --- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > Attachments: MAPREDUCE-5508.patch > > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Description: MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see "tempDirFs") that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... if (tempDirFs != fs) { try { fs.close(); } catch (IOException ie) { ... } {code} was: MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see "tempDirFs") that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} > Memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > --- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > if (tempDirFs != fs) { > try { > fs.close(); > } catch (IOException ie) { > ... > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767210#comment-13767210 ] Xi Fang commented on MAPREDUCE-5508: This bug was found in Microsoft's large scale test with about 200,000 job submissions. The memory usage is steadily growing up. There is a long discussion between Hortonworks (thanks [~cnauroth] and [~vinodkv]) and Microsoft on this issue. Here is the summary of the discussion. 1. The heap dumps are showing DistributedFileSystem instances that are only referred to from the cache's HashMap entries. Since nothing else has a reference, nothing else can ever attempt to close it, and therefore it will never be removed from the cache. 2. The special check for "tempDirFS" (see code in description) in the patch for MAPREDUCE-5351 is intended as an optimization so that CleanupQueue doesn't need to immediately reopen a FileSystem that was just closed. However, we observed that we're getting different identity hash code values on the subject in the key. The code is assuming that CleanupQueue will find the same Subject that was used inside JobInProgress. Unfortunately, this is not guaranteed, because we may have crossed into a different access control context at this point, via UserGroupInformation#doAs. Even though it's conceptually the same user, the Subject is a function of the current AccessControlContext: {code} public synchronized static UserGroupInformation getCurrentUser() throws IOException { AccessControlContext context = AccessController.getContext(); Subject subject = Subject.getSubject(context); {code} Even if the contexts are logically equivalent between JobInProgress and CleanupQueue, we see no guarantee that Java will give you the same Subject instance, which is required for successful lookup in the FileSystem cache (because of the use of identity hash code). A fix is abandon this optimization and close the FileSystem within the same AccessControlContext that opened it. > Memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > --- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object that is properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Description: MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see "tempDirFs") that is not properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} was: MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object that is properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} > Memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > --- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object (see "tempDirFs") that is not properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5508: --- Summary: Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob (was: Memory Leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob) > Memory leak caused by unreleased FileSystem objects in > JobInProgress#cleanupJob > --- > > Key: MAPREDUCE-5508 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 1-win >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Critical > > MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem > object that is properly released. > {code} JobInProgress#cleanupJob() > void cleanupJob() { > ... > tempDirFs = jobTempDirPath.getFileSystem(conf); > CleanupQueue.getInstance().addToQueue( > new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); > ... > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5508) Memory Leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
Xi Fang created MAPREDUCE-5508: -- Summary: Memory Leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob Key: MAPREDUCE-5508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1-win Reporter: Xi Fang Assignee: Xi Fang Priority: Critical MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object that is properly released. {code} JobInProgress#cleanupJob() void cleanupJob() { ... tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5405) Job recovery can fail if task log directory symlink from prior run still exists
[ https://issues.apache.org/jira/browse/MAPREDUCE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13714194#comment-13714194 ] Xi Fang commented on MAPREDUCE-5405: Sounds good to me! I also did some tests on Ubuntu and Windows. It passes consistently. Thanks Chris. > Job recovery can fail if task log directory symlink from prior run still > exists > --- > > Key: MAPREDUCE-5405 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5405 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1 >Affects Versions: 1-win, 1.3.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5405.branch-1.1.patch > > > During recovery, the task attempt log dir symlink from the prior run might > still exist. If it does, then the recovered attempt will fail while trying > to create a symlink at that path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5391) TestNonLocalJobJarSubmission fails on Windows due to missing classpath entries
[ https://issues.apache.org/jira/browse/MAPREDUCE-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13708988#comment-13708988 ] Xi Fang commented on MAPREDUCE-5391: Thanks Chris. The patch looks good to me! > TestNonLocalJobJarSubmission fails on Windows due to missing classpath entries > -- > > Key: MAPREDUCE-5391 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5391 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 1-win >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5391.1.patch > > > This test works by having the mapper check all classpath entries loaded by > the classloader. On Windows, the classpath is packed into an intermediate > jar file with a manifest containing the classpath to work around command line > length limitation. The test needs to be updated to unpack the intermediate > jar file and read the manifest when running on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703695#comment-13703695 ] Xi Fang commented on MAPREDUCE-5278: Thanks, Chris > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, > MAPREDUCE-5278.4.patch, MAPREDUCE-5278.5.patch, MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703619#comment-13703619 ] Xi Fang commented on MAPREDUCE-5278: Thanks, Chris. A new patch has been attached. > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, > MAPREDUCE-5278.4.patch, MAPREDUCE-5278.5.patch, MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Attachment: MAPREDUCE-5278.5.patch > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, > MAPREDUCE-5278.4.patch, MAPREDUCE-5278.5.patch, MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
[ https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702176#comment-13702176 ] Xi Fang commented on MAPREDUCE-5371: Thanks Chris! > TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of > windows users > --- > > Key: MAPREDUCE-5371 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5371.patch > > > The error message was: > Error Message > expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins> > Stacktrace > at > org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) > The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699544#comment-13699544 ] Xi Fang commented on MAPREDUCE-5278: Thanks Bikas. A new patch was attached. > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, > MAPREDUCE-5278.4.patch, MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Attachment: MAPREDUCE-5278.4.patch > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, > MAPREDUCE-5278.4.patch, MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
[ https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13698240#comment-13698240 ] Xi Fang commented on MAPREDUCE-5371: The attached patch removed the domains from user names. > TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of > windows users > --- > > Key: MAPREDUCE-5371 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5371.patch > > > The error message was: > Error Message > expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins> > Stacktrace > at > org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) > The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
[ https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-5371 started by Xi Fang. > TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of > windows users > --- > > Key: MAPREDUCE-5371 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5371.patch > > > The error message was: > Error Message > expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins> > Stacktrace > at > org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) > The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
[ https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5371: --- Attachment: MAPREDUCE-5371.patch > TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of > windows users > --- > > Key: MAPREDUCE-5371 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5371.patch > > > The error message was: > Error Message > expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins> > Stacktrace > at > org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) > The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
Xi Fang created MAPREDUCE-5371: -- Summary: TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users Key: MAPREDUCE-5371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win The error message was: Error Message expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins> Stacktrace at org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13698205#comment-13698205 ] Xi Fang commented on MAPREDUCE-5330: Thanks Ivan and Chris! > JVM manager should not forcefully kill the process on Signal.TERM on Windows > > > Key: MAPREDUCE-5330 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5330.patch > > > In MapReduce, we sometimes kill a task's JVM before it naturally shuts down > if we want to launch other tasks (look in > JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map > task process is in the middle of doing some cleanup/finalization after the > task is done, it might be interrupted/killed without giving it a chance. > In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during > closing file systems in a special shutdown hook, we're typically uploading > storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if > this kill happens these metrics get lost. The impact is that for many MR jobs > we don't see accurate metrics reported most of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5109) Job view-acl should apply to job listing too
[ https://issues.apache.org/jira/browse/MAPREDUCE-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13695116#comment-13695116 ] Xi Fang commented on MAPREDUCE-5109: Hi Vinod, thanks for your patch. If Hadoop runs with this patch on Windows, there would be a problem because file name can't have "*" on Windows. After discussed Chris, we have two proposals specifically for Windows: 1. Use an entirely different wildcard character on Windows (for example: using "!" instead of "*") 2. Add an encoder and a decoder specifically for "*" in JobHistory#encodeJobHistoryFileName() and decodeJobHistoryFileName() respectively, on Windows. For example, we can encode "*" to "%20F". In this case, getNewJobHistoryFileName should also be changed accordingly. Do you have any suggestion on these two options? > Job view-acl should apply to job listing too > > > Key: MAPREDUCE-5109 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5109 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Vinod Kumar Vavilapalli > Attachments: MAPREDUCE-5109-20130405.2.txt > > > Job view-acl should apply to job listing too, currently it only applies to > job details pages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5330) Killing M/R JVM's leads to metrics not being uploaded
[ https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687346#comment-13687346 ] Xi Fang commented on MAPREDUCE-5330: If Signal.TERM is sent to a process, then we wait for a delay. But in Windows the signal kind is ignored - we just kill it (look at Shell#getSignalKillProcessGroupCommand()) {code} public static String[] getSignalKillProcessGroupCommand(int code, String groupId) { if (WINDOWS) { return new String[] { Shell.WINUTILS, "task", "kill", groupId }; } else { return new String[] { "kill", "-" + code , "-" + groupId }; } } {code} Here is a fix. If the OS is Windows and the signal is TERM, then return immediately and let a delayed process killer actually kill this process group. This can give this process group a graceful time to clean up itself. > Killing M/R JVM's leads to metrics not being uploaded > - > > Key: MAPREDUCE-5330 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Attachments: MAPREDUCE-5330.patch > > > In MapReduce, we sometimes kill a task's JVM before it naturally shuts down > if we want to launch other tasks (look in > JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map > task process is in the middle of doing some cleanup/finalization after the > task is done, it might be interrupted/killed without giving it a chance. > In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during > closing file systems in a special shutdown hook, we're typically uploading > storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if > this kill happens these metrics get lost. The impact is that for many MR jobs > we don't see accurate metrics reported most of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5330) Killing M/R JVM's leads to metrics not being uploaded
[ https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5330: --- Attachment: MAPREDUCE-5330.patch > Killing M/R JVM's leads to metrics not being uploaded > - > > Key: MAPREDUCE-5330 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Attachments: MAPREDUCE-5330.patch > > > In MapReduce, we sometimes kill a task's JVM before it naturally shuts down > if we want to launch other tasks (look in > JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map > task process is in the middle of doing some cleanup/finalization after the > task is done, it might be interrupted/killed without giving it a chance. > In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during > closing file systems in a special shutdown hook, we're typically uploading > storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if > this kill happens these metrics get lost. The impact is that for many MR jobs > we don't see accurate metrics reported most of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5330) Killing M/R JVM's leads to metrics not being uploaded
Xi Fang created MAPREDUCE-5330: -- Summary: Killing M/R JVM's leads to metrics not being uploaded Key: MAPREDUCE-5330 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Assignee: Xi Fang In MapReduce, we sometimes kill a task's JVM before it naturally shuts down if we want to launch other tasks (look in JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map task process is in the middle of doing some cleanup/finalization after the task is done, it might be interrupted/killed without giving it a chance. In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during closing file systems in a special shutdown hook, we're typically uploading storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if this kill happens these metrics get lost. The impact is that for many MR jobs we don't see accurate metrics reported most of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686233#comment-13686233 ] Xi Fang commented on MAPREDUCE-5278: Thanks Bikas. A config name was added in JobClient.java {code} private static final String CLIENT_ACCESSIBLE_REMOTE_SCHEMES_KEY = "mapreduce.client.accessible.remote.schemes"; {code} And in copyRemoteFiles(), I changed to {code} String [] accessibleSchemes = job.getStrings( CLIENT_ACCESSIBLE_REMOTE_SCHEMES_KEY, null); {code} > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, > MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Attachment: MAPREDUCE-5278.3.patch > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, > MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686060#comment-13686060 ] Xi Fang commented on MAPREDUCE-5278: Thanks Bikas for your comments. For your question : "Is the following code (marked below) continuing to copy stuff to the default fs (fs) when the newPath points to a different filesystem?: No. Basically, the original code does this: If JT staging dir is not on the default FS (for example, in our context it is ASV), copyRemoteFiles() will copy files in ASV to JT. Note that these files are specified using generic options. After our change, when ASV is marked as "accessible" by specifying "mapreduce.client.accessible.remote.schemes", copyRemoteFiles() won't copy the files in ASV to the jobtracker. It just directly returns the path of that file, denoted by "newPath". In addition, no copy operation would happen in addArchiveToClassPath(). > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13680075#comment-13680075 ] Xi Fang commented on MAPREDUCE-5278: Thanks Ivan. I have added a classpath check and am preparing a trunk version. > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Attachment: MAPREDUCE-5278.2.patch > Distributed cache is broken when JT staging dir is not on the default FS > > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Fix Version/s: 1-win Target Version/s: 1-win Status: Patch Available (was: In Progress) > Perf: Distributed cache is broken when JT staging dir is not on the default FS > -- > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Fix For: 1-win > > Attachments: MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Attachment: MAPREDUCE-5278.patch A patch is attached. In this patch, we added a property called "mapreduce.client.accessible.remote.schemes." It specifies the schemes of the file systems that are accessible from all the nodes in the cluster. This is used by the job client to avoid copying distributed cache entries to the job staging dir if path is accessible (See JobClient#copyRemoteFiles() ). For example, on Windows Azure, a path that has ASV as its scheme is accessible from all the nodes in the cluster. "mapreduce.client.accessible.remote.schemes" can be set to "ASV". The change in this patch is passive, meaning that it won’t take effect unless this property is enabled thru configuration. > Perf: Distributed cache is broken when JT staging dir is not on the default FS > -- > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Attachments: MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-5278 started by Xi Fang. > Perf: Distributed cache is broken when JT staging dir is not on the default FS > -- > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > Attachments: MAPREDUCE-5278.patch > > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Description: Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is set to point to HDFS, even though other file systems (e.g. Amazon S3 file system and Windows ASV file system) are the default file systems. For ASV, this config was chosen and there are a few reasons why: 1. To prevent leak of the storage account credentials to the user's storage account; 2. It uses HDFS for the transient job files what is good for two reasons – a) it does not flood the user's storage account with irrelevant data/files b) it leverages HDFS locality for small files However, this approach conflicts with how distributed cache caching works, completely negating the feature's functionality. When files are added to the distributed cache (thru files/achieves/libjars hadoop generic options), they are copied to the job tracker staging dir only if they reside on a file system different that the jobtracker's. Later on, this path is used as a "key" to cache the files locally on the tasktracker's machine, and avoid localization (download/unzip) of the distributed cache files if they are already localized. In this configuration the caching is completely disabled and we always end up copying dist cache files to the job tracker's staging dir first and localizing them on the task tracker machine second. This is especially not good for Oozie scenarios as Oozie uses dist cache to populate Hive/Pig jars throughout the cluster. was: Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is set to point to HDFS, even though other file systems (e.g. Amazon S3 file system and Windows ASV file system) are the default file systems. For ASV, this config was chosen and there are a few reasons why: 1. To prevent leak of the storage account creds to the user's storage account; 2. It uses HDFS for the transient job files what is good for two reasons – a) it does not flood the user's storage account with irrelevant data/files b) it leverages HDFS locality for small files However, this approach conflicts with how distributed cache caching works, completely negating the feature's functionality. When files are added to the distributed cache (thru files/achieves/libjars hadoop generic options), they are copied to the job tracker staging dir only if they reside on a file system different that the jobtracker's. Later on, this path is used as a "key" to cache the files locally on the tasktracker's machine, and avoid localization (download/unzip) of the distributed cache files if they are already localized. In this configuration the caching is completely disabled and we always end up copying dist cache files to the job tracker's staging dir first and localizing them on the task tracker machine second. This is especially not good for Oozie scenarios as Oozie uses dist cache to populate Hive/Pig jars throughout the cluster. > Perf: Distributed cache is broken when JT staging dir is not on the default FS > -- > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. > For ASV, this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account credentials to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker machine second. > This is especia
[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Description: Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is set to point to HDFS, even though other file systems (e.g. Amazon S3 file system and Windows ASV file system) are the default file systems. For ASV, this config was chosen and there are a few reasons why: 1. To prevent leak of the storage account creds to the user's storage account; 2. It uses HDFS for the transient job files what is good for two reasons – a) it does not flood the user's storage account with irrelevant data/files b) it leverages HDFS locality for small files However, this approach conflicts with how distributed cache caching works, completely negating the feature's functionality. When files are added to the distributed cache (thru files/achieves/libjars hadoop generic options), they are copied to the job tracker staging dir only if they reside on a file system different that the jobtracker's. Later on, this path is used as a "key" to cache the files locally on the tasktracker's machine, and avoid localization (download/unzip) of the distributed cache files if they are already localized. In this configuration the caching is completely disabled and we always end up copying dist cache files to the job tracker's staging dir first and localizing them on the task tracker machine second. This is especially not good for Oozie scenarios as Oozie uses dist cache to populate Hive/Pig jars throughout the cluster. was: Today, we set the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is the default file system. There are a few reason why this config was chosen: 1. To prevent leak of the storage account creds to the user's storage account (IOW, keep job.xml in the cluster). 2. It uses HDFS for the transient job files what is good for two reasons – a) it does not flood the user's storage account with irrelevant data/files b) it leverages HDFS locality for small files However, this approach conflicts with how distributed cache caching works, completely negating the feature's functionality. When files are added to the distributed cache (thru files/achieves/libjars hadoop generic options), they are copied to the job tracker staging dir only if they reside on a file system different that the jobtracker's. Later on, this path is used as a "key" to cache the files locally on the tasktracker's machine, and avoid localization (download/unzip) of the distributed cache files if they are already localized. In our configuration the caching is completely disabled and we always end up copying dist cache files to the JT staging dir first and localizing them on the tasktracker machine second. This is especially not good for Oozie scenarios as Oozie uses dist cache to populate Hive/Pig jars throughout the cluster. Easy workaround is to config mapreduce.jobtracker.staging.root.dir in mapred-site.xml to be on the default FS. > Perf: Distributed cache is broken when JT staging dir is not on the default FS > -- > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > > Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is > set to point to HDFS, even though other file systems (e.g. Amazon S3 file > system and Windows ASV file system) are the default file systems. For ASV, > this config was chosen and there are a few reasons why: > 1. To prevent leak of the storage account creds to the user's storage > account; > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In this configuration the caching is completely disabled and we always end up > copying dist cache files to the job tracker's staging dir first and > localizing them on the task tracker m
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668119#comment-13668119 ] Xi Fang commented on MAPREDUCE-5224: Thanks Ivan! > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, > MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668117#comment-13668117 ] Xi Fang commented on MAPREDUCE-5278: Basically, if a remote file system is reachable from task trackers, we don't have to copy the files on this file system to the job tracker's staging (see JobClient#copyRemoteFiles() ). For example, in HDInsight, user storage would be ASV which is different than HDFS. So by default these files would be copied to JT. However, since ASV is supposed to be reachable from tasktracker, these copy operations would be unnecessary, which will also disable the dist cache. A proposal is to add a configuration property (e.g. "mapred.tasktracker.scheme.accessible"). If we specify a scheme in this property, we won't do the copy operation even if this scheme is not equal to the scheme of job tracker's staging dir. For example, in this context, mapred.tasktracker.scheme.accessible=ASV. > Perf: Distributed cache is broken when JT staging dir is not on the default FS > -- > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > > Today, we set the JobTracker staging dir > ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is > the default file system. There are a few reason why this config was chosen: > 1. To prevent leak of the storage account creds to the user's storage account > (IOW, keep job.xml in the cluster). > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In our configuration the caching is completely disabled and we always end up > copying dist cache files to the JT staging dir first and localizing them on > the tasktracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. > Easy workaround is to config mapreduce.jobtracker.staging.root.dir in > mapred-site.xml to be on the default FS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Description: Today, we set the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is the default file system. There are a few reason why this config was chosen: 1. To prevent leak of the storage account creds to the user's storage account (IOW, keep job.xml in the cluster). 2. It uses HDFS for the transient job files what is good for two reasons – a) it does not flood the user's storage account with irrelevant data/files b) it leverages HDFS locality for small files However, this approach conflicts with how distributed cache caching works, completely negating the feature's functionality. When files are added to the distributed cache (thru files/achieves/libjars hadoop generic options), they are copied to the job tracker staging dir only if they reside on a file system different that the jobtracker's. Later on, this path is used as a "key" to cache the files locally on the tasktracker's machine, and avoid localization (download/unzip) of the distributed cache files if they are already localized. In our configuration the caching is completely disabled and we always end up copying dist cache files to the JT staging dir first and localizing them on the tasktracker machine second. This is especially not good for Oozie scenarios as Oozie uses dist cache to populate Hive/Pig jars throughout the cluster. Easy workaround is to config mapreduce.jobtracker.staging.root.dir in mapred-site.xml to be on the default FS. was: Today, we set the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is the default file system. There are a few reason why this config was chosen: To prevent leak of the storage account creds to the user's storage account (IOW, keep job.xml in the cluster). This is needed until HADOOP-444 is fixed. It uses HDFS for the transient job files what is good for two reasons – a) it does not flood the user's storage account with irrelevant data/files b) it leverages HDFS locality for small files However, this approach conflicts with how distributed cache caching works, completely negating the feature's functionality. When files are added to the distributed cache (thru files/achieves/libjars hadoop generic options), they are copied to the job tracker staging dir only if they reside on a file system different that the jobtracker's. Later on, this path is used as a "key" to cache the files locally on the tasktracker's machine, and avoid localization (download/unzip) of the distributed cache files if they are already localized. In our configuration the caching is completely disabled and we always end up copying dist cache files to the JT staging dir first and localizing them on the tasktracker machine second. This is especially not good for Oozie scenarios as Oozie uses dist cache to populate Hive/Pig jars throughout the cluster. Easy workaround is to config mapreduce.jobtracker.staging.root.dir in mapred-site.xml to be on the default FS. > Perf: Distributed cache is broken when JT staging dir is not on the default FS > -- > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > > Today, we set the JobTracker staging dir > ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is > the default file system. There are a few reason why this config was chosen: > 1. To prevent leak of the storage account creds to the user's storage account > (IOW, keep job.xml in the cluster). > 2. It uses HDFS for the transient job files what is good for two reasons – a) > it does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In our configuration the caching is completely disabled and we always end up > copying dist cache files to the JT staging dir first and localizing them on > the tasktracke
[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5278: --- Assignee: Xi Fang > Perf: Distributed cache is broken when JT staging dir is not on the default FS > -- > > Key: MAPREDUCE-5278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache >Affects Versions: 1-win > Environment: Windows >Reporter: Xi Fang >Assignee: Xi Fang > > Today, we set the JobTracker staging dir > ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is > the default file system. There are a few reason why this config was chosen: > To prevent leak of the storage account creds to the user's storage account > (IOW, keep job.xml in the cluster). This is needed until HADOOP-444 is fixed. > It uses HDFS for the transient job files what is good for two reasons – a) it > does not flood the user's storage account with irrelevant data/files b) it > leverages HDFS locality for small files > However, this approach conflicts with how distributed cache caching works, > completely negating the feature's functionality. > When files are added to the distributed cache (thru files/achieves/libjars > hadoop generic options), they are copied to the job tracker staging dir only > if they reside on a file system different that the jobtracker's. Later on, > this path is used as a "key" to cache the files locally on the tasktracker's > machine, and avoid localization (download/unzip) of the distributed cache > files if they are already localized. > In our configuration the caching is completely disabled and we always end up > copying dist cache files to the JT staging dir first and localizing them on > the tasktracker machine second. > This is especially not good for Oozie scenarios as Oozie uses dist cache to > populate Hive/Pig jars throughout the cluster. > Easy workaround is to config mapreduce.jobtracker.staging.root.dir in > mapred-site.xml to be on the default FS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS
Xi Fang created MAPREDUCE-5278: -- Summary: Perf: Distributed cache is broken when JT staging dir is not on the default FS Key: MAPREDUCE-5278 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Today, we set the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is the default file system. There are a few reason why this config was chosen: To prevent leak of the storage account creds to the user's storage account (IOW, keep job.xml in the cluster). This is needed until HADOOP-444 is fixed. It uses HDFS for the transient job files what is good for two reasons – a) it does not flood the user's storage account with irrelevant data/files b) it leverages HDFS locality for small files However, this approach conflicts with how distributed cache caching works, completely negating the feature's functionality. When files are added to the distributed cache (thru files/achieves/libjars hadoop generic options), they are copied to the job tracker staging dir only if they reside on a file system different that the jobtracker's. Later on, this path is used as a "key" to cache the files locally on the tasktracker's machine, and avoid localization (download/unzip) of the distributed cache files if they are already localized. In our configuration the caching is completely disabled and we always end up copying dist cache files to the JT staging dir first and localizing them on the tasktracker machine second. This is especially not good for Oozie scenarios as Oozie uses dist cache to populate Hive/Pig jars throughout the cluster. Easy workaround is to config mapreduce.jobtracker.staging.root.dir in mapred-site.xml to be on the default FS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667437#comment-13667437 ] Xi Fang commented on MAPREDUCE-5224: Thank Ivan for MAPREDUCE-5224.5.patch. Here is the reason (offline emails from Ivan) for posting this new patch. 1. Given that fs is indeed used on some other places, we have to account for that as well (these tests actually want to close the system dir fs). 2. There is no need to use the default file system for the jobhistory. There is another (orthogonal) bug here. Job history completed location also assumes the default FS what is not correct. This should be a separate Jira. 3. This would make the prod code change really simple. > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, > MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: MAPREDUCE-5224.5.patch > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, > MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: MAPREDUCE-5224.4.patch Above comments have been addressed. Thanks. BTW, I changed JobTracker#defaultFs back to fs, because some other codes in the same package use this "fs" (fs was originally defined with no access modifier). > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, > MAPREDUCE-5224.4.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665679#comment-13665679 ] Xi Fang commented on MAPREDUCE-5224: Thanks Ivan for your detailed comments. These are of great help! > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, > MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663457#comment-13663457 ] Xi Fang commented on MAPREDUCE-5224: Thanks Chuan and Ivan. 1. For Chuan's comment: I added an assert to check this dir exists indeed. This also addresses Ivan's 10th comment. 2. For Ivan's comment: a. For comments #1, 3, 5, 7, 9, 10, I just followed the comments. b. For comment #2: I personally think it would be better if we can throw out an exception rather than swallowing it and setting it back to default file system. As mentioned by Mostafa (offline), for example, if someone configured the system dir as http://www.awesome.com/system, then with the fallback solution the exception saying "HTTP is not supported" will be swallowed and we'll set the system directory as just /system in the default file system, which doesn't seem like good behavior. We may want someone explicitly know/handle this at the moment this happens. c. For comment #4: I renamed the JobTracker#fs to defaultFs and still keep it just for possible future use/reference of this variable. d. For comment #6, 8: I put the initialization of MiniDFSCluster, MiniMRCluster in the test case and let setUp() just construct a configuration. In this way, we don't have to throw IOException in setUp() and test case would fail if my code changes are not applied. > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, > MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: MAPREDUCE-5224.3.patch > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, > MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662400#comment-13662400 ] Xi Fang commented on MAPREDUCE-5224: Sorry for the format! The system changed my text to something else because of the special symbols. > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662398#comment-13662398 ] Xi Fang commented on MAPREDUCE-5224: Hi Ivan, I was addressing your fourth comment. I have one question. There are two methods: - /** * Grab the local fs name */ public synchronized String getFilesystemName() throws IOException { if (fs == null) { throw new IllegalStateException("FileSystem object not available yet"); } return fs.getUri().toString(); } - /** * Get JobTracker's FileSystem. This is the filesystem for mapred.system.dir. */ FileSystem getFileSystem() { return fs; } I am a little bit confused. I think for getFileSystem() it is clear. We still return the systemDir's file system, so we should change this fs to systemDirFs which I omitted in my previous patch. For getFilesystemName(), what does fs stand for in this context, default fs or systemDir's file system. I guess it denotes the latter one. Right? Thanks > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661415#comment-13661415 ] Xi Fang commented on MAPREDUCE-5224: Thanks Ivan for the comments. That is of great help! I will check my code:) > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661206#comment-13661206 ] Xi Fang commented on MAPREDUCE-5224: Thanks Chuan. > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13656644#comment-13656644 ] Xi Fang commented on MAPREDUCE-5224: I updated the patch. It includes the unit test. I also made some change to MAPREDUCE-5224.patch because the previous one is not complete. In many places, the default file system is used to access the system directory rather than only in getSystemDir(). Thus, this requires much more changes to the original code. I still have some questions I am not quite sure. 1. In the constructor of JobTracker: try { FileStatus systemDirStatus = systemDirFs.getFileStatus(systemDir); if (!systemDirStatus.isOwnedByUser( mrOwner.getShortUserName(), mrOwner.getGroupNames())) { throw new AccessControlException("The systemdir " + systemDir + " is not owned by " + mrOwner.getShortUserName()); } if (!systemDirStatus.getPermission().equals(SYSTEM_DIR_PERMISSION)) { LOG.warn("Incorrect permissions on " + systemDir + ". Setting it to " + SYSTEM_DIR_PERMISSION); systemDirFs.setPermission(systemDir,new FsPermission(SYSTEM_DIR_PERMISSION)); } } Basically, I have changed the file system used to access the system dir. But I am not quite sure if I should change the two IF statements, because the file permission might be a problem. 2. LocalJobRunner has a method getSystemDir() as well. It uses the default file system to access the system directory. public String getSystemDir() { Path sysDir = new Path(conf.get("mapred.system.dir", "/tmp/hadoop/mapred/system")); return fs.makeQualified(sysDir).toString(); } I am not quite sure if I need to change this as well. Thanks! > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: MAPREDUCE-5224.2.patch > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653473#comment-13653473 ] Xi Fang commented on MAPREDUCE-5224: The original motivation of this JIRA is trying to fix the following scenario. In Azure, the default file system is set as ASV (Windows Azure Blob Storage), but we would still like the system directory to be in DFS, because we don't want to put such files in ASV that charge Azure customers fee. Thus, we want to change JobTracker.java to allow that. The problem in the current JobTracker.java is that we want to use makeQualified() to assemble a path. But getSystemDir() uses the wrong fs object to call fs.makeQualified(), if default (e.g. Azure in our scanerio) and "mapred.system.dir" are using different file systems. In the proposed fix, we rely on FileSystem.get() to choose the appropriate file system according to "mapred.system.dir". It falls back on the default file system if the scheme is not there. Although the original motivation is trying to fix the problem for Azure, this fix also applies to other scenarios where the default file system and "mapred.system.dir" are supposed to use different file systems. A unit test will follow. > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: MAPREDUCE-5224.patch > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-5224 started by Xi Fang. > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: (was: MAPREDUCE-5224.1.patch) > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: MAPREDUCE-5224.1.patch > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.1.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: (was: MAPREDUCE-5224.patch) > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Attachment: MAPREDUCE-5224.patch > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > Attachments: MAPREDUCE-5224.patch > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5224: --- Description: JobTracker today expects the system directory to be in the default file system if (fs == null) { fs = mrOwner.doAs(new PrivilegedExceptionAction() { public FileSystem run() throws IOException { return FileSystem.get(conf); }}); } ... public String getSystemDir() { Path sysDir = new Path(conf.get("mapred.system.dir", "/tmp/hadoop/mapred/system")); return fs.makeQualified(sysDir).toString(); } In Cloud like Azure the default file system is set as ASV (Windows Azure Blob Storage), but we would still like the system directory to be in DFS. We should change JobTracker to allow that. was: JobTracker today expects the system directory to be in the default file system if (fs == null) { fs = mrOwner.doAs(new PrivilegedExceptionAction() { public FileSystem run() throws IOException { return FileSystem.get(conf); }}); } ... public String getSystemDir() { Path sysDir = new Path(conf.get("mapred.system.dir", "/tmp/hadoop/mapred/system")); return fs.makeQualified(sysDir).toString(); } In Cloud like Azure the default file system is set as ASV (Windows Azure Blob Storage), but we would still like the system directory to be in DFS. We should change JobTracker to allow that. > JobTracker should allow the system directory to be in non-default FS > > > Key: MAPREDUCE-5224 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Xi Fang >Assignee: Xi Fang >Priority: Minor > Fix For: 1-win > > > JobTracker today expects the system directory to be in the default file > system > if (fs == null) { > fs = mrOwner.doAs(new PrivilegedExceptionAction() { > public FileSystem run() throws IOException { > return FileSystem.get(conf); > }}); > } > ... > public String getSystemDir() { > Path sysDir = new Path(conf.get("mapred.system.dir", > "/tmp/hadoop/mapred/system")); > return fs.makeQualified(sysDir).toString(); > } > In Cloud like Azure the default file system is set as ASV (Windows Azure Blob > Storage), but we would still like the system directory to be in DFS. We > should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
Xi Fang created MAPREDUCE-5224: -- Summary: JobTracker should allow the system directory to be in non-default FS Key: MAPREDUCE-5224 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win JobTracker today expects the system directory to be in the default file system if (fs == null) { fs = mrOwner.doAs(new PrivilegedExceptionAction() { public FileSystem run() throws IOException { return FileSystem.get(conf); }}); } ... public String getSystemDir() { Path sysDir = new Path(conf.get("mapred.system.dir", "/tmp/hadoop/mapred/system")); return fs.makeQualified(sysDir).toString(); } In Cloud like Azure the default file system is set as ASV (Windows Azure Blob Storage), but we would still like the system directory to be in DFS. We should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira