[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-11-07 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816210#comment-13816210
 ] 

Xi Fang commented on MAPREDUCE-5508:


One way to confirm that is to set to mapred.jobtracker.completeuserjobs.maximum 
= 0 and run some jobs. After all the jobs are done, wait for a while and check 
the number of FS objects in FileSystem#Cache.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: CleanupQueue.java, JobInProgress.java, 
> MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, MAPREDUCE-5508.3.patch, 
> MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-10-21 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800914#comment-13800914
 ] 

Xi Fang commented on MAPREDUCE-5508:


Thanks Chris and  viswanathan. And I think the three patches are what you need. 
It won't affect production environment because it is a very back-end thing. 
Users won't notice any difference I think.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, 
> MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-23 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776023#comment-13776023
 ] 

Xi Fang commented on MAPREDUCE-5508:


Thanks Chris and Sandy!

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, 
> MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-23 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13775904#comment-13775904
 ] 

Xi Fang commented on MAPREDUCE-5508:


Thanks Chris and Sandy. I just finished the large scale test. I didn't find 
memory leak in my test. I removed tabs and attached a new patch. 

So Chris, do you think we should file a new Jira for the idempotent 
implementation?



> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, 
> MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-23 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5508:
---

Attachment: MAPREDUCE-5508.3.patch

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, 
> MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-22 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5508:
---

Attachment: MAPREDUCE-5508.2.patch

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, 
> MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-22 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774322#comment-13774322
 ] 

Xi Fang commented on MAPREDUCE-5508:


Thanks Chris. I attached a new patch and will launch a large scale test 
tomorrow.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, 
> MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-22 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774280#comment-13774280
 ] 

Xi Fang commented on MAPREDUCE-5508:


[~cnauroth], thanks for your comments. 
bq. Swallowing the InterruptedException is problematic if any upstream code 
depends on seeing the thread's interrupted status, so let's restore the 
interrupted status in the catch block by calling 
Thread.currentThread().interrupt().

If we call Thread.currentThread().interrupt(), is that possible that fs won't 
be closed in JobInProgress#cleanupJob()?

bq. If there is an InterruptedException, then we currently would pass a null 
tempDirFs to the CleanupQueue, where we'd once again risk leaking memory. I 
suggest that if there is an InterruptedException, then we skip adding to the 
CleanupQueue and log a warning. This is consistent with the error-handling 
strategy in the rest of the method. (It logs warnings.)

I think if the answer to my first question is "fs will be closed in 
JobInProgress#cleanupJob()", there will be no memory leak. This is because even 
if we pass null into CleanupQueue, the new fs created in 
CleanupQueue#deletePath() would be closed anyway.

Thanks Chris.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-19 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772073#comment-13772073
 ] 

Xi Fang commented on MAPREDUCE-5508:


I set both staging and system dirs to hdfs on my test cluster. I ran 35,000 job 
submissions and manually checked the number of DistributedFileSystem objects. 
No memory leak related to DistributedFileSystem was found.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-18 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771276#comment-13771276
 ] 

Xi Fang commented on MAPREDUCE-5508:


Thanks Chris and Sandy. I made a draft patch for the proposal. I am thinking we 
still pass "tempDirFs" into PathDeletionContext instead of passing "fs", in 
order to deal with the case that fs is closed by someone. Although tempDirFs 
might be different from fs due to the different subject problem discussed 
above, in most of cases they would be the same (I used "userUGI" to get 
tempDirFs). So this is still an optimization. Let me know your comments. Thanks.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-18 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5508:
---

Attachment: MAPREDUCE-5508.1.patch

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-15 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767847#comment-13767847
 ] 

Xi Fang commented on MAPREDUCE-5508:


Thanks Chris for filing HDFS-5211. That sounds good to me:)

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-14 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767702#comment-13767702
 ] 

Xi Fang commented on MAPREDUCE-5508:


Thanks [~sandyr] and [~cnauroth]. Actually, the above discussion made me have 
second thoughts on the patch attached. There is a race condition here. Supposed 
that Path#getFileSystem in CleanupQueue#deletePath retrieved the same instance 
of JobInProgress#fs from FileSystem#Cache as well. Because there is race 
condition between DistributedFileSystem#close() and FileSystem#close(), it is 
possible that at the most just after JobInProgress#cleanupJob closed 
JobInProgress#fs's DFSClient, the processor switched to CleanupQueue#deletePath 
and called fs.delete(). Because this fs's DFCClient has been closed, an 
exception would be thrown and this staging directory won't be deleted then.



> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-14 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767525#comment-13767525
 ] 

Xi Fang commented on MAPREDUCE-5508:


Thanks Sandy for the information on HADOOP-6670. I think we may still need to 
close fs anyway, because p.getFileSystem(conf) in CleanupQueue#deletePath may 
not be able to find the FileSystem#Cache entry of JobInProgress#fs because of 
the different subject problem we discussed above. In this case, nothing will 
remove JobInProgress#fs from the FileSystem#Cache.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-14 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767402#comment-13767402
 ] 

Xi Fang commented on MAPREDUCE-5508:


Just found Chris was also working on this thread :). I agree with Chris. 
Changing the hash code may have a wide impact on existing code that would be 
risky.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-14 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767401#comment-13767401
 ] 

Xi Fang commented on MAPREDUCE-5508:


[~sandyr] Thanks for your comments.

bq. Have you tested this fix.

Yes. We have tested this fix on our test cluster (about 130,000 submission). 
After the workflow was done, we waited for a couple of minutes (jobs were 
retiring), then forced GC, and then dumped the memory. We manually checked the 
FileSystem#Cache. There was no memory leak.

bq. For your analysis 

1. I agree with "it doesn't appear that tempDirFs and fs are ever even ending 
up equal because tempDirFs is created with the wrong UGI."  
2. I think tempDir would be fine because  1) JobInProgess#cleanupJob won't 
introduce a file system instance for tempDir and 2) the fs in 
CleanupQueue@deletePath would be reused (i.e. only one instance would exist in 
FileSystem#Cache). My initial thought was this part has a memory leak. But a 
test shows that there is no problem here.
3. The problem is actually 
{code}
tempDirFs = jobTempDirPath.getFileSystem(conf);
{code}
The problem here is that this guy "MAY" (I will explain later) put a new entry 
in FileSystem#Cache. Note that this would eventually go into 
UserGroupInformation#getCurrentUser to get a UGI with a current 
AccessControlContext.  CleanupQueue#deletePath won't close this entry because a 
different UGI (i.e. "userUGI" created in JobInProgress) is used there. Here is 
the tricky part which we had a long discussion with [~cnauroth] and [~vinodkv]. 
The problem here is that although we may only have one current user, the 
following code "MAY" return different subjects.
{code}
 static UserGroupInformation getCurrentUser() throws IOException {
AccessControlContext context = AccessController.getContext();
-->Subject subject = Subject.getSubject(context);   
-< 
{code}
Because the entry of FileSystem#Cache uses identityHashCode of a subject to 
construct the key, a file system object created by  
"jobTempDirPath.getFileSystem(conf)" may not be found later when this code is 
executed again, although we may have the same principle (i.e. the current 
user). This eventually leads to an unbounded number of file system instances in 
FileSystem#Cache. Nothing is going to remove them from the cache.
 
Please let me know if you have any questions. 

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-13 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5508:
---

Summary: JobTracker memory leak caused by unreleased FileSystem objects in 
JobInProgress#cleanupJob  (was: Memory leak caused by unreleased FileSystem 
objects in JobInProgress#cleanupJob)

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-13 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-5508 started by Xi Fang.

> Memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> ---
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-13 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5508:
---

Attachment: MAPREDUCE-5508.patch

> Memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> ---
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-13 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5508:
---

Description: 
MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
object (see "tempDirFs") that is not properly released.
{code} JobInProgress#cleanupJob()

  void cleanupJob() {
...
  tempDirFs = jobTempDirPath.getFileSystem(conf);
  CleanupQueue.getInstance().addToQueue(
  new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
...
 if (tempDirFs != fs) {
  try {
fs.close();
  } catch (IOException ie) {
...
}
{code}


  was:
MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
object (see "tempDirFs") that is not properly released.
{code} JobInProgress#cleanupJob()

  void cleanupJob() {
...
  tempDirFs = jobTempDirPath.getFileSystem(conf);
  CleanupQueue.getInstance().addToQueue(
  new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
...
{code}



> Memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> ---
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-13 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767210#comment-13767210
 ] 

Xi Fang commented on MAPREDUCE-5508:


This bug was found in Microsoft's large scale test with about 200,000 job 
submissions. The memory usage is steadily growing up. 

There is a long discussion between Hortonworks (thanks [~cnauroth] and 
[~vinodkv]) and Microsoft on this issue. Here is the summary of the discussion.

1. The heap dumps are showing DistributedFileSystem instances that are only 
referred to from the cache's HashMap entries. Since nothing else has a 
reference, nothing else can ever attempt to close it, and therefore it will 
never be removed from the cache. 

2. The special check for "tempDirFS" (see code in description) in the patch for 
MAPREDUCE-5351 is intended as an optimization so that CleanupQueue doesn't need 
to immediately reopen a FileSystem that was just closed. However, we observed 
that we're getting different identity hash code values on the subject in the 
key. The code is assuming that CleanupQueue will find the same Subject that was 
used inside JobInProgress. Unfortunately, this is not guaranteed, because we 
may have crossed into a different access control context at this point, via 
UserGroupInformation#doAs. Even though it's conceptually the same user, the 
Subject is a function of the current AccessControlContext:
{code}
  public synchronized
  static UserGroupInformation getCurrentUser() throws IOException {
AccessControlContext context = AccessController.getContext();
Subject subject = Subject.getSubject(context);
{code}
Even if the contexts are logically equivalent between JobInProgress and 
CleanupQueue, we see no guarantee that Java will give you the same Subject 
instance, which is required for successful lookup in the FileSystem cache 
(because of the use of identity hash code).

A fix is abandon this optimization and close the FileSystem within the same 
AccessControlContext that opened it.  


> Memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> ---
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object that is properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-13 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5508:
---

Description: 
MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
object (see "tempDirFs") that is not properly released.
{code} JobInProgress#cleanupJob()

  void cleanupJob() {
...
  tempDirFs = jobTempDirPath.getFileSystem(conf);
  CleanupQueue.getInstance().addToQueue(
  new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
...
{code}


  was:
MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
object that is properly released.
{code} JobInProgress#cleanupJob()

  void cleanupJob() {
...
  tempDirFs = jobTempDirPath.getFileSystem(conf);
  CleanupQueue.getInstance().addToQueue(
  new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
...
{code}



> Memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> ---
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5508) Memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-13 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5508:
---

Summary: Memory leak caused by unreleased FileSystem objects in 
JobInProgress#cleanupJob  (was: Memory Leak caused by unreleased FileSystem 
objects in JobInProgress#cleanupJob)

> Memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> ---
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object that is properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5508) Memory Leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-13 Thread Xi Fang (JIRA)
Xi Fang created MAPREDUCE-5508:
--

 Summary: Memory Leak caused by unreleased FileSystem objects in 
JobInProgress#cleanupJob
 Key: MAPREDUCE-5508
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1-win
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Critical


MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
object that is properly released.
{code} JobInProgress#cleanupJob()

  void cleanupJob() {
...
  tempDirFs = jobTempDirPath.getFileSystem(conf);
  CleanupQueue.getInstance().addToQueue(
  new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
...
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5405) Job recovery can fail if task log directory symlink from prior run still exists

2013-07-19 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13714194#comment-13714194
 ] 

Xi Fang commented on MAPREDUCE-5405:


Sounds good to me! I also did some tests on Ubuntu and Windows. It passes 
consistently. Thanks Chris. 

> Job recovery can fail if task log directory symlink from prior run still 
> exists
> ---
>
> Key: MAPREDUCE-5405
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5405
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 1-win, 1.3.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: MAPREDUCE-5405.branch-1.1.patch
>
>
> During recovery, the task attempt log dir symlink from the prior run might 
> still exist.  If it does, then the recovered attempt will fail while trying 
> to create a symlink at that path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5391) TestNonLocalJobJarSubmission fails on Windows due to missing classpath entries

2013-07-15 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13708988#comment-13708988
 ] 

Xi Fang commented on MAPREDUCE-5391:


Thanks Chris. The patch looks good to me!

> TestNonLocalJobJarSubmission fails on Windows due to missing classpath entries
> --
>
> Key: MAPREDUCE-5391
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5391
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1-win
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: MAPREDUCE-5391.1.patch
>
>
> This test works by having the mapper check all classpath entries loaded by 
> the classloader.  On Windows, the classpath is packed into an intermediate 
> jar file with a manifest containing the classpath to work around command line 
> length limitation.  The test needs to be updated to unpack the intermediate 
> jar file and read the manifest when running on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-07-09 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703695#comment-13703695
 ] 

Xi Fang commented on MAPREDUCE-5278:


Thanks, Chris

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, 
> MAPREDUCE-5278.4.patch, MAPREDUCE-5278.5.patch, MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-07-09 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703619#comment-13703619
 ] 

Xi Fang commented on MAPREDUCE-5278:


Thanks, Chris. A new patch has been attached. 

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, 
> MAPREDUCE-5278.4.patch, MAPREDUCE-5278.5.patch, MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-07-09 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Attachment: MAPREDUCE-5278.5.patch

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, 
> MAPREDUCE-5278.4.patch, MAPREDUCE-5278.5.patch, MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-08 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702176#comment-13702176
 ] 

Xi Fang commented on MAPREDUCE-5371:


Thanks Chris!

> TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of 
> windows users
> ---
>
> Key: MAPREDUCE-5371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5371.patch
>
>
> The error message was:
> Error Message
> expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins>
> Stacktrace
> at 
> org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)
> The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-07-03 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699544#comment-13699544
 ] 

Xi Fang commented on MAPREDUCE-5278:


Thanks Bikas. A new patch was attached. 

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, 
> MAPREDUCE-5278.4.patch, MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-07-03 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Attachment: MAPREDUCE-5278.4.patch

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, 
> MAPREDUCE-5278.4.patch, MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-02 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13698240#comment-13698240
 ] 

Xi Fang commented on MAPREDUCE-5371:


The attached patch removed the domains from user names.

> TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of 
> windows users
> ---
>
> Key: MAPREDUCE-5371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5371.patch
>
>
> The error message was:
> Error Message
> expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins>
> Stacktrace
> at 
> org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)
> The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-02 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-5371 started by Xi Fang.

> TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of 
> windows users
> ---
>
> Key: MAPREDUCE-5371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5371.patch
>
>
> The error message was:
> Error Message
> expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins>
> Stacktrace
> at 
> org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)
> The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-02 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5371:
---

Attachment: MAPREDUCE-5371.patch

> TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of 
> windows users
> ---
>
> Key: MAPREDUCE-5371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5371.patch
>
>
> The error message was:
> Error Message
> expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins>
> Stacktrace
> at 
> org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)
> The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-02 Thread Xi Fang (JIRA)
Xi Fang created MAPREDUCE-5371:
--

 Summary: TestProxyUserFromEnv#testProxyUserFromEnvironment failed 
caused by domains of windows users
 Key: MAPREDUCE-5371
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win


The error message was:
Error Message
expected:<[sijenkins-vm2]jenkins> but was:<[]jenkins>
Stacktrace
at 
org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)

The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows

2013-07-02 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13698205#comment-13698205
 ] 

Xi Fang commented on MAPREDUCE-5330:


Thanks Ivan and Chris!

> JVM manager should not forcefully kill the process on Signal.TERM on Windows
> 
>
> Key: MAPREDUCE-5330
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5330.patch
>
>
> In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
> if we want to launch other tasks (look in 
> JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
> task process is in the middle of doing some cleanup/finalization after the 
> task is done, it might be interrupted/killed without giving it a chance. 
> In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
> closing file systems in a special shutdown hook, we're typically uploading 
> storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
> this kill happens these metrics get lost. The impact is that for many MR jobs 
> we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5109) Job view-acl should apply to job listing too

2013-06-27 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13695116#comment-13695116
 ] 

Xi Fang commented on MAPREDUCE-5109:


Hi Vinod, thanks for your patch. If Hadoop runs with this patch on Windows, 
there would be a problem because file name can't have "*" on Windows. After 
discussed Chris, we have two proposals specifically for Windows:

1. Use an entirely different wildcard character on Windows (for example: using 
"!" instead of "*")
2. Add an encoder and a decoder specifically for "*" in 
JobHistory#encodeJobHistoryFileName() and decodeJobHistoryFileName() 
respectively, on Windows. For example, we can encode "*" to "%20F". In this 
case, getNewJobHistoryFileName should also be changed accordingly. 

Do you have any suggestion on these two options?

> Job view-acl should apply to job listing too
> 
>
> Key: MAPREDUCE-5109
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5109
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Vinod Kumar Vavilapalli
> Attachments: MAPREDUCE-5109-20130405.2.txt
>
>
> Job view-acl should apply to job listing too, currently it only applies to 
> job details pages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5330) Killing M/R JVM's leads to metrics not being uploaded

2013-06-18 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687346#comment-13687346
 ] 

Xi Fang commented on MAPREDUCE-5330:


If Signal.TERM is sent to a process, then we wait for a delay. But in Windows 
the signal kind is ignored - we just kill it (look at 
Shell#getSignalKillProcessGroupCommand())
{code}
  public static String[] getSignalKillProcessGroupCommand(int code,
  String groupId) {
if (WINDOWS) {
  return new String[] { Shell.WINUTILS, "task", "kill", groupId };
} else {
  return new String[] { "kill", "-" + code , "-" + groupId };
}
  }
{code}

Here is a fix. If the OS is Windows and the signal is TERM, then return 
immediately and let a delayed process killer actually kill this process group. 
This can give this process group a graceful time to clean up itself.

> Killing M/R JVM's leads to metrics not being uploaded
> -
>
> Key: MAPREDUCE-5330
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Attachments: MAPREDUCE-5330.patch
>
>
> In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
> if we want to launch other tasks (look in 
> JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
> task process is in the middle of doing some cleanup/finalization after the 
> task is done, it might be interrupted/killed without giving it a chance. 
> In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
> closing file systems in a special shutdown hook, we're typically uploading 
> storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
> this kill happens these metrics get lost. The impact is that for many MR jobs 
> we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5330) Killing M/R JVM's leads to metrics not being uploaded

2013-06-18 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5330:
---

Attachment: MAPREDUCE-5330.patch

> Killing M/R JVM's leads to metrics not being uploaded
> -
>
> Key: MAPREDUCE-5330
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Attachments: MAPREDUCE-5330.patch
>
>
> In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
> if we want to launch other tasks (look in 
> JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
> task process is in the middle of doing some cleanup/finalization after the 
> task is done, it might be interrupted/killed without giving it a chance. 
> In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
> closing file systems in a special shutdown hook, we're typically uploading 
> storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
> this kill happens these metrics get lost. The impact is that for many MR jobs 
> we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5330) Killing M/R JVM's leads to metrics not being uploaded

2013-06-18 Thread Xi Fang (JIRA)
Xi Fang created MAPREDUCE-5330:
--

 Summary: Killing M/R JVM's leads to metrics not being uploaded
 Key: MAPREDUCE-5330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang


In MapReduce, we sometimes kill a task's JVM before it naturally shuts down if 
we want to launch other tasks (look in JvmManager$JvmManagerForType.reapJvm). 
This behavior means that if the map task process is in the middle of doing some 
cleanup/finalization after the task is done, it might be interrupted/killed 
without giving it a chance. 

In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
closing file systems in a special shutdown hook, we're typically uploading 
storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
this kill happens these metrics get lost. The impact is that for many MR jobs 
we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-06-17 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686233#comment-13686233
 ] 

Xi Fang commented on MAPREDUCE-5278:


Thanks Bikas. A config name was added in JobClient.java
{code}
private static final String CLIENT_ACCESSIBLE_REMOTE_SCHEMES_KEY =
   "mapreduce.client.accessible.remote.schemes";
{code}
And in copyRemoteFiles(), I changed to
{code}
String [] accessibleSchemes = job.getStrings(
CLIENT_ACCESSIBLE_REMOTE_SCHEMES_KEY, null);
{code}

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, 
> MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-06-17 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Attachment: MAPREDUCE-5278.3.patch

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.3.patch, 
> MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-06-17 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686060#comment-13686060
 ] 

Xi Fang commented on MAPREDUCE-5278:


Thanks Bikas for your comments. For your question : "Is the following code 
(marked below) continuing to copy stuff to the default fs (fs) when the newPath 
points to a different filesystem?:

No. Basically, the original code does this: If JT staging dir is not on the 
default FS (for example, in our context it is ASV), copyRemoteFiles() will copy 
files in ASV to JT. Note that these files are specified using generic options.
After our change, when ASV is marked as "accessible" by specifying 
"mapreduce.client.accessible.remote.schemes",  copyRemoteFiles() won't copy the 
files in ASV  to the jobtracker. It just directly returns the path of that 
file, denoted by "newPath".  In addition, no copy operation would happen in 
addArchiveToClassPath().  

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-06-10 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13680075#comment-13680075
 ] 

Xi Fang commented on MAPREDUCE-5278:


Thanks Ivan. I have added a classpath check and am preparing a trunk version. 

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-06-10 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Attachment: MAPREDUCE-5278.2.patch

> Distributed cache is broken when JT staging dir is not on the default FS
> 
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.2.patch, MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-06-05 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

   Fix Version/s: 1-win
Target Version/s: 1-win
  Status: Patch Available  (was: In Progress)

> Perf: Distributed cache is broken when JT staging dir is not on the default FS
> --
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-06-05 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Attachment: MAPREDUCE-5278.patch

A patch is attached. 

In this patch, we added a property called 
"mapreduce.client.accessible.remote.schemes." It specifies the schemes of the 
file systems that are accessible from all the nodes in the cluster. This is 
used by the job client to avoid copying distributed cache entries to the job 
staging dir if path is accessible (See JobClient#copyRemoteFiles() ).

For example, on Windows Azure, a path that has ASV as its scheme is accessible 
from all the nodes in the cluster. "mapreduce.client.accessible.remote.schemes" 
can be set to "ASV". 

The change in this patch is passive, meaning that it won’t take effect unless 
this property is enabled thru configuration. 

> Perf: Distributed cache is broken when JT staging dir is not on the default FS
> --
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Attachments: MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-06-05 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-5278 started by Xi Fang.

> Perf: Distributed cache is broken when JT staging dir is not on the default FS
> --
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
> Attachments: MAPREDUCE-5278.patch
>
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-05-28 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Description: 
Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
system and Windows ASV file system) are the default file systems.

For ASV, this config was chosen and there are a few reasons why:

1. To prevent leak of the storage account credentials to the user's storage 
account; 
2. It uses HDFS for the transient job files what is good for two reasons – a) 
it does not flood the user's storage account with irrelevant data/files b) it 
leverages HDFS locality for small files

However, this approach conflicts with how distributed cache caching works, 
completely negating the feature's functionality.

When files are added to the distributed cache (thru files/achieves/libjars 
hadoop generic options), they are copied to the job tracker staging dir only if 
they reside on a file system different that the jobtracker's. Later on, this 
path is used as a "key" to cache the files locally on the tasktracker's 
machine, and avoid localization (download/unzip) of the distributed cache files 
if they are already localized.

In this configuration the caching is completely disabled and we always end up 
copying dist cache files to the job tracker's staging dir first and localizing 
them on the task tracker machine second.

This is especially not good for Oozie scenarios as Oozie uses dist cache to 
populate Hive/Pig jars throughout the cluster.



  was:
Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
system and Windows ASV file system) are the default file systems. For ASV, this 
config was chosen and there are a few reasons why:

1. To prevent leak of the storage account creds to the user's storage account; 
2. It uses HDFS for the transient job files what is good for two reasons – a) 
it does not flood the user's storage account with irrelevant data/files b) it 
leverages HDFS locality for small files

However, this approach conflicts with how distributed cache caching works, 
completely negating the feature's functionality.

When files are added to the distributed cache (thru files/achieves/libjars 
hadoop generic options), they are copied to the job tracker staging dir only if 
they reside on a file system different that the jobtracker's. Later on, this 
path is used as a "key" to cache the files locally on the tasktracker's 
machine, and avoid localization (download/unzip) of the distributed cache files 
if they are already localized.

In this configuration the caching is completely disabled and we always end up 
copying dist cache files to the job tracker's staging dir first and localizing 
them on the task tracker machine second.

This is especially not good for Oozie scenarios as Oozie uses dist cache to 
populate Hive/Pig jars throughout the cluster.




> Perf: Distributed cache is broken when JT staging dir is not on the default FS
> --
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems.
> For ASV, this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account credentials to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker machine second.
> This is especia

[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-05-28 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Description: 
Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
system and Windows ASV file system) are the default file systems. For ASV, this 
config was chosen and there are a few reasons why:

1. To prevent leak of the storage account creds to the user's storage account; 
2. It uses HDFS for the transient job files what is good for two reasons – a) 
it does not flood the user's storage account with irrelevant data/files b) it 
leverages HDFS locality for small files

However, this approach conflicts with how distributed cache caching works, 
completely negating the feature's functionality.

When files are added to the distributed cache (thru files/achieves/libjars 
hadoop generic options), they are copied to the job tracker staging dir only if 
they reside on a file system different that the jobtracker's. Later on, this 
path is used as a "key" to cache the files locally on the tasktracker's 
machine, and avoid localization (download/unzip) of the distributed cache files 
if they are already localized.

In this configuration the caching is completely disabled and we always end up 
copying dist cache files to the job tracker's staging dir first and localizing 
them on the task tracker machine second.

This is especially not good for Oozie scenarios as Oozie uses dist cache to 
populate Hive/Pig jars throughout the cluster.



  was:
Today, we set the JobTracker staging dir 
("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is 
the default file system. There are a few reason why this config was chosen:

1. To prevent leak of the storage account creds to the user's storage account 
(IOW, keep job.xml in the cluster). 
2. It uses HDFS for the transient job files what is good for two reasons – a) 
it does not flood the user's storage account with irrelevant data/files b) it 
leverages HDFS locality for small files

However, this approach conflicts with how distributed cache caching works, 
completely negating the feature's functionality.

When files are added to the distributed cache (thru files/achieves/libjars 
hadoop generic options), they are copied to the job tracker staging dir only if 
they reside on a file system different that the jobtracker's. Later on, this 
path is used as a "key" to cache the files locally on the tasktracker's 
machine, and avoid localization (download/unzip) of the distributed cache files 
if they are already localized.

In our configuration the caching is completely disabled and we always end up 
copying dist cache files to the JT staging dir first and localizing them on the 
tasktracker machine second.

This is especially not good for Oozie scenarios as Oozie uses dist cache to 
populate Hive/Pig jars throughout the cluster.

Easy workaround is to config mapreduce.jobtracker.staging.root.dir in 
mapred-site.xml to be on the default FS.


> Perf: Distributed cache is broken when JT staging dir is not on the default FS
> --
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>
> Today, the JobTracker staging dir ("mapreduce.jobtracker.staging.root.dir) is 
> set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
> system and Windows ASV file system) are the default file systems. For ASV, 
> this config was chosen and there are a few reasons why:
> 1. To prevent leak of the storage account creds to the user's storage 
> account; 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In this configuration the caching is completely disabled and we always end up 
> copying dist cache files to the job tracker's staging dir first and 
> localizing them on the task tracker m

[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-27 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668119#comment-13668119
 ] 

Xi Fang commented on MAPREDUCE-5224:


Thanks Ivan!

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
> MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-05-27 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668117#comment-13668117
 ] 

Xi Fang commented on MAPREDUCE-5278:


Basically, if a remote file system is reachable from task trackers, we don't 
have to copy the files on this file system to the job tracker's staging (see 
JobClient#copyRemoteFiles() ). 

For example, in HDInsight, user storage would be ASV which is different than 
HDFS. So by default these files would be copied to JT. However, since ASV is 
supposed to be reachable from tasktracker, these copy operations would be 
unnecessary, which will also disable the dist cache.  A proposal is to add a 
configuration property (e.g. "mapred.tasktracker.scheme.accessible"). If we 
specify a scheme in this property, we won't do the copy operation even if this 
scheme is not equal to the scheme of job tracker's staging dir. For example, in 
this context, mapred.tasktracker.scheme.accessible=ASV.

> Perf: Distributed cache is broken when JT staging dir is not on the default FS
> --
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>
> Today, we set the JobTracker staging dir 
> ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is 
> the default file system. There are a few reason why this config was chosen:
> 1. To prevent leak of the storage account creds to the user's storage account 
> (IOW, keep job.xml in the cluster). 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In our configuration the caching is completely disabled and we always end up 
> copying dist cache files to the JT staging dir first and localizing them on 
> the tasktracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.
> Easy workaround is to config mapreduce.jobtracker.staging.root.dir in 
> mapred-site.xml to be on the default FS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-05-27 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Description: 
Today, we set the JobTracker staging dir 
("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is 
the default file system. There are a few reason why this config was chosen:

1. To prevent leak of the storage account creds to the user's storage account 
(IOW, keep job.xml in the cluster). 
2. It uses HDFS for the transient job files what is good for two reasons – a) 
it does not flood the user's storage account with irrelevant data/files b) it 
leverages HDFS locality for small files

However, this approach conflicts with how distributed cache caching works, 
completely negating the feature's functionality.

When files are added to the distributed cache (thru files/achieves/libjars 
hadoop generic options), they are copied to the job tracker staging dir only if 
they reside on a file system different that the jobtracker's. Later on, this 
path is used as a "key" to cache the files locally on the tasktracker's 
machine, and avoid localization (download/unzip) of the distributed cache files 
if they are already localized.

In our configuration the caching is completely disabled and we always end up 
copying dist cache files to the JT staging dir first and localizing them on the 
tasktracker machine second.

This is especially not good for Oozie scenarios as Oozie uses dist cache to 
populate Hive/Pig jars throughout the cluster.

Easy workaround is to config mapreduce.jobtracker.staging.root.dir in 
mapred-site.xml to be on the default FS.

  was:
Today, we set the JobTracker staging dir 
("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is 
the default file system. There are a few reason why this config was chosen:
To prevent leak of the storage account creds to the user's storage account 
(IOW, keep job.xml in the cluster). This is needed until HADOOP-444 is fixed.
It uses HDFS for the transient job files what is good for two reasons – a) it 
does not flood the user's storage account with irrelevant data/files b) it 
leverages HDFS locality for small files
However, this approach conflicts with how distributed cache caching works, 
completely negating the feature's functionality.
When files are added to the distributed cache (thru files/achieves/libjars 
hadoop generic options), they are copied to the job tracker staging dir only if 
they reside on a file system different that the jobtracker's. Later on, this 
path is used as a "key" to cache the files locally on the tasktracker's 
machine, and avoid localization (download/unzip) of the distributed cache files 
if they are already localized.
In our configuration the caching is completely disabled and we always end up 
copying dist cache files to the JT staging dir first and localizing them on the 
tasktracker machine second.
This is especially not good for Oozie scenarios as Oozie uses dist cache to 
populate Hive/Pig jars throughout the cluster.
Easy workaround is to config mapreduce.jobtracker.staging.root.dir in 
mapred-site.xml to be on the default FS.


> Perf: Distributed cache is broken when JT staging dir is not on the default FS
> --
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>
> Today, we set the JobTracker staging dir 
> ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is 
> the default file system. There are a few reason why this config was chosen:
> 1. To prevent leak of the storage account creds to the user's storage account 
> (IOW, keep job.xml in the cluster). 
> 2. It uses HDFS for the transient job files what is good for two reasons – a) 
> it does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In our configuration the caching is completely disabled and we always end up 
> copying dist cache files to the JT staging dir first and localizing them on 
> the tasktracke

[jira] [Updated] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-05-27 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5278:
---

Assignee: Xi Fang

> Perf: Distributed cache is broken when JT staging dir is not on the default FS
> --
>
> Key: MAPREDUCE-5278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache
>Affects Versions: 1-win
> Environment: Windows
>Reporter: Xi Fang
>Assignee: Xi Fang
>
> Today, we set the JobTracker staging dir 
> ("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is 
> the default file system. There are a few reason why this config was chosen:
> To prevent leak of the storage account creds to the user's storage account 
> (IOW, keep job.xml in the cluster). This is needed until HADOOP-444 is fixed.
> It uses HDFS for the transient job files what is good for two reasons – a) it 
> does not flood the user's storage account with irrelevant data/files b) it 
> leverages HDFS locality for small files
> However, this approach conflicts with how distributed cache caching works, 
> completely negating the feature's functionality.
> When files are added to the distributed cache (thru files/achieves/libjars 
> hadoop generic options), they are copied to the job tracker staging dir only 
> if they reside on a file system different that the jobtracker's. Later on, 
> this path is used as a "key" to cache the files locally on the tasktracker's 
> machine, and avoid localization (download/unzip) of the distributed cache 
> files if they are already localized.
> In our configuration the caching is completely disabled and we always end up 
> copying dist cache files to the JT staging dir first and localizing them on 
> the tasktracker machine second.
> This is especially not good for Oozie scenarios as Oozie uses dist cache to 
> populate Hive/Pig jars throughout the cluster.
> Easy workaround is to config mapreduce.jobtracker.staging.root.dir in 
> mapred-site.xml to be on the default FS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5278) Perf: Distributed cache is broken when JT staging dir is not on the default FS

2013-05-27 Thread Xi Fang (JIRA)
Xi Fang created MAPREDUCE-5278:
--

 Summary: Perf: Distributed cache is broken when JT staging dir is 
not on the default FS
 Key: MAPREDUCE-5278
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang


Today, we set the JobTracker staging dir 
("mapreduce.jobtracker.staging.root.dir) to point to HDFS even though ASV is 
the default file system. There are a few reason why this config was chosen:
To prevent leak of the storage account creds to the user's storage account 
(IOW, keep job.xml in the cluster). This is needed until HADOOP-444 is fixed.
It uses HDFS for the transient job files what is good for two reasons – a) it 
does not flood the user's storage account with irrelevant data/files b) it 
leverages HDFS locality for small files
However, this approach conflicts with how distributed cache caching works, 
completely negating the feature's functionality.
When files are added to the distributed cache (thru files/achieves/libjars 
hadoop generic options), they are copied to the job tracker staging dir only if 
they reside on a file system different that the jobtracker's. Later on, this 
path is used as a "key" to cache the files locally on the tasktracker's 
machine, and avoid localization (download/unzip) of the distributed cache files 
if they are already localized.
In our configuration the caching is completely disabled and we always end up 
copying dist cache files to the JT staging dir first and localizing them on the 
tasktracker machine second.
This is especially not good for Oozie scenarios as Oozie uses dist cache to 
populate Hive/Pig jars throughout the cluster.
Easy workaround is to config mapreduce.jobtracker.staging.root.dir in 
mapred-site.xml to be on the default FS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-26 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667437#comment-13667437
 ] 

Xi Fang commented on MAPREDUCE-5224:


Thank Ivan for MAPREDUCE-5224.5.patch. 
Here is the reason (offline emails from Ivan) for posting this new patch.

1. Given that fs is indeed used on some other places, we have to account for 
that as well (these tests actually want to close the system dir fs). 
2. There is no need to use the default file system for the jobhistory. There is 
another (orthogonal) bug here. Job history completed location also assumes the 
default FS what is not correct. This should be a separate Jira. 
3. This would make the prod code change really simple.



> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
> MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-26 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: MAPREDUCE-5224.5.patch

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
> MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-23 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: MAPREDUCE-5224.4.patch

Above comments have been addressed. Thanks.

BTW, I changed JobTracker#defaultFs back to fs, because some other codes in the 
same package use this "fs" (fs was originally defined with no access modifier).

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
> MAPREDUCE-5224.4.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-23 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665679#comment-13665679
 ] 

Xi Fang commented on MAPREDUCE-5224:


Thanks Ivan for your detailed comments. These are of great help!

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
> MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-21 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663457#comment-13663457
 ] 

Xi Fang commented on MAPREDUCE-5224:


Thanks Chuan and Ivan.
1. For Chuan's comment: I added an assert to check this dir exists indeed. This 
also addresses Ivan's 10th comment.
2. For Ivan's comment: 
a. For comments #1, 3, 5, 7, 9, 10, I just followed the comments.
b. For comment #2: I personally think it would be better if we can throw out an 
exception rather than swallowing it and setting it back to default file system. 
As mentioned by Mostafa (offline),  for example, if someone configured the 
system dir as http://www.awesome.com/system, then with the fallback solution 
the exception saying "HTTP is not supported" will be swallowed and we'll set 
the system directory as just /system in the default file system, which doesn't 
seem like good behavior. We may want someone explicitly know/handle this at the 
moment  this happens.
c. For comment #4: I renamed the JobTracker#fs to defaultFs and still keep it 
just for possible future use/reference of this variable. 
d. For comment #6, 8: I put the initialization of MiniDFSCluster, MiniMRCluster 
in the test case and let setUp() just construct a configuration. In this way, 
we don't have to throw IOException in setUp() and test case would fail if my 
code changes are not applied.


> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
> MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-21 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: MAPREDUCE-5224.3.patch

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
> MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-20 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662400#comment-13662400
 ] 

Xi Fang commented on MAPREDUCE-5224:


Sorry for the format! The system changed my text to something else because of 
the special symbols. 

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-20 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662398#comment-13662398
 ] 

Xi Fang commented on MAPREDUCE-5224:


Hi Ivan, I was addressing your fourth comment. I have one question.
There are two methods:
- 
 /**
   * Grab the local fs name
   */
  public synchronized String getFilesystemName() throws IOException {
if (fs == null) {
  throw new IllegalStateException("FileSystem object not available yet");
}
return fs.getUri().toString();
  }
 
-
 
  
/**
   * Get JobTracker's FileSystem. This is the filesystem for mapred.system.dir.
   */
  FileSystem getFileSystem() {
return fs;
  }

I am a little bit confused. I think for getFileSystem() it is clear. We still 
return the systemDir's file system, so we should change this fs to systemDirFs 
which I omitted in my previous patch.

For getFilesystemName(), what does fs stand for in this context, default fs or 
systemDir's file system. I guess it denotes the latter one. Right?

Thanks


> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-18 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661415#comment-13661415
 ] 

Xi Fang commented on MAPREDUCE-5224:


Thanks Ivan for the comments. That is of great help! I will check my code:)

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-17 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661206#comment-13661206
 ] 

Xi Fang commented on MAPREDUCE-5224:


Thanks Chuan.

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-13 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13656644#comment-13656644
 ] 

Xi Fang commented on MAPREDUCE-5224:


I updated the patch. It includes the unit test. I also made some change to 
MAPREDUCE-5224.patch because the previous one is not complete. In many places, 
the default file system is used to access the system directory rather than only 
in getSystemDir(). Thus, this requires much more changes to the original code. 

I still have some questions I am not quite sure.
1. In the constructor of JobTracker:
try {
FileStatus systemDirStatus = systemDirFs.getFileStatus(systemDir);
if (!systemDirStatus.isOwnedByUser(
mrOwner.getShortUserName(), mrOwner.getGroupNames()))
{ throw new AccessControlException("The systemdir " + systemDir + " is not 
owned by " + mrOwner.getShortUserName()); }
if (!systemDirStatus.getPermission().equals(SYSTEM_DIR_PERMISSION))
{ LOG.warn("Incorrect permissions on " + systemDir + ". Setting it to " + 
SYSTEM_DIR_PERMISSION); systemDirFs.setPermission(systemDir,new 
FsPermission(SYSTEM_DIR_PERMISSION)); }
} 
Basically, I have changed the file system used to access the system dir. But I 
am not quite sure if I should change the two IF statements, because the file 
permission might be a problem.
2. LocalJobRunner has a method getSystemDir() as well. It uses the default file 
system to access the system directory.
public String getSystemDir()
{ Path sysDir = new Path(conf.get("mapred.system.dir", 
"/tmp/hadoop/mapred/system")); return fs.makeQualified(sysDir).toString(); }
I am not quite sure if I need to change this as well.
Thanks!

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-13 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: MAPREDUCE-5224.2.patch

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-09 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653473#comment-13653473
 ] 

Xi Fang commented on MAPREDUCE-5224:


The original motivation of this JIRA is trying to fix the following scenario. 
In Azure, the default file system is set as ASV (Windows Azure Blob Storage), 
but we would still like the system directory to be in DFS, because we don't 
want to put such files in ASV that charge Azure customers fee. Thus, we want to 
change JobTracker.java to allow that.

The problem in the current JobTracker.java is that we want to use 
makeQualified() to assemble a path. But getSystemDir() uses the wrong fs object 
to call fs.makeQualified(), if default (e.g. Azure in our scanerio) and 
"mapred.system.dir" are using different file systems. In the proposed fix, we 
rely on FileSystem.get() to choose the appropriate file system according to 
"mapred.system.dir". It falls back on the default file system if the scheme is 
not there. 

Although the original motivation is trying to fix the problem for Azure, this 
fix also applies to other scenarios where the default file system and 
"mapred.system.dir" are supposed to use different file systems.

A unit test will follow.


> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-09 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: MAPREDUCE-5224.patch

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-09 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-5224 started by Xi Fang.

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-09 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: (was: MAPREDUCE-5224.1.patch)

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-09 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: MAPREDUCE-5224.1.patch

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.1.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-09 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: (was: MAPREDUCE-5224.patch)

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-09 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Attachment: MAPREDUCE-5224.patch

> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
> Attachments: MAPREDUCE-5224.patch
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-08 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5224:
---

Description: 
 JobTracker today expects the system directory to be in the default file system
if (fs == null) {
  fs = mrOwner.doAs(new PrivilegedExceptionAction() {
public FileSystem run() throws IOException {
  return FileSystem.get(conf);
  }});
}


...

  public String getSystemDir() {
Path sysDir = new Path(conf.get("mapred.system.dir", 
"/tmp/hadoop/mapred/system"));  
return fs.makeQualified(sysDir).toString();
  }
In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
Storage), but we would still like the system directory to be in DFS. We should 
change JobTracker to allow that.


  was:
JobTracker today expects the system directory to be in the default file system
if (fs == null) {
  fs = mrOwner.doAs(new PrivilegedExceptionAction() {
public FileSystem run() throws IOException {
  return FileSystem.get(conf);
  }});
}


...

  public String getSystemDir() {
Path sysDir = new Path(conf.get("mapred.system.dir", 
"/tmp/hadoop/mapred/system"));  
return fs.makeQualified(sysDir).toString();
  }
In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
Storage), but we would still like the system directory to be in DFS. We should 
change JobTracker to allow that.



> JobTracker should allow the system directory to be in non-default FS
> 
>
> Key: MAPREDUCE-5224
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Minor
> Fix For: 1-win
>
>
>  JobTracker today expects the system directory to be in the default file 
> system
> if (fs == null) {
>   fs = mrOwner.doAs(new PrivilegedExceptionAction() {
> public FileSystem run() throws IOException {
>   return FileSystem.get(conf);
>   }});
> }
> ...
>   public String getSystemDir() {
> Path sysDir = new Path(conf.get("mapred.system.dir", 
> "/tmp/hadoop/mapred/system"));  
> return fs.makeQualified(sysDir).toString();
>   }
> In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
> Storage), but we would still like the system directory to be in DFS. We 
> should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-08 Thread Xi Fang (JIRA)
Xi Fang created MAPREDUCE-5224:
--

 Summary: JobTracker should allow the system directory to be in 
non-default FS
 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win


JobTracker today expects the system directory to be in the default file system
if (fs == null) {
  fs = mrOwner.doAs(new PrivilegedExceptionAction() {
public FileSystem run() throws IOException {
  return FileSystem.get(conf);
  }});
}


...

  public String getSystemDir() {
Path sysDir = new Path(conf.get("mapred.system.dir", 
"/tmp/hadoop/mapred/system"));  
return fs.makeQualified(sysDir).toString();
  }
In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
Storage), but we would still like the system directory to be in DFS. We should 
change JobTracker to allow that.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira