date:20130827

[jira] [Commented] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752082#comment-13752082
 ] 

Hadoop QA commented on MAPREDUCE-5441:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12600323/MAPREDUCE-5441.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3966//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3966//console

This message is automatically generated.

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.1.patch, MAPREDUCE-5441.2.patch, 
> MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752076#comment-13752076
 ] 

Hadoop QA commented on MAPREDUCE-5441:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12600321/MAPREDUCE-5441.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3965//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3965//console

This message is automatically generated.

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.1.patch, MAPREDUCE-5441.2.patch, 
> MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752074#comment-13752074
 ] 

Rohith Sharma K S commented on MAPREDUCE-5441:
--

+1 I am OK with the patch.

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.1.patch, MAPREDUCE-5441.2.patch, 
> MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5441:
---

Attachment: MAPREDUCE-5441.2.patch

Update the patch again to return ERROR state in case of last retry..

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.1.patch, MAPREDUCE-5441.2.patch, 
> MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5460) MR AppMaster command options does not replace @taskid@ with the current task ID.

2013-08-27 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752066#comment-13752066
 ] 

Rohith Sharma K S commented on MAPREDUCE-5460:
--

This issue fix to be done from Yarn framework at the time of lauching master 
container, so need to move issue to Yarn.

> MR AppMaster command options does not replace @taskid@ with the current task 
> ID.
> 
>
> Key: MAPREDUCE-5460
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5460
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.0.0, 2.1.1-beta
>Reporter: Chris Nauroth
>Assignee: Rohith Sharma K S
>
> The description of {{yarn.app.mapreduce.am.command-opts}} in 
> mapred-default.xml states that occurrences of {{@taskid@}} will be replaced 
> by the current task ID.  This substitution is not happening.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5441:
---

Status: Open  (was: Patch Available)

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.0.5-alpha, 2.1.0-beta, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.1.patch, MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5441:
---

Attachment: MAPREDUCE-5441.1.patch

New patch fixed the scenario when it's lastAMRetry also return FAILED state

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.1.patch, MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5441:
---

Status: Patch Available  (was: Open)

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.0.5-alpha, 2.1.0-beta, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.1.patch, MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5460) MR AppMaster command options does not replace @taskid@ with the current task ID.

2013-08-27 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752050#comment-13752050
 ] 

Rohith Sharma K S commented on MAPREDUCE-5460:
--

As I see code, there is no validation for replacing @taskid@. Looks description 
for "yarn.app.mapreduce.am.command-opts" is directly copied from 
mapred.child.java.opts.


> MR AppMaster command options does not replace @taskid@ with the current task 
> ID.
> 
>
> Key: MAPREDUCE-5460
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5460
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.0.0, 2.1.1-beta
>Reporter: Chris Nauroth
>Assignee: Rohith Sharma K S
>
> The description of {{yarn.app.mapreduce.am.command-opts}} in 
> mapred-default.xml states that occurrences of {{@taskid@}} will be replaced 
> by the current task ID.  This substitution is not happening.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752051#comment-13752051
 ] 

Jian He commented on MAPREDUCE-5441:


Good point! just came to my mind and was also just working on it :)

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-5460) MR AppMaster command options does not replace @taskid@ with the current task ID.

2013-08-27 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S reassigned MAPREDUCE-5460:


Assignee: Rohith Sharma K S

> MR AppMaster command options does not replace @taskid@ with the current task 
> ID.
> 
>
> Key: MAPREDUCE-5460
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5460
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.0.0, 2.1.1-beta
>Reporter: Chris Nauroth
>Assignee: Rohith Sharma K S
>
> The description of {{yarn.app.mapreduce.am.command-opts}} in 
> mapred-default.xml states that occurrences of {{@taskid@}} will be replaced 
> by the current task ID.  This substitution is not happening.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752045#comment-13752045
 ] 

Rohith Sharma K S commented on MAPREDUCE-5441:
--

One scenario I have doubt,
what if current running application master attempt is last( i.e isLastAMRetry 
is true)? I think it is better to validate isLastAMRetry and set external state.

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751983#comment-13751983
 ] 

Hadoop QA commented on MAPREDUCE-5441:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12600306/MAPREDUCE-5441.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3964//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3964//console

This message is automatically generated.

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751973#comment-13751973
 ] 

Jian He commented on MAPREDUCE-5441:


Thanks [~rohithsharma] for reporting this problem.

Earlier this problem is not easily reproduced on my side because at that time 
MR choose to ignore the Invalid AMRMToken exception after RM restarts and never 
explicitly sends the JOB_AM_REBOOT event and keeps alive until signally killed 
by NM. After that JobClient can just quickly switch to the new AM.

Now MR is changed to explicitly send the JOB_AM_REBOOT event in case of Invalid 
AMRMToken exception(should be fixed later) and JobClient has more chance to see 
the ERROR state of the JOB which leads JobClient to exit prematurely.
Reproduced this problem by putting long sleep in MRAppMaster.showDownJob() for 
the normal shutDown and MRAppMasterShutdownHook in case of signally shutDown, 
so that JobClient has great chance to see the ERROR state.

Uploaded a patch that in case of REBOOT state of the Job return the external 
state as RUNNING to prevent JobClient from prematurely exiting
The above manual test passed with the patch and failed without.

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5441:
---

Status: Patch Available  (was: Open)

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.0.5-alpha, 2.1.0-beta, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5441:
---

Attachment: MAPREDUCE-5441.patch

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
> Attachments: MAPREDUCE-5441.patch
>
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5441:
---

Attachment: MAPREDUCE-5441.patch

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5441:
---

Attachment: (was: MAPREDUCE-5441.patch)

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.

2013-08-27 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned MAPREDUCE-5441:
--

Assignee: Jian He

> JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
> --
>
> Key: MAPREDUCE-5441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client
>Affects Versions: 2.1.0-beta, 2.0.5-alpha, 2.1.1-beta
>Reporter: Rohith Sharma K S
>Assignee: Jian He
>
> When RM issue Reboot command to app master, app master shutdown gracefully. 
> All the history event are writtent to hdfs with job status set as ERROR. 
> Jobclient get job state as ERROR and exit. 
> But RM launches 2nd attempt app master where no client are there to get job 
> status.In RM UI, job status is displayed as SUCCESS but for client Job is 
> Failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-3841) Broken Server metrics and Local logs link under the tools menu

2013-08-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-3841.


Resolution: Duplicate
  Assignee: Jian He

Closed as duplicate.

> Broken Server metrics and Local logs link under the tools menu
> --
>
> Key: MAPREDUCE-3841
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3841
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.1
>Reporter: Ramya Sunil
>Assignee: Jian He
>
> Local logs link redirects to the cluster page and Server metrics opens an 
> empty page on the RM/JHS homepage. So does the links from nodemanager UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5170) incorrect exception message if min node size > min rack size

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5170:
-

Fix Version/s: (was: 2.1.0-beta)
   2.3.0

> incorrect exception message if min node size > min rack size
> 
>
> Key: MAPREDUCE-5170
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5170
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.3-alpha
>Reporter: Sangjin Lee
>Priority: Trivial
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-5170.patch
>
>
> The exception message for CombineFileInputFormat if min node size > min rack 
> size is worded backwards.
> Currently it reads "Minimum split size per node... cannot be smaller than the 
> minimum split size per rack..."
> It should be "Minimum split size per node... cannot be LARGER than the 
> minimum split size per rack..."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4253) Tests for mapreduce-client-core are lying under mapreduce-client-jobclient

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4253:
-

Fix Version/s: (was: 2.1.0-beta)
   2.3.0

> Tests for mapreduce-client-core are lying under mapreduce-client-jobclient
> --
>
> Key: MAPREDUCE-4253
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4253
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.3.0
>
> Attachments: crossing_project_checker.rb, MR-4253.1.patch, 
> MR-4253.2.patch, result.txt
>
>
> Many of the tests for client libs from mapreduce-client-core are lying under 
> mapreduce-client-jobclient.
> We should investigate if this is the right thing to do and if not, move the 
> tests back into client-core.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4462) Enhance readability of TestFairScheduler.java

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4462:
-

Fix Version/s: (was: 2.1.0-beta)
   2.3.0

> Enhance readability of TestFairScheduler.java
> -
>
> Key: MAPREDUCE-4462
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4462
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: scheduler, test
>Reporter: Ryan Hennig
>Priority: Minor
>  Labels: comments, test
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-4462.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> While reading over the unit tests for the Fair Scheduler introduced by 
> MAPREDUCE-3451, I added comments to make the logic of the test easier to grok 
> quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4868) Allow multiple iteration for map

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4868:
-

Fix Version/s: (was: 2.1.0-beta)
   2.3.0

> Allow multiple iteration for map
> 
>
> Key: MAPREDUCE-4868
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Jerry Chen
> Fix For: 3.0.0, 2.3.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently, the Mapper class allows advanced users to override "public void 
> run(Context context)" method for more control over the execution of the 
> mapper, while Context interface limit the operations over the data which is 
> the foundation of "more control".
> One of use cases is that when I am considering a hive optimziation problem, I 
> want to go two passes over the input data instead of using a another job or 
> task ( which may slower the whole process). Each pass do the same thing but 
> with a different parameters.
> This is a new paradigm of Map Reduce usage and can be archived easily by 
> extend Context interface a little with the more control over the data such as 
> reset the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5036) Default shuffle handler port should not be 8080

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5036:
-

Fix Version/s: (was: 2.1.0-beta)
   2.3.0

> Default shuffle handler port should not be 8080
> ---
>
> Key: MAPREDUCE-5036
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5036
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-5036-13562.patch, MAPREDUCE-5036-2.patch, 
> MAPREDUCE-5036.patch
>
>
> The shuffle handler port (mapreduce.shuffle.port) defaults to 8080.  This is 
> a pretty common port for web services, and is likely to cause unnecessary 
> port conflicts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5028:
-

Fix Version/s: (was: 2.1.0-beta)
   2.3.0

> Maps fail when io.sort.mb is set to high value
> --
>
> Key: MAPREDUCE-5028
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Critical
> Fix For: 1.2.0, 2.3.0
>
> Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch, 
> mr-5028-branch1.patch, MR-5028_testapp.patch, mr-5028-trunk.patch, 
> mr-5028-trunk.patch, mr-5028-trunk.patch, repro-mr-5028.patch
>
>
> Verified the problem exists on branch-1 with the following configuration:
> Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, 
> io.sort.mb=1280, dfs.block.size=2147483648
> Run teragen to generate 4 GB data
> Maps fail when you run wordcount on this configuration with the following 
> error: 
> {noformat}
> java.io.IOException: Spill failed
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
>   at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
>   at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>   at 
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
>   at 
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
>   at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>   at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505)
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438)
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855)
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4468) Encapsulate FairScheduler preemption logic into helper class

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4468:
-

Fix Version/s: (was: 2.1.0-beta)
   2.3.0

> Encapsulate FairScheduler preemption logic into helper class
> 
>
> Key: MAPREDUCE-4468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4468
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Ryan Hennig
>Priority: Minor
>  Labels: refactoring, scheduler
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-4468.patch
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> I've extracted the preemption logic from the Fair Scheduler into a helper 
> class so that FairScheduler is closer to following the Single Responsibility 
> Principle.  This may eventually evolve into a generalized preemption module 
> which could be leveraged by other schedulers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5439) mapred-default.xml has missing properties

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5439:
-

Fix Version/s: (was: 2.1.0-beta)
   2.3.0

> mapred-default.xml has missing properties
> -
>
> Key: MAPREDUCE-5439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.1.0-beta
>Reporter: Siddharth Wagle
> Fix For: 2.3.0
>
>
> Properties that need to be added:
> mapreduce.map.memory.mb
> mapreduce.map.java.opts
> mapreduce.reduce.memory.mb
> mapreduce.reduce.java.opts
> Properties that need to be fixed:
> mapred.child.java.opts should not be in mapred-default.
> yarn.app.mapreduce.am.command-opts description needs fixing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1729) Distributed cache should provide an option to fail the job or not, if cache file gets modified on the fly.

2013-08-27 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751722#comment-13751722
 ] 

Akira AJISAKA commented on MAPREDUCE-1729:
--

[~yamashitasni], thanks for your comment.
I'll try to implement your 2nd option.

> Distributed cache should provide an option to fail the job or not, if cache 
> file gets modified on the fly.
> --
>
> Key: MAPREDUCE-1729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1729
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distributed-cache
>Reporter: Amareshwari Sriramadasu
>Assignee: Akira AJISAKA
>
> Currently, distributed cache fails the job if the cache file gets modified on 
> the fly. But there should be an option to fail a job or not.
> See discussions in MAPREDUCE-1288.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Chuan Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751696#comment-13751696
 ] 

Chuan Liu commented on MAPREDUCE-5483:
--

OK. This makes sense. I am also +1.

We can work on a separate Windows fix in another JIRA.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751681#comment-13751681
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5483:
---

UGI and minicluster have support for adding test users which do not map to OS 
users. when using such test users things blow up in the local file system. 
Before MAPREDUCE-5357 (without the chown) thing were working fine in such 
scenarios. MAPREDUCE-5357 introduced a regression.

I'm planning to commit the current tomorrow. If you want to do a special 
handling for Windows (which I would not recommend) please upload a patch. The 
patch should have the effect of a 'revert' for non Windows platforms.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751676#comment-13751676
 ] 

Robert Kanter commented on MAPREDUCE-5483:
--

In the testcase, the user is named 'test'.  The user doesn't exist, so it 
complains about an "Invalid argument".  

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Chuan Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751667#comment-13751667
 ] 

Chuan Liu commented on MAPREDUCE-5483:
--

Who is the directory owner in your test case?
We are setting owner to a newly created directory.
To my understanding, the creator should be the owner under POSIX mode including 
HDFS. 

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751603#comment-13751603
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5483:
---

+1 from my side. [~chuanliu], are you OK with the revert?

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751594#comment-13751594
 ] 

Hadoop QA commented on MAPREDUCE-5483:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12600219/MAPREDUCE-5483.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3963//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3963//console

This message is automatically generated.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751578#comment-13751578
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5483:
---

if you run builds in the same directory as different users you'll run into 
permission issues deleting files from previous run unless the user running the 
second time is a superuser. That seems a wrong thing to do.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751580#comment-13751580
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5483:
---

if we revert this patch you don't do a chown() in a dir you created.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Chuan Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751574#comment-13751574
 ] 

Chuan Liu commented on MAPREDUCE-5483:
--

I want to understand the scenario a little bit more. After reverting this 
change, it means:
# If the directory already exists, we will enforce it belongs to the current 
user.
# Otherwise, we create a new directory. Ownership does not matter in this case.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Chuan Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751570#comment-13751570
 ] 

Chuan Liu commented on MAPREDUCE-5483:
--

Who is the owner of the staging area directory? My original thought was that it 
is always the user who submits jobs because we will enforce the ownership 
explicitly in another code path (the if clause) of the same method. Thus 
chown() will always be a NOP; and the Windows Administrators user case is the 
only exception. It seems my assumption was wrong from the Ozzie test. However, 
this seems suggesting Ozzie test may also fail if you re-run the test without 
deleting the staging directory.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated MAPREDUCE-5483:
-

Status: Patch Available  (was: Open)

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated MAPREDUCE-5483:
-

Attachment: MAPREDUCE-5483.patch

Patch reverts the change.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
> Attachments: MAPREDUCE-5483.patch
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751558#comment-13751558
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5483:
---

I guess we could do a check if the platform is windows to do the chown() but 
the fix was because testcases failing on windows when running them as admin. it 
seems fishy to me that Windows will fail silently chown(). Regardless, either 
we guard this code to run only on Windows or we revert it. I'd prefer reverting 
it.

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter reassigned MAPREDUCE-5483:


Assignee: Robert Kanter

> revert MAPREDUCE-5357
> -
>
> Key: MAPREDUCE-5483
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Robert Kanter
> Fix For: 2.1.1-beta
>
>
> MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid 
> unless you are superuser. if you a chown() to yourself is a NOP, that is why 
> has not been detected in Hadoop testcases where user is running as itself. 
> However, in distcp testcases run by Oozie which use test users/groups from 
> UGI for minicluster it is failing because of this chown() either because the 
> test user does not exist of because the current use does not have privileges 
> to do a chown().
> We should revert MAPREDUCE-5357. Windows should handle this with some 
> conditional logic used only when running in Windows.
> Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
> 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-5483) revert MAPREDUCE-5357

2013-08-27 Thread Alejandro Abdelnur (JIRA)

Alejandro Abdelnur created MAPREDUCE-5483:
-

 Summary: revert MAPREDUCE-5357
 Key: MAPREDUCE-5483
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5483
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
 Fix For: 2.1.1-beta


MAPREDUCE-5357 does a fileystem chown() operation. chown() is not valid unless 
you are superuser. if you a chown() to yourself is a NOP, that is why has not 
been detected in Hadoop testcases where user is running as itself. However, in 
distcp testcases run by Oozie which use test users/groups from UGI for 
minicluster it is failing because of this chown() either because the test user 
does not exist of because the current use does not have privileges to do a 
chown().

We should revert MAPREDUCE-5357. Windows should handle this with some 
conditional logic used only when running in Windows.

Opening a new JIRA and not reverting directly because MAPREDUCE-5357 went in 
2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-08-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751547#comment-13751547
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5357:
---

FYI, opened MAPREDUCE-5483 to revert this JIRA.

> Job staging directory owner checking could fail on Windows
> --
>
> Key: MAPREDUCE-5357
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0-beta
>Reporter: Chuan Liu
>Assignee: Chuan Liu
>Priority: Minor
> Fix For: 3.0.0, 2.1.0-beta
>
> Attachments: MAPREDUCE-5357-trunk.patch
>
>
> In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
> throw exception if the directory owner is not the current user.
> {code:java}
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> This check will fail on Windows when the underlying file system is 
> LocalFileSystem. Because on Windows, the default file or directory owner 
> could be "Administrators" group if the user belongs to "Administrators" group.
> Quite a few MR unit tests that runs MR mini cluster with localFs as 
> underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-5362) clean up POM dependencies

2013-08-27 Thread Alejandro Abdelnur (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned MAPREDUCE-5362:
-

Assignee: Roman Shaposhnik

all yours, thx

> clean up POM dependencies
> -
>
> Key: MAPREDUCE-5362
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5362
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Roman Shaposhnik
>
> Intermediate 'pom' modules define dependencies inherited by leaf modules.
> This is causing issues in intellij IDE.
> We should normalize the leaf modules like in common, hdfs and tools where all 
> dependencies are defined in each leaf module and the intermediate 'pom' 
> module do not define any dependency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5480) TestJSHSecurity.testDelegationToken is breaking after YARN-1085

2013-08-27 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5480:
-

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Fixed via YARN-1085

> TestJSHSecurity.testDelegationToken is breaking after YARN-1085
> ---
>
> Key: MAPREDUCE-5480
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5480
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Omkar Vinit Joshi
>Priority: Blocker
> Attachments: MAPREDUCE-5480.20130824.1.patch
>
>
> See https://builds.apache.org/job/PreCommit-YARN-Build/1755//testReport/.
> {code}
> org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:251)
>   at 
> org.apache.hadoop.mapreduce.v2.hs.HistoryClientService.initializeWebApp(HistoryClientService.java:152)
>  ---
> Caused by: javax.servlet.ServletException: javax.servlet.ServletException: 
> Principal not defined in configuration
>   at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:203)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(AuthenticationFilter.java:146)
>  ---
> Caused by: javax.servlet.ServletException: Principal not defined in 
> configuration
>   at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:164)
>   ... 53 more
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5362) clean up POM dependencies

2013-08-27 Thread Roman Shaposhnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751399#comment-13751399
 ] 

Roman Shaposhnik commented on MAPREDUCE-5362:
-

I'm cleaning up a few build issues (Bigtop 0.8.0 related) and I was wondering 
whether you'd think it would be a good idea for me to tackle this one as well. 
Please assign it to me if you think it is.

> clean up POM dependencies
> -
>
> Key: MAPREDUCE-5362
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5362
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>
> Intermediate 'pom' modules define dependencies inherited by leaf modules.
> This is causing issues in intellij IDE.
> We should normalize the leaf modules like in common, hdfs and tools where all 
> dependencies are defined in each leaf module and the intermediate 'pom' 
> module do not define any dependency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5402) DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE

2013-08-27 Thread David Rosenstrauch (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751376#comment-13751376
 ] 

David Rosenstrauch commented on MAPREDUCE-5402:
---

Hi.  Just wondering if there's been any progress on getting this fix released.  
We're still running into issues in production with the "long tail" of distcp 
jobs taking hours to complete.  I'd love to deploy a fix to our system soon to 
solve that, if possible.

> DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE
> --
>
> Key: MAPREDUCE-5402
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5402
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp, mrv2
>Reporter: David Rosenstrauch
>Assignee: Tsuyoshi OZAWA
> Attachments: MAPREDUCE-5402.1.patch, MAPREDUCE-5402.2.patch, 
> MAPREDUCE-5402.3.patch
>
>
> In MAPREDUCE-2765, which provided the design spec for DistCpV2, the author 
> describes the implementation of DynamicInputFormat, with one of the main 
> motivations cited being to reduce the chance of long-tails where a few 
> leftover mappers run much longer than the rest.
> However, I today ran into a situation where I experienced exactly such a long 
> tail using DistCpV2 and DynamicInputFormat.  And when I tried to alleviate 
> the problem by overriding the number of mappers and the split ratio used by 
> the DynamicInputFormat, I was prevented from doing so by the hard-coded limit 
> set in the code by the MAX_CHUNKS_TOLERABLE constant.  (Currently set to 400.)
> This constant is actually set quite low for production use.  (See a 
> description of my use case below.)  And although MAPREDUCE-2765 states that 
> this is an "overridable maximum", when reading through the code there does 
> not actually appear to be any mechanism available to override it.
> This should be changed.  It should be possible to expand the maximum # of 
> chunks beyond this arbitrary limit.
> For example, here is the situation I ran into today:
> I ran a distcpv2 job on a cluster with 8 machines containing 128 map slots.  
> The job consisted of copying ~2800 files from HDFS to Amazon S3.  I overrode 
> the number of mappers for the job from the default of 20 to 128, so as to 
> more properly parallelize the copy across the cluster.  The number of chunk 
> files created was calculated as 241, and mapred.num.entries.per.chunk was 
> calculated as 12.
> As the job ran on, it reached a point where there were only 4 remaining map 
> tasks, which had each been running for over 2 hours.  The reason for this was 
> that each of the 12 files that those mappers were copying were quite large 
> (several hundred megabytes in size) and took ~20 minutes each.  However, 
> during this time, all the other 124 mappers sat idle.
> In theory I should be able to alleviate this problem with DynamicInputFormat. 
>  If I were able to, say, quadruple the number of chunk files created, that 
> would have made each chunk contain only 3 files, and these large files would 
> have gotten distributed better around the cluster and copied in parallel.
> However, when I tried to do that - by overriding mapred.listing.split.ratio 
> to, say, 10 - DynamicInputFormat responded with an exception ("Too many 
> chunks created with splitRatio:10, numMaps:128. Reduce numMaps or decrease 
> split-ratio to proceed.") - presumably because I exceeded the 
> MAX_CHUNKS_TOLERABLE value of 400.
> Is there any particular logic behind this MAX_CHUNKS_TOLERABLE limit?  I 
> can't personally see any.
> If this limit has no particular logic behind it, then it should be 
> overridable - or even better:  removed altogether.  After all, I'm not sure I 
> see any need for it.  Even if numMaps * splitRatio resulted in an 
> extraordinarily large number, if the code were modified so that the number of 
> chunks got calculated as Math.min( numMaps * splitRatio, numFiles), then 
> there would be no need for MAX_CHUNKS_TOLERABLE.  In this worst-case scenario 
> where the product of numMaps and splitRatio is large, capping the number of 
> chunks at the number of files (numberOfChunks = numberOfFiles) would result 
> in 1 file per chunk - the maximum parallelization possible.  That may not be 
> the best-tuned solution for some users, but I would think that it should be 
> left up to the user to deal with the potential consequence of not having 
> tuned their job properly.  Certainly that would be better than having an 
> arbitrary hard-coded limit that *prevents* proper parallelization when 
> dealing with large files and/or large numbers of mappers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on

reduce job hung in pending state: "No room for reduce task"

2013-08-27 Thread Jim Colestock

Hello All, 

We're running into the following 2 bugs again: 
https://issues.apache.org/jira/browse/HADOOP-5241
https://issues.apache.org/jira/browse/MAPREDUCE-2324

Both of them a listed as closed fixed.  (I was actually the one that got 
cloudera to submit MAPREDUCE-2324)  Does anyone know is anyone else seeing 
these in later releases?   We're running the following on various version of 
Cent OS with Java 1.6:

hadoop-2.0.0+1357-1.cdh4.3.0.p0.21.el5

hadoop-0.20-mapreduce-jobtracker-2.0.0+1357-1.cdh4.3.0.p0.21.el5
hadoop-0.20-mapreduce-2.0.0+1357-1.cdh4.3.0.p0.21.el5
hadoop-0.20-mapreduce-tasktracker-2.0.0+1357-1.cdh4.3.0.p0.21.el5

hadoop-hdfs-namenode-2.0.0+1357-1.cdh4.3.0.p0.21.el5
hadoop-hdfs-secondarynamenode-2.0.0+1357-1.cdh4.3.0.p0.21.el5
hadoop-hdfs-2.0.0+1357-1.cdh4.3.0.p0.21.el5
hadoop-hdfs-datanode-2.0.0+1357-1.cdh4.3.0.p0.21.el5

Just for a quick summary, basically a reduce job get hung in pending while 
trying to find room on a task tracker, it keeps trying over and over and never 
fails.  So you end up with a whole bunch of these in the logs: 

2013-08-27 00:48:01,412 WARN org.apache.hadoop.mapred.JobInProgress: No room 
for reduce task. Node tracker_104.sm.tld:127.0.0.1/127.0.0.1:43723 has 
250176954368 bytes free; but we expect reduce input to take 283580756533

Thanks in advance for any help on the issue.. 

JC

50 matches

Mail list logo