[jira] [Updated] (MAPREDUCE-6611) Localization timeout diagnostic for taskattempts

2016-01-19 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6611:

Attachment: MAPREDUCE-6611.patch

> Localization timeout diagnostic for taskattempts
> 
>
> Key: MAPREDUCE-6611
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6611
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6611.patch
>
>
> When a container takes too long to localize it manifests as a timeout, and 
> there's no indication that localization was the issue. We need diagnostics 
> for timeouts to indicate the container was still localizing when the timeout 
> occurred. Dependent upon YARN-4589



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6611) Localization timeout diagnostic for taskattempts

2016-01-19 Thread Chang Li (JIRA)
Chang Li created MAPREDUCE-6611:
---

 Summary: Localization timeout diagnostic for taskattempts
 Key: MAPREDUCE-6611
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6611
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Chang Li
Assignee: Chang Li


When a container takes too long to localize it manifests as a timeout, and 
there's no indication that localization was the issue. We need diagnostics for 
timeouts to indicate the container was still localizing when the timeout 
occurred. Dependent upon YARN-4589



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-11 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.6.3.patch

Thanks for very patient and careful review [~jlowe]! update .6.3 patch 
accordingly

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, 
> MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, 
> MAPREDUCE-6533.6.2.patch, MAPREDUCE-6533.6.3.patch, MAPREDUCE-6533.6.patch, 
> MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-11 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.6.2.patch

.6.2 fix tear down as well

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, 
> MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, 
> MAPREDUCE-6533.6.2.patch, MAPREDUCE-6533.6.patch, MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-11 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000502#comment-15000502
 ] 

Chang Li commented on MAPREDUCE-6533:
-

sorry, over look the comment about tearDown part. will fix that 

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, 
> MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, 
> MAPREDUCE-6533.6.patch, MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-11 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.6.patch

Thanks [~jlowe] for further review! update .6 patch accordingly

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, 
> MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, 
> MAPREDUCE-6533.6.patch, MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-11 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.5.patch

Thanks [~djp] for review! update .5 patch according to your suggestion

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, 
> MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, 
> MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.4.patch

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, 
> MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-09 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.4.patch

Thanks [~jlowe] for further review! update .4 patch to address your concern

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, 
> MAPREDUCE-6533.4.patch, MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-04 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.3.patch

.3 patch fix whitespace issue

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, 
> MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-04 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.2.patch

Thanks [~jlowe] for review! have updated .2 patch according to your suggestion 
to create them as Path object

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-11-02 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986258#comment-14986258
 ] 

Chang Li commented on MAPREDUCE-6533:
-

[~jlowe], thanks for review and comment concerns! Right, this broken test used 
to work, its current error seem to be caused by jdk change. In the test, the 
{{TEST_ROOT_DIR}} is 
{code}file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/
 {code} when I run the test on my machine. But {{TEST_VISIBILITY_DIR}} is 
{code} 
file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/TestCacheVisibility/{code}
 which used to be 
{code}file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/TestCacheVisibility{code}.
 When a file created by {{File(String parent, String child)}}, its absolute 
pathname and pathname differed as above. The {{testDetermineCacheVisibilities}} 
intended to test under 
{{file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/TestCacheVisibility}}
 which is TestCacheVisibility dir under {{TEST_ROOT_DIR}}, so I reconstruct the 
TEST_VISIBILITY_DIR string by this simple way rather than the unnecessarily 
complicate means from File to URI then to String.

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-11-02 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985440#comment-14985440
 ] 

Chang Li commented on MAPREDUCE-5003:
-

have opened MAPREDUCE-6533 for TestClientDistributedCacheManager failure

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.10.patch, 
> MAPREDUCE-5003.2.patch, MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, 
> MAPREDUCE-5003.7.patch, MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch, 
> MAPREDUCE-5003.9.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-11-02 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6512:

Attachment: MAPREDUCE-6512.2.patch

> FileOutputCommitter tasks unconditionally create parent directories
> ---
>
> Key: MAPREDUCE-6512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.2.patch, 
> MAPREDUCE-6512.patch, MAPREDUCE-6512.patch
>
>
> If the output directory is deleted then subsequent tasks should fail. Instead 
> they blindly create the missing parent directories, leading the job to be 
> "succesful" despite potentially missing almost all of the output. Task 
> attempts should fail if the parent app attempt directory is missing when they 
> go to create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-10-31 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Status: Patch Available  (was: Open)

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-10-31 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6533:

Attachment: MAPREDUCE-6533.patch

> testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
> -
>
> Key: MAPREDUCE-6533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6533.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken

2015-10-31 Thread Chang Li (JIRA)
Chang Li created MAPREDUCE-6533:
---

 Summary: testDetermineCacheVisibilities of 
TestClientDistributedCacheManager is broken
 Key: MAPREDUCE-6533
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-10-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.10.patch

.10 patch fix some checkstyle.
broken test of TestJobHistoryEventHandler is not related to my change. It may 
be transient since it pass in my local machine with my patch on. broken test of 
TestRecovery also appear to be transient becaue it pass on my local machine 
with my patch on. I update testMultipleCrashes of TestRecovery to improve its 
stability. 
testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken 
without applying my patch. Will file jira for that broken test

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.10.patch, 
> MAPREDUCE-5003.2.patch, MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, 
> MAPREDUCE-5003.7.patch, MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch, 
> MAPREDUCE-5003.9.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6512:

Attachment: MAPREDUCE-6512.2.patch

.2 patch fix broken tests which are caused by missing parent dir

> FileOutputCommitter tasks unconditionally create parent directories
> ---
>
> Key: MAPREDUCE-6512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.patch, 
> MAPREDUCE-6512.patch
>
>
> If the output directory is deleted then subsequent tasks should fail. Instead 
> they blindly create the missing parent directories, leading the job to be 
> "succesful" despite potentially missing almost all of the output. Task 
> attempts should fail if the parent app attempt directory is missing when they 
> go to create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-10-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.9.patch

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, 
> MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch, MAPREDUCE-5003.9.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6518) Set SO_KEEPALIVE on shuffle connections

2015-10-21 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6518:

Attachment: MAPREDUCE-6518.4.patch

Thanks [~jlowe] for review! Have addressed your concerns in .4 patch

> Set SO_KEEPALIVE on shuffle connections
> ---
>
> Key: MAPREDUCE-6518
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6518
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, nodemanager
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Chang Li
> Attachments: MAPREDUCE-6518.4.patch, YARN-4052.2.patch, 
> YARN-4052.3.patch, YARN-4052.patch
>
>
> Shuffle handler does not set SO_KEEPALIVE so we've seen cases where 
> FDs/sockets get stuck in ESTABLISHED state indefinitely because the server 
> did not see the client leave (network cut or otherwise). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6512:

Attachment: (was: YARN-4269.2.patch)

> FileOutputCommitter tasks unconditionally create parent directories
> ---
>
> Key: MAPREDUCE-6512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6512.patch, MAPREDUCE-6512.patch
>
>
> If the output directory is deleted then subsequent tasks should fail. Instead 
> they blindly create the missing parent directories, leading the job to be 
> "succesful" despite potentially missing almost all of the output. Task 
> attempts should fail if the parent app attempt directory is missing when they 
> go to create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6512:

Attachment: YARN-4269.2.patch

> FileOutputCommitter tasks unconditionally create parent directories
> ---
>
> Key: MAPREDUCE-6512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6512.patch, MAPREDUCE-6512.patch
>
>
> If the output directory is deleted then subsequent tasks should fail. Instead 
> they blindly create the missing parent directories, leading the job to be 
> "succesful" despite potentially missing almost all of the output. Task 
> attempts should fail if the parent app attempt directory is missing when they 
> go to create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6512:

Attachment: MAPREDUCE-6512.patch

> FileOutputCommitter tasks unconditionally create parent directories
> ---
>
> Key: MAPREDUCE-6512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6512.patch, MAPREDUCE-6512.patch
>
>
> If the output directory is deleted then subsequent tasks should fail. Instead 
> they blindly create the missing parent directories, leading the job to be 
> "succesful" despite potentially missing almost all of the output. Task 
> attempts should fail if the parent app attempt directory is missing when they 
> go to create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-14 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6512:

Status: Patch Available  (was: Open)

> FileOutputCommitter tasks unconditionally create parent directories
> ---
>
> Key: MAPREDUCE-6512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6512.patch
>
>
> If the output directory is deleted then subsequent tasks should fail. Instead 
> they blindly create the missing parent directories, leading the job to be 
> "succesful" despite potentially missing almost all of the output. Task 
> attempts should fail if the parent app attempt directory is missing when they 
> go to create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-14 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6512:

Attachment: MAPREDUCE-6512.patch

> FileOutputCommitter tasks unconditionally create parent directories
> ---
>
> Key: MAPREDUCE-6512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6512.patch
>
>
> If the output directory is deleted then subsequent tasks should fail. Instead 
> they blindly create the missing parent directories, leading the job to be 
> "succesful" despite potentially missing almost all of the output. Task 
> attempts should fail if the parent app attempt directory is missing when they 
> go to create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-14 Thread Chang Li (JIRA)
Chang Li created MAPREDUCE-6512:
---

 Summary: FileOutputCommitter tasks unconditionally create parent 
directories
 Key: MAPREDUCE-6512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li


If the output directory is deleted then subsequent tasks should fail. Instead 
they blindly create the missing parent directories, leading the job to be 
"succesful" despite potentially missing almost all of the output. Task attempts 
should fail if the parent app attempt directory is missing when they go to 
create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-09-25 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.9.patch

Thanks for review [~jlowe]! I have uploaded .9 patch to address the concerns 
you have. About the nit you mentioned,  when a previously running task gets 
recovered, its state will be null, that's why I do the null check. It's null 
because jobhistory server only record state for a task in those completion 
event. So recovery will not get value of state for those previously running 
tasks. .9 patch deals with this problem by doing record state for task in task 
start event. 
The way I check backward compatibility is by first check if an old jobhistory 
files could be parsed by my modified new jobhistory server. I do this by first 
start a single node cluster without applying my patch, and run some jobs. Then 
I shutdown the jobhistory server and apply my patch, compile the new code and 
start up the jobhistory server which will have my changes. I check if the new 
jobhistory server could load and parse those old jobhistory files. I verified 
that I can visit all old jobhistory in the UI after restart.
Also vice versa, I check if old history server could be compatible with 
jobhistory files generated by the jobhistory server with my change. I follow 
the steps above except I first run jobs in the new jobhistory server with my 
patch applied and then shutdown the history server and remove my patch, 
recompile the code and start up the jobhistory server without my patch on. I 
verify that those jobhistory files generated by the new jobhistory server could 
be parsed by the old jobhistory server.

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, 
> MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-09-18 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Attachment: MAPREDUCE-5982.branch.2.7.patch

[~jlowe] I uploaded 2.7 patch.

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.6.patch, 
> MAPREDUCE-5982.branch.2.7.patch, MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-09-17 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804585#comment-14804585
 ] 

Chang Li commented on MAPREDUCE-5982:
-

Thanks [~jlowe] for review and commit! I will work on a patch for branch-2.7.

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.6.patch, 
> MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-09-14 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744644#comment-14744644
 ] 

Chang Li commented on MAPREDUCE-5003:
-

The broken unit test is unrelated to my change. Have test this test on my local 
machine with my patch on and the test passed. [~jlowe] please help review the 
latest patch. Thanks!

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, 
> MAPREDUCE-5003.8.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-09-14 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.8.patch

fix whitespace and some checkstyle issues

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, 
> MAPREDUCE-5003.8.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-09-14 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.7.patch

Thanks [~jlowe] for review! Have reproduced the problem about broken link for 
recovered incomplete task attempts. .7 fixed this issue. Also addressed your 
other suggestions. Because I touched TaskAttemptStarted event, I have checked 
and make sure that there is no backward compatibility issue.

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-09-11 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741606#comment-14741606
 ] 

Chang Li commented on MAPREDUCE-5982:
-

[~jlowe] please help review the latest patch. Thanks!

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.6.patch, 
> MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-09-11 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Attachment: MAPREDUCE-5982.6.patch

fixed broken tests and checkstyles 

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.6.patch, 
> MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-09-11 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Status: Patch Available  (was: Open)

Thanks [~jlowe] for review! I have updated my patch according to your 
suggestion.

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.7.1, 2.2.1, 0.23.10
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-09-11 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Attachment: MAPREDUCE-5982.5.patch

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-09-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Attachment: MAPREDUCE-5002.6.patch

fix whitespace issue

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch, MAPREDUCE-5002.4.patch, 
> MAPREDUCE-5002.5.patch, MAPREDUCE-5002.6.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-09-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Attachment: MAPREDUCE-5002.5.patch

.5 patch improve some naming and comment

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch, MAPREDUCE-5002.4.patch, 
> MAPREDUCE-5002.5.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-09-10 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739369#comment-14739369
 ] 

Chang Li commented on MAPREDUCE-5002:
-

Thanks [~jlowe] for review! Have reworked my unit test. Please help review the 
updated patch. Thanks!

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch, MAPREDUCE-5002.4.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-09-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Attachment: MAPREDUCE-5002.4.patch

there was a debugging print in .3 patch I forgot to delete. Upload .4 patch fix 
that

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch, MAPREDUCE-5002.4.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-09-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Attachment: MAPREDUCE-5002.3.patch

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6442) Stack trace missing for client protocol provider creation error

2015-09-03 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729241#comment-14729241
 ] 

Chang Li commented on MAPREDUCE-6442:
-

Hi [~ozawa], thanks for review. But usually log based change doesn't need a 
unit test. Also for my change
{code}
catch (Exception e) {
  LOG.info("Failed to use " + provider.getClass().getName()
  + " due to error: ", e);
}
{code}
that exception is caught, and error message merely gets logged. The exception 
is not thrown. Do you have any recommendation how to test that in a unit test?
Moreover, I have manually test this change. I verified that the stack trace is 
printed and I post the stack trace in the above comment.

> Stack trace missing for client protocol provider creation error
> ---
>
> Key: MAPREDUCE-6442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6442.2.patch, MAPREDUCE-6442.patch
>
>
> when provider creation fail dump the stack trace rather than just print out 
> the message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6442) Stack trace missing for client protocol provider creation error

2015-09-02 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727836#comment-14727836
 ] 

Chang Li commented on MAPREDUCE-6442:
-

[~jlowe] thanks for review! I test by pass a null conf to initialize cluster 
and the log showed the stack trace
{code}
Failed to use org.apache.hadoop.mapred.YarnClientProtocolProvider due to error:
java.lang.NullPointerException
at 
org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:33)
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:95)
at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:75)
{code}

> Stack trace missing for client protocol provider creation error
> ---
>
> Key: MAPREDUCE-6442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6442.2.patch, MAPREDUCE-6442.patch
>
>
> when provider creation fail dump the stack trace rather than just print out 
> the message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6442) Stack trace missing for client protocol provider creation error

2015-08-03 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6442:

Status: Patch Available  (was: Open)

> Stack trace missing for client protocol provider creation error
> ---
>
> Key: MAPREDUCE-6442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6442.2.patch, MAPREDUCE-6442.patch
>
>
> when provider creation fail dump the stack trace rather than just print out 
> the message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6442) Stack trace missing for client protocol provider creation error

2015-08-03 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6442:

Attachment: MAPREDUCE-6442.2.patch

[~jlowe] thanks for review. Updated my patch. 

> Stack trace missing for client protocol provider creation error
> ---
>
> Key: MAPREDUCE-6442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6442.2.patch, MAPREDUCE-6442.patch
>
>
> when provider creation fail dump the stack trace rather than just print out 
> the message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6442) when provider creation fail dump the stack trace

2015-07-31 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649933#comment-14649933
 ] 

Chang Li commented on MAPREDUCE-6442:
-

[~jlowe] please help review. Thanks!

> when provider creation fail dump the stack trace
> 
>
> Key: MAPREDUCE-6442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6442.patch
>
>
> when provider creation fail dump the stack trace rather than just print out 
> the message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6442) when provider creation fail dump the stack trace

2015-07-31 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6442:

Attachment: MAPREDUCE-6442.patch

> when provider creation fail dump the stack trace
> 
>
> Key: MAPREDUCE-6442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6442.patch
>
>
> when provider creation fail dump the stack trace rather than just print out 
> the message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6442) when provider creation fail dump the stack trace

2015-07-31 Thread Chang Li (JIRA)
Chang Li created MAPREDUCE-6442:
---

 Summary: when provider creation fail dump the stack trace
 Key: MAPREDUCE-6442
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li


when provider creation fail dump the stack trace rather than just print out the 
message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-07-31 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649778#comment-14649778
 ] 

Chang Li commented on MAPREDUCE-5982:
-

[~jlowe] please help review.

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-07-31 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Attachment: MAPREDUCE-5982.4.patch

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.4.patch, MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-07-31 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Attachment: MAPREDUCE-5982.3.patch

checkstyle fix

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, 
> MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-07-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Attachment: MAPREDUCE-5982.2.patch

whitespace fix

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-07-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Affects Version/s: 2.7.1

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1, 2.7.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-07-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Attachment: MAPREDUCE-5982.patch

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-07-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5982:

Status: Patch Available  (was: Open)

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.2.1, 0.23.10
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear

2015-07-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li reassigned MAPREDUCE-5982:
---

Assignee: Chang Li

> Task attempts that fail from the ASSIGNED state can disappear
> -
>
> Key: MAPREDUCE-5982
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.10, 2.2.1
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5982.patch
>
>
> If a task attempt fails in the ASSIGNED state, e.g.: container launch fails,  
> then it can disappear from the job history.  The task overview page will show 
> subsequent attempts but the attempt in question is simply missing.  For 
> example attempt ID 1 appears but the attempt ID 0 is missing.  Similarly in 
> the job overview page the task attempt doesn't appear in any of the 
> failed/killed/succeeded counts or pages.  It's as if the task attempt never 
> existed, but the AM logs show otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6384:

Attachment: MAPREDUCE-6384.10.patch

Thanks [~jlowe] for further review! I have changed the typo and replaced all 
the tabs in testFetchFailureAttemptFinishTime() with spaces.

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.10.patch, MAPREDUCE-6384.2.patch, 
> MAPREDUCE-6384.3.patch, MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, 
> MAPREDUCE-6384.5.patch, MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, 
> MAPREDUCE-6384.8.patch, MAPREDUCE-6384.9.patch, MAPREDUCE-6384.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6420) Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6420:

Status: Patch Available  (was: Open)

[~knoguchi], [~jlowe] please help review. Thanks!

> Interrupted Exception in LocalContainerLauncher should be logged in warn/info 
> levle
> ---
>
> Key: MAPREDUCE-6420
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6420
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6420.1.patch
>
>
> Interrupted Exception in LocalContainerLauncher should be logged in warn/info 
> levle instead of error because it won't fail the job. Otherwise it will cause 
> some confusions during debugging



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6420) Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6420:

Attachment: MAPREDUCE-6420.1.patch

> Interrupted Exception in LocalContainerLauncher should be logged in warn/info 
> levle
> ---
>
> Key: MAPREDUCE-6420
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6420
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6420.1.patch
>
>
> Interrupted Exception in LocalContainerLauncher should be logged in warn/info 
> levle instead of error because it won't fail the job. Otherwise it will cause 
> some confusions during debugging



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6420) Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle

2015-06-29 Thread Chang Li (JIRA)
Chang Li created MAPREDUCE-6420:
---

 Summary: Interrupted Exception in LocalContainerLauncher should be 
logged in warn/info levle
 Key: MAPREDUCE-6420
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6420
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li


Interrupted Exception in LocalContainerLauncher should be logged in warn/info 
levle instead of error because it won't fail the job



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6420) Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6420:

Description: Interrupted Exception in LocalContainerLauncher should be 
logged in warn/info levle instead of error because it won't fail the job. 
Otherwise it will cause some confusions during debugging  (was: Interrupted 
Exception in LocalContainerLauncher should be logged in warn/info levle instead 
of error because it won't fail the job)

> Interrupted Exception in LocalContainerLauncher should be logged in warn/info 
> levle
> ---
>
> Key: MAPREDUCE-6420
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6420
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>
> Interrupted Exception in LocalContainerLauncher should be logged in warn/info 
> levle instead of error because it won't fail the job. Otherwise it will cause 
> some confusions during debugging



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6418:

Attachment: MAPREDUCE-6418.4.patch

.4 patch fixed some whitespace issues I just noticed

> MRApp should not shutdown LogManager during shutdown
> 
>
> Key: MAPREDUCE-6418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6418.1.patch, MAPREDUCE-6418.2.patch, 
> MAPREDUCE-6418.3.patch, MAPREDUCE-6418.4.patch
>
>
> Tests in TestRecovery.java lost their logs after recovery due to the change 
> of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am 
> recover to be shown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6418:

Attachment: MAPREDUCE-6418.3.patch

Thanks [~jlowe] for review and provide valuable suggestions! I just updated my 
patch to improve the comments and move comment inside the function body.

> MRApp should not shutdown LogManager during shutdown
> 
>
> Key: MAPREDUCE-6418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6418.1.patch, MAPREDUCE-6418.2.patch, 
> MAPREDUCE-6418.3.patch
>
>
> Tests in TestRecovery.java lost their logs after recovery due to the change 
> of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am 
> recover to be shown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-29 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606139#comment-14606139
 ] 

Chang Li commented on MAPREDUCE-5003:
-

[~jlowe] thanks for review! I have updated my patch. Could you please review 
the latest patch. Thanks!

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6418:

Attachment: MAPREDUCE-6418.2.patch

[~jlowe] thanks for review! I just updated my patch according to your 
suggestions.

> MRApp should not shutdown LogManager during shutdown
> 
>
> Key: MAPREDUCE-6418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6418.1.patch, MAPREDUCE-6418.2.patch
>
>
> Tests in TestRecovery.java lost their logs after recovery due to the change 
> of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am 
> recover to be shown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.6.patch

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.5.patch

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, 
> MAPREDUCE-5003.5.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-29 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.5.patch

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-28 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Status: Patch Available  (was: Open)

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-28 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.4.patch

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown

2015-06-26 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6418:

Status: Patch Available  (was: Open)

[~jlowe] please help review. Thanks! 

> MRApp should not shutdown LogManager during shutdown
> 
>
> Key: MAPREDUCE-6418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6418.1.patch
>
>
> Tests in TestRecovery.java lost their logs after recovery due to the change 
> of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am 
> recover to be shown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown

2015-06-26 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6418:

Attachment: MAPREDUCE-6418.1.patch

> MRApp should not shutdown LogManager during shutdown
> 
>
> Key: MAPREDUCE-6418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6418.1.patch
>
>
> Tests in TestRecovery.java lost their logs after recovery due to the change 
> of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am 
> recover to be shown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown

2015-06-26 Thread Chang Li (JIRA)
Chang Li created MAPREDUCE-6418:
---

 Summary: MRApp should not shutdown LogManager during shutdown
 Key: MAPREDUCE-6418
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Tests in TestRecovery.java lost their logs after recovery 
due to the change of MAPREDUCE-5694. MRApp should overwrite those changes to 
allow log after am recover to be shown.
Reporter: Chang Li
Assignee: Chang Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown

2015-06-26 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6418:

Description: Tests in TestRecovery.java lost their logs after recovery due 
to the change of MAPREDUCE-5694. MRApp should overwrite those changes to allow 
log after am recover to be shown.

> MRApp should not shutdown LogManager during shutdown
> 
>
> Key: MAPREDUCE-6418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>
> Tests in TestRecovery.java lost their logs after recovery due to the change 
> of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am 
> recover to be shown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown

2015-06-26 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6418:

Environment: (was: Tests in TestRecovery.java lost their logs after 
recovery due to the change of MAPREDUCE-5694. MRApp should overwrite those 
changes to allow log after am recover to be shown.)

> MRApp should not shutdown LogManager during shutdown
> 
>
> Key: MAPREDUCE-6418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-26 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603413#comment-14603413
 ] 

Chang Li commented on MAPREDUCE-6384:
-

[~jlowe],  please help review the latest patch. Thanks!

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, 
> MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, MAPREDUCE-6384.8.patch, 
> MAPREDUCE-6384.9.patch, MAPREDUCE-6384.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-26 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6384:

Attachment: MAPREDUCE-6384.9.patch

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, 
> MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, MAPREDUCE-6384.8.patch, 
> MAPREDUCE-6384.9.patch, MAPREDUCE-6384.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-26 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6384:

Attachment: MAPREDUCE-6384.9.patch

whitespace fix

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, 
> MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, MAPREDUCE-6384.8.patch, 
> MAPREDUCE-6384.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-26 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6384:

Attachment: MAPREDUCE-6384.8.patch

[~jlowe] thanks for review! Have updated my patch according to your suggestion.

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, 
> MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, MAPREDUCE-6384.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-22 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6384:

Attachment: MAPREDUCE-6384.7.patch

whitespace fix

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, 
> MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-06-22 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596589#comment-14596589
 ] 

Chang Li commented on MAPREDUCE-5002:
-

[~jlowe] could you please help reveiw? Thanks!

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-22 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596586#comment-14596586
 ] 

Chang Li commented on MAPREDUCE-6384:
-

Thansk [~zxu] for review and sorry about the delay in response. I have fixed 
the indentation issue in my patch and those two whitespace issues.

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, 
> MAPREDUCE-6384.6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-22 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6384:

Attachment: MAPREDUCE-6384.6.patch

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, 
> MAPREDUCE-6384.6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-06-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Affects Version/s: 2.7.0

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-06-18 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Attachment: MAPREDUCE-5002.2.patch

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, 
> MAPREDUCE-5002.2.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-06-18 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Attachment: MAPREDUCE-5002.2.patch

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-06-17 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Status: Patch Available  (was: Open)

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 0.23.7, 2.0.3-alpha
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-06-17 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5002:

Attachment: MAPREDUCE-5002.1.patch

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5002.1.patch
>
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2015-06-17 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li reassigned MAPREDUCE-5002:
---

Assignee: Chang Li

> AM could potentially allocate a reduce container to a map attempt
> -
>
> Key: MAPREDUCE-5002
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.7
>Reporter: Jason Lowe
>Assignee: Chang Li
>
> As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
> possible for the AM to accidentally assign a reducer container to a map 
> attempt if the AM doesn't find a reduce attempt actively looking for the 
> container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-12 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583815#comment-14583815
 ] 

Chang Li commented on MAPREDUCE-5003:
-

[~jlowe] could you please help review, thanks!

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-12 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.3.patch

checkstyle fix

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, 
> MAPREDUCE-5003.3.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-12 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.2.patch

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Status: Patch Available  (was: Open)

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.1.patch

the AM hanging issue mentioned in MAPREDUCE-4992 no longer exists in current 
recover implementation. Moreover, current recovery implementation for task 
attempts will recover incomplete task attempts to killed state. So we no longer 
need to remove those incomplete task attempts for recovery. 

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-06-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li reassigned MAPREDUCE-5003:
---

Assignee: Chang Li

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-05 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6384:

Attachment: MAPREDUCE-6384.5.patch

Hi [~zxu], thanks for review and good catch! I just updated my patch according 
to your suggestion. 

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6354) shuffle handler should log connection info

2015-06-04 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6354:

Attachment: MAPREDUCE-6354.8.patch

> shuffle handler should log connection info
> --
>
> Key: MAPREDUCE-6354
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6354
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6354.2.patch, MAPREDUCE-6354.3.patch, 
> MAPREDUCE-6354.4.patch, MAPREDUCE-6354.5.patch, MAPREDUCE-6354.6.patch, 
> MAPREDUCE-6354.7.patch, MAPREDUCE-6354.8.patch, MAPREDUCE-6354.patch
>
>
> currently, shuffle handler only log connection info in debug mode, we want to 
> log that info in a more concise way



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure

2015-06-04 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6384:

Attachment: MAPREDUCE-6384.4.2.patch

> add the last reporting reducer host info and attempt id on the map error 
> message due to too many fetch failure
> --
>
> Key: MAPREDUCE-6384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, 
> MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6354) shuffle handler should log connection info

2015-06-04 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573047#comment-14573047
 ] 

Chang Li commented on MAPREDUCE-6354:
-

[~jlowe] thanks a lot for thoughtful review and hearty discussion of how to 
make this logging more efficient! I have made changes of debug level logging. 
As for the trace level logging, I think we could wait till this get committed 
and file another jira to address that issue. Let me know what you think of the 
latest patch. Thanks! 

> shuffle handler should log connection info
> --
>
> Key: MAPREDUCE-6354
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6354
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6354.2.patch, MAPREDUCE-6354.3.patch, 
> MAPREDUCE-6354.4.patch, MAPREDUCE-6354.5.patch, MAPREDUCE-6354.6.patch, 
> MAPREDUCE-6354.7.patch, MAPREDUCE-6354.patch
>
>
> currently, shuffle handler only log connection info in debug mode, we want to 
> log that info in a more concise way



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >