[jira] [Updated] (MAPREDUCE-7259) testSpeculateSuccessfulWithUpdateEvents fails Intermittently

2020-01-28 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles updated MAPREDUCE-7259:
--
Fix Version/s: 2.10.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

+1 on the branch-2.10 patch. Thanks again, [~ahussein]

> testSpeculateSuccessfulWithUpdateEvents fails Intermittently  
> --
>
> Key: MAPREDUCE-7259
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7259
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: MAPREDUCE-7259-branch-2.10.005.patch, 
> MAPREDUCE-7259.001.patch, MAPREDUCE-7259.002.patch, MAPREDUCE-7259.003.patch, 
> MAPREDUCE-7259.004.patch, MAPREDUCE-7259.005.patch
>
>
> {{TestSpeculativeExecutionWithMRApp.testSpeculateSuccessfulWithUpdateEvents}} 
> fails Intermittently with the exponential estimator. The problem happens 
> because assertion fails waiting for the MRApp to stop.
> There maybe a need to redesign the test case because it does not work very 
> well because of the racing and the timing between the speculator and the 
> tasks. It works fine for the legacy estimator because the estimate is based 
> on start-end rate. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7259) testSpeculateSuccessfulWithUpdateEvents fails Intermittently

2020-01-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025465#comment-17025465
 ] 

Hadoop QA commented on MAPREDUCE-7259:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
55s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
51s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_232 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_232 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed with JDK v1.8.0_232 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-mapreduce-project/hadoop-mapreduce-client: 
The patch generated 0 new + 58 unchanged - 5 fixed = 58 total (was 63) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed with JDK v1.8.0_232 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  6m 
42s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}116m  0s{color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
39s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:a969cad0a12 |
| JIRA Issue | MAPREDUCE-7259 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12992058/MAPREDUCE-7259-branch-2.10.005.patch
 |
| Optional Tests |  dupname  

[jira] [Commented] (MAPREDUCE-7079) JobHistory#ServiceStop implementation is incorrect

2020-01-28 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025417#comment-17025417
 ] 

Ahmed Hussein commented on MAPREDUCE-7079:
--

Thanks [~epayne]!

> JobHistory#ServiceStop implementation is incorrect
> --
>
> Key: MAPREDUCE-7079
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7079
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jason Darrell Lowe
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 2020-01-10-MRApp-stack-dump.txt, 
> 2020-01-10-org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-version-14.txt,
>  MAPREDUCE-7079.001.patch, MAPREDUCE-7079.002.patch, 
> MAPREDUCE-7079.003.patch, MAPREDUCE-7079.004.patch, MAPREDUCE-7079.005.patch, 
> MAPREDUCE-7079.006.patch, MAPREDUCE-7079.007.patch, MAPREDUCE-7079.008.patch, 
> MAPREDUCE-7079.009.patch, MAPREDUCE-7079.010.patch
>
>
> {{JobHistory.serviceStop}} skips waiting for the thread pool to terminate. 
> The problem is due to incorrect while condition that will evaluate to false 
> on the iteration of the loop.
> {code:java}
>  scheduledExecutor.shutdown();
>   boolean interrupted = false;
>   long currentTime = System.currentTimeMillis();
>   while (!scheduledExecutor.isShutdown()
>   && System.currentTimeMillis() > currentTime + 1000l && 
> !interrupted) {
> try {
>   Thread.sleep(20);
> } catch (InterruptedException e) {
>   interrupted = true;
> }
>   }
> {code}
> The expression "{{System.currentTimeMillis() > currentTime + 1000L}}" is 
> false because currentTime was just initialized with 
> {{System.currentTimeMillis()}}. As a result the the thread won't wait until 
> the executor is terminated. Instead, it will force a shutdown immediately.
> *TestMRIntermediateDataEncryption is failing in precommit builds*
> TestMRIntermediateDataEncryption is either timing out or tearing down the JVM 
> which causes the unit tests in jobclient to not pass cleanly during precommit 
> builds. From sample precommit console output, note the lack of a test results 
> line when the test is run:
> {noformat}
> [INFO] Running org.apache.hadoop.mapred.TestSequenceFileInputFormat
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.976 
> s - in org.apache.hadoop.mapred.TestSequenceFileInputFormat
> [INFO] Running org.apache.hadoop.mapred.TestMRIntermediateDataEncryption
> [INFO] Running org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.659 
> s - in org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
> [...]
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:14 h
> [INFO] Finished at: 2018-04-12T04:27:06+00:00
> [INFO] Final Memory: 24M/594M
> [INFO] 
> 
> [WARNING] The requested profile "parallel-tests" could not be activated 
> because it does not exist.
> [WARNING] The requested profile "native" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "yarn-ui" could not be activated because it 
> does not exist.
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
> in the fork -> [Help 1]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7079) JobHistory#ServiceStop implementation is incorrect

2020-01-28 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025412#comment-17025412
 ] 

Eric Payne commented on MAPREDUCE-7079:
---

It backports cleanly to branch-2.10.
I'll wait for additional comments and then, if no objections, commit tomorrow.

> JobHistory#ServiceStop implementation is incorrect
> --
>
> Key: MAPREDUCE-7079
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7079
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jason Darrell Lowe
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 2020-01-10-MRApp-stack-dump.txt, 
> 2020-01-10-org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-version-14.txt,
>  MAPREDUCE-7079.001.patch, MAPREDUCE-7079.002.patch, 
> MAPREDUCE-7079.003.patch, MAPREDUCE-7079.004.patch, MAPREDUCE-7079.005.patch, 
> MAPREDUCE-7079.006.patch, MAPREDUCE-7079.007.patch, MAPREDUCE-7079.008.patch, 
> MAPREDUCE-7079.009.patch, MAPREDUCE-7079.010.patch
>
>
> {{JobHistory.serviceStop}} skips waiting for the thread pool to terminate. 
> The problem is due to incorrect while condition that will evaluate to false 
> on the iteration of the loop.
> {code:java}
>  scheduledExecutor.shutdown();
>   boolean interrupted = false;
>   long currentTime = System.currentTimeMillis();
>   while (!scheduledExecutor.isShutdown()
>   && System.currentTimeMillis() > currentTime + 1000l && 
> !interrupted) {
> try {
>   Thread.sleep(20);
> } catch (InterruptedException e) {
>   interrupted = true;
> }
>   }
> {code}
> The expression "{{System.currentTimeMillis() > currentTime + 1000L}}" is 
> false because currentTime was just initialized with 
> {{System.currentTimeMillis()}}. As a result the the thread won't wait until 
> the executor is terminated. Instead, it will force a shutdown immediately.
> *TestMRIntermediateDataEncryption is failing in precommit builds*
> TestMRIntermediateDataEncryption is either timing out or tearing down the JVM 
> which causes the unit tests in jobclient to not pass cleanly during precommit 
> builds. From sample precommit console output, note the lack of a test results 
> line when the test is run:
> {noformat}
> [INFO] Running org.apache.hadoop.mapred.TestSequenceFileInputFormat
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.976 
> s - in org.apache.hadoop.mapred.TestSequenceFileInputFormat
> [INFO] Running org.apache.hadoop.mapred.TestMRIntermediateDataEncryption
> [INFO] Running org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.659 
> s - in org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
> [...]
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:14 h
> [INFO] Finished at: 2018-04-12T04:27:06+00:00
> [INFO] Final Memory: 24M/594M
> [INFO] 
> 
> [WARNING] The requested profile "parallel-tests" could not be activated 
> because it does not exist.
> [WARNING] The requested profile "native" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "yarn-ui" could not be activated because it 
> does not exist.
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
> in the fork -> [Help 1]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7079) JobHistory#ServiceStop implementation is incorrect

2020-01-28 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025405#comment-17025405
 ] 

Eric Payne commented on MAPREDUCE-7079:
---

[~ahussein],
+1. Latest patch LGTM.
I assume we want to pull this back to branch-2.10. I will check to see if it 
comes back cleanly.

> JobHistory#ServiceStop implementation is incorrect
> --
>
> Key: MAPREDUCE-7079
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7079
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jason Darrell Lowe
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 2020-01-10-MRApp-stack-dump.txt, 
> 2020-01-10-org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-version-14.txt,
>  MAPREDUCE-7079.001.patch, MAPREDUCE-7079.002.patch, 
> MAPREDUCE-7079.003.patch, MAPREDUCE-7079.004.patch, MAPREDUCE-7079.005.patch, 
> MAPREDUCE-7079.006.patch, MAPREDUCE-7079.007.patch, MAPREDUCE-7079.008.patch, 
> MAPREDUCE-7079.009.patch, MAPREDUCE-7079.010.patch
>
>
> {{JobHistory.serviceStop}} skips waiting for the thread pool to terminate. 
> The problem is due to incorrect while condition that will evaluate to false 
> on the iteration of the loop.
> {code:java}
>  scheduledExecutor.shutdown();
>   boolean interrupted = false;
>   long currentTime = System.currentTimeMillis();
>   while (!scheduledExecutor.isShutdown()
>   && System.currentTimeMillis() > currentTime + 1000l && 
> !interrupted) {
> try {
>   Thread.sleep(20);
> } catch (InterruptedException e) {
>   interrupted = true;
> }
>   }
> {code}
> The expression "{{System.currentTimeMillis() > currentTime + 1000L}}" is 
> false because currentTime was just initialized with 
> {{System.currentTimeMillis()}}. As a result the the thread won't wait until 
> the executor is terminated. Instead, it will force a shutdown immediately.
> *TestMRIntermediateDataEncryption is failing in precommit builds*
> TestMRIntermediateDataEncryption is either timing out or tearing down the JVM 
> which causes the unit tests in jobclient to not pass cleanly during precommit 
> builds. From sample precommit console output, note the lack of a test results 
> line when the test is run:
> {noformat}
> [INFO] Running org.apache.hadoop.mapred.TestSequenceFileInputFormat
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.976 
> s - in org.apache.hadoop.mapred.TestSequenceFileInputFormat
> [INFO] Running org.apache.hadoop.mapred.TestMRIntermediateDataEncryption
> [INFO] Running org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.659 
> s - in org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
> [...]
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:14 h
> [INFO] Finished at: 2018-04-12T04:27:06+00:00
> [INFO] Final Memory: 24M/594M
> [INFO] 
> 
> [WARNING] The requested profile "parallel-tests" could not be activated 
> because it does not exist.
> [WARNING] The requested profile "native" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "yarn-ui" could not be activated because it 
> does not exist.
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
> in the fork -> [Help 1]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7259) testSpeculateSuccessfulWithUpdateEvents fails Intermittently

2020-01-28 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated MAPREDUCE-7259:
-
Attachment: MAPREDUCE-7259-branch-2.10.005.patch

> testSpeculateSuccessfulWithUpdateEvents fails Intermittently  
> --
>
> Key: MAPREDUCE-7259
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7259
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: MAPREDUCE-7259-branch-2.10.005.patch, 
> MAPREDUCE-7259.001.patch, MAPREDUCE-7259.002.patch, MAPREDUCE-7259.003.patch, 
> MAPREDUCE-7259.004.patch, MAPREDUCE-7259.005.patch
>
>
> {{TestSpeculativeExecutionWithMRApp.testSpeculateSuccessfulWithUpdateEvents}} 
> fails Intermittently with the exponential estimator. The problem happens 
> because assertion fails waiting for the MRApp to stop.
> There maybe a need to redesign the test case because it does not work very 
> well because of the racing and the timing between the speculator and the 
> tasks. It works fine for the legacy estimator because the estimate is based 
> on start-end rate. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7259) testSpeculateSuccessfulWithUpdateEvents fails Intermittently

2020-01-28 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025302#comment-17025302
 ] 

Jonathan Turner Eagles commented on MAPREDUCE-7259:
---

+1. Thanks for fixing this flaky test. Committed to trunk, branch-3.2, and 
branch-3.1. However, a special patch is needed for branch-2.10 as lamda 
expressions are not supported in this branch.

> testSpeculateSuccessfulWithUpdateEvents fails Intermittently  
> --
>
> Key: MAPREDUCE-7259
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7259
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: MAPREDUCE-7259.001.patch, MAPREDUCE-7259.002.patch, 
> MAPREDUCE-7259.003.patch, MAPREDUCE-7259.004.patch, MAPREDUCE-7259.005.patch
>
>
> {{TestSpeculativeExecutionWithMRApp.testSpeculateSuccessfulWithUpdateEvents}} 
> fails Intermittently with the exponential estimator. The problem happens 
> because assertion fails waiting for the MRApp to stop.
> There maybe a need to redesign the test case because it does not work very 
> well because of the racing and the timing between the speculator and the 
> tasks. It works fine for the legacy estimator because the estimate is based 
> on start-end rate. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7259) testSpeculateSuccessfulWithUpdateEvents fails Intermittently

2020-01-28 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles updated MAPREDUCE-7259:
--
Fix Version/s: 3.2.2
   3.1.4
   3.3.0

> testSpeculateSuccessfulWithUpdateEvents fails Intermittently  
> --
>
> Key: MAPREDUCE-7259
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7259
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: MAPREDUCE-7259.001.patch, MAPREDUCE-7259.002.patch, 
> MAPREDUCE-7259.003.patch, MAPREDUCE-7259.004.patch, MAPREDUCE-7259.005.patch
>
>
> {{TestSpeculativeExecutionWithMRApp.testSpeculateSuccessfulWithUpdateEvents}} 
> fails Intermittently with the exponential estimator. The problem happens 
> because assertion fails waiting for the MRApp to stop.
> There maybe a need to redesign the test case because it does not work very 
> well because of the racing and the timing between the speculator and the 
> tasks. It works fine for the legacy estimator because the estimate is based 
> on start-end rate. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7259) testSpeculateSuccessfulWithUpdateEvents fails Intermittently

2020-01-28 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025301#comment-17025301
 ] 

Hudson commented on MAPREDUCE-7259:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17906 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17906/])
MAPREDUCE-7259. testSpeculateSuccessfulWithUpdateEvents fails (jeagles: rev 
08251538fe2550d9dd86f9daf79994f5b8bdf7fa)
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestSpeculativeExecutionWithMRApp.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java


> testSpeculateSuccessfulWithUpdateEvents fails Intermittently  
> --
>
> Key: MAPREDUCE-7259
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7259
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: MAPREDUCE-7259.001.patch, MAPREDUCE-7259.002.patch, 
> MAPREDUCE-7259.003.patch, MAPREDUCE-7259.004.patch, MAPREDUCE-7259.005.patch
>
>
> {{TestSpeculativeExecutionWithMRApp.testSpeculateSuccessfulWithUpdateEvents}} 
> fails Intermittently with the exponential estimator. The problem happens 
> because assertion fails waiting for the MRApp to stop.
> There maybe a need to redesign the test case because it does not work very 
> well because of the racing and the timing between the speculator and the 
> tasks. It works fine for the legacy estimator because the estimate is based 
> on start-end rate. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7262) MRApp helpers block for long intervals (500ms)

2020-01-28 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025201#comment-17025201
 ] 

Jonathan Turner Eagles commented on MAPREDUCE-7262:
---

+1. On branch-2.10 patch. Committed that patch to branch-2.10. Thanks again, 
[~ahussein]

> MRApp helpers block for long intervals (500ms)
> --
>
> Key: MAPREDUCE-7262
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7262
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: MAPREDUCE-7262-branch-2.10.002.patch, 
> MAPREDUCE-7262-elapsedTimes.pdf, MAPREDUCE-7262.001.patch, 
> MAPREDUCE-7262.002.patch
>
>
> MRApp has a set of methods used as helpers in test cases such as: 
> {{waitForInternalState(TA)}}, {{waitForState(TA)}}, {{waitForState(Job)}}..etc
> When the condition fails, the thread sleeps for a minimum of 500ms before 
> rechecking the new state of the Job/TA.
> Example:
> {code:java}
>   public void waitForState(Task task, TaskState finalState) throws Exception {
> int timeoutSecs = 0;
> TaskReport report = task.getReport();
> while (!finalState.equals(report.getTaskState()) &&
> timeoutSecs++ < 20) {
>   System.out.println("Task State for " + task.getID() + " is : "
>   + report.getTaskState() + " Waiting for state : " + finalState
>   + "   progress : " + report.getProgress());
>   report = task.getReport();
>   Thread.sleep(500);
> }
> System.out.println("Task State is : " + report.getTaskState());
> Assert.assertEquals("Task state is not correct (timedout)", finalState,
> report.getTaskState());
>   }
> {code}
> I suggest to reduce the interval 500 to 50, while incrementing the number of 
> retries to 200. this will potentially make the test cases run faster. Also, 
> the {{System.out}} calls need to be removed because they are not adding 
> information dumping the current state on every iteration.
> A tentative list of Junits affected by the change:
> {code:bash}
> Method
> waitForInternalState(JobImpl, JobStateInternal)
> Found usages  (12 usages found)
> org.apache.hadoop.mapreduce.v2.app  (10 usages found)
> TestJobEndNotifier  (3 usages found)
> testNotificationOnLastRetry(boolean)  (1 usage found)
> 214 app.waitForInternalState(job, JobStateInternal.SUCCEEDED);
> testAbsentNotificationOnNotLastRetryUnregistrationFailure()  (1 
> usage found)
> 256 app.waitForInternalState(job, JobStateInternal.REBOOT);
> testNotificationOnLastRetryUnregistrationFailure()  (1 usage 
> found)
> 289 app.waitForInternalState(job, JobStateInternal.REBOOT);
> TestKill  (5 usages found)
> testKillJob()  (1 usage found)
> 70 app.waitForInternalState((JobImpl) job, 
> JobStateInternal.RUNNING);
> testKillTask()  (1 usage found)
> 108 app.waitForInternalState((JobImpl) job, 
> JobStateInternal.RUNNING);
> testKillTaskWait()  (1 usage found)
> 219 app.waitForInternalState((JobImpl) job, 
> JobStateInternal.KILLED);
> testKillTaskWaitKillJobAfterTA_DONE()  (1 usage found)
> 266 app.waitForInternalState((JobImpl)job, 
> JobStateInternal.KILLED);
> testKillTaskWaitKillJobBeforeTA_DONE()  (1 usage found)
> 316 app.waitForInternalState((JobImpl)job, 
> JobStateInternal.KILLED);
> TestMRApp  (2 usages found)
> testJobSuccess()  (1 usage found)
> 494 app.waitForInternalState(job, JobStateInternal.SUCCEEDED);
> testJobRebootOnLastRetryOnUnregistrationFailure()  (1 usage found)
> 542 app.waitForInternalState((JobImpl) job, 
> JobStateInternal.REBOOT);
> org.apache.hadoop.mapreduce.v2.app.rm  (2 usages found)
> TestRMContainerAllocator  (2 usages found)
> testReportedAppProgress()  (1 usage found)
> 1050 mrApp.waitForInternalState((JobImpl) job, 
> JobStateInternal.RUNNING);
> testReportedAppProgressWithOnlyMaps()  (1 usage found)
> 1202 mrApp.waitForInternalState((JobImpl)job, 
> JobStateInternal.RUNNING);
> --
> Method
> waitForState(TaskAttempt, TaskAttemptState)
> Found usages  (72 usages found)
> org.apache.hadoop.mapreduce.v2  (2 usages found)
> TestSpeculativeExecutionWithMRApp  (2 usages found)
> testSpeculateSuccessfulWithoutUpdateEvents()  (1 usage found)
> 212 

[jira] [Updated] (MAPREDUCE-7262) MRApp helpers block for long intervals (500ms)

2020-01-28 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles updated MAPREDUCE-7262:
--
Fix Version/s: 2.10.1

> MRApp helpers block for long intervals (500ms)
> --
>
> Key: MAPREDUCE-7262
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7262
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: MAPREDUCE-7262-branch-2.10.002.patch, 
> MAPREDUCE-7262-elapsedTimes.pdf, MAPREDUCE-7262.001.patch, 
> MAPREDUCE-7262.002.patch
>
>
> MRApp has a set of methods used as helpers in test cases such as: 
> {{waitForInternalState(TA)}}, {{waitForState(TA)}}, {{waitForState(Job)}}..etc
> When the condition fails, the thread sleeps for a minimum of 500ms before 
> rechecking the new state of the Job/TA.
> Example:
> {code:java}
>   public void waitForState(Task task, TaskState finalState) throws Exception {
> int timeoutSecs = 0;
> TaskReport report = task.getReport();
> while (!finalState.equals(report.getTaskState()) &&
> timeoutSecs++ < 20) {
>   System.out.println("Task State for " + task.getID() + " is : "
>   + report.getTaskState() + " Waiting for state : " + finalState
>   + "   progress : " + report.getProgress());
>   report = task.getReport();
>   Thread.sleep(500);
> }
> System.out.println("Task State is : " + report.getTaskState());
> Assert.assertEquals("Task state is not correct (timedout)", finalState,
> report.getTaskState());
>   }
> {code}
> I suggest to reduce the interval 500 to 50, while incrementing the number of 
> retries to 200. this will potentially make the test cases run faster. Also, 
> the {{System.out}} calls need to be removed because they are not adding 
> information dumping the current state on every iteration.
> A tentative list of Junits affected by the change:
> {code:bash}
> Method
> waitForInternalState(JobImpl, JobStateInternal)
> Found usages  (12 usages found)
> org.apache.hadoop.mapreduce.v2.app  (10 usages found)
> TestJobEndNotifier  (3 usages found)
> testNotificationOnLastRetry(boolean)  (1 usage found)
> 214 app.waitForInternalState(job, JobStateInternal.SUCCEEDED);
> testAbsentNotificationOnNotLastRetryUnregistrationFailure()  (1 
> usage found)
> 256 app.waitForInternalState(job, JobStateInternal.REBOOT);
> testNotificationOnLastRetryUnregistrationFailure()  (1 usage 
> found)
> 289 app.waitForInternalState(job, JobStateInternal.REBOOT);
> TestKill  (5 usages found)
> testKillJob()  (1 usage found)
> 70 app.waitForInternalState((JobImpl) job, 
> JobStateInternal.RUNNING);
> testKillTask()  (1 usage found)
> 108 app.waitForInternalState((JobImpl) job, 
> JobStateInternal.RUNNING);
> testKillTaskWait()  (1 usage found)
> 219 app.waitForInternalState((JobImpl) job, 
> JobStateInternal.KILLED);
> testKillTaskWaitKillJobAfterTA_DONE()  (1 usage found)
> 266 app.waitForInternalState((JobImpl)job, 
> JobStateInternal.KILLED);
> testKillTaskWaitKillJobBeforeTA_DONE()  (1 usage found)
> 316 app.waitForInternalState((JobImpl)job, 
> JobStateInternal.KILLED);
> TestMRApp  (2 usages found)
> testJobSuccess()  (1 usage found)
> 494 app.waitForInternalState(job, JobStateInternal.SUCCEEDED);
> testJobRebootOnLastRetryOnUnregistrationFailure()  (1 usage found)
> 542 app.waitForInternalState((JobImpl) job, 
> JobStateInternal.REBOOT);
> org.apache.hadoop.mapreduce.v2.app.rm  (2 usages found)
> TestRMContainerAllocator  (2 usages found)
> testReportedAppProgress()  (1 usage found)
> 1050 mrApp.waitForInternalState((JobImpl) job, 
> JobStateInternal.RUNNING);
> testReportedAppProgressWithOnlyMaps()  (1 usage found)
> 1202 mrApp.waitForInternalState((JobImpl)job, 
> JobStateInternal.RUNNING);
> --
> Method
> waitForState(TaskAttempt, TaskAttemptState)
> Found usages  (72 usages found)
> org.apache.hadoop.mapreduce.v2  (2 usages found)
> TestSpeculativeExecutionWithMRApp  (2 usages found)
> testSpeculateSuccessfulWithoutUpdateEvents()  (1 usage found)
> 212 app.waitForState(taskAttempt.getValue(), 
> TaskAttemptState.SUCCEEDED);
>