[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez (probably YARN also)

2016-07-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398491#comment-15398491
 ] 

Hudson commented on YARN-5436:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10175 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10175/])
YARN-5436. Race in AsyncDispatcher can cause random test failures in Tez 
(gtcarrera9: rev 7086fc72eebc41fd174d91839ed703c014aac920)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java


> Race in AsyncDispatcher can cause random test failures in Tez (probably YARN 
> also)
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-28 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398440#comment-15398440
 ] 

Li Lu commented on YARN-5436:
-

Will commit this patch shortly. 

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-28 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397858#comment-15397858
 ] 

Rohith Sharma K S commented on YARN-5436:
-

Thanks for the clarification, especially *race can still happen without 
invoking dispatcher.serviceStop()* is main reason for test failures. This would 
solve YARN test failures also. 

+1 LGTM

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-28 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397739#comment-15397739
 ] 

Zhiyuan Yang commented on YARN-5436:


[~rohithsharma] Thanks for reviewing! You are right in the sense this patch is 
mostly letting DrainDispatcher not reuse AsyncDispatcher's drained field, but 
the fix for YARN-2991 is still there.

bq. does small tiny race is causing TEZ test failures?
Yes. In Tez UT tests, invocation of dispatcher.await() finished without 
handling all events and assertion after dispatcher.await() failed. This race 
condition only happens when queue is almost empty, which is exactly the case in 
Tez UT tests.

bq. If so would it be good to fix in AsyncDispatcher rather adding full 
duplicate code. 
The root cause of race is we cannot guarantee we enqueue event and update 
drained atomically. I didn't find a way to fix this without adding more 
synchronization which is a very expensive fix for a minimum benefit. YARN-3878 
discussed about this race and decided to ignore it for the same reason.  

bq. How about adding additional check before adding into event queue to avoid a 
race?
While this may avoid enqueuing last event, race can still happen without 
invoking dispatcher.serviceStop(). Actually in Tez UT test, we never invoke 
dispatcher.serviceStop().

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-28 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397378#comment-15397378
 ] 

Rohith Sharma K S commented on YARN-5436:
-

Thanks Zhiyuan for providing patch!  Basically I see that patch is reverting 
YARN-2991. 

Couples of doubts, does small tiny race is causing TEZ test failures? If so 
would it be good to fix in AsyncDispatcher rather adding full duplicate code. 
How about adding additional check before adding into event queue to avoid a 
race?
{code}
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
index f5361c8..a162690 100644
--- 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
@@ -247,6 +247,12 @@ public void handle(Event event) {
 LOG.warn("Very low remaining capacity in the event-queue: "
 + remCapacity);
   }
+
+  if (blockNewEvents) {
+drained = eventQueue.isEmpty();
+return;
+  }
+
   try {
 eventQueue.put(event);
   } catch (InterruptedException e) {
{code}

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396608#comment-15396608
 ] 

Hadoop QA commented on YARN-5436:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m 14s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12820582/YARN-5436.4.patch |
| JIRA Issue | YARN-5436 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1241e8c12390 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / eb7ff0c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12531/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12531/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in 

[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396576#comment-15396576
 ] 

Li Lu commented on YARN-5436:
-

Fix LGTM, +1. I'll wait for 24 hrs for more comments. 

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396570#comment-15396570
 ] 

Hadoop QA commented on YARN-5436:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 36s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 50s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12820572/YARN-5436.3.patch |
| JIRA Issue | YARN-5436 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f1b6e06f22a9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / eb7ff0c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12529/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12529/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in 

[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396512#comment-15396512
 ] 

Li Lu commented on YARN-5436:
-

Thanks for the work [~aplusplus]! The {{FIXME}} part in AsyncDispatcher appears 
to be confusing: There is no data race (per Java memory model's definition) 
with the volatile variable {{drained}}. Maybe you'd like to rephrase a little 
bit to express the potential nondeterminism? 

Other changes in {{DrainedDispatcher}} appears to be fine to me. 

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396457#comment-15396457
 ] 

Hadoop QA commented on YARN-5436:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 18s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 26s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12820553/YARN-5436.2.patch |
| JIRA Issue | YARN-5436 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux b5dd8fb851ba 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 54fe17a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12528/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12528/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch, YARN-5436.2.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in 

[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396390#comment-15396390
 ] 

Hadoop QA commented on YARN-5436:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 30s {color} 
| {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common generated 1 
new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The 
patch generated 2 new + 7 unchanged - 0 fixed = 9 total (was 7) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 15s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 18m 24s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12820544/YARN-5436.1.patch |
| JIRA Issue | YARN-5436 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2325af36e225 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 54fe17a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/12526/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12526/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12526/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12526/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> 

[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396345#comment-15396345
 ] 

Zhiyuan Yang commented on YARN-5436:


Upload the patch that fixes problems only in DrainDispatcher and documents 
minor race condition in AsyncDispatcher. Please help review. [~jianhe], 
[~rohithsharma], [~varun_saxena].

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396348#comment-15396348
 ] 

Zhiyuan Yang commented on YARN-5436:


Race in AsyncDispatcher has been found and ignored in YARN-3887. Leave it there 
for now.

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: YARN-5436.1.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396054#comment-15396054
 ] 

Zhiyuan Yang commented on YARN-5436:


Data race can cause RM stop without handling last enqueued event.

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher but wasn't found. In YARN-2991, another 
> DrainDispatcher bug was fixed by letting DrainDispatcher extend 
> AsyncDispatcher because AsyncDispatcher doesn't have such issue. However, 
> this shadows YARN-2264, and now similar race reappears in Tez unit tests 
> (probably also YARN unit tests also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5436) Race in AsyncDispatcher can cause random test failures in Tez(probably YARN also )

2016-07-27 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396052#comment-15396052
 ] 

Zhiyuan Yang commented on YARN-5436:


Data race can cause RM stop without handling last enqueued event.

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> --
>
> Key: YARN-5436
> URL: https://issues.apache.org/jira/browse/YARN-5436
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher but wasn't found. In YARN-2991, another 
> DrainDispatcher bug was fixed by letting DrainDispatcher extend 
> AsyncDispatcher because AsyncDispatcher doesn't have such issue. However, 
> this shadows YARN-2264, and now similar race reappears in Tez unit tests 
> (probably also YARN unit tests also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org