[jira] [Commented] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-12-02 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986442#comment-16986442
 ] 

Ahmed Hussein commented on TEZ-4103:


{quote}I can see this patch goes through great effort to centralize logging 
into the ProgressHelper. However, it adds IMHO unnecessarily complex code by 
using lambdas in log statements as well as separates the condition checking and 
logging from its origin of error. Unless I'm missing the necessity, I think 
this code becomes much simpler with a simple if isDebugEnabled() check followed 
by parameterized LOG.debug statement. Once this is done we can remove the 
logDebug helper methods.
{quote}
I thought that lambda expressions reduce the overhead because the expression 
(i.e., parameters to the lambda expression and string formatting) won't be 
evaluated until the {{fn.apply()}} is called. I will replace the lambda with 
simple {{isDebugEnabled()}}. Yet, we need a way to aggregate the progress 
logging to make it easy to debug. For example, when we use {{isDebugEnabled()}} 
we will need to enable the logging for all classes that have {{getProgress()}} 
method. On the other hand, logging in one class makes it easy to enable/disable 
the debugging of {{getProgress()}}.

{quote}I also wondered about the thread monitoring. Can you help me to 
understand why a catch (Throwable) wasn't sufficient. As per 
https://stackoverflow.com/a/24902026. Seems like (though I am not positive) we 
have created a thread to monitor the other thread.{quote}

I was confused by the java doc thinking that the future invocation will halt as 
long as the thread exception in the JVM has been set. I will simplify the code 
by removing the re-launching piece.

{quote}Functionally, it isn't incorrect to use a LogicalInput that isn't 
AbstractLogicalInput. While I like logging the non-compliant class as 
speculative execution is very limited in that scenario, is it too excessive to 
log that condition every time?{quote}
I saw in the javaDoc that {{AbstractLogicalInput}} has to be the base for all 
implementations. If that's the design, then it should be incorrect to have 
different implementations.

{code:java}
/**
 * An abstract class which should be the base class for all implementations of 
LogicalInput.
 *
 * This class implements the framework facing as well as user facing methods 
which need to be
 * implemented by all LogicalInputs.
{code}



 
 

> Progress in DAG, Vertex, and tasks is incorrect
> ---
>
> Key: TEZ-4103
> URL: https://issues.apache.org/jira/browse/TEZ-4103
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-4103.001.patch, TEZ-4103.002.patch, 
> TEZ-4103.003.patch
>
>
> Looking at the progress code, there some few issues that could lead to some 
> problems calculating the progress.
>  There are some cases when the progress never reach 1.0.
>  This is a list of issues that need to be fixed in the progress code:
>  * After TEZ-3982, since values are skipped in the In some cases, the 
> progress of DAG or a vertex may never reach 1.0f. this is in both 
> "{{DAGImpl.java}}" and "{{ProgressHelper.java}}"
>  * {{ProgressHelper}} schedules a service to update the progress, dubbed 
> `{{ProgressHelper.monitorProgress}}`. According to Java Documentation:
> {quote}If any execution of the task encounters an exception,
>  subsequent executions are suppressed.
>  Otherwise, the task will only terminate via cancellation
>  or termination of the executor.
> {quote}
> In other words, if the service dies, there is no way to catch that in the 
> code and the progress will never be updated.
>  * The `{{SimpleProcessor.inputMap}}` is not thread-safe. They are 
> initialized as `{{LinkedHashMap}}` and there is no synchronization on the 
> field objects in the map. This could be problematic in concurrent context.
>  * `{{VertexImpl.getProgress()}}` does not check the range of the progress 
> calculated in `{{VertexImpl.computeProgress()}}`
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-12-02 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986418#comment-16986418
 ] 

Jonathan Turner Eagles commented on TEZ-4103:
-

I think we need a few improvements before this patch is ready.

I can see this patch goes through great effort to centralize logging into the 
ProgressHelper. However, it adds IMHO unnecessarily complex code by using 
lambdas in log statements as well as separates the condition checking and 
logging from its origin of error. Unless I'm missing the necessity, I think 
this code becomes much simpler with a simple if isDebugEnabled() check followed 
by parameterized LOG.debug statement. Once this is done we can remove the 
logDebug helper methods.

I also wondered about the thread monitoring. Can you help me to understand why 
a catch (Throwable) wasn't sufficient. As per 
https://stackoverflow.com/a/24902026. Seems like (though I am not positive) we 
have created a thread to monitor the other thread.

Functionally, it isn't incorrect to use a LogicalInput that isn't 
AbstractLogicalInput. While I like logging the non-compliant class as 
speculative execution is very limited in that scenario, is it too excessive to 
log that condition every time?

> Progress in DAG, Vertex, and tasks is incorrect
> ---
>
> Key: TEZ-4103
> URL: https://issues.apache.org/jira/browse/TEZ-4103
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-4103.001.patch, TEZ-4103.002.patch, 
> TEZ-4103.003.patch
>
>
> Looking at the progress code, there some few issues that could lead to some 
> problems calculating the progress.
>  There are some cases when the progress never reach 1.0.
>  This is a list of issues that need to be fixed in the progress code:
>  * After TEZ-3982, since values are skipped in the In some cases, the 
> progress of DAG or a vertex may never reach 1.0f. this is in both 
> "{{DAGImpl.java}}" and "{{ProgressHelper.java}}"
>  * {{ProgressHelper}} schedules a service to update the progress, dubbed 
> `{{ProgressHelper.monitorProgress}}`. According to Java Documentation:
> {quote}If any execution of the task encounters an exception,
>  subsequent executions are suppressed.
>  Otherwise, the task will only terminate via cancellation
>  or termination of the executor.
> {quote}
> In other words, if the service dies, there is no way to catch that in the 
> code and the progress will never be updated.
>  * The `{{SimpleProcessor.inputMap}}` is not thread-safe. They are 
> initialized as `{{LinkedHashMap}}` and there is no synchronization on the 
> field objects in the map. This could be problematic in concurrent context.
>  * `{{VertexImpl.getProgress()}}` does not check the range of the progress 
> calculated in `{{VertexImpl.computeProgress()}}`
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-12-02 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986149#comment-16986149
 ] 

TezQA commented on TEZ-4103:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} tez-api: The patch generated 0 new + 2 unchanged - 5 
fixed = 2 total (was 7) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} tez-dag: The patch generated 0 new + 379 unchanged - 
1 fixed = 379 total (was 380) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
56s{color} | {color:green} tez-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m  
4s{color} | {color:green} tez-dag in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee |
| JIRA Issue | TEZ-4103 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987308/TEZ-4103.003.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  |
| uname | Linux 5990cf679540 4.15.0-70-generic #79-Ubuntu SMP Tue Nov 12 
10:36:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | master / 1af0897 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.0.1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-TEZ-Build/208/testReport/ |
| Max. process+thread count | 220 (vs. ulimit of 5500) |
| modules | C: tez-api tez-dag U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-TEZ-Build/208/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Progress in DAG, Vertex, and tasks is incorrect
> ---
>
> Key: TEZ-4103
> URL: https://issues.apache.org/jira/browse/TEZ-4103
> Project: Apache Tez
>   

[jira] [Updated] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-12-02 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated TEZ-4103:
---
Attachment: TEZ-4103.003.patch

> Progress in DAG, Vertex, and tasks is incorrect
> ---
>
> Key: TEZ-4103
> URL: https://issues.apache.org/jira/browse/TEZ-4103
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-4103.001.patch, TEZ-4103.002.patch, 
> TEZ-4103.003.patch
>
>
> Looking at the progress code, there some few issues that could lead to some 
> problems calculating the progress.
>  There are some cases when the progress never reach 1.0.
>  This is a list of issues that need to be fixed in the progress code:
>  * After TEZ-3982, since values are skipped in the In some cases, the 
> progress of DAG or a vertex may never reach 1.0f. this is in both 
> "{{DAGImpl.java}}" and "{{ProgressHelper.java}}"
>  * {{ProgressHelper}} schedules a service to update the progress, dubbed 
> `{{ProgressHelper.monitorProgress}}`. According to Java Documentation:
> {quote}If any execution of the task encounters an exception,
>  subsequent executions are suppressed.
>  Otherwise, the task will only terminate via cancellation
>  or termination of the executor.
> {quote}
> In other words, if the service dies, there is no way to catch that in the 
> code and the progress will never be updated.
>  * The `{{SimpleProcessor.inputMap}}` is not thread-safe. They are 
> initialized as `{{LinkedHashMap}}` and there is no synchronization on the 
> field objects in the map. This could be problematic in concurrent context.
>  * `{{VertexImpl.getProgress()}}` does not check the range of the progress 
> calculated in `{{VertexImpl.computeProgress()}}`
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-12-02 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986112#comment-16986112
 ] 

TezQA commented on TEZ-4103:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} tez-api: The patch generated 2 new + 3 unchanged 
- 4 fixed = 5 total (was 7) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} tez-dag: The patch generated 2 new + 379 
unchanged - 1 fixed = 381 total (was 380) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
9s{color} | {color:red} tez-api generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
53s{color} | {color:green} tez-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m  
1s{color} | {color:green} tez-dag in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:tez-api |
|  |  Dead store to invalidInput in 
org.apache.tez.common.ProgressHelper$1.run()  At 
ProgressHelper.java:org.apache.tez.common.ProgressHelper$1.run()  At 
ProgressHelper.java:[line 171] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/tez:d4a62deee |
| JIRA Issue | TEZ-4103 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987301/TEZ-4103.002.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  |
| uname | Linux a3fb6fe4f16e 4.15.0-70-generic #79-Ubuntu SMP Tue Nov 12 
10:36:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | master / 1af0897 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.0.1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-TEZ-Build/207/artifact/out/diff-checkstyle-tez-api.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-TEZ-Build/207/artifact/out/diff-checkstyle-tez-dag.txt
 |
| findbugs | 

[jira] [Updated] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-12-02 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated TEZ-4103:
---
Attachment: TEZ-4103.002.patch

> Progress in DAG, Vertex, and tasks is incorrect
> ---
>
> Key: TEZ-4103
> URL: https://issues.apache.org/jira/browse/TEZ-4103
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-4103.001.patch, TEZ-4103.002.patch
>
>
> Looking at the progress code, there some few issues that could lead to some 
> problems calculating the progress.
>  There are some cases when the progress never reach 1.0.
>  This is a list of issues that need to be fixed in the progress code:
>  * After TEZ-3982, since values are skipped in the In some cases, the 
> progress of DAG or a vertex may never reach 1.0f. this is in both 
> "{{DAGImpl.java}}" and "{{ProgressHelper.java}}"
>  * {{ProgressHelper}} schedules a service to update the progress, dubbed 
> `{{ProgressHelper.monitorProgress}}`. According to Java Documentation:
> {quote}If any execution of the task encounters an exception,
>  subsequent executions are suppressed.
>  Otherwise, the task will only terminate via cancellation
>  or termination of the executor.
> {quote}
> In other words, if the service dies, there is no way to catch that in the 
> code and the progress will never be updated.
>  * The `{{SimpleProcessor.inputMap}}` is not thread-safe. They are 
> initialized as `{{LinkedHashMap}}` and there is no synchronization on the 
> field objects in the map. This could be problematic in concurrent context.
>  * `{{VertexImpl.getProgress()}}` does not check the range of the progress 
> calculated in `{{VertexImpl.computeProgress()}}`
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)