[jira] [Created] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-11-27 Thread Ahmed Hussein (Jira)
Ahmed Hussein created TEZ-4103:
--

 Summary: Progress in DAG, Vertex, and tasks is incorrect
 Key: TEZ-4103
 URL: https://issues.apache.org/jira/browse/TEZ-4103
 Project: Apache Tez
  Issue Type: Bug
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


Looking at the progress code, there some few issues that could lead to some 
problems calculating the progress.
 There are some cases when the progress never reach 1.0.
 This is a list of issues that need to be fixed in the progress code:
 * After TEZ-3982, since values are skipped in the In some cases, the progress 
of DAG or a vertex may never reach 1.0f. this is in both "{{DAGImpl.java}}" and 
"{{ProgressHelper.java}}"
 * {{ProgressHelper}} schedules a service to update the progress, dubbed 
`{{ProgressHelper.monitorProgress}}`. According to Java Documentation:
{quote}If any execution of the task encounters an exception,
 subsequent executions are suppressed.
 Otherwise, the task will only terminate via cancellation
 or termination of the executor.
{quote}
In other words, if the service dies, there is no way to catch that in the code 
and the progress will never be updated.

 * The `{{SimpleProcessor.inputMap}}` is not thread-safe. They are initialized 
as `{{LinkedHashMap}}` and there is no synchronization on the field objects in 
the map. This could be problematic in concurrent context.
 * `{{VertexImpl.getProgress()}}` does not check the range of the progress 
calculated in `{{VertexImpl.computeProgress()}}`
  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-11-27 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated TEZ-4103:
---
Attachment: TEZ-4103.001.patch

> Progress in DAG, Vertex, and tasks is incorrect
> ---
>
> Key: TEZ-4103
> URL: https://issues.apache.org/jira/browse/TEZ-4103
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-4103.001.patch
>
>
> Looking at the progress code, there some few issues that could lead to some 
> problems calculating the progress.
>  There are some cases when the progress never reach 1.0.
>  This is a list of issues that need to be fixed in the progress code:
>  * After TEZ-3982, since values are skipped in the In some cases, the 
> progress of DAG or a vertex may never reach 1.0f. this is in both 
> "{{DAGImpl.java}}" and "{{ProgressHelper.java}}"
>  * {{ProgressHelper}} schedules a service to update the progress, dubbed 
> `{{ProgressHelper.monitorProgress}}`. According to Java Documentation:
> {quote}If any execution of the task encounters an exception,
>  subsequent executions are suppressed.
>  Otherwise, the task will only terminate via cancellation
>  or termination of the executor.
> {quote}
> In other words, if the service dies, there is no way to catch that in the 
> code and the progress will never be updated.
>  * The `{{SimpleProcessor.inputMap}}` is not thread-safe. They are 
> initialized as `{{LinkedHashMap}}` and there is no synchronization on the 
> field objects in the map. This could be problematic in concurrent context.
>  * `{{VertexImpl.getProgress()}}` does not check the range of the progress 
> calculated in `{{VertexImpl.computeProgress()}}`
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4103) Progress in DAG, Vertex, and tasks is incorrect

2019-11-27 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983810#comment-16983810
 ] 

Ahmed Hussein commented on TEZ-4103:


Changing the data stucture of the inputs into a thread-safe implementation will 
need lots of changes across the source code. It is better to keep that in a 
separate Jira.

> Progress in DAG, Vertex, and tasks is incorrect
> ---
>
> Key: TEZ-4103
> URL: https://issues.apache.org/jira/browse/TEZ-4103
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-4103.001.patch
>
>
> Looking at the progress code, there some few issues that could lead to some 
> problems calculating the progress.
>  There are some cases when the progress never reach 1.0.
>  This is a list of issues that need to be fixed in the progress code:
>  * After TEZ-3982, since values are skipped in the In some cases, the 
> progress of DAG or a vertex may never reach 1.0f. this is in both 
> "{{DAGImpl.java}}" and "{{ProgressHelper.java}}"
>  * {{ProgressHelper}} schedules a service to update the progress, dubbed 
> `{{ProgressHelper.monitorProgress}}`. According to Java Documentation:
> {quote}If any execution of the task encounters an exception,
>  subsequent executions are suppressed.
>  Otherwise, the task will only terminate via cancellation
>  or termination of the executor.
> {quote}
> In other words, if the service dies, there is no way to catch that in the 
> code and the progress will never be updated.
>  * The `{{SimpleProcessor.inputMap}}` is not thread-safe. They are 
> initialized as `{{LinkedHashMap}}` and there is no synchronization on the 
> field objects in the map. This could be problematic in concurrent context.
>  * `{{VertexImpl.getProgress()}}` does not check the range of the progress 
> calculated in `{{VertexImpl.computeProgress()}}`
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)