[ 
https://issues.apache.org/jira/browse/OOZIE-547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663594#comment-13663594
 ] 

Mona Chitnis commented on OOZIE-547:
------------------------------------

Some review comments:

# In JobXCommand, you mark doneActions as with status either "OK" or "DONE". 
What about the "error" states? It also points at completed execution, albeit 
unsuccessfully. You can maybe have positive and negative values for the 
progress, indicating success and failure cases respectively. In the future if 
this needs to be color coded, the +/- value can be used.
# Among the wfprogress-test-*.xml test resources and the testcase, I see two 
xml files both being used to "check orphan nodes", and only 1 of them had 
orphan nodes from what I saw. Please consolidate into just one file and test 
function if possible, unless there is significantly different structure.
# In LiteWorkflowAppParser isControlNode(), just check for 'instanceof 
ActionNodeDef' and return false, instead of having so many OR clauses for the 
true case.
# Can you explain what you are doing in the 'topological sort'?
# It is incorrect to simply count all actions in the presence of fork and 
decision nodes. Even in general use-cases, the workflow actions can form a 
large tree after forking or decision. Consider performing a Depth-First-Search 
to determine longest path.
# Ensure that this works correctly even in job Rerun cases, with the config 
property for wf.action.skip.nodes provided.
# So progress for MR (non-opaque) actions is going to be a different JIRA?
                
> build workflow progress information in Oozie
> --------------------------------------------
>
>                 Key: OOZIE-547
>                 URL: https://issues.apache.org/jira/browse/OOZIE-547
>             Project: Oozie
>          Issue Type: New Feature
>            Reporter: Hadoop QA
>            Assignee: zhu jin wei
>         Attachments: oozie-547.patch
>
>
> For a user, knowing progress of her workflow is always desirable. This ticket 
> is to introduce this support to Oozie.
> I know it's a hard problem. For my initial effort, I plan to start with 
> simple workflows that do not contain decision nodes or fork/join nodes, i.e., 
> chain type workflows. I plan to use percentage of finished actions as the 
> overall wf progress estimate.
> Going forward we can improve the estimation by:
> 1) handle general workflows that contain decision, fork/join nodes;
> 2) incorporate the action level progress into wf level progress estimation to 
> make the estimate better. To be more specific:
> In the case of "opaque" actions like pig/hive/jaql where the status can only 
> be 0% or 100% (or failure) we plug that value into the overall DAG status of 
> 0-100%. If a DAG had say 4 opaque actions, the progress would move in 
> discrete steps 0, 25, 50, 75, 100%.  For the m/r actions where the JobTracker
> gives values between 0-100% for an action then the overall progress will be 
> smoother. We can do same thing for pig/hive/jaql actions as well if they 
> expose their own progress info.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to