[ 
https://issues.apache.org/jira/browse/OOZIE-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980123#comment-16980123
 ] 

Peter Bacsko commented on OOZIE-3561:
-------------------------------------

[~dionusos] thanks for the patch, I believe this is the approach that we need.
As we discussed in person, let's improve this further:

1. Just store the {{NodeDef}} object in the set, not a string. That should 
exhibit the exact same behavior.
2. Call the set sth like "seenNodes" or "visitedNodes".
3. Next week, let's come up with some more edge cases, eg. "errorTo" of a node 
inside a fork points to a node which is located in another fork.

> Forkjoin validation is slow when there are many actions in chain
> ----------------------------------------------------------------
>
>                 Key: OOZIE-3561
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3561
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 5.1.0
>            Reporter: Denes Bodo
>            Assignee: Denes Bodo
>            Priority: Critical
>              Labels: performance
>         Attachments: OOZIE-3561_001.patch
>
>
> In case we have a workflow which has, let's say, 80 actions after each other:
> {{a1 -> a2 -> ... a80}}
> then the validator code "never" finishes.
> Currently the validation (in my understanding) does depth first checks from 
> the start node and runs in time of n! . This is confirmed as when we split 
> this huge workflow into two 40-element workflow then we get 2x ~40!-step in 
> validation instead of ~80! steps.
> Guys, could you please confirm or disprove my theory?
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to