[ https://issues.apache.org/jira/browse/OOZIE-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16979310#comment-16979310 ]
Peter Bacsko commented on OOZIE-3561: ------------------------------------- I refactored the validator 3 years ago, so I had to check it again how it works: 1. Basic validation makes sure that the workflow is acyclic. That's definitely fast. 2. Fork-join validation: it was more tricky. Multiple fork-joins did cause problems because paths were re-walked unnecessarily - this had exponential runtime with regards to the number of fork-join pairs. However, OOZIE-1978 made sure that no unnecessary walks take place by making sure that we stop the recursion when we encounter a join. Right now I don't see what could go wrong. > Forkjoin validation is slow when there are many actions in chain > ---------------------------------------------------------------- > > Key: OOZIE-3561 > URL: https://issues.apache.org/jira/browse/OOZIE-3561 > Project: Oozie > Issue Type: Bug > Components: core > Affects Versions: 5.1.0 > Reporter: Denes Bodo > Assignee: Denes Bodo > Priority: Critical > Labels: performance > > In case we have a workflow which has, let's say, 80 actions after each other: > {{a1 -> a2 -> ... a80}} > then the validator code "never" finishes. > Currently the validation (in my understanding) does depth first checks from > the start node and runs in time of n! . This is confirmed as when we split > this huge workflow into two 40-element workflow then we get 2x ~40!-step in > validation instead of ~80! steps. > Guys, could you please confirm or disprove my theory? > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)