[
https://issues.apache.org/jira/browse/CRUNCH-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992486#comment-13992486
]
Josh Wills commented on CRUNCH-390:
-----------------------------------
[~cmarius] good looking patch, thank you so much! I'm running it through
integration tests now and will commit it when it passes.
> Planner is not adding dependencies between jobs when planning is done in more
> than one stage.
> ---------------------------------------------------------------------------------------------
>
> Key: CRUNCH-390
> URL: https://issues.apache.org/jira/browse/CRUNCH-390
> Project: Crunch
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.2
> Reporter: Ioan Marius Curelariu
> Assignee: Josh Wills
> Attachments:
> 0001-Patched-the-MSCRPlanner-to-correctly-add-dependencie.patch
>
>
> The planner splits does the planning in multiple stages when it finds job
> dependencies on ReadableData. One example of this case is when using the
> BloomFilterJoinStrategy.
> While the generated plan dot file looks good, the planner actually does not
> add dependencies between jobs that are created in different planning stages.
> I have a pipeline that reads 3 input sources. It joins 2 of them using a
> bloom filter join strategy. Later on, it joins this with the output of a job
> coming from the third source path.
> In the case the jobs on the branch using the bloom filter finish before the
> one reading the third source, the executor attempts to start the 4-th job
> that is supposed to join everything before the 3-rd one finish, resulting in
> a input Path not found exception.
--
This message was sent by Atlassian JIRA
(v6.2#6252)