Josh Wills created CRUNCH-128:
---------------------------------

             Summary: Allow one stage of an MR pipeline to depend on another 
target being created
                 Key: CRUNCH-128
                 URL: https://issues.apache.org/jira/browse/CRUNCH-128
             Project: Crunch
          Issue Type: Bug
            Reporter: Josh Wills


There are a couple of problems (e.g., mapside-joins, total orderings, etc.) 
where we need to guarantee that one PCollection has been written to the 
FileSystem before another MapReduce pipeline that depends on that file is 
allowed to run. This doesn't fit cleanly into the current set of abstractions 
for Crunch, which is why we force pipelines to execute via the run command to 
guarantee that the files have been created before the second stage is run.

We should add the ability for a particular PCollection to require that a 
SourceTarget instance has been created before it can be executed, and the 
planner should incorporate this information into the MR pipeline planning 
process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to