[
https://issues.apache.org/jira/browse/PIG-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-1734:
--------------------------------
Fix Version/s: 0.10
> Pig needs a more efficient DAG execution
> ----------------------------------------
>
> Key: PIG-1734
> URL: https://issues.apache.org/jira/browse/PIG-1734
> Project: Pig
> Issue Type: Improvement
> Reporter: Olga Natkovich
> Fix For: 0.10
>
>
> The current code uses Hadoop's Job control to execute one stage at a time.
> The first stage includes all jobs with no dependencies, the second stage jobs
> that depend only on jobs completed in the first stage, the third stage
> contains the jobs that depend on jobs from stage 1 and 2, etc.
> The problem with this simplistic approach is that each next stages only
> starts when the previous stage is over which means means that some branches
> of the DAG are unnecessarily blocked.
> We would need to do our own DAG management to solve this issue which would be
> a pretty significant undertaking. Something we should look at in the future.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira