> On Oct. 21, 2015, 11:19 a.m., Rohini Palaniswamy wrote:
> > Overall approach in general looks good and viable. There might be more of 
> > condition checks to be added and corner cases to handle as you go along and 
> > test out the implementation. I have commented on whatever gaps I could 
> > think of at the moment.

Major changes 
- Use logical signature instead of script hash
- Code style changes
- Storing the success/failure state of MRJob nodes

I have replied to the comments not falling in above category. Please check.


- Abhishek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39226/#review103384
-----------------------------------------------------------


On Oct. 12, 2015, 11:30 a.m., Abhishek Agarwal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39226/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2015, 11:30 a.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Pig scripts can have multiple ETL jobs in the DAG which may take hours to 
> finish. In case of transient errors, the job fails. When the job is rerun, 
> all the nodes in Job graph will rerun. Some of these nodes may have already 
> run successfully. Redundant runs lead to wastage of cluster capacity and 
> pipeline delays.
> 
> In case of failure, we can persist the graph state. In next run, only the 
> failed nodes and their successors will rerun. This is of course subject to 
> preconditions such as
>          > Pig script has not changed
>          > Input locations have not changed
>          > Output data from previous run is intact
>          > Configuration has not changed
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/PigConfiguration.java 03b36a5 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
>  595e68c 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRIntermediateDataVisitor.java
>  4b62112 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRJobRecovery.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRJobState.java
>  PRE-CREATION 
>   src/org/apache/pig/impl/io/FileLocalizer.java f0f9b43 
>   src/org/apache/pig/tools/grunt/GruntParser.java 439d087 
>   src/org/apache/pig/tools/pigstats/ScriptState.java 03a12b1 
> 
> Diff: https://reviews.apache.org/r/39226/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Abhishek Agarwal
> 
>

Reply via email to