> On Oct. 21, 2015, 11:19 a.m., Rohini Palaniswamy wrote: > > Overall approach in general looks good and viable. There might be more of > > condition checks to be added and corner cases to handle as you go along and > > test out the implementation. I have commented on whatever gaps I could > > think of at the moment.
Major changes - Use logical signature instead of script hash - Code style changes - Storing the success/failure state of MRJob nodes I have replied to the comments not falling in above category. Please check. - Abhishek ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/39226/#review103384 ----------------------------------------------------------- On Oct. 12, 2015, 11:30 a.m., Abhishek Agarwal wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/39226/ > ----------------------------------------------------------- > > (Updated Oct. 12, 2015, 11:30 a.m.) > > > Review request for pig and Rohini Palaniswamy. > > > Repository: pig-git > > > Description > ------- > > Pig scripts can have multiple ETL jobs in the DAG which may take hours to > finish. In case of transient errors, the job fails. When the job is rerun, > all the nodes in Job graph will rerun. Some of these nodes may have already > run successfully. Redundant runs lead to wastage of cluster capacity and > pipeline delays. > > In case of failure, we can persist the graph state. In next run, only the > failed nodes and their successors will rerun. This is of course subject to > preconditions such as > > Pig script has not changed > > Input locations have not changed > > Output data from previous run is intact > > Configuration has not changed > > > Diffs > ----- > > src/org/apache/pig/PigConfiguration.java 03b36a5 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java > 595e68c > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRIntermediateDataVisitor.java > 4b62112 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRJobRecovery.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRJobState.java > PRE-CREATION > src/org/apache/pig/impl/io/FileLocalizer.java f0f9b43 > src/org/apache/pig/tools/grunt/GruntParser.java 439d087 > src/org/apache/pig/tools/pigstats/ScriptState.java 03a12b1 > > Diff: https://reviews.apache.org/r/39226/diff/ > > > Testing > ------- > > > Thanks, > > Abhishek Agarwal > >
