Compute LogicalPlan signature and store in job conf ---------------------------------------------------
Key: PIG-2587 URL: https://issues.apache.org/jira/browse/PIG-2587 Project: Pig Issue Type: Improvement Reporter: Bill Graham Assignee: Bill Graham We'd like to be able to uniquely identify a re-executed script (possibly with different inputs/outputs) by creating a signature of the {{LogicalPlan}}. Here's the proposal: # Add a new method {{LogicalPlan.getSignature()}} that returns a hash of its {{LogicalPlanPrinter}} output. # In {{PigServer.execute()}} set the signature on the job conf after the LP is compiled, but before it's executed. (1) would allow an impl of {{PigProgressNotificationListener.setScriptPlan()}} to save the LP signature with the script metadata. Upon subsequent runs (2) would allow an impl of {{PigReducerEstimator}} (see PIG-2574) to retrieve the current LP signature and fetch the historical data for the script. It could then use the previous run data to better estimate the number of reducers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira