juripetersen opened a new pull request, #396:
URL: https://github.com/apache/incubator-wayang/pull/396

   This PR proposes an abstraction of the current cost model.
   As of now, the cost model uses a somewhat _hardcoded_ calculation. 
   To make the optimizers cost calculation more pluggable and allow users to 
optimize for their desired cost metric, this PR provides an interface called 
_EstimatableCost_ that can be implemented to define custom ways of calculating 
costs. 
   
   The current _hardcoded_ calculation was moved to _DefaultEstimatableCost_ 
and is used by default.
   
   # Proposed provision of a cost model:
   Example from `guides/WordCount.java`:
   
   ```java
   public class CustomEstimatableCost implements EstimatableCost {
       /* Provide concrete implementations to match desired cost function(s)
        * by implementing the interface in this class.
        */
   }
   public class WordCount {
       public static void main(String[] args) {
           /* Create a Wayang context and specify the platforms Wayang will 
consider */
           Configuration config = new Configuration();
           /* Provision of a EstimatableCost that implements the interface.*/
           config.setCostModel(new CustomEstimatableCost());
           WayangContext wayangContext = new WayangContext(config)
                   .withPlugin(Java.basicPlugin())
                   .withPlugin(Spark.basicPlugin());
           /*... omitted */
       }
   }
   ```
   In a project, we used this abstraction to implement a proof-of-concept ML 
runtime estimation as a cost model. Because the ML model just needed to decide 
between (sub-)plans, we modified the _Job_ class to allow overwriting the 
decision between these plans in an implementation of _EstimatableCost_.
   
   Thus, extending the `EstimatableCost` interface with the following
   method seems logical:
   ```java
       public PlanImplementation pickBestExecutionPlan(
           Collection<PlanImplementation> executionPlans,
           ExecutionPlan existingPlan,
           Set<Channel> openChannels,
           Set<ExecutionStage> executedStages);
   ```
   This method could then be invoked in `Job.java` like this:
   ```java
       public PlanImplementation pickBestExecutionPlan(
           Collection<PlanImplementation> executionPlans,
           ExecutionPlan existingPlan,
           Set<Channel> openChannels,
           Set<ExecutionStage> executedStages) {
           return this.configuration
               .getCostModel()
               .getFactory()
               .makeCost()
               .pickBestExecutionPlan(
                   executionPlans,
                   existingPlan,
                   openChannels,
                   executedStages
               );
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to