juripetersen opened a new pull request, #396:
URL: https://github.com/apache/incubator-wayang/pull/396
This PR proposes an abstraction of the current cost model.
As of now, the cost model uses a somewhat _hardcoded_ calculation.
To make the optimizers cost calculation more pluggable and allow users to
optimize for their desired cost metric, this PR provides an interface called
_EstimatableCost_ that can be implemented to define custom ways of calculating
costs.
The current _hardcoded_ calculation was moved to _DefaultEstimatableCost_
and is used by default.
# Proposed provision of a cost model:
Example from `guides/WordCount.java`:
```java
public class CustomEstimatableCost implements EstimatableCost {
/* Provide concrete implementations to match desired cost function(s)
* by implementing the interface in this class.
*/
}
public class WordCount {
public static void main(String[] args) {
/* Create a Wayang context and specify the platforms Wayang will
consider */
Configuration config = new Configuration();
/* Provision of a EstimatableCost that implements the interface.*/
config.setCostModel(new CustomEstimatableCost());
WayangContext wayangContext = new WayangContext(config)
.withPlugin(Java.basicPlugin())
.withPlugin(Spark.basicPlugin());
/*... omitted */
}
}
```
In a project, we used this abstraction to implement a proof-of-concept ML
runtime estimation as a cost model. Because the ML model just needed to decide
between (sub-)plans, we modified the _Job_ class to allow overwriting the
decision between these plans in an implementation of _EstimatableCost_.
Thus, extending the `EstimatableCost` interface with the following
method seems logical:
```java
public PlanImplementation pickBestExecutionPlan(
Collection<PlanImplementation> executionPlans,
ExecutionPlan existingPlan,
Set<Channel> openChannels,
Set<ExecutionStage> executedStages);
```
This method could then be invoked in `Job.java` like this:
```java
public PlanImplementation pickBestExecutionPlan(
Collection<PlanImplementation> executionPlans,
ExecutionPlan existingPlan,
Set<Channel> openChannels,
Set<ExecutionStage> executedStages) {
return this.configuration
.getCostModel()
.getFactory()
.makeCost()
.pickBestExecutionPlan(
executionPlans,
existingPlan,
openChannels,
executedStages
);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]