Hi all,

we are a group of researchers from the Database group (DIMA) at TU Berlin. We 
would like to add Apache Flink as an execution backend to SystemML in addition 
to Hadoop MR and Spark.
To this end we started implementing a proof of concept consisting of several 
instructions together with the necessary de-/serialization and execution-logic.
You can see the current state of our fork [1] including two test-cases showing 
what we currently support [2][3].

For our simple POC implementation we realized that we had to duplicate a lot of 
functionality (especially from spark instructions). We saw that people already 
raised concerns regarding the refactoring of the runtime package [4][5], 
potentially making it easier to integrate further backend-systems.
Given that this would be a bigger change, it would be helpful to get some input 
from the SystemML community regarding this effort.

In particular, we would like to discuss the following questions:

  *
How should we deal with shared functionality between the different backends 
(Flink, Spark, etc.) to avoid code duplication, especially in instructions, but 
also introduce modularity? And is this modularization even desired?
  *
How should we integrate Flink into the different runtime-modes? (Flink-only, 
Flink-Hybrid, etc.)
  *
How should we structure the integration? (multiple/single commits)

We’re looking forward to feedback and hope the community likes the idea of 
adding Flink as an execution backend to SystemML.

Best,
Andreas Kunft
Christoph Brücke
Felix Schüler

[1] https://github.com/stratosphere/incubator-systemml/tree/flink-integration
[2] 
https://github.com/stratosphere/incubator-systemml/blob/flink-integration/src/test/java/org/apache/sysml/runtime/instructions/flink/TsmmFLInstructionTest.java
[3] 
https://github.com/stratosphere/incubator-systemml/blob/flink-integration/src/test/java/org/apache/sysml/runtime/instructions/flink/utils/DataSetConverterUtilsTest.java
[4] https://issues.apache.org/jira/browse/SYSTEMML-33
[5] 
https://www.mail-archive.com/search?l=dev%40systemml.incubator.apache.org&q=subject%3A%22Runtime+package+refactoring%22&o=newest&f=1​

Reply via email to