Hi Andreas, I too like the idea of having Flink as a backend for SystemML and the POC is in the right direction. I agree with Matthias's comments regarding shared functionality and multi-stage integration. I would also recommend using similar naming conventions as our Hadoop and Spark backend. i.e. hybrid_flink and flink. Looking forward for more discussions on this topic :)
Thanks, Niketan Pansare IBM Almaden Research Center E-mail: npansar At us.ibm.com http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar From: "Kunft, Andreas" <[email protected]> To: "[email protected]" <[email protected]> Date: 03/03/2016 09:47 AM Subject: AW: Add Apache Flink as new backend Hello, thank you for the fast reply. We are glad you like the idea! As next step, we will focus on implementing a end-to-end integration based on your suggestions. We think that this initial integration is a good start for further discussions based on the concrete implementation in the pull request. Best Andreas ________________________________ Von: Matthias Boehm <[email protected]> Gesendet: Donnerstag, 3. März 2016 06:44 An: [email protected] Betreff: Re: Add Apache Flink as new backend Thanks guys, for sharing the details of this prototype. In general, I really like the idea of having a Flink backend in SystemML. We just need to structure the code (similar to our Spark backend) in a way that Flink libraries are not necessarily required when running in Spark or MapReduce execution modes. To answer your questions in detail: 1) Shared Functionality: I would recommend to reuse the upper levels (i.e., language, hops, lops, etc) and core block operations but keep the instructions (and everything that accesses Flink APIs) independent. Yes, this separation comes at the cost of code duplication but it allows to run backends without the need for libraries of the other backends. Note that we did the same for our Spark backend, which allows us to run the same jar in old MapReduce v1 environments where these libraries are not available during runtime. Down the road we might consolidate common functionality like runtime maintenance of matrix characteristics etc. 2) Execution Modes: Yes, please add two new execution modes in DMLScript.RUNTIME_PLATFORM and one new execution type in Lop.ExecType. Once this is done, you can already run end-to-end scripts with '-exec hybrid_flink'. Of course we can have more detailed discussions about how and when to select Flink operators during operator selection in hybrid_flink mode. As a start, I would recommend our default heuristic of compiling Flink operators whenever the memory estimate of an operation exceeds the local memory budget of the driver/client process. 3) Pull Requests: I would recommend multiple stages: (1) initially a minimal end-to-end integration, (2) multiple packages of "instruction sets" incl tests, (3) specific rewrites / optimizer extensions, and later (4) continuous improvements. For the initial end-to-end integration, I would focus on two or three simple yet very important instructions (tsmm, mapmm, mapmmchain), basic converter utils, and a basic end-to-end integration (execution types, serialization, buffer pool, etc). Having tsmm and mapmm (plus optionally mapmmchain) would already allow you to run end-to-end algorithms like LinregDS, LinregCG, GLM, L2SVM, PageRank, etc for common scenarios where only transpose-self matrix multiplications or matrix-vector multiplications are compiled to distributed operations while remaining operators are executed in the driver/client and vectors are small enough to be broadcast to mapmm/mapmmchain. 4) Refactoring runtime package: Again, don't worry about the refactoring of our runtime packages. The focus is mainly our block runtime, restructuring everything such that this runtime can be easily packaged as an individual jar and distributed/consumed independently of SystemML. I'm looking forward to many more discussions on this topic. Regards Matthias [Inactive hide details for "Kunft, Andreas" ---03/02/2016 11:51:16 AM---Hi all, we are a group of researchers from the Database]"Kunft, Andreas" ---03/02/2016 11:51:16 AM---Hi all, we are a group of researchers from the Database group (DIMA) at TU Berlin. We would like to From: "Kunft, Andreas" <[email protected]> To: "[email protected]" <[email protected]> Date: 03/02/2016 11:51 AM Subject: Add Apache Flink as new backend ________________________________ Hi all, we are a group of researchers from the Database group (DIMA) at TU Berlin. We would like to add Apache Flink as an execution backend to SystemML in addition to Hadoop MR and Spark. To this end we started implementing a proof of concept consisting of several instructions together with the necessary de-/serialization and execution-logic. You can see the current state of our fork [1] including two test-cases showing what we currently support [2][3]. For our simple POC implementation we realized that we had to duplicate a lot of functionality (especially from spark instructions). We saw that people already raised concerns regarding the refactoring of the runtime package [4][5], potentially making it easier to integrate further backend-systems. Given that this would be a bigger change, it would be helpful to get some input from the SystemML community regarding this effort. In particular, we would like to discuss the following questions: * How should we deal with shared functionality between the different backends (Flink, Spark, etc.) to avoid code duplication, especially in instructions, but also introduce modularity? And is this modularization even desired? * How should we integrate Flink into the different runtime-modes? (Flink-only, Flink-Hybrid, etc.) * How should we structure the integration? (multiple/single commits) We're looking forward to feedback and hope the community likes the idea of adding Flink as an execution backend to SystemML. Best, Andreas Kunft Christoph Brücke Felix Schüler [1] https://github.com/stratosphere/incubator-systemml/tree/flink-integration [2] https://github.com/stratosphere/incubator-systemml/blob/flink-integration/src/test/java/org/apache/sysml/runtime/instructions/flink/TsmmFLInstructionTest.java [3] https://github.com/stratosphere/incubator-systemml/blob/flink-integration/src/test/java/org/apache/sysml/runtime/instructions/flink/utils/DataSetConverterUtilsTest.java [4] https://issues.apache.org/jira/browse/SYSTEMML-33 [5] https://www.mail-archive.com/search?l=dev%40systemml.incubator.apache.org&q=subject%3A%22Runtime +package+refactoring%22&o=newest&f=1?
