Hi all, Zhipeng, Fan (cc'ed) and I are opening this thread to discuss two different designs to extend Flink ML API to support more use-cases, e.g. expressing a DAG of preprocessing and training logics. These two designs have been documented in FLIP-173 <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=184615783> 。
We have different opinions on the usability and the ease-of-understanding of the proposed APIs. It will be really useful to have comments of those designs from the open source community and to learn your preferences. To facilitate the discussion, we have summarized our design principles and opinions in this Google doc <https://docs.google.com/document/d/1L3aI9LjkcUPoM52liEY6uFktMnFMNFQ6kXAjnz_11do>. Code snippets for a few example use-cases are also provided in this doc to demonstrate the difference between these two solutions. This Flink ML API is super important to the future of Flink ML library. Please feel free to reply to this email thread or comment in the Google doc directly. Thank you! Dong, Zhipeng, Fan