Hi Everyone,
my colleagues (in cc) and I would like to propose this FLIP for
discussion. In short, we want to reduce the number of APIs that we have
by deprecating the DataSet API. This is a big step for Flink, that's why
I'm also cross-posting this to the User Mailing List.
FLIP-131: http://s.apache.org/FLIP-131
I'm posting the introduction of the FLIP below but please refer to the
document linked above for the full details:
--
Flink provides three main SDKs/APIs for writing Dataflow Programs: Table
API/SQL, the DataStream API, and the DataSet API. We believe that this
is one API too many and propose to deprecate the DataSet API in favor of
the Table API/SQL and the DataStream API. Of course, this is easier said
than done, so in the following, we will outline why we think that having
too many APIs is detrimental to the project and community. We will then
describe how we can enhance the Table API/SQL and the DataStream API to
subsume the DataSet API's functionality.
In this FLIP, we will not describe all the technical details of how the
Table API/SQL and DataStream will be enhanced. The goal is to achieve
consensus on the idea of deprecating the DataSet API. There will have to
be follow-up FLIPs that describe the necessary changes for the APIs that
we maintain.
--
Please let us know if you have any concerns or comments. Also, please
keep discussion to this ML thread instead of commenting in the Wiki so
that we can have a consistent view of the discussion.
Best,
Aljoscha