Hi all, I'd like to start a discussion about introducing a few convenient operations in Table API from the perspective of ease of use.
Currently some tasks are not easy to express in Table API e.g. deduplication, topn, etc, or not easy to express when there are hundreds of columns in a table, e.g. null data handling, etc. I'd like to propose to introduce a few operations in Table API with the following purposes: - Make Table API users to easily leverage the powerful features already in SQL, e.g. deduplication, topn, etc - Provide some convenient operations, e.g. introducing a series of operations for null data handling (it may become a problem when there are hundreds of columns), data sampling and splitting (which is a very common use case in ML which usually needs to split a table into multiple tables for training and validation separately). Please refer to FLIP-155 [1] for more details. Looking forward to your feedback! Regards, Dian [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-155%3A+Introduce+a+few+convenient+operations+in+Table+API