Re: [DISCUSS] Introduce partitioning strategies to Table/SQL

2020-09-02 Thread Benchao Li
Hi Jingsong, Thanks for the clarification, and sorry to misunderstand your first intention. What I was talking about is indeed another topic, we can leave it to the future, and see if there are any other people who have the same scenarios. Jingsong Li 于2020年9月3日周四 上午10:56写道: > Thanks Timo for

Re: [DISCUSS] Introduce partitioning strategies to Table/SQL

2020-09-02 Thread Jingsong Li
Thanks Timo for working on FLIP-107. Agree, I think it is good. I'll spend more time to form a FLIP in detail later. Best, Jingsong On Wed, Sep 2, 2020 at 7:12 PM Timo Walther wrote: > Hi Jingsong, > > I haven't looked at your proposal but I think it make sense to have a > separate FLIP for

Re: [DISCUSS] Introduce partitioning strategies to Table/SQL

2020-09-02 Thread Timo Walther
Hi Jingsong, I haven't looked at your proposal but I think it make sense to have a separate FLIP for the parititioning topic. I'm currently working on an update to FLIP-107 and would suggest to remove the paritioning topic there. FLIP-107 will only focus on accessing metadata and expressing

Re: [DISCUSS] Introduce partitioning strategies to Table/SQL

2020-08-31 Thread Jingsong Li
Thanks Konstantin and Benchao for your response. If we need to push forward the implementation, it should be a FLIP. My original intention was to unify the partition definitions for batches and streams: - What is "PARTITION" on a table? Partitions define the physical storage form of a table.

Re: [DISCUSS] Introduce partitioning strategies to Table/SQL

2020-08-31 Thread Benchao Li
Hi Jingsong, Thanks for bringing up this discussion. I like this idea generally. I'd like to add some cases we met in our scenarios. ## Source Partition By There is an use case that users want to do some lookup thing in the UDF, it's very like the dimension table. It's common for them to cache

Re: [DISCUSS] Introduce partitioning strategies to Table/SQL

2020-08-31 Thread Konstantin Knauf
Hi Jingsong, I would like to understand this FLIP (?) a bit better, but I am missing some background, I believe. So, some basic questions: 1) Does the PARTITION BY clause only have an effect for sink tables defining how data should be partitioning the sink system or does it also make a

[DISCUSS] Introduce partitioning strategies to Table/SQL

2020-08-24 Thread Jingsong Li
Hi all, ## Motivation FLIP-63 [1] introduced initial support for PARTITIONED BY clause to an extent that let us support Hive's partitioning. But this partition definition is completely specific to Hive/File systems, with the continuous development of the system, there are new requirements: -