Hi godfrey, Thanks for driving this meaningful topic.
I think statistics are essential and meaningful for the optimizer, I'm just
wondering which situation is needed. From the user side, the optimizer
should be executed by the framework, maybe they do not want to consider too
much about it. Could you share more situations about using 'ANALYZE TABLE'
from the user side?

nit: There maybe exists a mistake in Examples#partition table
the partition info should be

Partition1: (ds='2022-06-01', hr=1)

Partition2: (ds='2022-06-01', hr=2)

Partition3: (ds='2022-06-02', hr=1)

Partition4: (ds='2022-06-02', hr=2)

best
 zoucao


godfrey he <godfre...@gmail.com> 于2022年6月10日周五 15:54写道:

> Hi all,
>
> I would like to open a discussion on FLIP-240:  Introduce "ANALYZE
> TABLE" Syntax.
>
> As FLIP-231 mentioned, statistics are one of the most important inputs
> to the optimizer. Accurate and complete statistics allows the
> optimizer to be more powerful. "ANALYZE TABLE" syntax is a very common
> but effective approach to gather statistics, which is already
> introduced by many compute engines and databases.
>
> The main purpose of  discussion is to introduce "ANALYZE TABLE" syntax
> for Flink sql.
>
> You can find more details in FLIP-240 document[1]. Looking forward to
> your feedback.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217386481
> [2] POC: https://github.com/godfreyhe/flink/tree/FLIP-240
>
>
> Best,
> Godfrey
>

Reply via email to