Hi Godfrey,

compared to the solution proposed in the FLIP (using a SELECT statement), I wonder if you have considered adding APIs to catalogs / connectors to perform this task as an alternative? I could imagine that for many connectors, statistics could be implemented in a less expensive way by leveraging the underlying system (e.g. a JDBC connector can get a row count estimate without performing a SELECT COUNT(1)).


Best
Ingo


On 10.06.22 09:53, godfrey he wrote:
Hi all,

I would like to open a discussion on FLIP-240:  Introduce "ANALYZE
TABLE" Syntax.

As FLIP-231 mentioned, statistics are one of the most important inputs
to the optimizer. Accurate and complete statistics allows the
optimizer to be more powerful. "ANALYZE TABLE" syntax is a very common
but effective approach to gather statistics, which is already
introduced by many compute engines and databases.

The main purpose of  discussion is to introduce "ANALYZE TABLE" syntax
for Flink sql.

You can find more details in FLIP-240 document[1]. Looking forward to
your feedback.

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217386481
[2] POC: https://github.com/godfreyhe/flink/tree/FLIP-240


Best,
Godfrey

Reply via email to