退订
















At 2022-06-13 22:44:24, "cao zou" <zoucao...@gmail.com> wrote:
>Hi godfrey, thanks for your detail explanation.
>After explaining and glancing over the FLIP-231, I think it is
>really need, +1 for this and looking forward to it.
>
>best
>zoucao
>
>godfrey he <godfre...@gmail.com> 于2022年6月13日周一 14:43写道:
>
>> Hi Ingo,
>>
>> The semantics does not distinguish batch and streaming,
>> It works for both batch and streaming, but the result of
>> unbounded sources is meaningless.
>> Currently, I throw exception for streaming mode,
>> and we can support streaming mode with bounded source
>> in the future.
>>
>> Best,
>> Godfrey
>>
>> Ingo Bürk <airbla...@apache.org> 于2022年6月13日周一 14:17写道:
>> >
>> > Hi Godfrey,
>> >
>> > thank you for the explanation. A SELECT is definitely more generic and
>> > will work for all connectors automatically. As such I think it's a good
>> > baseline solution regardless.
>> >
>> > We can also think about allowing connector-specific optimizations in the
>> > future, but I do like your idea of letting the optimizer rules perform a
>> > lot of the work here already by leveraging existing optimizations.
>> > Similarly things like non-null counts of non-nullable columns would (or
>> > at least could) be handled by the optimizer rules already.
>> >
>> > So as far as that point goes, +1 to the generic approach.
>> >
>> > One more point, though: In general we should avoid supporting features
>> > only in specific modes as it breaks the unification promise. Given that
>> > ANALYZE is a manual and completely optional operation I'm OK with doing
>> > that here in principle. However, I wonder what will happen in the
>> > streaming / unbounded case. Do you plan to throw an error? Or do we
>> > complete the command as successful but without doing anything?
>> >
>> >
>> > Best
>> > Ingo
>> >
>> > On 13.06.22 05:50, godfrey he wrote:
>> > > Hi Ingo,
>> > >
>> > > Thanks for the inputs.
>> > >
>> > > I think converting `ANALYZE TABLE` to `SELECT` statement is
>> > > more generic approach. Because query plan optimization is more generic,
>> > >   we can provide more optimization rules to optimize not only `SELECT`
>> statement
>> > > converted from `ANALYZE TABLE` but also the `SELECT` statement written
>> by users.
>> > >
>> > >> JDBC connector can get a row count estimate without performing a
>> > >> SELECT COUNT(1)
>> > > To optimize such cases, we can implement a rule to push aggregate into
>> > > table source.
>> > > Currently, there is a similar rule: SupportsAggregatePushDown, which
>> > > supports only pushing
>> > > local aggregate into source now.
>> > >
>> > >
>> > > Best,
>> > > Godfrey
>> > >
>> > > Ingo Bürk <airbla...@apache.org> 于2022年6月10日周五 17:15写道:
>> > >>
>> > >> Hi Godfrey,
>> > >>
>> > >> compared to the solution proposed in the FLIP (using a SELECT
>> > >> statement), I wonder if you have considered adding APIs to catalogs /
>> > >> connectors to perform this task as an alternative?
>> > >> I could imagine that for many connectors, statistics could be
>> > >> implemented in a less expensive way by leveraging the underlying
>> system
>> > >> (e.g. a JDBC connector can get a row count estimate without
>> performing a
>> > >> SELECT COUNT(1)).
>> > >>
>> > >>
>> > >> Best
>> > >> Ingo
>> > >>
>> > >>
>> > >> On 10.06.22 09:53, godfrey he wrote:
>> > >>> Hi all,
>> > >>>
>> > >>> I would like to open a discussion on FLIP-240:  Introduce "ANALYZE
>> > >>> TABLE" Syntax.
>> > >>>
>> > >>> As FLIP-231 mentioned, statistics are one of the most important
>> inputs
>> > >>> to the optimizer. Accurate and complete statistics allows the
>> > >>> optimizer to be more powerful. "ANALYZE TABLE" syntax is a very
>> common
>> > >>> but effective approach to gather statistics, which is already
>> > >>> introduced by many compute engines and databases.
>> > >>>
>> > >>> The main purpose of  discussion is to introduce "ANALYZE TABLE"
>> syntax
>> > >>> for Flink sql.
>> > >>>
>> > >>> You can find more details in FLIP-240 document[1]. Looking forward to
>> > >>> your feedback.
>> > >>>
>> > >>> [1]
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217386481
>> > >>> [2] POC: https://github.com/godfreyhe/flink/tree/FLIP-240
>> > >>>
>> > >>>
>> > >>> Best,
>> > >>> Godfrey
>>

Reply via email to