Hi Zhanghao,

Certainly, I think we shall leave this FLIP focus on setting the source 
parallelism via DDL's properties. I just want to clarify that setting 
parallelism for individual operators is also profitable from my experience, 
which is slighted in your FLIP.

@ConradJam BTW, compared with SQL hint, I think using `scan.parallelism` is 
better to align with current `sink.parallelism`. And once we introduce such 
option, we can also use SQL hint of dynamic table options[1] to configure the 
source parallelism.

[1] 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#dynamic-table-options


Best
Yun Tang
________________________________
From: ConradJam <jam.gz...@gmail.com>
Sent: Friday, September 15, 2023 22:52
To: dev@flink.apache.org <dev@flink.apache.org>
Subject: Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL 
Sources

+ 1 Thanks for the FLIP and the discussion. I would like to ask whether to
use SQL Hint syntax to set this parallelism?

Martijn Visser <martijnvis...@apache.org> 于2023年9月15日周五 20:52写道:

> Hi everyone,
>
> Thanks for the FLIP and the discussion. I find it exciting. Thanks for
> pushing for this.
>
> Best regards,
>
> Martijn
>
> On Fri, Sep 15, 2023 at 2:25 PM Chen Zhanghao <zhanghao.c...@outlook.com>
> wrote:
>
> > Hi Jane,
> >
> > Thanks for the valuable suggestions.
> >
> > For Q1, it's indeed an issue. Some possible ideas include introducing a
> > fake transformation after the source that takes the global default
> > parallelism, or simply make exec nodes to take the global default
> > parallelism, but both ways prevent potential chaining opportunity and I'm
> > not sure if that's good to go. We'll need to give deeper thoughts in it
> and
> > polish our proposal. We're also more than glad to hear your inputs on it.
> >
> > For Q2, scan.parallelism will take high precedence, as the more specific
> > config should take higher precedence.
> >
> > Best,
> > Zhanghao Chen
> > ________________________________
> > 发件人: Jane Chan <qingyue....@gmail.com>
> > 发送时间: 2023年9月15日 11:56
> > 收件人: dev@flink.apache.org <dev@flink.apache.org>
> > 抄送: dewe...@outlook.com <dewe...@outlook.com>
> > 主题: Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL
> > Sources
> >
> > Hi, Zhanghao, Dewei,
> >
> > Thanks for initiating this discussion. This feature is valuable in
> > providing more flexibility for performance tuning for SQL pipelines.
> >
> > Here are my two cents,
> >
> > 1. In the FLIP, you mentioned concerns about the parallelism of the calc
> > node and concluded to "leave the behavior unchanged for now."  This means
> > that the calc node will use the parallelism of the source operator,
> > regardless of whether the source parallelism is configured or not. If I
> > understand correctly, currently, except for the sink exec node (which has
> > the ability to configure its own parallelism), the rest of the exec nodes
> > accept its input parallelism. From the design, I didn't see the details
> > about coping with input and default parallelism for the rest of the exec
> > nodes. Can you elaborate more about the details?
> >
> > 2. Does the configuration `table.exec.resource.default-parallelism` take
> > precedence over `scan.parallelism`?
> >
> > Best,
> > Jane
> >
> > On Fri, Sep 15, 2023 at 10:43 AM Yun Tang <myas...@live.com> wrote:
> >
> > > Thanks for creating this FLIP,
> > >
> > > Many users have demands to configure the source parallelism just as
> > > configuring the sink parallelism via DDL. Look forward for this
> feature.
> > >
> > > BTW, I think setting parallelism for each operator should also be
> > > valuable. And this shall work with compiled plan [1] instead of SQL's
> > DDL.
> > >
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration
> > >
> > > Best
> > > Yun Tang
> > > ________________________________
> > > From: Benchao Li <libenc...@apache.org>
> > > Sent: Thursday, September 14, 2023 19:53
> > > To: dev@flink.apache.org <dev@flink.apache.org>
> > > Cc: dewe...@outlook.com <dewe...@outlook.com>
> > > Subject: Re: [DISCUSS] FLIP-367: Support Setting Parallelism for
> > Table/SQL
> > > Sources
> > >
> > > Thanks Zhanghao, Dewei for preparing the FLIP,
> > >
> > > I think this is a long awaited feature, and I appreciate your effort,
> > > especially the "Other concerns" part you listed.
> > >
> > > Regarding the parallelism of transformations following the source
> > > transformation, it's indeed a problem that we initially want to solve
> > > when we introduced this feature internally. I'd like to hear more
> > > opinions on this. Personally I'm ok to leave it out of this FLIP for
> > > the time being.
> > >
> > > Chen Zhanghao <zhanghao.c...@outlook.com> 于2023年9月14日周四 14:46写道:
> > > >
> > > > Hi Devs,
> > > >
> > > > Dewei (cced) and I would like to start a discussion on FLIP-367:
> > Support
> > > Setting Parallelism for Table/SQL Sources [1].
> > > >
> > > > Currently, Flink Table/SQL jobs do not expose fine-grained control of
> > > operator parallelism to users. FLIP-146 [2] brings us support for
> setting
> > > parallelism for sinks, but except for that, one can only set a default
> > > global parallelism and all other operators share the same parallelism.
> > > However, in many cases, setting parallelism for sources individually is
> > > preferable:
> > > >
> > > > - Many connectors have an upper bound parallelism to efficiently
> ingest
> > > data. For example, the parallelism of a Kafka source is bound by the
> > number
> > > of partitions, any extra tasks would be idle.
> > > > - Other operators may involve intensive computation and need a larger
> > > parallelism.
> > > >
> > > > We propose to improve the current situation by extending the current
> > > table source API to support setting parallelism for Table/SQL sources
> via
> > > connector options.
> > > >
> > > > Looking forward to your feedback.
> > > >
> > > > [1] FLIP-367: Support Setting Parallelism for Table/SQL Sources -
> > Apache
> > > Flink - Apache Software Foundation<
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263429150
> > > >
> > > > [2] FLIP-146: Improve new TableSource and TableSink interfaces -
> Apache
> > > Flink - Apache Software Foundation<
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-146%3A+Improve+new+TableSource+and+TableSink+interfaces
> > > >
> > > >
> > > > Best,
> > > > Zhanghao Chen
> > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
> >
>


--
Best

ConradJam

Reply via email to