Hi, Xia

Thanks for driving this FLIP.

This proposal looks good to me overall. However, I have the following minor
questions:

1. FLIP introduced `table.exec.hive.infer-source-parallelism.mode` as a new
parameter, and the value is the enum class `InferMode`, I think the
InferMode class should also be introduced in the Public Interfaces section!
2. You mentioned in FLIP that the default value of
`table.exec.hive.infer-source-parallelism.max` is 1024, I checked through
the code that the default value is 1000?
3. I also agree with Muhammet's idea that there is no need to introduce the
option `table.exec.hive.infer-source-parallelism.enabled`, and that
expanding the InferMode values will fulfill the need. There is another
issue to consider here though, how are
`table.exec.hive.infer-source-parallelism` and
`table.exec.hive.infer-source-parallelism.mode` compatible?
4. In FLIP-367 it is supported to be able to set the Source's parallelism
individually, if in the future HiveSource also supports this feature,
however, the default value of
`table.exec.hive.infer-source-parallelism.mode` is `InferMode. DYNAMIC`, at
this point will the parallelism be dynamically derived or will the manually
set parallelism take effect, and who has the higher priority?

Best,
Ron

Xia Sun <xingbe...@gmail.com> 于2024年4月17日周三 12:08写道:

> Hi Jeyhun, Muhammet,
> Thanks for all the feedback!
>
> > Could you please mention the default values for the new configurations
> >     (e.g., table.exec.hive.infer-source-parallelism.mode,
> >     table.exec.hive.infer-source-parallelism.enabled,
> >     etc) ?
>
>
> Thanks for your suggestion. I have supplemented the explanation regarding
> the default values.
>
> > Since we are introducing the mode as a configuration option,
> >     could it make sense to have `InferMode.NONE` option also?
> >     The `NONE` option would disable the inference.
>
>
> This is a good idea. Looking ahead, it could eliminate the need for
> introducing
> a new configuration option. I haven't identified any potential
> compatibility issues
> as yet. If there are no further ideas from others, I'll go ahead and update
> the FLIP to
> introducing InferMode.NONE.
>
> Best,
> Xia
>
> Muhammet Orazov <mor+fl...@morazow.com.invalid> 于2024年4月17日周三 10:31写道:
>
> > Hello Xia,
> >
> > Thanks for the FLIP!
> >
> > Since we are introducing the mode as a configuration option,
> > could it make sense to have `InferMode.NONE` option also?
> > The `NONE` option would disable the inference.
> >
> > This way we deprecate the `table.exec.hive.infer-source-parallelism`
> > and no additional `table.exec.hive.infer-source-parallelism.enabled`
> > option is required.
> >
> > What do you think?
> >
> > Best,
> > Muhammet
> >
> > On 2024-04-16 07:07, Xia Sun wrote:
> > > Hi everyone,
> > > I would like to start a discussion on FLIP-445: Support dynamic
> > > parallelism
> > > inference for HiveSource[1].
> > >
> > > FLIP-379[2] has introduced dynamic source parallelism inference for
> > > batch
> > > jobs, which can utilize runtime information to more accurately decide
> > > the
> > > source parallelism. As a follow-up task, we plan to implement the
> > > dynamic
> > > parallelism inference interface for HiveSource, and also switch the
> > > default
> > > static parallelism inference to dynamic parallelism inference.
> > >
> > > Looking forward to your feedback and suggestions, thanks.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-445%3A+Support+dynamic+parallelism+inference+for+HiveSource
> > > [2]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs
> > >
> > > Best regards,
> > > Xia
> >
>

Reply via email to