Hi Venkat,
Thanks for joining the discussion.
Based on our understanding, there are still a significant number of
existing tasks using Hive. Indeed, many companies are now migrating their
data to the lakehouse, but due to historical reasons, a substantial amount
of data still resides in Hive.
Bes
Hi Xia,
+1 on introducing dynamic parallelism inference for HiveSource.
Orthogonal to this discussion, curious, how commonly HiveSource is used
these days in the industry given the popularity of table formats/sources
like Iceberg, Hudi and Delta lake?
Thanks
Venkat
On Wed, Apr 24, 2024, 7:41 PM
Hi everyone,
Thanks for all the feedback!
If there are no more comments, I would like to start the vote thread,
thanks again!
Best,
Xia
Ahmed Hamdy 于2024年4月18日周四 21:31写道:
> Hi Xia,
> I have read through the FLIP and discussion and the new version of the FLIP
> looks better.
> +1 for the propo
Hi Xia,
I have read through the FLIP and discussion and the new version of the FLIP
looks better.
+1 for the proposal.
Best Regards
Ahmed Hamdy
On Thu, 18 Apr 2024 at 12:21, Ron Liu wrote:
> Hi, Xia
>
> Thanks for updating, looks good to me.
>
> Best,
> Ron
>
> Xia Sun 于2024年4月18日周四 19:11写道:
>
Hi, Xia
Thanks for updating, looks good to me.
Best,
Ron
Xia Sun 于2024年4月18日周四 19:11写道:
> Hi Ron,
> Yes, presenting it in a table might be more intuitive. I have already added
> the table in the "Public Interfaces | New Config Option" chapter of FLIP.
> PTAL~
>
> Ron Liu 于2024年4月18日周四 18:10写道
Hi Ron,
Yes, presenting it in a table might be more intuitive. I have already added
the table in the "Public Interfaces | New Config Option" chapter of FLIP.
PTAL~
Ron Liu 于2024年4月18日周四 18:10写道:
> Hi, Xia
>
> Thanks for your reply.
>
> > That means, in terms
> of priority, `table.exec.hive.infer
Hi, Xia
Thanks for your reply.
> That means, in terms
of priority, `table.exec.hive.infer-source-parallelism` >
`table.exec.hive.infer-source-parallelism.mode`.
I still have some confusion, if the
`table.exec.hive.infer-source-parallelism`
>`table.exec.hive.infer-source-parallelism.mode`, curren
Hi Ron and Lijie,
Thanks for joining the discussion and sharing your suggestions.
> the InferMode class should also be introduced in the Public Interfaces
> section!
Thanks for the reminder, I have now added the InferMode class to the Public
Interfaces section as well.
> `table.exec.hive.infer-
Thanks for driving the discussion.
+1 for the proposal and +1 for the `InferMode.NONE` option.
Best,
Lijie
Ron liu 于2024年4月18日周四 11:36写道:
> Hi, Xia
>
> Thanks for driving this FLIP.
>
> This proposal looks good to me overall. However, I have the following minor
> questions:
>
> 1. FLIP introdu
Hi, Xia
Thanks for driving this FLIP.
This proposal looks good to me overall. However, I have the following minor
questions:
1. FLIP introduced `table.exec.hive.infer-source-parallelism.mode` as a new
parameter, and the value is the enum class `InferMode`, I think the
InferMode class should also
Hi Jeyhun, Muhammet,
Thanks for all the feedback!
> Could you please mention the default values for the new configurations
> (e.g., table.exec.hive.infer-source-parallelism.mode,
> table.exec.hive.infer-source-parallelism.enabled,
> etc) ?
Thanks for your suggestion. I have supplemen
Hello Xia,
Thanks for the FLIP!
Since we are introducing the mode as a configuration option,
could it make sense to have `InferMode.NONE` option also?
The `NONE` option would disable the inference.
This way we deprecate the `table.exec.hive.infer-source-parallelism`
and no additional `table.exe
Hi Xia,
Thanks for driving this FLIP. +1 from my side.
I have one comment.
Could you please mention the default values for the new configurations
(e.g., table.exec.hive.infer-source-parallelism.mode,
table.exec.hive.infer-source-parallelism.enabled,
etc) ?
Regards,
Jeyhun
On Tue, Apr 16, 2024 a
Thanks for creating this FLIP. @Xia
+1 for this proposal. Dynamic parallelism inference can be helpful
to decide a better parallelism. And it's good to unify the settings
of static & dynamic parallelism inference.
Thanks,
Zhu
Xia Sun 于2024年4月16日周二 15:12写道:
> Hi everyone,
> I would like to sta
Hi everyone,
I would like to start a discussion on FLIP-445: Support dynamic parallelism
inference for HiveSource[1].
FLIP-379[2] has introduced dynamic source parallelism inference for batch
jobs, which can utilize runtime information to more accurately decide the
source parallelism. As a follow-
15 matches
Mail list logo