Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

Ron Liu Thu, 11 Jul 2024 19:20:29 -0700

Hi, Junrui

The FLIP proposal looks good to me.


I have the same question as Fabian:

> For join strategies, they are only
applicable when using an optimizer (that's currently not part of Flink's
runtime) with the Table API or Flink SQL. How do we plan to connect the
optimizer with Flink's runtime?

For batch scenario, if we want to better support dynamic plan tuning
strategies, the fundamental solution is still to put SQL Optimizer to
flink-runtime.

Best,
Ron

David Morávek <[email protected]> 于2024年7月11日周四 19:17写道：

> Hi Junrui,
>
> Thank you for drafting the FLIP. I really appreciate the direction it’s
> taking. We’ve discussed similar approaches multiple times, and it’s great
> to see this progress.
>
> I have a few questions and thoughts:
>
>
> * 1. Transformations in StreamGraphGenerator:*
> Should we consider taking this a step further by working on a list of
> transformations (inputs of StreamGraphGenerator)?
>
>     public StreamGraphGenerator(
>             List<Transformation<?>> transformations,
>             ExecutionConfig executionConfig,
>             CheckpointConfig checkpointConfig,
>             ReadableConfig configuration) {
>
> We could potentially merge ExecutionConfig and CheckpointConfig into
> ReadableConfig. This approach might offer us even more flexibility.
>
>
> *2. StreamGraph for Recovery Purposes:*
> Should we avoid using StreamGraph for recovery purposes? The existing
> JG-based recovery code paths took years to perfect, and it doesn’t seem
> necessary to replace them. We only need SG for cases where we want to
> regenerate the JG.
> Additionally, translating SG into JG before persisting it in HA could be
> beneficial, as it allows us to catch potential issues early on.
>
>
> * 3. Moving Away from Java Serializables:*
> It would be great to start moving away from Java Serializables as much as
> possible. Could we instead define proper versioned serializers, possibly
> based on a well-defined protobuf blueprint? This change could help us avoid
> ongoing issues associated with Serializables.
>
> Looking forward to your thoughts.
>
> Best,
> D.
>
> On Thu, Jul 11, 2024 at 12:58 PM Fabian Paul <[email protected]> wrote:
>
> > Thanks for drafting this FLIP. I really like the idea of introducing a
> > concept in Flink that is close to a logical plan submission.
> >
> > I have a few questions about the proposal and its future evolvability.
> >
> > - What is the future plan for job submissions in Flink? With the current
> > proposal, Flink will support JobGraph/StreamGraph/compiled plan
> > submissions? It might be confusing for users and complicate the existing
> > job submission logic significantly.
> > - The FLIP mentions multiple areas of optimization, first operator
> chaining
> > and second dynamic switches between join strategies. I think from a Flink
> > perspective, these are, at the moment, separate concerns.  For operator
> > chaining, I agree with the current proposal, which is a concept that
> > applies generally to Flink's runtime. For join strategies, they are only
> > applicable when using an optimizer (that's currently not part of Flink's
> > runtime) with the Table API or Flink SQL. How do we plan to connect the
> > optimizer with Flink's runtime?
> > - With table/SQL API we already expose a compiled plan to support stable
> > version upgrades. It would be great to explore a joined plan to also
> offer
> > stable version upgrades with a potentially persistent streamgraph.
> >
> > Best,
> > Fabian
> >
>

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

Reply via email to