Re: [DISCUSS] FLIP-325: Support configuring end-to-end allowed latency

Dong Lin Thu, 06 Jul 2023 18:12:48 -0700

Hi Jing,

Thanks for the comments. Please see my reply inline.


On Fri, Jul 7, 2023 at 5:40 AM Jing Ge <[email protected]> wrote:

> Hi,
>
> Thank you all for the inspired discussion. Really appreciate it!
>
> @Dong I'd like to ask some (stupid) questions to make sure I understand
> your thoughts correctly.
>
> 1. It will make no sense to send the same type of RecordAttributes right?
> e.g.  if one RecordAttributes(isBacklog=true) has been sent, a new
> RecordAttributes will be only sent when isBacklog is changed to be false,
> and vice versa. In this way, the number of RecordAttributes will be very
> limited.
>

Yes, you are right. Actually, this is what we plan to do when we update
operators to emit RecordAttributes via `Output#emitRecordAttributes()`.

Note that the FLIP does not specify the frequency of how operators should
invoke `Output#emitRecordAttributes()`. It is up to the operator
to decide when to emit RecordAttributes.


> 2. Since source readers can invoke Output#emitRecordAttributes to emit
> RecordAttributes(isBacklog=true/false), it might be weird to send
> RecordAttributes with different isBacklog back and forth too often. Devs
> and users should pay attention to it. Something is wrong when such a thing
> happens(metrics for monitoring?). Is this correct?
>

Actually, I think it could make sense to toggle isBacklog between true and
false while the job is running.

Suppose the job is reading from user-action data from Kafka and there is a
traffic spike for 2 hours. If the job keeps running in pure stream mode,
the watermark lag might keep increasing during this period because the
job's processing capability can not catch up with the Kafka input
throughput. In this case, it can be beneficial to dynamically switch
isBacklog to true when watermarkLag exceeds a given threshold (e.g. 5
minutes), and switch isBacklog to false again when the watermarkLag is low
enough (30 seconds).


> 3. Is there any relationship between end-to-end-latency and checkpoint
> interval that users should pay attention to? In the example described in
> the FLIP, both have the same value, 2 min. What about end-to-end-latency is
> configured bigger than checkpoint interval? Could checkpoint between
> end-to-end-latency be skipped?
>

This FLIP would not enforce any relationship between end-to-end latency and
checkpoint interval. Users are free to configure end-to-end latency to be
bigger than checkpoint interval.

I don't think there exists any use-case which requires end-to-end latency
to be higher than the checkpoint interval. Note that introducing a
relationship between these two configs would increase code complexity and
also make the documentation of these configs a bit more complex for users
to understand.

Since there is no correctness when a user sets end-to-end latency to be
bigger than the checkpointing interval, I think it is simpler to just let
the user decide how to configure them.


> 4. Afaiu, one major discussion point is that isBacklog can be derived from
> back pressure and there will be no need of RecordAttributes. In case a
> Flink job has rich resources that there is no back pressure (it will be
> difficult to perfectly have just enough resources that everything is fine
> but will have back pressure only for backlog) but we want to improve the
> throughput. We then need some other ways to derive isBacklog. That is the
> reason why RecordAttributes has been introduced. Did I understand it
> correctly?
>

I think there can be multiple ways to derive isBackog, including:
1) Based on the source operator's state. For example, when MySQL CDC source
is reading snapshot, it can claim isBacklog=true.
2) Based on the watermarkLag in the source. For example, when system_time -
watermark > user_specified_threshold, then isBacklog=true.
3) Based on metrics. For example, when busyTimeMsPerSecond (or
backPressuredTimeMsPerSecond) > user_specified_threshold, then
isBacklog=true.

Note that there are pros/cons between these choices and none of them can
best fit all use-cases. For example, since option-1 does not require any
extra user-specified threshold, it can be the best choice when we want to
improve user's existing job (e.g. the one in the motivation section)
without extra user configuration.

For use-cases which want to increase throughput when reading backlog data
from Kafka, option-2 can be the best choice because a threshold based on
the watermark lag is easier to understand and configure than configuring
threshold based on the backPressuredTimeMsPerSecond.

option-3 might be the only choice when option-1 and option-2 are not
available for the given use-cases. But it might be harder to configure a
threshold against backPressuredTimeMsPerSecond. This is because the choice
of the percentage (or ms per second) threshold will mostly be empirical and
approximate. For example, should the user configure this to be 100%, 99%,
or 90%? I would prefer not to have user worry about this if option-1 or
option-2 can be used.

RecordAttributes would be necessary in order to support option-1 and
option-2 well.


> 5. NIT: Just like we talked about in another thread, JavaBean naming
> convention is recommended, i.e. isBacklog() & setBacklog() instead of
> getIsBacklog() and setIsBacklog().
>

Yeah, thanks for the suggestion. I have updated the FLIP as suggested.

Best,
Dong


>
> Best regards,
> Jing
>
> On Thu, Jul 6, 2023 at 2:38 PM Dong Lin <[email protected]> wrote:
>
> > Hi Shammon,
> >
> > Thanks for your comments. Please see my reply inline.
> >
> >
> > On Thu, Jul 6, 2023 at 12:47 PM Shammon FY <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > Thanks for your replay @Dong. I really agree with Piotr's points and I
> > > would like to share some thoughts from my side.
> > >
> > > About the latency for mini-batch mechanism in Flink SQL, I still think
> > the
> > > description in the FLIP is not right. If there are N operators and the
> > > whole process time for data in the job is `t`, then the latency in
> > > mini-batch will be `table.exec.mini-batch.allow-latency`+`t`, not `
> > > table.exec.mini-batch.allow-latency`*N. I think this is one of the
> > > foundations of this FLIP, and you may need to confirm it again.
> > >
> >
> > Given that we agree to have a mechanism to support end-to-end latency for
> > DataStream programs, I think the exact semantics of
> > table.exec.mini-batch.allow-latency will not affect the motivation or API
> > design of this FLIP. I have updated the FLIP to remove any mention of
> > table.exec.mini-batch.allow-latency.
> >
> >
> > >
> > > I think supporting similar mechanisms in the runtime and balance
> latency
> > > and throughput dynamically for all flink jobs is a very good idea, and
> I
> > > have some questions for that.
> > >
> > > 1. We encounter a situation where the workload is high when processing
> > > snapshot data and we need mini-batch in sql for performance reason. But
> > the
> > > workload is low when processing delta data, we need to automatically
> > adjust
> > > the mini-batch SQL for them, or even cancel the mini-batch during delta
> > > processing. I think this FLIP meets our needs, but I think we need a
> > > general solution which covers all source types in flink, and the
> > > `isBacklog` in the FLIP is only one strategy.
> > >
> >
> > The focus of this FLIP is to allow Flink runtime to adjust the behavior
> of
> > operators (e.g. the buffer time) based on the IsBacklog status of sources
> > and the user-specified execution.end-to-end-latency (effective only when
> > there is no backlog). The FLIP assumes there is already a strategy for
> > sources to determine the IsProcessingBacklog status without adding more
> > strategies.
> >
> > I agree it is useful to introduce more strategies to determine the the
> > IsProcessingBacklog status for sources. We can determine the
> > IsProcessingBacklog status based on the backpressure metrics, the
> > event-time watermark lag, or anything we find reasonable. I would like to
> > work on this in follow-up FLIPs and that we don't work on too many things
> > in the same FLIP.
> >
> > Would this be OK with you?
> >
> >
> > > From the FLIP I think there should be two parts: dynamic trigger flush
> > > event in JM and dynamic trigger flush operations in Operator. We need
> to
> > > introduce much more general interfaces for them, such as
> > > `DynamicFlushStrategy` in JM and `DynamicFlushOperation` in TM? As
> Piotr
> > > mentioned above, we can collect many information from TM locally such
> as
> > > backpressure, queue size and `Operator` can decide whether to buffer
> data
> > > or process it immediately.  JM is also the same, it can decide to send
> > > flush events on a regular basis or send them based on the collected
> > metrics
> > > information and other information, such as the isBacklog in the FLIP.
> > >
> > > 2. I really don't get enough benefits for `RecordAttribute` in the FLIP
> > and
> > > as Piotr mentioned above too, it will generate a large number of
> > messages,
> > >
> >
> > If there is any sentence in the FLIP that suggests we will emit a lot of
> > RecordAttribute, sorry for that and I would fix it.
> >
> > Currently, the FLIP provides the `Output#emitRecordAttributes()` for
> > operators (e.g. source reader) to emit RecordAttributes. The FLIP leaves
> > the operator to decide the frequency and value of the emitted
> > RecordAttributes.
> >
> > Our plan is to let SourceReader emit RecordAttributes only when its value
> > (e.g. isBacklog) differs from the value of the RecordAttributes it has
> > emitted earlier. It should avoid resending RecordAttributes with the same
> > value, similar to how Flink currently avoids resending
> > Watermark/WatermarkStatus with the same value.
> >
> > Would it address your concern?
> >
> >
> > > affecting performance. FLIP mentions that it will be applied to
> Operator
> > > and Sink, I try to understand it's role and please correct me if I'm
> > wrong.
> > > a) It tells the Operator and Sink that current most of data they are
> > > processing are from snapshot and are "insert" data? For the out of
> order
> > in
> > > flink, the Operator and Sink may receive "upsert" data from other
> > sources.
> > >
> >
> > The RecordAttributes currently proposed in the FLIP only provides the
> > IsBacklog information, which tells the operator (including sink operator)
> > whether the records received after this RercordAttributes event are
> > "backlog". Note that snapshot (e.g. MySQL CDC snapshot) is one particular
> > case which can be classified as backlog. But we might introduce more
> > strategies to classify recods as backlog in the future.
> >
> > Currently, RecordAttributes does not specify whether the following
> records
> > are insert-only or upsert. We might introduce such an atttribute if there
> > is a good use-case for having it.
> >
> >
> > > b) Do Operators and Sink perform any very special operations in the
> above
> > > situations? What are the benefits of this special operations for "most
> > data
> > > are insert"?
> > >
> >
> > Hmm.. I don't think the FLIP says something like "most data are insert".
> > Could you clarify which part of the FLIP you are talking about?
> >
> > I was told that Hudi Sink can have much higher throughput if all its
> inputs
> > are insert-only. One point in the FLIP is that Hudi Sink can take
> advantage
> > of the features proposed in the FLIP to increase its throughput when
> source
> > are reading backlog data (e.g. MySQL CDC snapshot).
> >
> >
> >
> > > c) I think the operator and sink can collect the above information
> > locally
> > > when it receives each record without the need for `RecordAttribute`
> even
> > > when we need some special operations.
> > >
> >
> > Let's say the job is reading from MySQL CDC, we want to process records
> in
> > snapshot phase with high throughput (and likely high latency), and
> process
> > records in the binlog phase with low processing latency.
> >
> > We can achieve this goal by propagating RecordAttributes from source to
> > downstream operators. This allows an operator to know exactly that those
> > records received before a RecordAttributes(isBacklog=false) can be
> > processed with high latency, and those records received after a
> > RecordAttributes(isBacklog=false) should be processed with low latency.
> >
> > Could you help explain how to achieve this goal without
> `RecordAttribute`?
> >
> >
> > > d) Even if we do need a `RecordAttribute` events in Operator and Sink,
> I
> > > think broadcast them from JM is a better choice.
> > >
> >
> > Suppose we broadcast it from JM. When MySQL CDC source switches from the
> > snapshot phase to binlog phase, the RecordAttributes(IsBacklog=false)
> might
> > arrive at an operator while it is still processing a queue of records
> from
> > the snapshot phase, causing the operator to reduce its buffer time (and
> > throughput) earlier than expected. It might not be a big deal when this
> > interval is short. But given that propagating RecordAttributes from
> sources
> > to downstream operators does not have much overhead (no more than the
> > periodic Watermark), it seems useful to make the operator behavior more
> > accurate.
> >
> > What do you think?
> >
> >
> > >
> > > 3. For the flush event, I also have some questions. What type of
> > operators
> > > need to buffer data and flush them based on the flush events? In SQL
> > >
> >
> > As you mentioned above, typically those operators that rely heavily on
> > statebackend (e.g. co-group, aggregate, join) can increase throughput by
> > buffering data. Operators which can not benefit from flush, such as Map,
> do
> > not need to be updated.
> >
> > It is mentioned in the FLIP that we will "update operators in Flink to
> > override processRecordAttributes() if that helps improve job performance
> > without sacrificing correctness (e.g. processing latency)".
> >
> >
> > > mini-batch mechanism, similar processing will be added for the
> aggregate
> > > and join operators, while for operators such as map, it is not
> necessary.
> > > How can we identify different operator in the runtime layer (`Input`
> and
> > > `TwoInputStreamOperator`)? I think buffer data in Map/FlatMap/Filter
> > >
> >
> > I am not sure I understand this question. Are you asking how to
> > automatically identify the operators that can benefit from the flush
> > operation?
> >
> > I think we will need to manually identify and optimize operators on a
> > case-by-case basis. There is no good systematic way to automatically
> > identify and optimize all such operators. An operator's implementation
> > needs to be manually updated and benchmarked before we commit the
> > corresponding code change.
> >
> > What do you think?
> >
> > Best,
> > Dong
> >
> >
> > > Operator is not a good idea which makes data no longer flowing.
> > >
> >
> > >
> > > Best,
> > > Shammon FY
> > >
> > >
> > > On Thu, Jul 6, 2023 at 1:54 AM Piotr Nowojski <
> [email protected]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Thanks for this proposal, this is a very much needed thing that
> should
> > be
> > > > addressed in Flink.
> > > >
> > > > I think there is one thing that hasn't been discussed neither here
> nor
> > in
> > > > FLIP-309. Given that we have
> > > > three dimensions:
> > > > - e2e latency/checkpointing interval
> > > > - enabling some kind of batching/buffering on the operator level
> > > > - how much resources we want to allocate to the job
> > > >
> > > > How do we want Flink to adjust itself between those three? For
> example:
> > > > a) Should we assume that given Job has a fixed amount of assigned
> > > > resources and make it paramount that
> > > >   Flink doesn't exceed those available resources? So in case of
> > > > backpressure, we
> > > >   should extend checkpointing intervals, emit records less frequently
> > and
> > > > in batches.
> > > > b) Or should we assume that the amount of resources is flexible (up
> to
> > a
> > > > point?), and the desired e2e latency
> > > >   is the paramount aspect? So in case of backpressure, we should
> still
> > > > adhere to the configured e2e latency,
> > > >   and wait for the user or autoscaler to scale up the job?
> > > >
> > > > In case of a), I think the concept of "isProcessingBacklog" is not
> > > needed,
> > > > we could steer the behaviour only
> > > > using the backpressure information.
> > > >
> > > > On the other hand, in case of b), "isProcessingBacklog" information
> > might
> > > > be helpful, to let Flink know that
> > > > we can safely decrease the e2e latency/checkpoint interval even if
> > there
> > > > is no backpressure, to use fewer
> > > > resources (and let the autoscaler scale down the job).
> > > >
> > > > Do we want to have both, or only one of those? Do a) and b)
> complement
> > > one
> > > > another? If job is backpressured,
> > > > we should follow a) and expose to autoscaler/users information "Hey!
> > I'm
> > > > barely keeping up! I need more resources!".
> > > > While, when there is no backpressure and latency doesn't matter
> > > > (isProcessingBacklog=true), we can limit the resource
> > > > usage.
> > > >
> > > > And a couple of more concrete remarks about the current proposal.
> > > >
> > > > 1.
> > > >
> > > > > I think the goal is to allow users to specify an end-to-end latency
> > > > budget for the job.
> > > >
> > > > I fully agree with this, but in that case, why are you proposing to
> add
> > > > `execution.flush.interval`? That's
> > > > yet another parameter that would affect e2e latency, without actually
> > > > defining it. We already have things
> > > > like: execution.checkpointing.interval, execution.buffer-timeout. I'm
> > > > pretty sure very few Flink users would be
> > > > able to configure or understand all of them.
> > > >
> > > > I think we should simplify configuration and try to define
> > > > "execution.end-to-end-latency" so the runtime
> > > > could derive other things from this new configuration.
> > > >
> > > > 2. How do you envision `#flush()` and `#snapshotState()` to be
> > connected?
> > > > So far, `#snapshotState()`
> > > > was considered as a kind of `#flush()` signal. Do we need both?
> > Shouldn't
> > > > `#flush()` be implicitly or
> > > > explicitly attached to the `#snapshotState()` call?
> > > >
> > > > 3. What about unaligned checkpoints if we have separate `#flush()`
> > > > event/signal?
> > > >
> > > > 4. How should this be working in at-least-once mode (especially
> sources
> > > > that are configured to be working
> > > > in at-least-once mode)?.
> > > >
> > > > 5. How is this FLIP connected with FLI-327? I think they are trying
> to
> > > > achieve basically the same thing:
> > > > optimise when data should be flushed/committed to balance between
> > > > throughput and latency.
> > > >
> > > > 6.
> > > >
> > > > > Add RecordAttributesBuilder and RecordAttributes that extends
> > > > StreamElement to provide operator with essential
> > > > > information about the records they receive, such as whether the
> > records
> > > > are already stale due to backlog.
> > > >
> > > > Passing along `RecordAttribute` for every `StreamElement` would be an
> > > > extremely inefficient solution.
> > > >
> > > > If at all, this should be a marker propagated through the JobGraph
> vie
> > > > Events or sent from JM to TMs via an RPC
> > > > that would mark "backlog processing started/ended". Note that Events
> > > might
> > > > be costly, as they need to be
> > > > broadcasted. So with a job having 5 keyBy exchanges and parallelism
> of
> > > > 1000, the number of events sent is
> > > > ~4 000 000, while the number of RPCs would be only 5000.
> > > >
> > > > In case we want to only check for the backpressure, we don't need any
> > > > extra signal. Operators/subtasks can
> > > > get that information very easily from the TMs runtime.
> > > >
> > > > Best,
> > > > Piotrek
> > > >
> > > > czw., 29 cze 2023 o 17:19 Dong Lin <[email protected]> napisał(a):
> > > >
> > > >> Hi Shammon,
> > > >>
> > > >> Thanks for your comments. Please see my reply inline.
> > > >>
> > > >> On Thu, Jun 29, 2023 at 6:01 PM Shammon FY <[email protected]>
> wrote:
> > > >>
> > > >> > Hi Dong and Yunfeng,
> > > >> >
> > > >> > Thanks for bringing up this discussion.
> > > >> >
> > > >> > As described in the FLIP, the differences between `end-to-end
> > latency`
> > > >> and
> > > >> > `table.exec.mini-batch.allow-latency` are: "It allows users to
> > specify
> > > >> the
> > > >> > end-to-end latency, whereas table.exec.mini-batch.allow-latency
> > > applies
> > > >> to
> > > >> > each operator. If there are N operators on the path from source to
> > > sink,
> > > >> > the end-to-end latency could be up to
> > > >> table.exec.mini-batch.allow-latency *
> > > >> > N".
> > > >> >
> > > >> > If I understand correctly, `table.exec.mini-batch.allow-latency`
> is
> > > also
> > > >> > applied to the end-to-end latency for a job, maybe @Jack Wu can
> give
> > > >> more
> > > >> > information.
> > > >> >
> > > >>
> > > >> Based on what I can tell from the doc/code and offline discussion, I
> > > >> believe table.exec.mini-batch.allow-latency is not applied to the
> > > >> end-to-end latency for a job.
> > > >>
> > > >> It is mentioned here
> > > >> <
> > > >>
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/table/config/
> > > >> >
> > > >> that
> > > >> table.exec.mini-batch.allow-latency is "the maximum latency can be
> > used
> > > >> for
> > > >> MiniBatch to buffer input records". I think we should have mentioned
> > > that
> > > >> the config is applied to the end-to-end latency in this doc if it is
> > > >> indeed
> > > >> the case.
> > > >>
> > > >>
> > > >> > So, from my perspective, and please correct me if I'm
> misunderstand,
> > > the
> > > >> > targets of this FLIP may include the following:
> > > >> >
> > > >> > 1. Support a mechanism like  `mini-batch` in SQL for `DataStream`,
> > > which
> > > >> > will collect data in the operator and flush data when it receives
> a
> > > >> `flush`
> > > >> > event, in the FLIP it is `FlushEvent`.
> > > >> >
> > > >>
> > > >> I think the goal is to allow users to specify an end-to-end latency
> > > budget
> > > >> for the job. IMO it is quite different from the `mini-batch` in SQL.
> > > >>
> > > >>
> > > >> >
> > > >> > 2. Support dynamic `latency` according to the progress of job,
> such
> > as
> > > >> > snapshot stage and after that.
> > > >> >
> > > >> > To do that, I have some questions:
> > > >> >
> > > >> > 1. I didn't understand the purpose of public interface
> > > >> `RecordAttributes`.
> > > >> > I think `FlushEvent` in the FLIP is enough, and different
> > > >> > `DynamicFlushStrategy` can be added to generate flush events to
> > > address
> > > >> > different needs, such as a static interval similar to mini-batch
> in
> > > SQL
> > > >> or
> > > >> > collect the information `isProcessingBacklog` and metrics to
> > generate
> > > >> > `FlushEvent` which is described in your FLIP? If hudi sink needs
> the
> > > >> > `isBacklog` flag, the hudi `SplitEnumerator` can create an
> operator
> > > >> event
> > > >> > and send it to hudi source reader.
> > > >> >
> > > >>
> > > >> Suppose we only have FlushEvent, then operators (e.g. Hudi Sink)
> will
> > > not
> > > >> know they can buffer data in the following scenario:
> > > >>
> > > >> - execution.allowed-latency is not configured and use the default
> > value
> > > >> null.
> > > >> - The job is reading from HybridSource and HybridSource says
> > > >> isBacklog=true.
> > > >>
> > > >> Also note that Hudi Sink might not be the only operators that can
> > > benefit
> > > >> from knowing isBacklog=true. Other sinks and aggregation operators
> > (e.g.
> > > >> CoGroup) can also increase throughput by buffering/sorting records
> > when
> > > >> there is backlog. So it seems simpler to pass RecordAttributes to
> > these
> > > >> operators than asking every operator developer to create operator
> > event
> > > >> etc.
> > > >>
> > > >>
> > > >> >
> > > >> > 2. How is this new mechanism unified with SQL's mini-batch
> > mechanism?
> > > As
> > > >> > far as I am concerned, SQL implements mini-batch mechanism based
> on
> > > >> > watermark, I think it is very unreasonable to have two different
> > > >> > implementation in SQL and DataStream.
> > > >> >
> > > >>
> > > >> I think we can deprecate table.exec.mini-batch.allow-latency later
> > > >> once execution.allowed-latency is ready for production usage. This
> is
> > > >> mentioned in the "Compatibility, Deprecation, and Migration Plan"
> > > section.
> > > >>
> > > >> If there is a config that supports user specifying the e2e latency,
> it
> > > is
> > > >> probably reasonable for this config to work for both DataStream and
> > SQL.
> > > >>
> > > >>
> > > >> > 3. I notice that the `CheckpointCoordinator` will generate
> > > `FlushEvent`,
> > > >> > which information about `FlushEvent` will be stored in
> > > >> >
> > > >>
> > > >> CheckpointCoordinator might need to send FlushEvent before
> triggering
> > > >> checkpoint in order to deal with the two-phase commit sinks. The
> > > algorithm
> > > >> is specified in the "Proposed Changes" section.
> > > >>
> > > >>
> > > >> > `Checkpoint`? What is the alignment strategy for FlushEvent in the
> > > >> > operator? The operator will flush the data when it receives all
> > > >> > `FlushEvent` from upstream with the same ID or do flush for each
> > > >> > `FlushEvent`? Can you give more detailed proposal about that? We
> > also
> > > >> have
> > > >> > a demand for this piece, thanks
> > > >> >
> > > >>
> > > >> After an operator has received a FlushEvent:
> > > >> - If the ID of the received FlushEvent is larger than the largest ID
> > > this
> > > >> operator has received, then flush() is triggered for this operator
> and
> > > the
> > > >> operator should broadcast FlushEvent to downstream operators.
> > > >> - Otherwise, this FlushEvent is ignored.
> > > >>
> > > >> This behavior is specified in the Java doc of the FlushEvent.
> > > >>
> > > >> Can you see if this answers your questions?
> > > >>
> > > >> Best,
> > > >> Dong
> > > >>
> > > >>
> > > >> >
> > > >> >
> > > >> > Best,
> > > >> > Shammon FY
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Thu, Jun 29, 2023 at 4:35 PM Martijn Visser <
> > > >> [email protected]>
> > > >> > wrote:
> > > >> >
> > > >> >> Hi Dong and Yunfeng,
> > > >> >>
> > > >> >> Thanks for the FLIP. What's not clear for me is what's the
> expected
> > > >> >> behaviour when the allowed latency can't be met, for whatever
> > reason.
> > > >> >> Given that we're talking about an "allowed latency", it implies
> > that
> > > >> >> something has gone wrong and should fail? Isn't this more a
> minimum
> > > >> >> latency that you're proposing?
> > > >> >>
> > > >> >> There's also the part about the Hudi Sink processing records
> > > >> >> immediately upon arrival. Given that the SinkV2 API provides the
> > > >> >> ability for custom post and pre-commit topologies [1],
> specifically
> > > >> >> targeted to avoid generating multiple small files, why isn't that
> > > >> >> sufficient for the Hudi Sink? It would be great to see that added
> > > >> >> under Rejected Alternatives if this is indeed not sufficient.
> > > >> >>
> > > >> >> Best regards,
> > > >> >>
> > > >> >> Martijn
> > > >> >>
> > > >> >> [1]
> > > >> >>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > >> >>
> > > >> >> On Sun, Jun 25, 2023 at 4:25 AM Yunfeng Zhou
> > > >> >> <[email protected]> wrote:
> > > >> >> >
> > > >> >> > Hi all,
> > > >> >> >
> > > >> >> > Dong(cc'ed) and I are opening this thread to discuss our
> proposal
> > > to
> > > >> >> > support configuring end-to-end allowed latency for Flink jobs,
> > > which
> > > >> >> > has been documented in FLIP-325
> > > >> >> > <
> > > >> >>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-325%3A+Support+configuring+end-to-end+allowed+latency
> > > >> >> >.
> > > >> >> >
> > > >> >> > By configuring the latency requirement for a Flink job, users
> > would
> > > >> be
> > > >> >> > able to optimize the throughput and overhead of the job while
> > still
> > > >> >> > acceptably increasing latency. This approach is particularly
> > useful
> > > >> >> > when dealing with records that do not require immediate
> > processing
> > > >> and
> > > >> >> > emission upon arrival.
> > > >> >> >
> > > >> >> > Please refer to the FLIP document for more details about the
> > > proposed
> > > >> >> > design and implementation. We welcome any feedback and opinions
> > on
> > > >> >> > this proposal.
> > > >> >> >
> > > >> >> > Best regards.
> > > >> >> >
> > > >> >> > Dong and Yunfeng
> > > >> >>
> > > >> >
> > > >>
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-325: Support configuring end-to-end allowed latency

Reply via email to