Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Guowei Ma
+1(binding)
Best,
Guowei


On Wed, Jul 19, 2023 at 11:18 AM Hang Ruan  wrote:

> +1 (non-binding)
>
> Thanks for driving.
>
> Best,
> Hang
>
> Leonard Xu  于2023年7月19日周三 10:42写道:
>
> > Thanks Dong for the continuous work.
> >
> > +1(binding)
> >
> > Best,
> > Leonard
> >
> > > On Jul 18, 2023, at 10:16 PM, Jingsong Li 
> > wrote:
> > >
> > > +1 binding
> > >
> > > Thanks Dong for continuous driving.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Tue, Jul 18, 2023 at 10:04 PM Jark Wu  wrote:
> > >>
> > >> +1 (binding)
> > >>
> > >> Best,
> > >> Jark
> > >>
> > >> On Tue, 18 Jul 2023 at 20:30, Piotr Nowojski 
> > wrote:
> > >>
> > >>> +1 (binding)
> > >>>
> > >>> Piotrek
> > >>>
> > >>> wt., 18 lip 2023 o 08:51 Jing Ge 
> > napisał(a):
> > >>>
> >  +1(binding)
> > 
> >  Best regards,
> >  Jing
> > 
> >  On Tue, Jul 18, 2023 at 8:31 AM Rui Fan <1996fan...@gmail.com>
> wrote:
> > 
> > > +1(binding)
> > >
> > > Best,
> > > Rui Fan
> > >
> > >
> > > On Tue, Jul 18, 2023 at 12:04 PM Dong Lin 
> > wrote:
> > >
> > >> Hi all,
> > >>
> > >> We would like to start the vote for FLIP-309: Support using larger
> > >> checkpointing interval when source is processing backlog [1]. This
> > >>> FLIP
> > > was
> > >> discussed in this thread [2].
> > >>
> > >> The vote will be open until at least July 21st (at least 72
> hours),
> > >> following
> > >> the consensus voting process.
> > >>
> > >> Cheers,
> > >> Yunfeng and Dong
> > >>
> > >> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
> > >>
> > >>
> > >
> > 
> > >>>
> >
> %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> > >> [2]
> > https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
> > >>
> > >
> > 
> > >>>
> >
> >
>


Re: [DISUCSS] Deprecate multiple APIs in 1.18

2023-07-18 Thread Xintong Song
Thanks for the beautiful sheets, Jane.

1. This sheet <
> https://docs.google.com/spreadsheets/d/1dZBNHLuAHYJt3pFU8ZtfUzrYyf2ZFQ6wybDXGS1bHno/edit?usp=sharing>
> summarizes the user-facing classes and methods that need to be deprecated
> under the flink-table module, some of which are marked with a red
> background and belong to the APIs that need to be depreciated but are not
> explicitly marked in the code. This mainly includes legacy table
> source/sink, legacy table schema, legacy SQL function, and some internal
> APIs designed for Paimon but are now obsolete.
>
IIUC, you are suggesting to mark the classes with a red background as
`@Deprecated` in 1.18?

   - +1 for deprecating `StreamRecordTimestamp` & `ExistingField` in 1.18.
   Based on your description, it seems these were not marked by mistake. Let's
   fix them.
   - For the ManagedTable related classes, is there any FLIP explicitly
   that decides to deprecate them? If not, I think it would be nice to have
   one, to formally decide the deprecation with a vote. I'd expect such a FLIP
   can be quite lightweight.
   - For `SourceFunctionProvider`, please beware there are recently some
   objections against deprecating `SourceFunction`, and the code changes
   marking it as `@Deprecated` might be reverted. See [1][2] for more details.
   So my question is, if `SourceFunction` will not be deprecated until all
   sub-tasks in FLINK-28045 are resolved, do you still think we should
   deprecate `SourceFunctionProvider` now?

2. In addition, during the process of organizing, it was found that some
> APIs under the flink-table-api-java and flink-table-common modules do not
> have an explicit API annotation (you can find detailed information in this
> sheet <
> https://docs.google.com/spreadsheets/d/1e8M0tUtKkZXEd8rCZtZ0C6Ty9QkNaPySsrCgz0vEID4/edit?usp=sharing>).
> I suggest explicitly marking the level for these APIs.

I'm in general +1 to add the missing API annotations. However, I don't have
the expertise to comment on the classes and suggestd API levels being
listed.

3. As there are still some internal and test code dependencies on these
> APIs,  can we first gradually migrate these dependencies to alternative
> APIs to make the deprecation process relatively easy?
>
That makes sense to me. I think the purpose of trying to mark the APIs as
deprecated in 1.18 is to send users the signal early that these APIs will
be removed. As for the internal and test code dependencies, I don't see any
problem in gradually migrating them.

Best,

Xintong


[1] https://lists.apache.org/thread/734zhkvs59w2o4d1rsnozr1bfqlr6rgm

[2] https://issues.apache.org/jira/browse/FLINK-28046



On Wed, Jul 19, 2023 at 11:41 AM Jane Chan  wrote:

> Hi Xintong,
>
> Thanks for driving this topic. Regarding the Table API deprecation, I can
> provide some details to help with the process.
>
>1. This sheet
><
> https://docs.google.com/spreadsheets/d/1dZBNHLuAHYJt3pFU8ZtfUzrYyf2ZFQ6wybDXGS1bHno/edit?usp=sharing
> >
> summarizes
>the user-facing classes and methods that need to be deprecated under the
>flink-table module, some of which are marked with a red background and
>belong to the APIs that need to be depreciated but are not explicitly
>marked in the code. This mainly includes legacy table source/sink,
> legacy
>table schema, legacy SQL function, and some internal APIs designed for
>Paimon but are now obsolete.
>2. In addition, during the process of organizing, it was found that some
>APIs under the flink-table-api-java and flink-table-common modules do
> not
>have an explicit API annotation (you can find detailed information in
> this
>sheet
><
> https://docs.google.com/spreadsheets/d/1e8M0tUtKkZXEd8rCZtZ0C6Ty9QkNaPySsrCgz0vEID4/edit?usp=sharing
> >).
>I suggest explicitly marking the level for these APIs.
>3. As there are still some internal and test code dependencies on these
>APIs,  can we first gradually migrate these dependencies to alternative
>APIs to make the deprecation process relatively easy?
>
> I hope this helps.
>
> Best,
> Jane
>
> On Tue, Jul 11, 2023 at 12:08 PM Xintong Song 
> wrote:
>
> > Thanks for bringing this up, Matthias. I've replied to you in the other
> > thread[1].
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4
> >
> > On Mon, Jul 10, 2023 at 8:37 PM Matthias Pohl
> >  wrote:
> >
> > > @Xintong did you come across FLINK-3957 [1] when looking for deprecated
> > > issues? There are a few subtasks that would require deprecations as
> well
> > as
> > > far as I can see. For some I'm wondering whether we should add them to
> > the
> > > release 2.0 feature list [2] in some way and (as a consequence) missed
> > them
> > > in FLINK-32557 [3], e.g.:
> > > - FLINK-4675 [4]
> > > - FLINK-14068 [5] (listed in the release 2.0 feature list [2] but not
> > > listed in FLINK-32557 [3])
> > > - FLINK-5126 [6]
> > > - FLINK-13926 [7]
> > 

Re: [DISUCSS] Deprecate multiple APIs in 1.18

2023-07-18 Thread Jane Chan
Hi Xintong,

Thanks for driving this topic. Regarding the Table API deprecation, I can
provide some details to help with the process.

   1. This sheet
   

summarizes
   the user-facing classes and methods that need to be deprecated under the
   flink-table module, some of which are marked with a red background and
   belong to the APIs that need to be depreciated but are not explicitly
   marked in the code. This mainly includes legacy table source/sink, legacy
   table schema, legacy SQL function, and some internal APIs designed for
   Paimon but are now obsolete.
   2. In addition, during the process of organizing, it was found that some
   APIs under the flink-table-api-java and flink-table-common modules do not
   have an explicit API annotation (you can find detailed information in this
   sheet
   
).
   I suggest explicitly marking the level for these APIs.
   3. As there are still some internal and test code dependencies on these
   APIs,  can we first gradually migrate these dependencies to alternative
   APIs to make the deprecation process relatively easy?

I hope this helps.

Best,
Jane

On Tue, Jul 11, 2023 at 12:08 PM Xintong Song  wrote:

> Thanks for bringing this up, Matthias. I've replied to you in the other
> thread[1].
>
> Best,
>
> Xintong
>
>
> [1] https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4
>
> On Mon, Jul 10, 2023 at 8:37 PM Matthias Pohl
>  wrote:
>
> > @Xintong did you come across FLINK-3957 [1] when looking for deprecated
> > issues? There are a few subtasks that would require deprecations as well
> as
> > far as I can see. For some I'm wondering whether we should add them to
> the
> > release 2.0 feature list [2] in some way and (as a consequence) missed
> them
> > in FLINK-32557 [3], e.g.:
> > - FLINK-4675 [4]
> > - FLINK-14068 [5] (listed in the release 2.0 feature list [2] but not
> > listed in FLINK-32557 [3])
> > - FLINK-5126 [6]
> > - FLINK-13926 [7]
> >
> > For some other subtasks, I'm not 100% sure whether they still apply.
> Others
> > might resolve by removing the DataSet or Scala API. But for the 4
> mentioned
> > above we might need to deprecate API in 1.18.
> >
> > Matthias
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-3957
> > [2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > [3] https://issues.apache.org/jira/browse/FLINK-32557
> >
> > [4] https://issues.apache.org/jira/browse/FLINK-4675
> > [5] https://issues.apache.org/jira/browse/FLINK-14068
> > [6] https://issues.apache.org/jira/browse/FLINK-5126
> > [7] https://issues.apache.org/jira/browse/FLINK-13926
> >
> > On Fri, Jul 7, 2023 at 12:59 PM Jing Ge 
> > wrote:
> >
> > > Hi Alex,
> > >
> > > I would follow FLIP-197 and try to release them asap depending on dev
> > > resources and how difficult those issues are. The fastest timeline is
> the
> > > period defined in FLIP-197 in ideal conditions.
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Fri, Jul 7, 2023 at 12:20 PM Alexander Fedulov <
> > > alexander.fedu...@gmail.com> wrote:
> > >
> > > > @Xintong
> > > > > - IIUC, the testing scenario you described is like blocking the
> > source
> > > > for
> > > > > proceeding (emit data, finish, etc.) until a checkpoint is
> finished.
> > > >
> > > > It is more tricky than that - we need to prevent the Sink from
> > receiving
> > > a
> > > > checkpoint barrier until the Source is done emitting a given set of
> > > > records. In
> > > > the current tests, which are also used for V2 Sinks, SourceFunction
> > > > controls
> > > > when the Sink is "allowed" to commit by holding the checkpoint lock
> > while
> > > > producing the records. The lock is not available in the new Source by
> > > > design
> > > > and we need a solution that provides the same functionality (without
> > > > modifying
> > > > the Sinks). I am currently checking if a workaround is at all
> possible
> > > > without
> > > > adjusting anything in the Source interface.
> > > >
> > > > > I may not have understood all the details, but based on what you
> > > > described
> > > > > I'd hesitate to block the deprecation / removal of SourceFunction
> on
> > > > this.
> > > >
> > > > I don't think we should, just wanted to highlight that there are some
> > > > unknowns
> > > > with respect to estimating the amount of work required.
> > > >
> > > > @Jing
> > > > I want to understand in which release would you target graduation of
> > the
> > > > mentioned connectors to @Public/@PublicEvolving - basically the
> > > anticipated
> > > > timeline of the steps in both options with respect to releases.
> > > >
> > > > Best,
> > > > Alex
> > > >
> > > > On Fri, 7 Jul 2023 at 10:53, Xintong Song 
> > wrote:
> > > >
> > > > > Thanks all for the discussion. I've created FLINK-32557 for this.
> > > > >
> > > > > 

Re: [DISCUSS][2.0] FLIP-340: Remove rescale REST endpoint

2023-07-18 Thread ConradJam
+1

Zhu Zhu  于2023年7月19日周三 10:53写道:

> +1
>
> Thanks,
> Zhu
>
> Jing Ge  于2023年7月18日周二 19:09写道:
> >
> > +1
> >
> > On Tue, Jul 18, 2023 at 1:05 PM Maximilian Michels 
> wrote:
> >
> > > +1
> > >
> > > On Tue, Jul 18, 2023 at 12:29 PM Gyula Fóra  wrote:
> > > >
> > > > +1
> > > >
> > > > On Tue, 18 Jul 2023 at 12:12, Xintong Song 
> > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Best,
> > > > >
> > > > > Xintong
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jul 18, 2023 at 4:25 PM Chesnay Schepler <
> ches...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > The endpoint hasn't been working for years and was only kept to
> > > inform
> > > > > > users about it. Let's finally remove it.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint
> > > > > >
> > > > > >
> > > > >
> > >
>


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Hang Ruan
+1 (non-binding)

Thanks for driving.

Best,
Hang

Leonard Xu  于2023年7月19日周三 10:42写道:

> Thanks Dong for the continuous work.
>
> +1(binding)
>
> Best,
> Leonard
>
> > On Jul 18, 2023, at 10:16 PM, Jingsong Li 
> wrote:
> >
> > +1 binding
> >
> > Thanks Dong for continuous driving.
> >
> > Best,
> > Jingsong
> >
> > On Tue, Jul 18, 2023 at 10:04 PM Jark Wu  wrote:
> >>
> >> +1 (binding)
> >>
> >> Best,
> >> Jark
> >>
> >> On Tue, 18 Jul 2023 at 20:30, Piotr Nowojski 
> wrote:
> >>
> >>> +1 (binding)
> >>>
> >>> Piotrek
> >>>
> >>> wt., 18 lip 2023 o 08:51 Jing Ge 
> napisał(a):
> >>>
>  +1(binding)
> 
>  Best regards,
>  Jing
> 
>  On Tue, Jul 18, 2023 at 8:31 AM Rui Fan <1996fan...@gmail.com> wrote:
> 
> > +1(binding)
> >
> > Best,
> > Rui Fan
> >
> >
> > On Tue, Jul 18, 2023 at 12:04 PM Dong Lin 
> wrote:
> >
> >> Hi all,
> >>
> >> We would like to start the vote for FLIP-309: Support using larger
> >> checkpointing interval when source is processing backlog [1]. This
> >>> FLIP
> > was
> >> discussed in this thread [2].
> >>
> >> The vote will be open until at least July 21st (at least 72 hours),
> >> following
> >> the consensus voting process.
> >>
> >> Cheers,
> >> Yunfeng and Dong
> >>
> >> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
> >>
> >>
> >
> 
> >>>
> %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> >> [2]
> https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
> >>
> >
> 
> >>>
>
>


Re: Re: [DISCUSS] Release 2.0 Work Items

2023-07-18 Thread Xintong Song
I went through the remaining Jira tickets with 2.0.0 fix-version and are
not included in FLINK-3975.

I skipped the 3 umbrella tickets below and their subtasks, which are newly
created for the 2.0 work items.

   - FLINK-32377 Breaking REST API changes
   - FLINK-32378 Breaking Metrics system changes
   - FLINK-32383 2.0 Breaking configuration changes

I'd suggest going ahead with the following tickets.

   - Need action in 1.18
  - FLINK-29739: Already listed in the release 2.0 wiki. Needs mark all
  Scala APIs as deprecated.
   - Need no action in 1.18
  - FLINK-23620: Already listed in the release 2.0
  - FLINK-15470/30246/32437: Behavior changes, no API to be deprecated

I'd suggest not doing the following tickets.

   - FLINK-11409: Subsumed by "Convert user-facing concrete classes into
   interfaces" in the release 2.0 wiki

I'd suggest leaving the following tickets as TBD, and would be slightly in
favor of not doing them unless someone volunteers to look more into them.

   - FLINK-10113 Drop support for pre 1.6 shared buffer state
   - FLINK-10374 [Map State] Let user value serializer handle null values
   - FLINK-13928 Make windows api more extendable
   - FLINK-17539 Migrate the configuration options which do not follow the
   xyz.max/min pattern


Best,

Xintong



On Tue, Jul 18, 2023 at 5:20 PM Wencong Liu  wrote:

> Hi Chesnay,
> Thanks for the reply. I think it is reasonable to remove the configuration
> argument
> in AbstractUdfStreamOperator#open if it is consistently empty. I'll
> propose a discuss
> about the specific actions in FLINK-6912 at a later time.
>
>
> Best,
> Wencong Liu
>
>
>
>
>
>
>
>
>
>
>
> At 2023-07-18 16:38:59, "Chesnay Schepler"  wrote:
> >On 18/07/2023 10:33, Wencong Liu wrote:
> >> For FLINK-6912:
> >>
> >>  There are three implementations of RichFunction that actually use
> >> the Configuration parameter in RichFunction#open:
> >>  1. ContinuousFileMonitoringFunction#open: It uses the configuration
> >> to configure the FileInputFormat. [1]
> >>  2. OutputFormatSinkFunction#open: It uses the configuration
> >> to configure the OutputFormat. [2]
> >>  3. InputFormatSourceFunction#open: It uses the configuration
> >>   to configure the InputFormat. [3]
> >
> >And none of them should have any effect since the configuration is empty.
> >
> >See
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.
>


Re: [DISCUSS][2.0] FLIP-340: Remove rescale REST endpoint

2023-07-18 Thread Zhu Zhu
+1

Thanks,
Zhu

Jing Ge  于2023年7月18日周二 19:09写道:
>
> +1
>
> On Tue, Jul 18, 2023 at 1:05 PM Maximilian Michels  wrote:
>
> > +1
> >
> > On Tue, Jul 18, 2023 at 12:29 PM Gyula Fóra  wrote:
> > >
> > > +1
> > >
> > > On Tue, 18 Jul 2023 at 12:12, Xintong Song 
> > wrote:
> > >
> > > > +1
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > >
> > > > On Tue, Jul 18, 2023 at 4:25 PM Chesnay Schepler 
> > > > wrote:
> > > >
> > > > > The endpoint hasn't been working for years and was only kept to
> > inform
> > > > > users about it. Let's finally remove it.
> > > > >
> > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint
> > > > >
> > > > >
> > > >
> >


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Leonard Xu
Thanks Dong for the continuous work.

+1(binding)

Best,
Leonard

> On Jul 18, 2023, at 10:16 PM, Jingsong Li  wrote:
> 
> +1 binding
> 
> Thanks Dong for continuous driving.
> 
> Best,
> Jingsong
> 
> On Tue, Jul 18, 2023 at 10:04 PM Jark Wu  wrote:
>> 
>> +1 (binding)
>> 
>> Best,
>> Jark
>> 
>> On Tue, 18 Jul 2023 at 20:30, Piotr Nowojski  wrote:
>> 
>>> +1 (binding)
>>> 
>>> Piotrek
>>> 
>>> wt., 18 lip 2023 o 08:51 Jing Ge  napisał(a):
>>> 
 +1(binding)
 
 Best regards,
 Jing
 
 On Tue, Jul 18, 2023 at 8:31 AM Rui Fan <1996fan...@gmail.com> wrote:
 
> +1(binding)
> 
> Best,
> Rui Fan
> 
> 
> On Tue, Jul 18, 2023 at 12:04 PM Dong Lin  wrote:
> 
>> Hi all,
>> 
>> We would like to start the vote for FLIP-309: Support using larger
>> checkpointing interval when source is processing backlog [1]. This
>>> FLIP
> was
>> discussed in this thread [2].
>> 
>> The vote will be open until at least July 21st (at least 72 hours),
>> following
>> the consensus voting process.
>> 
>> Cheers,
>> Yunfeng and Dong
>> 
>> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
>> 
>> 
> 
 
>>> %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
>> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
>> 
> 
 
>>> 



Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-07-18 Thread Leonard Xu
+1 for interval-during-backlog


best,
leonard

> On Jul 14, 2023, at 11:38 PM, Piotr Nowojski  wrote:
> 
> Hi All,
> 
> We had a lot of off-line discussions. As a result I would suggest dropping
> the idea of introducing an end-to-end-latency concept, until
> we can properly implement it, which will require more designing and
> experimenting. I would suggest starting with a more manual solution,
> where the user needs to configure concrete parameters, like
> `execution.checkpointing.max-interval` or `execution.flush-interval`.
> 
> FLIP-309 looks good to me, I would just rename
> `execution.checkpointing.interval-during-backlog` to
> `execution.checkpointing.max-interval`.
> 
> I would also reference future work, that a solution that would allow set
> `isProcessingBacklog` for sources like Kafka will be introduced via
> FLIP-328 [1].
> 
> Best,
> Piotrek
> 
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-328%3A+Allow+source+operators+to+determine+isProcessingBacklog+based+on+watermark+lag
> 
> śr., 12 lip 2023 o 03:49 Dong Lin  napisał(a):
> 
>> Hi Piotr,
>> 
>> I think I understand your motivation for suggeseting
>> execution.slow-end-to-end-latency now. Please see my followup comments
>> (after the previous email) inline.
>> 
>> On Wed, Jul 12, 2023 at 12:32 AM Piotr Nowojski 
>> wrote:
>> 
>>> Hi Dong,
>>> 
>>> Thanks for the updates, a couple of comments:
>>> 
 If a record is generated by a source when the source's
>>> isProcessingBacklog is true, or some of the records used to
 derive this record (by an operator) has isBacklog = true, then this
>>> record should have isBacklog = true. Otherwise,
 this record should have isBacklog = false.
>>> 
>>> nit:
>>> I think this conflicts with "Rule of thumb for non-source operators to
>> set
>>> isBacklog = true for the records it emits:"
>>> section later on, when it comes to a case if an operator has mixed
>>> isBacklog = false and isBacklog = true inputs.
>>> 
 execution.checkpointing.interval-during-backlog
>>> 
>>> Do we need to define this as an interval config parameter? Won't that add
>>> an option that will be almost instantly deprecated
>>> because what we actually would like to have is:
>>> execution.slow-end-to-end-latency and execution.end-to-end-latency
>>> 
>> 
>> I guess you are suggesting that we should allow users to specify a higher
>> end-to-end latency budget for those records that are emitted by two-phase
>> commit sink, than those records that are emitted by none-two-phase commit
>> sink.
>> 
>> My concern with this approach is that it will increase the complexity of
>> the definition of "processing latency requirement", as well as the
>> complexity of the Flink runtime code that handles it. Currently, the
>> FLIP-325 defines end-to-end latency as an attribute of the records that is
>> statically assigned when the record is generated at the source, regardless
>> of how it will be emitted later in the topology. If we make the changes
>> proposed above, we would need to define the latency requirement w.r.t. the
>> attribute of the operators that it travels through before its result is
>> emitted, which is less intuitive and more complex.
>> 
>> For now, it is not clear whether it is necessary to have two categories of
>> latency requirement for the same job. Maybe it is reasonable to assume that
>> if a job has two-phase commit sink and the user is OK to emit some results
>> at 1 minute interval, then more likely than not the user is also OK to emit
>> all results at 1 minute interval, include those that go through
>> none-two-phase commit sink?
>> 
>> If we do want to support different end-to-end latency depending on whether
>> the operator is emitted by two-phase commit sink, I would prefer to still
>> use execution.checkpointing.interval-during-backlog instead of
>> execution.slow-end-to-end-latency. This allows us to keep the concept of
>> end-to-end latency simple. Also, by explicitly including "checkpointing
>> interval" in the name of the config that directly affects checkpointing
>> interval, we can make it easier and more intuitive for users to understand
>> the impact and set proper value for such configs.
>> 
>> What do you think?
>> 
>> Best,
>> Dong
>> 
>> 
>>> Maybe we can introduce only `execution.slow-end-to-end-latency` (% maybe
>> a
>>> better name), and for the time being
>>> use it as the checkpoint interval value during backlog?
>> 
>> 
>>> Or do you envision that in the future users will be configuring only:
>>> - execution.end-to-end-latency
>>> and only optionally:
>>> - execution.checkpointing.interval-during-backlog
>>> ?
>>> 
>>> Best Piotrek
>>> 
>>> PS, I will read the summary that you have just published later, but I
>> think
>>> we don't need to block this FLIP on the
>>> existence of that high level summary.
>>> 
>>> wt., 11 lip 2023 o 17:49 Dong Lin  napisał(a):
>>> 
 Hi Piotr and everyone,
 
 I have documented the vision with a summary of the existing work in
>> 

[jira] [Created] (FLINK-32627) Add support for dynamic time window function

2023-07-18 Thread Jira
张一帆 created FLINK-32627:
---

 Summary: Add support for dynamic time window function
 Key: FLINK-32627
 URL: https://issues.apache.org/jira/browse/FLINK-32627
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Affects Versions: 1.18.0
Reporter: 张一帆
 Fix For: 1.18.0


When using windows for calculations, when the logic is frequently modified and 
adjusted, the entire program needs to be stopped, the code is modified, the 
program is repackaged and then submitted to the cluster. It is impossible to 
achieve logic dynamic modification and external dynamic injection. The window 
information can be obtained from the data to trigger Redistribution of windows 
to achieve the effect of dynamic windows{*}{*}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSSION] Revival of FLIP-154 (SQL Implicit Type Coercion)

2023-07-18 Thread Sergey Nuyanzin
Hello everyone

I'd like to revive FLIP-154[1] a bit.

I failed with finding any discussion/vote thread about it (please point me
to that if it is present somewhere)

Also FLIP itself looks abandoned (no activity for a couple of years),
however it seems to be useful.

I did a bit investigation about that

>From one side Calcite provides its own coercion... However, sometimes it
behaves strangely and is not ready to use in Flink.
for instance
1) it can not handle cases with `with` subqueries and fails with NPE (even
if it's fixed it will come not earlier than with 1.36.0)
2) under the hood it uses hardcoded `cast`, especially it is an issue for
equals where `cast` is invoked without fallback to `null`. In addition it
tries to cast `string` to `date`. All this leads to failure for such
queries like `select uuid() = null;` where it tries to cast the result of
`uuid()` to date without a fallback option.

The good thing is that Calcite also provides a custom TypeCoercionFactory
which could be used during coercion (in case it is enabled). This could
allow for instance to use `try_cast` instead of `cast`, embed the fix for
aforementioned NPE, enable only required coercion rules. Also it will
enable coercions rule by rule instead of big bang.

I would volunteer to update the FLIP page with usage of a custom factory if
there are no objections and come back with a discussion thread to revive
the work on it.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-154%3A+SQL+Implicit+Type+Coercion
-- 
Best regards,
Sergey


Flink 1.15.1 Troubleshooting following exception

2023-07-18 Thread Tucker Harvey
Hi, we are trying to determine how fix the following exception. This is an 
issue that is repeatedly happening for us. We have tried looking online for 
some solutions.  One thread suggested setting idleTimeout but this doesn’t seem 
supported in Flink Source code. 

https://github.com/netty/netty/issues/8801

org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: 
readAddress(..) failed: Connection timed out (connection to 
[redacted-worker])]')
at 
org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.exceptionCaught(CreditBasedPartitionRequestClientHandler.java:175)
at 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
at 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
at 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273)
at 
org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1377)
at 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
at 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
at 
org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:907)
at 
org.apache.flink.shaded.netty4.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.handleReadException(AbstractEpollStreamChannel.java:728)
at 
org.apache.flink.shaded.netty4.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:821)
at 
org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
at 
org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.flink.shaded.netty4.io.netty.channel.unix.Errors$NativeIoException: 
readAddress(..) failed: Connection timed out 

Keying a data stream by tenant and performing Flink ML tasks on each sub-stream ?

2023-07-18 Thread Catalin Stavaru
Hello everyone,

Please forgive me if this is not the correct mailing list for the issue
below.

Here is my use case: I have an event data stream which I need to "keyby" a
certain field (tenant id) and then for each tenant's events I need to
independently perform ML clustering using FlinkML's OnlineKMeans component.
I am using Java.

I tried different approaches but none of them seems to be correct.

Basically, I try to keep an OnlineKMeansModel instance as per-key (thus
per-tenant) state using a keyed processing function on the event
DataStream. In the processing function for the current event, if the
OnlineKMeansModel instance for the event's tenant id is not yet created, I
will create one and store it as state for that tenant id, to use it in the
future.

However, this doesn't seem to be the correct way to do it in Flink, I am
facing many hurdles using this approach.

- The OnlineKMeans takes a table (as in Table API) as input;  that table is
basically a view of the event data stream, filtered by a certain tenant id.
How do I go about this ?
- The OnlineKMeansModel is provided a table to output its predictions to.
How do I go about this table ?
- I get many "this class is not serializable" errors, a sign that I am not
using the correct approach.

etc.

In essence, I feel that I am overlooking a fundamental aspect when it comes
to implementing a functional approach for performing FlinkML computations
independently for each key within a keyed data stream.

In the hope that my use case was understood, I am asking you for help on
the correct approach for this scenario.

Thank you !
-- 
Catalin Stavaru


Re: [DISCUSS] Flink Kubernetes Operator cleanup procedure

2023-07-18 Thread Maximilian Michels
Hi Daren,

The behavior is consistent with the regular FlinkDeployment where the
cleanup will also cancel any running jobs. Are you intending to
recover jobs from another session cluster?

-Max

On Mon, Jul 17, 2023 at 4:48 PM Wong, Daren
 wrote:
>
> Hi devs,
>
> I would like to enquire about the cleanup procedure upon FlinkSessionJob 
> deletion. Currently, FlinkSessionJobController would trigger a cleanup in the 
> SessionJobReconciler which in turn cancels the Flink Job.
>
> Link to code: 
> https://github.com/apache/flink-kubernetes-operator/blob/371a2e6bbb8008c8ffccfff8fc338fb39bda19e2/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/sessionjob/SessionJobReconciler.java#L110
>
> This make sense to me as we want to ensure the Flink Job is ended gracefully 
> when the FlinkSessionJob associated with it is deleted. Otherwise, the Flink 
> Job will be “leaked” in the Flink cluster without a FlinkSessionJob 
> associated to it for the Kubernetes Operator to control.
>
> That being said, I was wondering if we should consider for scenarios where 
> users may not want FlinkSessionJob deletion to create a side-effect such as 
> cancelJob? For example, the user just wants to simply delete the whole 
> namespace. One way of achieving this could be to put the job in SUSPENDED 
> state instead of cancelling the job.
>
> I am opening this discussion thread to get feedback and input on whether this 
> alternative cleanup procedure is worth considering and if anyone else see any 
> potential use case/benefits/downsides with it?
>
> Thank you very much.
>
> Regards,
> Daren


[jira] [Created] (FLINK-32626) Get Savepoint REST API doesn't distinguish non-existent job from non-existent savepoint

2023-07-18 Thread Austin Cawley-Edwards (Jira)
Austin Cawley-Edwards created FLINK-32626:
-

 Summary: Get Savepoint REST API doesn't distinguish non-existent 
job from non-existent savepoint 
 Key: FLINK-32626
 URL: https://issues.apache.org/jira/browse/FLINK-32626
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / REST
Affects Versions: 1.17.1
Reporter: Austin Cawley-Edwards


The current `GET /jobs/:jobid/savepoints/:triggerid` API endpoint [1], when 
given *either* a Job ID or a Trigger ID that does not exist, it will respond 
with an exception that indicates the Savepoint doesn't exist, like:
{code:java}
{"errors":["org.apache.flink.runtime.rest.handler.RestHandlerException: There 
is no savepoint operation with triggerId=TRIGGER ID for job JOB ID.\n\tat 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers$SavepointStatusHandler.maybeCreateNotFoundError(SavepointHandlers.java:325)\n\tat
 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers$SavepointStatusHandler.lambda$handleRequest$1(SavepointHandlers.java:308)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)\n\tat
 
java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)\n\tat
 
org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$1(AkkaInvocationHandler.java:260)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)\n\tat
 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)\n\tat
 
org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275)\n\tat
 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)\n\tat
 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)\n\tat
 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)\n\tat
 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)\n\tat
 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)\n\tat
 
org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$1.onComplete(AkkaFutureUtils.java:45)\n\tat
 akka.dispatch.OnComplete.internal(Future.scala:299)\n\tat 
akka.dispatch.OnComplete.internal(Future.scala:297)\n\tat 
akka.dispatch.japi$CallbackBridge.apply(Future.scala:224)\n\tat 
akka.dispatch.japi$CallbackBridge.apply(Future.scala:221)\n\tat 
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)\n\tat 
org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$DirectExecutionContext.execute(AkkaFutureUtils.java:65)\n\tat
 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:72)\n\tat 
scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:288)\n\tat
 
scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:288)\n\tat
 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:288)\n\tat
 akka.pattern.PromiseActorRef.$bang(AskSupport.scala:622)\n\tat 
akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:25)\n\tat
 
akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:23)\n\tat
 scala.concurrent.Future.$anonfun$andThen$1(Future.scala:536)\n\tat 
scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)\n\tat 
scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)\n\tat 
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)\n\tat 
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63)\n\tat
 
akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:100)\n\tat
 scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)\n\tat 
scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)\n\tat 
akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:100)\n\tat
 akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:49)\n\tat 

[jira] [Created] (FLINK-32625) MiniClusterTestEnvironment supports customized MiniClusterResourceConfiguration

2023-07-18 Thread Zili Chen (Jira)
Zili Chen created FLINK-32625:
-

 Summary: MiniClusterTestEnvironment supports customized 
MiniClusterResourceConfiguration
 Key: FLINK-32625
 URL: https://issues.apache.org/jira/browse/FLINK-32625
 Project: Flink
  Issue Type: Improvement
  Components: Test Infrastructure
Reporter: Zili Chen
Assignee: Zili Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Jingsong Li
+1 binding

Thanks Dong for continuous driving.

Best,
Jingsong

On Tue, Jul 18, 2023 at 10:04 PM Jark Wu  wrote:
>
> +1 (binding)
>
> Best,
> Jark
>
> On Tue, 18 Jul 2023 at 20:30, Piotr Nowojski  wrote:
>
> > +1 (binding)
> >
> > Piotrek
> >
> > wt., 18 lip 2023 o 08:51 Jing Ge  napisał(a):
> >
> > > +1(binding)
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Tue, Jul 18, 2023 at 8:31 AM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > > +1(binding)
> > > >
> > > > Best,
> > > > Rui Fan
> > > >
> > > >
> > > > On Tue, Jul 18, 2023 at 12:04 PM Dong Lin  wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > We would like to start the vote for FLIP-309: Support using larger
> > > > > checkpointing interval when source is processing backlog [1]. This
> > FLIP
> > > > was
> > > > > discussed in this thread [2].
> > > > >
> > > > > The vote will be open until at least July 21st (at least 72 hours),
> > > > > following
> > > > > the consensus voting process.
> > > > >
> > > > > Cheers,
> > > > > Yunfeng and Dong
> > > > >
> > > > > [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
> > > > >
> > > > >
> > > >
> > >
> > %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> > > > > [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
> > > > >
> > > >
> > >
> >


[jira] [Created] (FLINK-32624) TieredStorageConsumerClientTest.testGetNextBufferFromRemoteTier failed on CI

2023-07-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-32624:
---

 Summary: 
TieredStorageConsumerClientTest.testGetNextBufferFromRemoteTier failed on CI
 Key: FLINK-32624
 URL: https://issues.apache.org/jira/browse/FLINK-32624
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.18.0
Reporter: lincoln lee


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51376=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8]

errors:

{code}
Jul 18 11:18:35 at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
 
Jul 18 11:18:35 at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
 
Jul 18 11:18:35 at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
 
Jul 18 11:18:35 at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
 
Jul 18 11:18:35 at 
org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:185)
 
Jul 18 11:18:35 at 
java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189) 
Jul 18 11:18:35 at 
java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) 
Jul 18 11:18:35 at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 
Jul 18 11:18:35 at 
java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 
Jul 18 11:18:35 at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: FLIP-401: Remove brackets around keys returned by MetricGroup#getAllVariables

2023-07-18 Thread Maximilian Michels
Hi Chesnay,

+1 Sounds good to me!

-Max

On Tue, Jul 18, 2023 at 10:59 AM Chesnay Schepler  wrote:
>
> MetricGroup#getAllVariables returns all variables associated with the
> metric, for example:
>
> | = abcde|
> | = ||0|
>
> The keys are surrounded by brackets for no particular reason.
>
> In virtually every use-case for this method the user is stripping the
> brackets from keys, as done in:
>
>   * our datadog reporter:
> 
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L244
> 
> 
>   * our prometheus reporter (implicitly via a character filter):
> 
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L236
> 
> 
>   * our JMX reporter:
> 
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-jmx/src/main/java/org/apache/flink/metrics/jmx/JMXReporter.java#L223
> 
> 
>
> I propose to change the method spec and implementation to remove the
> brackets around keys.
>
> For migration purposes it may make sense to add a new method with the
> new behavior (|getVariables()|) and deprecate the old method.
>
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425202


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Jark Wu
+1 (binding)

Best,
Jark

On Tue, 18 Jul 2023 at 20:30, Piotr Nowojski  wrote:

> +1 (binding)
>
> Piotrek
>
> wt., 18 lip 2023 o 08:51 Jing Ge  napisał(a):
>
> > +1(binding)
> >
> > Best regards,
> > Jing
> >
> > On Tue, Jul 18, 2023 at 8:31 AM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > +1(binding)
> > >
> > > Best,
> > > Rui Fan
> > >
> > >
> > > On Tue, Jul 18, 2023 at 12:04 PM Dong Lin  wrote:
> > >
> > > > Hi all,
> > > >
> > > > We would like to start the vote for FLIP-309: Support using larger
> > > > checkpointing interval when source is processing backlog [1]. This
> FLIP
> > > was
> > > > discussed in this thread [2].
> > > >
> > > > The vote will be open until at least July 21st (at least 72 hours),
> > > > following
> > > > the consensus voting process.
> > > >
> > > > Cheers,
> > > > Yunfeng and Dong
> > > >
> > > > [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
> > > >
> > > >
> > >
> >
> %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> > > > [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
> > > >
> > >
> >
>


[jira] [Created] (FLINK-32623) Rest api doesn't return minimum resource requirements correctly

2023-07-18 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-32623:
--

 Summary: Rest api doesn't return minimum resource requirements 
correctly
 Key: FLINK-32623
 URL: https://issues.apache.org/jira/browse/FLINK-32623
 Project: Flink
  Issue Type: Bug
  Components: Runtime / REST
Affects Versions: 1.18.0
Reporter: Gyula Fora
Assignee: Gyula Fora
 Fix For: 1.18.0


The resource requirements returned by the rest api always return a hardcoded 1 
lower bound for each vertex.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS][2.0] FLIP-341: Remove MetricGroup methods accepting an int as a name

2023-07-18 Thread Jing Ge
+1

On Tue, Jul 18, 2023 at 1:11 PM Chesnay Schepler  wrote:

> Good catch; i've fixed the list.
>
> On 18/07/2023 12:20, Xintong Song wrote:
> > +1 in general.
> >
> > I think the list of affected public interfaces in the FLIP is not
> accurate.
> >
> > - `#counter(int, Counter)` is missed
> > - `#meter(int)` should be `#meter(int, Meter)`
> > - `#group(int)` should be `#addGroup(int)`
> >
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Tue, Jul 18, 2023 at 4:39 PM Chesnay Schepler 
> wrote:
> >
> >> The MetricGroup interface contains methods to create groups and metrics
> >> using an int as a name. The original intention was to allow pattern like
> >> |group.addGroup("subtaskIndex").addGroup(0)| , but this didn't really
> >> work out, with |addGroup(String, String)|  serving this use case much
> >> better.
> >>
> >> Metric methods accept an int mostly for consistency, but there's no good
> >> use-case for it.
> >>
> >> These methods also offer hardly any convenience since all they do is
> >> save potential users from using |String.valueOf| on one argument. That's
> >> doesn't seem valuable enough for something that doubles the size of the
> >> interface.
> >>
> >> I propose to remove said method.
> >>
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-341%3A+Remove+MetricGroup+methods+accepting+an+int+as+a+name
> >>
>
>


[jira] [Created] (FLINK-32622) Do not add mini-batch assigner operator if it is useless

2023-07-18 Thread Fang Yong (Jira)
Fang Yong created FLINK-32622:
-

 Summary: Do not add mini-batch assigner operator if it is useless
 Key: FLINK-32622
 URL: https://issues.apache.org/jira/browse/FLINK-32622
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Fang Yong


Currently if user config mini-batch for their sql jobs, flink will always add 
mini-batch assigner operator in job plan even there's no agg/join operators in 
the job. Mini-batch operator will generate useless event and cause performance 
issue for them. If the mini-batch is useless for the specific jobs, flink 
should not add mini-batch assigner even when users turn on mini-batch 
mechanism. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32621) Add metrics for DataGeneratorSource

2023-07-18 Thread Weijie Guo (Jira)
Weijie Guo created FLINK-32621:
--

 Summary: Add metrics for DataGeneratorSource
 Key: FLINK-32621
 URL: https://issues.apache.org/jira/browse/FLINK-32621
 Project: Flink
  Issue Type: Technical Debt
Affects Versions: 1.18.0
Reporter: Weijie Guo
Assignee: Weijie Guo


{{DataGeneratorSource}} use {{GeneratingIteratorSourceReader}} instead of 
{{SourceReaderBase}} as sourceReader. So it lacks metrics like numRecordIn 
contains by SourceReaderBase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32620) Migrate DiscardingSink to sinkv2

2023-07-18 Thread Weijie Guo (Jira)
Weijie Guo created FLINK-32620:
--

 Summary: Migrate DiscardingSink to sinkv2
 Key: FLINK-32620
 URL: https://issues.apache.org/jira/browse/FLINK-32620
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Common
Affects Versions: 1.18.0
Reporter: Weijie Guo
Assignee: Weijie Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Piotr Nowojski
+1 (binding)

Piotrek

wt., 18 lip 2023 o 08:51 Jing Ge  napisał(a):

> +1(binding)
>
> Best regards,
> Jing
>
> On Tue, Jul 18, 2023 at 8:31 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> > +1(binding)
> >
> > Best,
> > Rui Fan
> >
> >
> > On Tue, Jul 18, 2023 at 12:04 PM Dong Lin  wrote:
> >
> > > Hi all,
> > >
> > > We would like to start the vote for FLIP-309: Support using larger
> > > checkpointing interval when source is processing backlog [1]. This FLIP
> > was
> > > discussed in this thread [2].
> > >
> > > The vote will be open until at least July 21st (at least 72 hours),
> > > following
> > > the consensus voting process.
> > >
> > > Cheers,
> > > Yunfeng and Dong
> > >
> > > [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
> > >
> > >
> >
> %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> > > [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
> > >
> >
>


Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-07-18 Thread Piotr Nowojski
Thanks Dong!

Piotrek

wt., 18 lip 2023 o 06:04 Dong Lin  napisał(a):

> Hi all,
>
> I have updated FLIP-309 as suggested by Piotr to include a reference to
> FLIP-328 in the future work section.
>
> Piotra, Stephan, and I discussed offline regarding the choice
> between execution.checkpointing.max-interval and
> execution.checkpointing.interval-during-backlog.
> The advantage of using "max-interval" is that Flink runtime can have more
> flexibility to decide when/how to adjust checkpointing intervals (based on
> information other than backlog). The advantage of using
> "interval-during-backlog" is that it is clearer to the user when/how this
> configured interval is used. Since there is no immediate need for the extra
> flexibility as of this FLIP, we agreed to go with interval-during-backlog
> for now. And we can rename this config to e.g.
> execution.checkpointing.max-interval when needed in the future.
>
> Thanks everyone for all the reviews and suggestions! And special thanks to
> Piotr and Stephan for taking extra time to provide detailed reviews and
> suggestions offline!
>
> Since there is no further comment, I will open the voting thread for this
> FLIP.
>
> Cheers,
> Dong
>
>
> On Fri, Jul 14, 2023 at 11:39 PM Piotr Nowojski 
> wrote:
>
> > Hi All,
> >
> > We had a lot of off-line discussions. As a result I would suggest
> dropping
> > the idea of introducing an end-to-end-latency concept, until
> > we can properly implement it, which will require more designing and
> > experimenting. I would suggest starting with a more manual solution,
> > where the user needs to configure concrete parameters, like
> > `execution.checkpointing.max-interval` or `execution.flush-interval`.
> >
> > FLIP-309 looks good to me, I would just rename
> > `execution.checkpointing.interval-during-backlog` to
> > `execution.checkpointing.max-interval`.
> >
> > I would also reference future work, that a solution that would allow set
> > `isProcessingBacklog` for sources like Kafka will be introduced via
> > FLIP-328 [1].
> >
> > Best,
> > Piotrek
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-328%3A+Allow+source+operators+to+determine+isProcessingBacklog+based+on+watermark+lag
> >
> > śr., 12 lip 2023 o 03:49 Dong Lin  napisał(a):
> >
> > > Hi Piotr,
> > >
> > > I think I understand your motivation for suggeseting
> > > execution.slow-end-to-end-latency now. Please see my followup comments
> > > (after the previous email) inline.
> > >
> > > On Wed, Jul 12, 2023 at 12:32 AM Piotr Nowojski 
> > > wrote:
> > >
> > > > Hi Dong,
> > > >
> > > > Thanks for the updates, a couple of comments:
> > > >
> > > > > If a record is generated by a source when the source's
> > > > isProcessingBacklog is true, or some of the records used to
> > > > > derive this record (by an operator) has isBacklog = true, then this
> > > > record should have isBacklog = true. Otherwise,
> > > > > this record should have isBacklog = false.
> > > >
> > > > nit:
> > > > I think this conflicts with "Rule of thumb for non-source operators
> to
> > > set
> > > > isBacklog = true for the records it emits:"
> > > > section later on, when it comes to a case if an operator has mixed
> > > > isBacklog = false and isBacklog = true inputs.
> > > >
> > > > > execution.checkpointing.interval-during-backlog
> > > >
> > > > Do we need to define this as an interval config parameter? Won't that
> > add
> > > > an option that will be almost instantly deprecated
> > > > because what we actually would like to have is:
> > > > execution.slow-end-to-end-latency and execution.end-to-end-latency
> > > >
> > >
> > > I guess you are suggesting that we should allow users to specify a
> higher
> > > end-to-end latency budget for those records that are emitted by
> two-phase
> > > commit sink, than those records that are emitted by none-two-phase
> commit
> > > sink.
> > >
> > > My concern with this approach is that it will increase the complexity
> of
> > > the definition of "processing latency requirement", as well as the
> > > complexity of the Flink runtime code that handles it. Currently, the
> > > FLIP-325 defines end-to-end latency as an attribute of the records that
> > is
> > > statically assigned when the record is generated at the source,
> > regardless
> > > of how it will be emitted later in the topology. If we make the changes
> > > proposed above, we would need to define the latency requirement w.r.t.
> > the
> > > attribute of the operators that it travels through before its result is
> > > emitted, which is less intuitive and more complex.
> > >
> > > For now, it is not clear whether it is necessary to have two categories
> > of
> > > latency requirement for the same job. Maybe it is reasonable to assume
> > that
> > > if a job has two-phase commit sink and the user is OK to emit some
> > results
> > > at 1 minute interval, then more likely than not the user is also OK to
> > emit
> > > all results at 1 minute interval, include those that 

[jira] [Created] (FLINK-32619) ConfigOptions to support fallback configuration

2023-07-18 Thread Hong Liang Teoh (Jira)
Hong Liang Teoh created FLINK-32619:
---

 Summary: ConfigOptions to support fallback configuration
 Key: FLINK-32619
 URL: https://issues.apache.org/jira/browse/FLINK-32619
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Configuration
Affects Versions: 1.17.1, 1.16.2
Reporter: Hong Liang Teoh


ConfigOptions has no option to specify a "fallback configuration" as the 
default.

 

For example, if we want {{rest.cache.checkpoint-statistics.timeout}} to 
fallback to web.refresh-interval instead of a static default value, we have to 
specify

 
{code:java}
@Documentation.OverrideDefault("web.refresh-interval")
@Documentation.Section(Documentation.Sections.EXPERT_REST)
public static final ConfigOption CACHE_CHECKPOINT_STATISTICS_TIMEOUT =
key("rest.cache.checkpoint-statistics.timeout")
.durationType()
.noDefaultValue()
.withDescription(
"");
 {code}
 

 

The {{.noDefault()}} is misleading as it actually has a default.

 

We should introduce a {{.fallbackConfiguration()}} that is handled gracefully 
by doc generators.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS][2.0] FLIP-341: Remove MetricGroup methods accepting an int as a name

2023-07-18 Thread Chesnay Schepler

Good catch; i've fixed the list.

On 18/07/2023 12:20, Xintong Song wrote:

+1 in general.

I think the list of affected public interfaces in the FLIP is not accurate.

- `#counter(int, Counter)` is missed
- `#meter(int)` should be `#meter(int, Meter)`
- `#group(int)` should be `#addGroup(int)`


Best,

Xintong



On Tue, Jul 18, 2023 at 4:39 PM Chesnay Schepler  wrote:


The MetricGroup interface contains methods to create groups and metrics
using an int as a name. The original intention was to allow pattern like
|group.addGroup("subtaskIndex").addGroup(0)| , but this didn't really
work out, with |addGroup(String, String)|  serving this use case much
better.

Metric methods accept an int mostly for consistency, but there's no good
use-case for it.

These methods also offer hardly any convenience since all they do is
save potential users from using |String.valueOf| on one argument. That's
doesn't seem valuable enough for something that doubles the size of the
interface.

I propose to remove said method.



https://cwiki.apache.org/confluence/display/FLINK/FLIP-341%3A+Remove+MetricGroup+methods+accepting+an+int+as+a+name





Re: FLIP-342: Remove brackets around keys returned by MetricGroup#getAllVariables

2023-07-18 Thread Jing Ge
+1

On Tue, Jul 18, 2023 at 12:24 PM Xintong Song  wrote:

> +1
>
> Best,
>
> Xintong
>
>
>
> On Tue, Jul 18, 2023 at 5:02 PM Chesnay Schepler 
> wrote:
>
> > The FLIP number was changed to 342.
> >
> > On 18/07/2023 10:56, Chesnay Schepler wrote:
> > > MetricGroup#getAllVariables returns all variables associated with the
> > > metric, for example:
> > >
> > > | = abcde|
> > > | = ||0|
> > >
> > > The keys are surrounded by brackets for no particular reason.
> > >
> > > In virtually every use-case for this method the user is stripping the
> > > brackets from keys, as done in:
> > >
> > >  * our datadog reporter:
> > >
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L244
> > > <
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L244
> > >
> > >  * our prometheus reporter (implicitly via a character filter):
> > >
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L236
> > > <
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L236
> > >
> > >  * our JMX reporter:
> > >
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-jmx/src/main/java/org/apache/flink/metrics/jmx/JMXReporter.java#L223
> > > <
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-jmx/src/main/java/org/apache/flink/metrics/jmx/JMXReporter.java#L223
> > >
> > >
> > > I propose to change the method spec and implementation to remove the
> > > brackets around keys.
> > >
> > > For migration purposes it may make sense to add a new method with the
> > > new behavior (|getVariables()|) and deprecate the old method.
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425202
> > >
> >
> >
>


Re: [DISCUSS][2.0] FLIP-340: Remove rescale REST endpoint

2023-07-18 Thread Jing Ge
+1

On Tue, Jul 18, 2023 at 1:05 PM Maximilian Michels  wrote:

> +1
>
> On Tue, Jul 18, 2023 at 12:29 PM Gyula Fóra  wrote:
> >
> > +1
> >
> > On Tue, 18 Jul 2023 at 12:12, Xintong Song 
> wrote:
> >
> > > +1
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Tue, Jul 18, 2023 at 4:25 PM Chesnay Schepler 
> > > wrote:
> > >
> > > > The endpoint hasn't been working for years and was only kept to
> inform
> > > > users about it. Let's finally remove it.
> > > >
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint
> > > >
> > > >
> > >
>


Re: [DISCUSS][2.0] FLIP-340: Remove rescale REST endpoint

2023-07-18 Thread Maximilian Michels
+1

On Tue, Jul 18, 2023 at 12:29 PM Gyula Fóra  wrote:
>
> +1
>
> On Tue, 18 Jul 2023 at 12:12, Xintong Song  wrote:
>
> > +1
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Tue, Jul 18, 2023 at 4:25 PM Chesnay Schepler 
> > wrote:
> >
> > > The endpoint hasn't been working for years and was only kept to inform
> > > users about it. Let's finally remove it.
> > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint
> > >
> > >
> >


[jira] [Created] (FLINK-32618) Remove the dependency of the flink-core in the flink-sql-jdbc-driver-bundle

2023-07-18 Thread Shengkai Fang (Jira)
Shengkai Fang created FLINK-32618:
-

 Summary: Remove the dependency of the flink-core in the 
flink-sql-jdbc-driver-bundle
 Key: FLINK-32618
 URL: https://issues.apache.org/jira/browse/FLINK-32618
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / JDBC
Affects Versions: 1.18.0
Reporter: Shengkai Fang
 Fix For: 1.18.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS][2.0] FLIP-340: Remove rescale REST endpoint

2023-07-18 Thread Gyula Fóra
+1

On Tue, 18 Jul 2023 at 12:12, Xintong Song  wrote:

> +1
>
> Best,
>
> Xintong
>
>
>
> On Tue, Jul 18, 2023 at 4:25 PM Chesnay Schepler 
> wrote:
>
> > The endpoint hasn't been working for years and was only kept to inform
> > users about it. Let's finally remove it.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint
> >
> >
>


Re: FLIP-342: Remove brackets around keys returned by MetricGroup#getAllVariables

2023-07-18 Thread Xintong Song
+1

Best,

Xintong



On Tue, Jul 18, 2023 at 5:02 PM Chesnay Schepler  wrote:

> The FLIP number was changed to 342.
>
> On 18/07/2023 10:56, Chesnay Schepler wrote:
> > MetricGroup#getAllVariables returns all variables associated with the
> > metric, for example:
> >
> > | = abcde|
> > | = ||0|
> >
> > The keys are surrounded by brackets for no particular reason.
> >
> > In virtually every use-case for this method the user is stripping the
> > brackets from keys, as done in:
> >
> >  * our datadog reporter:
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L244
> > <
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L244
> >
> >  * our prometheus reporter (implicitly via a character filter):
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L236
> > <
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L236
> >
> >  * our JMX reporter:
> >
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-jmx/src/main/java/org/apache/flink/metrics/jmx/JMXReporter.java#L223
> > <
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-jmx/src/main/java/org/apache/flink/metrics/jmx/JMXReporter.java#L223
> >
> >
> > I propose to change the method spec and implementation to remove the
> > brackets around keys.
> >
> > For migration purposes it may make sense to add a new method with the
> > new behavior (|getVariables()|) and deprecate the old method.
> >
> >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425202
> >
>
>


Re: [DISCUSS][2.0] FLIP-341: Remove MetricGroup methods accepting an int as a name

2023-07-18 Thread Xintong Song
+1 in general.

I think the list of affected public interfaces in the FLIP is not accurate.

   - `#counter(int, Counter)` is missed
   - `#meter(int)` should be `#meter(int, Meter)`
   - `#group(int)` should be `#addGroup(int)`


Best,

Xintong



On Tue, Jul 18, 2023 at 4:39 PM Chesnay Schepler  wrote:

> The MetricGroup interface contains methods to create groups and metrics
> using an int as a name. The original intention was to allow pattern like
> |group.addGroup("subtaskIndex").addGroup(0)| , but this didn't really
> work out, with |addGroup(String, String)|  serving this use case much
> better.
>
> Metric methods accept an int mostly for consistency, but there's no good
> use-case for it.
>
> These methods also offer hardly any convenience since all they do is
> save potential users from using |String.valueOf| on one argument. That's
> doesn't seem valuable enough for something that doubles the size of the
> interface.
>
> I propose to remove said method.
>
>
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-341%3A+Remove+MetricGroup+methods+accepting+an+int+as+a+name
>


[jira] [Created] (FLINK-32617) FlinkResultSetMetaData throw exception for most methods

2023-07-18 Thread Shengkai Fang (Jira)
Shengkai Fang created FLINK-32617:
-

 Summary: FlinkResultSetMetaData throw exception for most methods
 Key: FLINK-32617
 URL: https://issues.apache.org/jira/browse/FLINK-32617
 Project: Flink
  Issue Type: Improvement
Reporter: Shengkai Fang


I think most methods, e.g.

 

```

boolean supportsMultipleResultSets() throws SQLException;

boolean supportsMultipleTransactions() throws SQLException;

boolean supportsMinimumSQLGrammar() throws SQLException;

```

We can just return true or false.

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS][2.0] FLIP-340: Remove rescale REST endpoint

2023-07-18 Thread Xintong Song
+1

Best,

Xintong



On Tue, Jul 18, 2023 at 4:25 PM Chesnay Schepler  wrote:

> The endpoint hasn't been working for years and was only kept to inform
> users about it. Let's finally remove it.
>
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint
>
>


[jira] [Created] (FLINK-32616) FlinkStatement#executeQuery resource leaks when the input sql is not query

2023-07-18 Thread Shengkai Fang (Jira)
Shengkai Fang created FLINK-32616:
-

 Summary: FlinkStatement#executeQuery resource leaks when the input 
sql is not query
 Key: FLINK-32616
 URL: https://issues.apache.org/jira/browse/FLINK-32616
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / JDBC
Affects Versions: 1.18.0
Reporter: Shengkai Fang
 Fix For: 1.18.0


The current implementation just throw the exception if the input sql is not 
query. No one is responsible to close the StatementResult.

 

BTW, the current implementation just submit the sql to the gateway, which means 
the sql is executed. I just wonder do we need to expose this features to the 
users?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS][2.0] FLIP-337: Remove JarRequestBody#programArgs

2023-07-18 Thread Jing Ge
got it thanks!

For @Deprecated, I meant to force using like: @Deprecated(since = "1.18",
forRemoval = true)

Best regards,
Jing

On Tue, Jul 18, 2023 at 11:06 AM Hong Teoh  wrote:

> +1 to this. Nice to simplify the REST API!
>
>
> Regards,
> Hong
>
>
> > On 18 Jul 2023, at 10:00, Chesnay Schepler  wrote:
> >
> > Something to note is that the UI is using this parameter, and would have
> to be changed to the new one.
> >
> > Since we want to avoid having to split arguments ourselves, this may
> imply changes to the UI.
> >
> > On 18/07/2023 10:18, Chesnay Schepler wrote:
> >> We'll log a warn message when it is used and maybe hide it from the
> docs.
> >>
> >> Archunit rule doesn't really work here because it's not annotated with
> stability annotations (as it shouldn't since the classes aren't really
> user-facing).
> >>
> >> On 17/07/2023 21:56, Jing Ge wrote:
> >>> Hi Chesnay,
> >>>
> >>> I am trying to understand what is the right removal process with this
> >>> concrete example. Given all things about the programArgs are private or
> >>> package private except the constructor. Will you just mark it as
> deprecated
> >>> with constructor overloading in 1.18 and remove it in 2.0? Should we
> >>> describe the deprecation work in the FLIP?
> >>>
> >>> Another more general question, maybe offtrack, I don't know which
> thread is
> >>> the right place to ask, since Java 11 has been recommended, should we
> >>> always include "since" and "forRemoval" while adding @Deprecated, i.e.
> >>> ArchUnit rule?
> >>>
> >>> Best regards,
> >>> Jing
> >>>
> >>> On Mon, Jul 17, 2023 at 5:33 AM Xintong Song 
> wrote:
> >>>
>  +1
> 
>  Best,
> 
>  Xintong
> 
> 
> 
>  On Thu, Jul 13, 2023 at 9:34 PM Chesnay Schepler 
>  wrote:
> 
> > Hello,
> >
> > The request body for the jar run/plan REST endpoints accepts program
> > arguments as a string (programArgs) or a list of strings
> > (programArgsList). The latter was introduced as kept running into
> issues
> > with splitting the string into individual arguments./
> > /
> >
> > We ideally force users to use the list argument, and we can simplify
> the
> > codebase if there'd only be 1 way to pass arguments.
> >
> > As such I propose to remove the programArgs field from the request
> body.
> >
> >
> 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424796
> >
> > Regards,
> >
> > Chesnay
> >
> >>
> >
>
>


[jira] [Created] (FLINK-32615) Cache AutoscalerInfo for each resource

2023-07-18 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-32615:
--

 Summary: Cache AutoscalerInfo for each resource
 Key: FLINK-32615
 URL: https://issues.apache.org/jira/browse/FLINK-32615
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Gyula Fora
Assignee: Gyula Fora


Currently we always get the autoscaler info configmap through the Kubernetes 
rest api. This is a very heavy and unnecessary operation as the lifecycle is 
directly tied to the resource. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re:Re: [DISCUSS] Release 2.0 Work Items

2023-07-18 Thread Wencong Liu
Hi Chesnay,
Thanks for the reply. I think it is reasonable to remove the configuration 
argument
in AbstractUdfStreamOperator#open if it is consistently empty. I'll propose a 
discuss
about the specific actions in FLINK-6912 at a later time.


Best,
Wencong Liu











At 2023-07-18 16:38:59, "Chesnay Schepler"  wrote:
>On 18/07/2023 10:33, Wencong Liu wrote:
>> For FLINK-6912:
>>
>>  There are three implementations of RichFunction that actually use
>> the Configuration parameter in RichFunction#open:
>>  1. ContinuousFileMonitoringFunction#open: It uses the configuration
>> to configure the FileInputFormat. [1]
>>  2. OutputFormatSinkFunction#open: It uses the configuration
>> to configure the OutputFormat. [2]
>>  3. InputFormatSourceFunction#open: It uses the configuration
>>   to configure the InputFormat. [3]
>
>And none of them should have any effect since the configuration is empty.
>
>See org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.


Re: [DISCUSS][2.0] FLIP-337: Remove JarRequestBody#programArgs

2023-07-18 Thread Hong Teoh
+1 to this. Nice to simplify the REST API!


Regards,
Hong


> On 18 Jul 2023, at 10:00, Chesnay Schepler  wrote:
> 
> Something to note is that the UI is using this parameter, and would have to 
> be changed to the new one.
> 
> Since we want to avoid having to split arguments ourselves, this may imply 
> changes to the UI.
> 
> On 18/07/2023 10:18, Chesnay Schepler wrote:
>> We'll log a warn message when it is used and maybe hide it from the docs.
>> 
>> Archunit rule doesn't really work here because it's not annotated with 
>> stability annotations (as it shouldn't since the classes aren't really 
>> user-facing).
>> 
>> On 17/07/2023 21:56, Jing Ge wrote:
>>> Hi Chesnay,
>>> 
>>> I am trying to understand what is the right removal process with this
>>> concrete example. Given all things about the programArgs are private or
>>> package private except the constructor. Will you just mark it as deprecated
>>> with constructor overloading in 1.18 and remove it in 2.0? Should we
>>> describe the deprecation work in the FLIP?
>>> 
>>> Another more general question, maybe offtrack, I don't know which thread is
>>> the right place to ask, since Java 11 has been recommended, should we
>>> always include "since" and "forRemoval" while adding @Deprecated, i.e.
>>> ArchUnit rule?
>>> 
>>> Best regards,
>>> Jing
>>> 
>>> On Mon, Jul 17, 2023 at 5:33 AM Xintong Song  wrote:
>>> 
 +1
 
 Best,
 
 Xintong
 
 
 
 On Thu, Jul 13, 2023 at 9:34 PM Chesnay Schepler 
 wrote:
 
> Hello,
> 
> The request body for the jar run/plan REST endpoints accepts program
> arguments as a string (programArgs) or a list of strings
> (programArgsList). The latter was introduced as kept running into issues
> with splitting the string into individual arguments./
> /
> 
> We ideally force users to use the list argument, and we can simplify the
> codebase if there'd only be 1 way to pass arguments.
> 
> As such I propose to remove the programArgs field from the request body.
> 
> 
 https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424796 
> 
> Regards,
> 
> Chesnay
> 
>> 
> 



Re: FLIP-342: Remove brackets around keys returned by MetricGroup#getAllVariables

2023-07-18 Thread Chesnay Schepler

The FLIP number was changed to 342.

On 18/07/2023 10:56, Chesnay Schepler wrote:
MetricGroup#getAllVariables returns all variables associated with the 
metric, for example:


| = abcde|
| = ||0|

The keys are surrounded by brackets for no particular reason.

In virtually every use-case for this method the user is stripping the 
brackets from keys, as done in:


 * our datadog reporter:
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L244

 * our prometheus reporter (implicitly via a character filter):
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L236

 * our JMX reporter:
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-jmx/src/main/java/org/apache/flink/metrics/jmx/JMXReporter.java#L223


I propose to change the method spec and implementation to remove the 
brackets around keys.


For migration purposes it may make sense to add a new method with the 
new behavior (|getVariables()|) and deprecate the old method.



https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425202 





Re: [DISCUSS][2.0] FLIP-337: Remove JarRequestBody#programArgs

2023-07-18 Thread Chesnay Schepler
Something to note is that the UI is using this parameter, and would have 
to be changed to the new one.


Since we want to avoid having to split arguments ourselves, this may 
imply changes to the UI.


On 18/07/2023 10:18, Chesnay Schepler wrote:

We'll log a warn message when it is used and maybe hide it from the docs.

Archunit rule doesn't really work here because it's not annotated with 
stability annotations (as it shouldn't since the classes aren't really 
user-facing).


On 17/07/2023 21:56, Jing Ge wrote:

Hi Chesnay,

I am trying to understand what is the right removal process with this
concrete example. Given all things about the programArgs are private or
package private except the constructor. Will you just mark it as 
deprecated

with constructor overloading in 1.18 and remove it in 2.0? Should we
describe the deprecation work in the FLIP?

Another more general question, maybe offtrack, I don't know which 
thread is

the right place to ask, since Java 11 has been recommended, should we
always include "since" and "forRemoval" while adding @Deprecated, i.e.
ArchUnit rule?

Best regards,
Jing

On Mon, Jul 17, 2023 at 5:33 AM Xintong Song  
wrote:



+1

Best,

Xintong



On Thu, Jul 13, 2023 at 9:34 PM Chesnay Schepler 
wrote:


Hello,

The request body for the jar run/plan REST endpoints accepts program
arguments as a string (programArgs) or a list of strings
(programArgsList). The latter was introduced as kept running into 
issues

with splitting the string into individual arguments./
/

We ideally force users to use the list argument, and we can 
simplify the

codebase if there'd only be 1 way to pass arguments.

As such I propose to remove the programArgs field from the request 
body.



https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424796 



Regards,

Chesnay







FLIP-401: Remove brackets around keys returned by MetricGroup#getAllVariables

2023-07-18 Thread Chesnay Schepler
MetricGroup#getAllVariables returns all variables associated with the 
metric, for example:


| = abcde|
| = ||0|

The keys are surrounded by brackets for no particular reason.

In virtually every use-case for this method the user is stripping the 
brackets from keys, as done in:


 * our datadog reporter:
   
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L244
   

 * our prometheus reporter (implicitly via a character filter):
   
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L236
   

 * our JMX reporter:
   
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-jmx/src/main/java/org/apache/flink/metrics/jmx/JMXReporter.java#L223
   


I propose to change the method spec and implementation to remove the 
brackets around keys.


For migration purposes it may make sense to add a new method with the 
new behavior (|getVariables()|) and deprecate the old method.



https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425202

Re: [DISCUSS] Release 2.0 Work Items

2023-07-18 Thread Chesnay Schepler

On 18/07/2023 10:33, Wencong Liu wrote:

For FLINK-6912:

 There are three implementations of RichFunction that actually use
the Configuration parameter in RichFunction#open:
 1. ContinuousFileMonitoringFunction#open: It uses the configuration
to configure the FileInputFormat. [1]
 2. OutputFormatSinkFunction#open: It uses the configuration
to configure the OutputFormat. [2]
 3. InputFormatSourceFunction#open: It uses the configuration
  to configure the InputFormat. [3]


And none of them should have any effect since the configuration is empty.

See org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.


[DISCUSS][2.0] FLIP-341: Remove MetricGroup methods accepting an int as a name

2023-07-18 Thread Chesnay Schepler
The MetricGroup interface contains methods to create groups and metrics 
using an int as a name. The original intention was to allow pattern like 
|group.addGroup("subtaskIndex").addGroup(0)| , but this didn't really 
work out, with |addGroup(String, String)|  serving this use case much 
better.


Metric methods accept an int mostly for consistency, but there's no good 
use-case for it.


These methods also offer hardly any convenience since all they do is 
save potential users from using |String.valueOf| on one argument. That's 
doesn't seem valuable enough for something that doubles the size of the 
interface.


I propose to remove said method.


https://cwiki.apache.org/confluence/display/FLINK/FLIP-341%3A+Remove+MetricGroup+methods+accepting+an+int+as+a+name 


Re:Re: [DISCUSS] Release 2.0 Work Items

2023-07-18 Thread Wencong Liu
Thanks Xintong Song and Matthias for the insightful discussion!


I have double-checked the jira tickets that belong to the 
"Need action in 1.18" section and have some inputs to share.

For FLINK-4675:

The argument StreamExecutionEnvironment in 
WindowAssigner.getDefaultTrigger() 
is not used in all implementations of WindowAssigner and is no longer needed.

For FLINK-6912:

There are three implementations of RichFunction that actually use 
the Configuration parameter in RichFunction#open:
1. ContinuousFileMonitoringFunction#open: It uses the configuration 
to configure the FileInputFormat. [1]
2. OutputFormatSinkFunction#open: It uses the configuration 
to configure the OutputFormat. [2]
3. InputFormatSourceFunction#open: It uses the configuration
 to configure the InputFormat. [3]
I think RichFunction#open should still take a Configuration 
instance as an argument.

For FLINK-5336:

There are three classes that de/serialize the Path through IOReadWritable 
interface:
1. FileSourceSplitSerializer: It de/serializes the Path during the process 
of de/serializing FileSourceSplit. [4]
2. TestManagedSinkCommittableSerializer: It de/serializes the Path during 
the process of de/serializing TestManagedCommittable. [5]
3. TestManagedFileSourceSplitSerializer: It de/serializes the Path during 
the process of de/serializing TestManagedIterableSourceSplit. [6]
I think the Path should still implement the IOReadWritable interface.


I plan to propose a discussion about removing argument in FLINK-4675 and 
comment the conclusion in FLINK-6912 and FLINK-5336, WDYT?

[1] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/ContinuousFileMonitoringFunction.java#L199
[2] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/OutputFormatSinkFunction.java#L63
[3] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/InputFormatSourceFunction.java#L64C2-L64C2
[4] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSourceSplitSerializer.java#L67

[5] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-table/flink-table-common/src/test/java/org/apache/flink/table/connector/sink/TestManagedSinkCommittableSerializer.java#L113
[6] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-table/flink-table-common/src/test/java/org/apache/flink/table/connector/source/TestManagedFileSourceSplitSerializer.java#L56

















At 2023-07-17 12:23:51, "Xintong Song"  wrote:
>Hi Matthias,
>
>How's it going with the summary of existing 2.0.0 jira tickets?
>
>I have gone through everything listed under FLINK-3957[1], and will
>continue with other Jira tickets whose fix-version is 2.0.0.
>
>Here are my 2-cents on the FLINK-3975 subtasks. Hope this helps on your
>summary.
>
>I'd suggest going ahead with the following tickets.
>
>   - Need action in 1.18
>  - FLINK-4675: Double-check whether the argument is indeed not used.
>  Introduce a new non-argument API, and mark the original one as
>  `@Deprecated`. FLIP needed.
>  - FLINK-6912: Double-check whether the argument is indeed not used.
>  Introduce a new non-argument API, and mark the original one as
>  `@Deprecated`. FLIP needed.
>  - FLINK-5336: Double-check whether `IOReadableWritable` is indeed not
>  needed for `Path`. Mark methods from `IOReadableWritable` as
>`@Deprecated`
>  in `Path`. FLIP needed.
>   - Need no action in 1.18
>  - FLINK-4602/14068: Already listed in the release 2.0 wiki [2]
>  - FLINK-3986/3991/3992/4367/5130/7691: Subsumed by "Deprecated
>  methods/fields/classes in DataStream" in the release 2.0 wiki [2]
>  - FLINK-6375: Change the hashCode behavior of `LongValue` (and other
>  numeric types).
>
>I'd suggest not doing the following tickets.
>
>   - FLINK-4147/4330/9529/14658: These changes are non-trivial for both
>   developers and users. Also, we are taking them into consideration designing
>   the new ProcessFunction API. I'd be in favor of letting users migrate to
>   the ProcessFunction API directly once it's ready, rather than forcing users
>   to adapt to the breaking changes twice.
>   - FLINK-3610: Only affects Scala API, which will soon be removed.
>
>I don't have strong opinions on whether to work on the following tickets or
>not. Some of them are not very clear to me based on the description and
>conversation on the ticket, others may require further investigation and
>evaluation to decide. Unless someone volunteers to look into them, I'd be
>slightly 

[DISCUSS][2.0] FLIP-340: Remove rescale REST endpoint

2023-07-18 Thread Chesnay Schepler
The endpoint hasn't been working for years and was only kept to inform 
users about it. Let's finally remove it.


https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint 



Re: [DISCUSS][2.0] FLIP-337: Remove JarRequestBody#programArgs

2023-07-18 Thread Chesnay Schepler

We'll log a warn message when it is used and maybe hide it from the docs.

Archunit rule doesn't really work here because it's not annotated with 
stability annotations (as it shouldn't since the classes aren't really 
user-facing).


On 17/07/2023 21:56, Jing Ge wrote:

Hi Chesnay,

I am trying to understand what is the right removal process with this
concrete example. Given all things about the programArgs are private or
package private except the constructor. Will you just mark it as deprecated
with constructor overloading in 1.18 and remove it in 2.0?  Should we
describe the deprecation work in the FLIP?

Another more general question, maybe offtrack, I don't know which thread is
the right place to ask, since Java 11 has been recommended, should we
always include "since" and "forRemoval" while adding @Deprecated, i.e.
ArchUnit rule?

Best regards,
Jing

On Mon, Jul 17, 2023 at 5:33 AM Xintong Song  wrote:


+1

Best,

Xintong



On Thu, Jul 13, 2023 at 9:34 PM Chesnay Schepler 
wrote:


Hello,

The request body for the jar run/plan REST endpoints accepts program
arguments as a string (programArgs) or a list of strings
(programArgsList). The latter was introduced as kept running into issues
with splitting the string into individual arguments./
/

We ideally force users to use the list argument, and we can simplify the
codebase if there'd only be 1 way to pass arguments.

As such I propose to remove the programArgs field from the request body.



https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424796


Regards,

Chesnay





[jira] [Created] (FLINK-32614) avro mappings aren't always named as pojos

2023-07-18 Thread Martin Sillence (Jira)
Martin Sillence created FLINK-32614:
---

 Summary: avro mappings aren't always named as pojos
 Key: FLINK-32614
 URL: https://issues.apache.org/jira/browse/FLINK-32614
 Project: Flink
  Issue Type: Bug
Reporter: Martin Sillence


Debezium with the flatten SMT:

    "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",

{{Will create avro with the field}}

{{    {}}
{{      "default": null,}}
{{      "name": "__deleted",}}
{{      "type": [}}
{{        "null",}}
{{        "string"}}
{{      ]}}
{{    }}}

 

This has the expected field:
  private java.lang.String __deleted;

{{and constructor but the getter and setter are named:}}
{{  public java.lang.String getDeleted$1() {}}
{{    return __deleted;}}
{{  }}}

{{ }}{{  public void setDeleted$1(java.lang.String value) {}}
{{    this.__deleted = value;}}
{{  }}}

{{Trying to use this generate class throws:}}

 

Exception in thread "main" java.lang.IllegalStateException: Expecting type to 
be a PojoTypeInfo
    at 
org.apache.flink.formats.avro.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:72)
    at 
org.apache.flink.formats.avro.typeutils.AvroTypeInfo.(AvroTypeInfo.java:55)
    at 
org.apache.flink.formats.avro.utils.AvroKryoSerializerUtils.createAvroTypeInfo(AvroKryoSerializerUtils.java:87)
    at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1939)
    at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1840)
    at 
org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:982)
    at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:802)
    at 
org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfo(TypeExtractor.java:749)
    at 
org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfo(TypeExtractor.java:745)
    at 
org.apache.flink.api.common.typeinfo.TypeInformation.of(TypeInformation.java:210)
    at com.fnz.flink.AvroDeserialization.(AvroDeserialization.java:22)
    at com.fnz.flink.DataStreamJob.main(DataStreamJob.java:90)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Jing Ge
+1(binding)

Best regards,
Jing

On Tue, Jul 18, 2023 at 8:31 AM Rui Fan <1996fan...@gmail.com> wrote:

> +1(binding)
>
> Best,
> Rui Fan
>
>
> On Tue, Jul 18, 2023 at 12:04 PM Dong Lin  wrote:
>
> > Hi all,
> >
> > We would like to start the vote for FLIP-309: Support using larger
> > checkpointing interval when source is processing backlog [1]. This FLIP
> was
> > discussed in this thread [2].
> >
> > The vote will be open until at least July 21st (at least 72 hours),
> > following
> > the consensus voting process.
> >
> > Cheers,
> > Yunfeng and Dong
> >
> > [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
> >
> >
> %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> > [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
> >
>


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Rui Fan
+1(binding)

Best,
Rui Fan


On Tue, Jul 18, 2023 at 12:04 PM Dong Lin  wrote:

> Hi all,
>
> We would like to start the vote for FLIP-309: Support using larger
> checkpointing interval when source is processing backlog [1]. This FLIP was
> discussed in this thread [2].
>
> The vote will be open until at least July 21st (at least 72 hours),
> following
> the consensus voting process.
>
> Cheers,
> Yunfeng and Dong
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
>
> %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
>


Re: [VOTE] Release 2.0 must-have work items

2023-07-18 Thread Xintong Song
Let me try to summarize the proposed changes on the list.

   - "Eager State Declaration" should be nice-to-have
   - "Remove SourceFunction / SinkFunction / SinkV1" should be changed to
   "Remove SinkV1", and remain must-to-have
   - "Remove Queryable State" should be nice-to-have. It will be deprecated
   in 1.18, but the hard removal requires community discussion and vote, which
   may or may not happen in 2.0.
   - "Refactor the API modules" should be TBD.
   - I also noticed Zhenqiu Huang has already changed "Drop YARN-specific
   mutating GET REST endpoints" from TBD to must-have, which I'd like to bring
   up here for attention.

I'd leave this discussion open for the next couple of days. If there are no
objections, I'll update the list and start another round of voting.

In addition, I'd like to cross-post from the other thread [1] that:

I'm not aware of any decision that has already been made by the community
> regarding after which 1.x minor release will we ship the 2.0 major release.



> I also don't think we should push all the deprecation works in 1.18.
>


> Deciding to deprecate / remove an API definitely deserves thorough
> discussions and FLIP, which takes time, and I don't think we should
> compromise that for any reason.



> Assuming at some point we want to ship the 2.0 release, and there are some
> deprecated APIs that we want to remove but have not fulfilled the migration
> period required by FLIP-321, I see 3 options to move forward.

1. Not removing the deprecated APIs in 2.0, carrying them until 3.0.

2. Postpone the 2.0 release for another minor release.

3. Considering such APIs as exceptions of FLIP-321.



> Trying to deprecate things early is still helpful, because it reduces the
> number of APIs we need to consider when deciding between the options.
> However, that must not come at the price of rush decisions. I'd suggest
> developers to design / discuss / vote the breaking changes at their pace,
> and we evaluate the status and choose between the options later (maybe
> around the time releasing 1.19).
>

Best,

Xintong


[1] https://lists.apache.org/thread/m0b1v2htd0l7oqo6ypf8lnjyjo817bmm



On Mon, Jul 17, 2023 at 6:33 PM Yuan Mei  wrote:

> Hey Yun and Xintong,
>
> (Had a quick offline discussion with Yun)
> 1. I agree the current implementation of the queryable state is not a
> blocker of anything related to disaggregated state management. They are
> different things.
> 2. On the other hand, "queryable snapshot" is not a completely equivalent
> substitution of "queryable state".
> 3. But in whatever way, I think the way how "queryable state" is designed
> is not the right way to move forward.
> 4. "Deprecating queryable state" is put as a must-have because this topic
> has been raised many times along the way. It seems to reach an agreement
> every time as mentioned by Xingtong, but no one really takes the action.
>
> I am suggesting:
>
> 1. Remove "Deprecating queryable state" from the must-have list (since it
> does not meet the requirements of "must-have")
> 2. But I am still hoping we can move things forward, so let's put
> the @Deprecated annotation on it
> 3. Removal of the code follows a formal community discussion and vote.
>
> Best
> Yuan
>
>
>
>
> On Mon, Jul 17, 2023 at 3:40 PM Xintong Song 
> wrote:
>
> > Thanks for the clarification.
> >
> > If the list of "Remove deprecated APIs" means, we must remove the code in
> > > Flink-2.0 initial release, I would vote -1 for queryable state before
> we
> > > get an alternative.
> >
> >
> > FYI, the removal of queryable state is currently marked as the
> `must-have`
> > priority.  Of course it's not a final decision and that's exactly why we
> > are collecting feedback about the list now.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Mon, Jul 17, 2023 at 3:15 PM Yun Tang  wrote:
> >
> > > Hi Xintong,
> > >
> > > If the current implementation of queryable state would not block the
> > > implementation of disaggregated state-backends.
> > > I prefer to not removing the implementation until we have a better
> > > solution (maybe based on the queryable snapshot) cc @Yuan.
> > >
> > > If the list of "Remove deprecated APIs" means, we must remove the code
> in
> > > Flink-2.0 initial release, I would vote -1 for queryable state before
> we
> > > get an alternative.
> > > And I will raise the concern in the Flink roadmap discussion.
> > >
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Xintong Song 
> > > Sent: Monday, July 17, 2023 10:07
> > > To: dev@flink.apache.org 
> > > Subject: Re: [VOTE] Release 2.0 must-have work items
> > >
> > > @Yun,
> > > I see your point that the ability queryable states trying to provide is
> > > meaningful but the current implementation of the feature is
> problematic.
> > So
> > > what's your opinion on deprecating the current queryable state? Do you
> > > think we need to wait until there is a new implementation of queryable
> > > state to remove