Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

2023-06-22 Thread Jing Ge
Hi Galen,

Thanks for the hint which is helpful for us to have a clear big picture.
Afaiac, this will not be a blocking issue for the graduation. There will
always be some (potential) bugs in the implementation. The API is very
stable from 2020. The timing is good to graduate. WDYT?
Furthermore, I'd like to have more opinions. All opinions together will
help the community build a mature API graduation process.

Best regards,
Jing

On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
 wrote:

> Is this issue still unresolved?
>
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
>
> Based on prior discussion, I believe this could lead to data loss with
> FileSink.
>
>
>
> On Tue, Jun 20, 2023, 5:41 AM Jing Ge  wrote:
>
> > Hi all,
> >
> > The FileSink has been marked as @Experimental[1] since Oct. 2020.
> > According to FLIP-197[2], I would like to propose to graduate it
> > to @PublicEvloving in the upcoming 1.18 release.
> >
> > On the other hand, as a related topic, FileSource was marked
> > as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion
> too.
> > To keep this discussion lean and efficient, let's focus on FlieSink in
> this
> > thread. There will be another discussion thread for the FileSource.
> >
> > I was wondering if anyone might have any concerns. Looking forward to
> > hearing from you.
> >
> >
> > Best regards,
> > Jing
> >
> >
> >
> >
> >
> >
> > [1]
> >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > [3]
> >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
> >
>


[jira] [Created] (FLINK-32418) ClassNotFoundException when using flink-protobuf with sql-client

2023-06-22 Thread Michael Kreis (Jira)
Michael Kreis created FLINK-32418:
-

 Summary: ClassNotFoundException when using flink-protobuf with 
sql-client
 Key: FLINK-32418
 URL: https://issues.apache.org/jira/browse/FLINK-32418
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
SQL / Client
Affects Versions: 1.16.2
Reporter: Michael Kreis


When the protobuf format in the kafka connector is used via the sql-client it 
is not able to load the generated protobuf classes which are either passed via 
`-j /protobuf-classes.jar` or added in the script via ADD JAR 
'/protobuf-classes.jar'. The SHOW JARS command prints that the jar is loaded 
but when the protobuf classes are loaded a ClassNotFoundException occurs.

executed command:
{code:java}
sql-client.sh -f protobuf-table.sql -j /protobuf-classes.jar
{code}
protobuf-table.sql
{code:sql}
ADD JAR '/opt/sql-client/lib/flink-sql-connector-kafka-1.16.2.jar';
ADD JAR '/opt/sql-client/lib/flink-protobuf-1.16.2.jar';

SHOW JARS;

CREATE TABLE POSITIONS(id BIGINT) WITH (
  'connector' = 'kafka',
  'format' = 'protobuf',
  'topic' = 'protbuf-topic',
  'properties.bootstrap.servers' = 'kafka:9092',
  'properties.group.id' = 'flink-protobuf',
  'properties.security.protocol' = 'SASL_PLAINTEXT',
  'properties.sasl.mechanism' = 'SCRAM-SHA-512',
  'properties.sasl.jaas.config' = 
'org.apache.kafka.common.security.scram.ScramLoginModule required 
username="user" password="";',
  'scan.startup.mode' = 'earliest-offset',
  'protobuf.message-class-name' = 'com.example.protobuf.ProtoMessage',
  'protobuf.ignore-parse-errors' = 'true'
  );

SELECT * FROM POSITIONS;
{code}
exception in the log:
{code:java}
Caused by: java.lang.ClassNotFoundException: com.example.protobuf.ProtoMessage
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown 
Source)
at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown 
Source)
at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Unknown Source)
at 
org.apache.flink.formats.protobuf.util.PbFormatUtils.getDescriptor(PbFormatUtils.java:89)
... 36 more
{code}
This also seems somehow related to FLINK-30318



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


When does backpressure matter

2023-06-22 Thread Lu Niu
For example, if a flink job reads from kafka do something and writes to
kafka. Do we need to take any actions when the job kafka consumer lag is
low or 0 but some tasks have constant backpressure? Do we need to increase
the parallelism or do some network tuning so that backpressure is constant
0? If so, would that lead to resource overprovision?

Or is it that only when kafka lag keeps increasing while backpressure is
happening at the same time, we need to take action?


Best

Lu


[jira] [Created] (FLINK-32417) DynamicKafkaSource User Documentation

2023-06-22 Thread Mason Chen (Jira)
Mason Chen created FLINK-32417:
--

 Summary: DynamicKafkaSource User Documentation
 Key: FLINK-32417
 URL: https://issues.apache.org/jira/browse/FLINK-32417
 Project: Flink
  Issue Type: Sub-task
Reporter: Mason Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32416) Initial DynamicKafkaSource Implementation

2023-06-22 Thread Mason Chen (Jira)
Mason Chen created FLINK-32416:
--

 Summary: Initial DynamicKafkaSource Implementation 
 Key: FLINK-32416
 URL: https://issues.apache.org/jira/browse/FLINK-32416
 Project: Flink
  Issue Type: Sub-task
Reporter: Mason Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT][VOTE] FLIP-246: Dynamic Kafka Source (originally Multi Cluster Kafka Source)

2023-06-22 Thread Mason Chen
Hi everyone,

I'm happy to announce that FLIP-246: DynamicKafkaSource [1] has been
approved! According to the voting thread [2], there are 6 approving votes,
5 of which are binding:

- Jing Ge (binding)
- Tzu-Li (Gordon) Tai (binding)
- Martijn Visser (binding)
- Ryan van Huuksloot (non-binding)
- Thomas Weise (binding)
- Rui Fan (binding)

And no disapproving votes. We've also settled on the DynamicKafkaSource
name.

Thanks all for participating! I'm excited to finally contribute this!

Best,
Mason

[1] https://cwiki.apache.org/confluence/x/CBn1D
[2] https://lists.apache.org/thread/nx00y04t9bslp4mq20x1x8h268gr44o3


[DISCUSS] Persistent SQL Gateway

2023-06-22 Thread Ferenc Csaky
Hello devs,

I would like to open a discussion about persistence possibilitis for the SQL 
Gateway. At Cloudera, we are happy to see the work already done on this project 
and looking for ways to utilize it on our platform as well, but currently it 
lacks some features that would be essential in our case, where we could help 
out.

I am not sure if any thought went into gateway persistence specifics already, 
and this feature could be implemented in fundamentally differnt ways, so I 
think the frist step could be to agree on the basics.

First, in my opinion, persistence should be an optional feature of the gateway, 
that can be enabled if desired. There can be a lot of implementation details, 
but there can be some major directions to follow:

- Utilize Hive catalog: The Hive catalog can already be used to have 
persistenct meta-objects, so the crucial thing that would be missing in this 
case is other catalogs. Personally, I would not pursue this option, because in 
my opinion it would limit the usability of this feature too much.
- Serialize the session as is: Saving the whole session (or its context) [1] as 
is to durable storage, so it can be kept and picked up again.
- Serialize the required elements (catalogs, tables, functions, etc.), not 
necessarily as a whole: The main point here would be to serialize a different 
object, so the persistent data will not be that sensitive to changes of the 
session (or its context). There can be numerous factors here, like try to keep 
the model close to the session itself, so the boilerplate required for the 
mapping can be kept to minimal, or focus on saving what is actually necessary, 
making the persistent storage more portable.

WDYT?

Cheers,
F

[1] 
https://github.com/apache/flink/blob/master/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/session/Session.java

[jira] [Created] (FLINK-32415) Add maven wrapper to benchmark to avoid maven version issues

2023-06-22 Thread Gabor Somogyi (Jira)
Gabor Somogyi created FLINK-32415:
-

 Summary: Add maven wrapper to benchmark to avoid maven version 
issues
 Key: FLINK-32415
 URL: https://issues.apache.org/jira/browse/FLINK-32415
 Project: Flink
  Issue Type: Improvement
  Components: Benchmarks
Reporter: Gabor Somogyi






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-22 Thread Lijie Wang
Hi all,

Thanks for all the feedback about this FLIP.  If there are no other
questions or concerns, I will start voting tomorrow(Friday, June 23rd).

Best,
Lijie

Jing Ge  于2023年6月21日周三 17:14写道:

> Hi Ron,
>
> Thanks for sharing your thoughts! It makes sense. It would be helpful if
> these references of Hive, Polardb, etc. could be added into the FLIP.
>
> Best regards,
> Jing
>
> On Tue, Jun 20, 2023 at 5:41 PM liu ron  wrote:
>
> > Hi, Jing
> >
> > The default value for this ratio is a reference to other systems, such as
> > Hive. As long as Runtime Filter can filter out more than half of the
> data,
> > we can benefit from it. Of course, normally, as long as we can get the
> > statistics, ndv are present, the use of rowCount should be less, so I
> think
> > the formula is valid in most cases. This formula we are also borrowed
> from
> > some systems, such as the polardb of AliCloud. your concern is valuable
> for
> > this FLIP, but currently, we do not know how to adjust is reasonably, too
> > complex may lead to the user also can not understand, so I think we
> should
> > first follow the simple way, the subsequent gradual optimization. The
> first
> > step may be that we can verify the reasonableness of current formula by
> > TPC-DS case.
> >
> > Best,
> > Ron
> >
> > Jing Ge  于2023年6月20日周二 19:46写道:
> >
> > > Hi Ron,
> > >
> > > Thanks for the clarification. That answered my questions.
> > >
> > > Regarding the ratio, since my gut feeling is that any value less than
> 0.8
> > > or 0.9 won't help too much(I might be wrong). I was thinking of
> adapting
> > > the formula to somehow map the current 0.9-1 to 0-1, i.e. if user
> config
> > > 0.5, it will be mapped to e.g. 0.95 (or e.g. 0.85, the real number
> > > needs more calculation) for the current formula described in the FLIP.
> > But
> > > I am not sure it is a feasible solution. It deserves more discussion.
> > Maybe
> > > some real performance tests could give us some hints.
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Tue, Jun 20, 2023 at 5:19 AM liu ron  wrote:
> > >
> > > > Hi, Jing
> > > >
> > > > Thanks for your feedback.
> > > >
> > > > > Afaiu, the runtime Filter will only be Injected when the gap
> between
> > > the
> > > > build data size and prob data size is big enough. Let's make an
> extreme
> > > > example. If the small table(build side) has one row and the large
> > > > table(probe side) contains tens of billions of rows. This will be the
> > > ideal
> > > > use case for the runtime filter and the improvement will be
> > significant.
> > > Is
> > > > this correct?
> > > >
> > > > Yes, you are right.
> > > >
> > > > > Speaking of the "Conditions of injecting Runtime Filter" in the
> FLIP,
> > > > will
> > > > the value of max-build-data-size and min-prob-data-size depend on the
> > > > parallelism config? I.e. with the same data-size setting, is it
> > possible
> > > to
> > > > inject or don't inject runtime filters by adjusting the parallelism?
> > > >
> > > > First, let me clarify two points. The first is that RuntimeFilter
> > decides
> > > > whether to inject or not in the optimization phase, but we do not
> > > consider
> > > > operator parallelism in the SQL optimization phase currently, which
> is
> > > set
> > > > at the ExecNode level. The second is that in batch mode, the default
> > > > AdaptiveBatchScheduler[1] is now used, which will derive the
> > parallelism
> > > of
> > > > the downstream operator based on the amount of data produced by the
> > > > upstream operator, that is, the parallelism is determined by runtime
> > > > adaptation. In the above case, we cannot decide whether to inject
> > > > BloomFilter in the optimization stage based on parallelism.
> > > > A more important point is that the purpose of Runtime Filter is to
> > reduce
> > > > the amount of data for shuffle, and thus the amount of data processed
> > by
> > > > the downstream join operator. Therefore, I understand that regardless
> > of
> > > > the parallelism of the probe, the amount of data in the shuffle must
> be
> > > > reduced after inserting the Runtime Filter, which is beneficial to
> the
> > > join
> > > > operator, so whether to insert the RuntimeFilter or not is not
> > dependent
> > > on
> > > > the parallelism.
> > > >
> > > > > Does it make sense to reconsider the formula of ratio
> > > > calculation to help users easily control the filter injection?
> > > >
> > > > Only when ndv does not exist will row count be considered. when size
> > uses
> > > > the default value and ndv cannot be taken, it is true that this
> > condition
> > > > may always hold, but this does not seem to affect anything, and the
> > user
> > > is
> > > > also likely to change the value of the size. One question, how do you
> > > think
> > > > we should make it easier for users to control the  filter injection?
> > > >
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> >
> 

[RESULT][VOTE]FLIP-308: Support Time Travel

2023-06-22 Thread Feng Jin
Hi everyone,

Happy to announce that FLIP-308[1] has been approved!
According to the vote thread[2], there are 7 approving votes, out of
which 6 are binding:

- yuxia (binding)
- Lincoln Lee (binding)
- Benchao Li (binding)
- Jing Ge (binding)
- Leonard Xu (binding)
- John Roesler (non-binding)
- Yun Tang (binding)

And no disapproving ones.

Thanks all for participating!

Best,
Feng

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel
[2] https://lists.apache.org/thread/d8g8dmktgtmso14ljqtz9mwx1dsch8q0


[jira] [Created] (FLINK-32414) Watermark alignment will cause flink jobs to hang forever when any source subtask has no SourceSplit

2023-06-22 Thread Rui Fan (Jira)
Rui Fan created FLINK-32414:
---

 Summary: Watermark alignment will cause flink jobs to hang forever 
when any source subtask has no SourceSplit
 Key: FLINK-32414
 URL: https://issues.apache.org/jira/browse/FLINK-32414
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.17.1, 1.16.2
Reporter: Rui Fan
Assignee: Rui Fan
 Attachments: image-2023-06-22-22-43-59-671.png

Watermark alignment will cause flink jobs to hang forever when any source 
subtask has no SourceSplit.
h1. Root cause:
 # 
[SourceOperator#emitLatestWatermark|https://github.com/apache/flink/blob/274aa0debffaa57926c474f11e36be753b49cbc5/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/SourceOperator.java#L504]
 reports the lastEmittedWatermark to SourceCoordinator
 # If one subtask has no SourceSplit, the lastEmittedWatermark will be the 
[Watermark.UNINITIALIZED.getTimestamp()|https://github.com/apache/flink/blob/274aa0debffaa57926c474f11e36be753b49cbc5/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/SourceOperator.java#L149]
 forever, it's Long.MIN_VALUE.
 # SourceCoordinator combines the watermark of all subtasks, and using the 
[minimum 
watermark|https://github.com/apache/flink/blob/274aa0debffaa57926c474f11e36be753b49cbc5/flink-runtime/src/main/java/org/apache/flink/runtime/source/coordinator/SourceCoordinator.java#L644]
 as the aggregated watermark.
 # Long.MIN_VALUE must be the minimum watermark, so the maxAllowedWatermark =  
Long.MIN_VALUE + maxAllowedWatermarkDrift, and [SourceCoordinator will announce 
it to all 
subtasks.|https://github.com/apache/flink/blob/274aa0debffaa57926c474f11e36be753b49cbc5/flink-runtime/src/main/java/org/apache/flink/runtime/source/coordinator/SourceCoordinator.java#L168]
 # The maxAllowedWatermark is very small, so all source subtasks will hang 
forever

h1. How to reproduce?

When the kafka partition number is less than the parallelism of kafka source.

Here is a demo: [code 
link|https://github.com/1996fanrui/fanrui-learning/commit/24b707f7805b3a61a70df1c70c26f8e8a16b006b]
 * kafka partition is 1
 * The paralleslism is 2

 

!image-2023-06-22-22-43-59-671.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-246: Dynamic Kafka Source (originally Multi Cluster Kafka Source)

2023-06-22 Thread Rui Fan
+1 (binding)

+1 for DynamicKafkaSource

Best,
Rui Fan

On Wed, Jun 21, 2023 at 6:57 PM Thomas Weise  wrote:

> +1 (binding)
>
>
> On Mon, Jun 19, 2023 at 8:09 AM Ryan van Huuksloot
>  wrote:
>
> > +1 (non-binding)
> >
> > +1 for DynamicKafkaSource
> >
> > Ryan van Huuksloot
> > Sr. Production Engineer | Streaming Platform
> > [image: Shopify]
> >  >
> >
> >
> > On Mon, Jun 19, 2023 at 8:15 AM Martijn Visser  >
> > wrote:
> >
> > > +1 (binding)
> > >
> > > +1 for DynamicKafkaSource
> > >
> > >
> > > On Sat, Jun 17, 2023 at 5:31 AM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> > >
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > +1 for either DynamicKafkaSource or DiscoveringKafkaSource
> > > >
> > > > Cheers,
> > > > Gordon
> > > >
> > > > On Thu, Jun 15, 2023, 10:56 Mason Chen 
> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Thank you to everyone for the feedback on FLIP-246 [1]. Based on
> the
> > > > > discussion thread [2], we have come to a consensus on the design
> and
> > > are
> > > > > ready to take a vote to contribute this to Flink.
> > > > >
> > > > > This voting thread will be open for at least 72 hours (excluding
> > > > weekends,
> > > > > until June 20th 10:00AM PST) unless there is an objection or an
> > > > > insufficient number of votes.
> > > > >
> > > > > (Optional) If you have an opinion on the naming of the connector,
> > > please
> > > > > include it in your vote:
> > > > > 1. DynamicKafkaSource
> > > > > 2. MultiClusterKafkaSource
> > > > > 3. DiscoveringKafkaSource
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217389320
> > > > > [2]
> https://lists.apache.org/thread/vz7nw5qzvmxwnpktnofc9p13s1dzqm6z
> > > > >
> > > > > Best,
> > > > > Mason
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-32413) Add fallback error handler to DefaultLeaderElectionService

2023-06-22 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-32413:
-

 Summary: Add fallback error handler to DefaultLeaderElectionService
 Key: FLINK-32413
 URL: https://issues.apache.org/jira/browse/FLINK-32413
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Matthias Pohl


The FLIP-285 work separated the driver lifecycle from the contender lifecycle. 
Now, a contender can be removed but the driver could still be running. Error 
could be produced on the driver's side. The {{DefaultLeaderElectionService}} 
would try to forward the error to the contender. With not contender being 
registered, the error would be swallowed.

We should add a fallback error handler for this specific case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32412) JobID collisions in FlinkSessionJob

2023-06-22 Thread Fabio Wanner (Jira)
Fabio Wanner created FLINK-32412:


 Summary: JobID collisions in FlinkSessionJob
 Key: FLINK-32412
 URL: https://issues.apache.org/jira/browse/FLINK-32412
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.5.0
Reporter: Fabio Wanner


>From time to time we see {{JobId}} collisions in our deployments due to the 
>low entropy of the generated {{{}JobId{}}}. The problem is that, although the 
>{{uid}} from the k8s-resource (which is a UUID, but we don't know of which 
>version), only the {{hashCode}} of it will be used for the {{{}JobId{}}}. The 
>{{hashCode}} is an integer, thus 32 bits. If we look at the birthday problem 
>theorem we can expect a collision with a 50% chance with only 77000 random 
>integers. 

In reality we seem to see the problem more often, but this could be because the 
{{uid}} might not be completely random, therefore increasing the chances if we 
just use parts of it.

We propose to at least use the complete 64 bits of the upper part of the 
{{{}JobId{}}}, where 5.1×10{^}9{^} IDs are needed for a collision chance of 
50%. We could even argue that most probably 64 bit for the generation number is 
not needed and another 32 bit could be spent on the uid to increase the entropy 
of the {{JobId}} even further (This would mean the max generation would be 
4,294,967,295).

Our suggestion for using 64 bits would be:
{code:java}
new JobID(
UUID.fromString(Preconditions.checkNotNull(uid)).getMostSignificantBits(), 
Preconditions.checkNotNull(generation)
);
{code}
Any thoughts on this? I would create a PR once we know how to proceed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-22 Thread Becket Qin
Thanks much for the input, John, Stefan and Jing.

I think Xingtong has well summarized the pros and cons of the two options.
Let's collect a few more opinions here and we can move forward with the one
more people prefer.

Thanks,

Jiangjie (Becket) Qin

On Wed, Jun 21, 2023 at 3:20 AM Jing Ge  wrote:

> Hi all,
>
> Thanks Xingtong for the summary. If I could only choose one of the given
> two options, I would go with option 1. I understood that option 2 worked
> great with Kafka. But the bridge release will still confuse users and my
> gut feeling is that many users will skip 2.0 and be waiting for 3.0 or even
> 3.x. And since fewer users will use Flink 2.x, the development focus will
> be on Flink 3.0 with the fact that the current Flink release is 1.17 and we
> are preparing 2.0 release. That is weird for me.
>
> THB, I would not name the change from @Public to @Retired as a demotion.
> The purpose of @Retire is to extend the API lifecycle with one more stage,
> like in the real world, people born, studied, graduated, worked, and
> retired. Afaiu from the previous discussion, there are two rules we'd like
> to follow simultaneously:
>
> 1. Public APIs can only be changed between major releases.
> 2. A smooth migration phase should be offered to users, i.e. at least 2
> minor releases after APIs are marked as @deprecated. There should be new
> APIs as the replacement.
>
> Agree, those rules are good to improve the user friendliness. Issues we
> discussed are rising because we want to fulfill both of them. If we take
> care of deprecation very seriously, APIs can be marked as @Deprecated, only
> when the new APIs as the replacement provide all functionalities the
> deprecated APIs have. In an ideal case without critical bugs that might
> stop users adopting the new APIs. Otherwise the expected "replacement" will
> not happen. Users will still stick to the deprecated APIs, because the new
> APIs can not be used. For big features, it will need at least 4 minor
> releases(ideal case), i.e. 2+ years to remove deprecated APIs:
>
> - 1st minor release to build the new APIs as the replacement and waiting
> for feedback. It might be difficult to mark the old API as deprecated in
> this release, because we are not sure if the new APIs could cover 100%
> functionalities.
> -  In the lucky case,  mark all old APIs as deprecated in the 2nd minor
> release. (I would even suggest having the new APIs released at least for
> two minor releases before marking it as deprecated to make sure they can
> really replace the old APIs, in case we care more about smooth migration)
> - 3rd minor release for the migration period
> -  In another lucky case, the 4th release is a major release, the
> deprecated APIs could be removed.
>
> The above described scenario works only in an ideal case. In reality, it
> might take longer to get the new APIs ready and mark the old API
> deprecated. Furthermore, if the 4th release is not a major release, we will
> have to maintain both APIs for many further minor releases. The question is
> how to know the next major release in advance, especially 4 minor releases'
> period, i.e. more than 2 years in advance? Given that Flink contains many
> modules, it is difficult to ask devs to create a 2-3 years deprecation plan
> for each case. In case we want to build major releases at a fast pace,
> let's say every two years, it means devs must plan any API deprecation
> right after each major release. Afaiac, it is quite difficult.
>
> The major issue is, afaiu, if we follow rule 2, we have to keep all @Public
> APIs, e.g. DataStream, that are not marked as deprecated yet, to 2.0. Then
> we have to follow rule 1 to keep it unchanged until we have 3.0. That is
> why @Retired is useful to give devs more flexibility and still fulfill both
> rules. Let's check it with examples:
>
> - we have @Public DataStream API in 1.18. It will not be marked
> as @Deprecated, because the new APIs as the replacement are not ready.
> - we keep the DataStream API itself unchanged in 2.0, but change the
> annotation from @Public to @Retire. New APIs will be introduced too. In
> this case, Rule 1 is ok, since public API is allowed to change between
> major releases. Only changing annotation is the minimal change we could do
> and it does not break rule 1. Rule 2 is ok too, since the DataStream APIs
> work exactly the same.  Attention: the change of @Public -> @Retired can
> only be done between major releases, because @Public APIs can only be
> changed between major releases.
> - in 2.1, DataStream API will be marked as deprecated.
> - in 2.2, DataStream will be kept for the migration period.
> - in 2.3, DataStream will be removed.
>
> Becket mentioned previously (please correct me if I didn't understand it
> correctly) that users might not check the changes of annotation and the
> upgrade process might not be smooth. I was wondering, if users don't pay
> attention to that, they will also not pay attention to @deprecated. The