[jira] [Created] (FLINK-24608) Sinks built with the unified sink framework do not receive timestamps when used in Table API

2021-10-20 Thread Fabian Paul (Jira)
Fabian Paul created FLINK-24608:
---

 Summary: Sinks built with the unified sink framework do not 
receive timestamps when used in Table API
 Key: FLINK-24608
 URL: https://issues.apache.org/jira/browse/FLINK-24608
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Common, Table SQL / Planner
Affects Versions: 1.13.3, 1.14.0, 1.15.0
Reporter: Fabian Paul


All sinks built with the unified sink framework extract the timestamp from the 
internal {{StreamRecord}}. The Table API does not facilitate the timestamp 
field in the {{StreamRecord}}  but extracts the timestamp from the actual data. 

We either have to use a dedicated operator before all the sinks to simulate the 
behavior or allow a customizable timestamp extraction during the sink 
translation.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Creating an external connector repository

2021-10-20 Thread Kyle Bendickson
Hi all,

My name is Kyle and I’m an open source developer primarily focused on
Apache Iceberg.

I’m happy to help clarify or elaborate on any aspect of our experience
working on a relatively decoupled connector that is downstream and pretty
popular.

I’d also love to be able to contribute or assist in any way I can.

I don’t mean to thread jack, but are there any meetings or community sync
ups, specifically around the connector APIs, that I might join / be invited
to?

I did want to add that even though I’ve experienced some of the pain points
of integrating with an evolving system / API (catalog support is generally
speaking pretty new everywhere really in this space), I also agree
personally that you shouldn’t slow down development velocity too much for
the sake of external connector. Getting to a performant and stable place
should be the primary goal, and slowing that down to support stragglers
will (in my personal opinion) always be a losing game. Some folks will
simply stay behind on versions regardless until they have to upgrade.

I am working on ensuring that the Iceberg community stays within 1-2
versions of Flink, so that we can help provide more feedback or contribute
things that might make our ability to support multiple Flink runtimes /
versions with one project / codebase and minimal to no reflection (our
desired goal).

If there’s anything I can do or any way I can be of assistance, please
don’t hesitate to reach out. Or find me on ASF slack 😀

I greatly appreciate your general concern for the needs of downstream
connector integrators!

Cheers
Kyle Bendickson (GitHub: kbendick)
Open Source Developer
kyle [at] tabular [dot] io

On Wed, Oct 20, 2021 at 11:35 AM Thomas Weise  wrote:

> Hi,
>
> I see the stable core Flink API as a prerequisite for modularity. And
> for connectors it is not just the source and sink API (source being
> stable as of 1.14), but everything that is required to build and
> maintain a connector downstream, such as the test utilities and
> infrastructure.
>
> Without the stable surface of core Flink, changes will leak into
> downstream dependencies and force lock step updates. Refactoring
> across N repos is more painful than a single repo. Those with
> experience developing downstream of Flink will know the pain, and that
> isn't limited to connectors. I don't remember a Flink "minor version"
> update that was just a dependency version change and did not force
> other downstream changes.
>
> Imagine a project with a complex set of dependencies. Let's say Flink
> version A plus Flink reliant dependencies released by other projects
> (Flink-external connectors, Beam, Iceberg, Hudi, ..). We don't want a
> situation where we bump the core Flink version to B and things fall
> apart (interface changes, utilities that were useful but not public,
> transitive dependencies etc.).
>
> The discussion here also highlights the benefits of keeping certain
> connectors outside Flink. Whether that is due to difference in
> developer community, maturity of the connectors, their
> specialized/limited usage etc. I would like to see that as a sign of a
> growing ecosystem and most of the ideas that Arvid has put forward
> would benefit further growth of the connector ecosystem.
>
> As for keeping connectors within Apache Flink: I prefer that as the
> path forward for "essential" connectors like FileSource, KafkaSource,
> ... And we can still achieve a more flexible and faster release cycle.
>
> Thanks,
> Thomas
>
>
>
>
>
> On Wed, Oct 20, 2021 at 3:32 AM Jark Wu  wrote:
> >
> > Hi Konstantin,
> >
> > > the connectors need to be adopted and require at least one release per
> > Flink minor release.
> > However, this will make the releases of connectors slower, e.g. maintain
> > features for multiple branches and release multiple branches.
> > I think the main purpose of having an external connector repository is in
> > order to have "faster releases of connectors"?
> >
> >
> > From the perspective of CDC connector maintainers, the biggest advantage
> of
> > maintaining it outside of the Flink project is that:
> > 1) we can have a more flexible and faster release cycle
> > 2) we can be more liberal with committership for connector maintainers
> > which can also attract more committers to help the release.
> >
> > Personally, I think maintaining one connector repository under the ASF
> may
> > not have the above benefits.
> >
> > Best,
> > Jark
> >
> > On Wed, 20 Oct 2021 at 15:14, Konstantin Knauf 
> wrote:
> >
> > > Hi everyone,
> > >
> > > regarding the stability of the APIs. I think everyone agrees that
> > > connector APIs which are stable across minor versions (1.13->1.14) are
> the
> > > mid-term goal. But:
> > >
> > > a) These APIs are still quite young, and we shouldn't make them @Public
> > > prematurely either.
> > >
> > > b) Isn't this *mostly* orthogonal to where the connector code lives?
> Yes,
> > > as long as there are breaking changes, the connectors need to be
> adopted
> > > 

Re: [DISCUSS] FLIP-188: Introduce Built-in Dynamic Table Storage

2021-10-20 Thread wenlong.lwl
Hi Jingsong, thanks for the proposal, providing a built-in storage solution
for users will make flink SQL much more easier to use in production.

I have some questions which may be missed in the FLIP, but may be important
IMO:
1. Is it possible to read historical data from the file store first and
then fetch new changes from the log store? something like a hybrid source,
but I think we need a mechanism to get exactly-once semantic.
2. How the built-in table would be persisted in Catalog?
3. Currently a catalog can provide a default table factory and would be
used as the top priority factory, what would happen after the default
factory was introduced.

On Wed, 20 Oct 2021 at 19:35, Ingo Bürk  wrote:

> Hi Jingsong,
>
> thank you for writing up the proposal. The benefits such a mechanism will
> bring will be very valuable! I haven't yet looked into this in detail, but
> one question came to my mind immediately:
>
> The DDL for these tables seems to rely on there not being a 'connector'
> option. However, catalogs can provide a custom factory, and thus tables
> don't necessarily need to contain such an option already today. How will
> this interact / work with catalogs? I think there are more points regarding
> interaction with catalogs, e.g. if tables are dropped externally rather
> than through Flink SQL DDL, how would Flink be able to remove the physical
> storage for it.
>
>
> Best
> Ingo
>
> On Wed, Oct 20, 2021 at 11:14 AM Jingsong Li 
> wrote:
>
> > Hi all,
> >
> > Kurt and I propose to introduce built-in storage support for dynamic
> > table, a truly unified changelog & table representation, from Flink
> > SQL’s perspective. We believe this kind of storage will improve the
> > usability a lot.
> >
> > We want to highlight some characteristics about this storage:
> >
> > - It’s a built-in storage for Flink SQL
> > ** Improve usability issues
> > ** Flink DDL is no longer just a mapping, but a real creation for these
> > tables
> > ** Masks & abstracts the underlying technical details, no annoying
> options
> >
> > - Supports subsecond streaming write & consumption
> > ** It could be backed by a service-oriented message queue (Like Kafka)
> > ** High throughput scan capability
> > ** Filesystem with columnar formats would be an ideal choice just like
> > iceberg/hudi does.
> >
> > - More importantly, in order to solve the cognitive bar, storage needs
> > to automatically address various Insert/Update/Delete inputs and table
> > definitions
> > ** Receive any type of changelog
> > ** Table can have primary key or no primary key
> >
> > Looking forward to your feedback.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage
> >
> > Best,
> > Jingsong Lee
> >
>


[jira] [Created] (FLINK-24607) SourceCoordinator may miss to close SplitEnumerator when failover frequently

2021-10-20 Thread Jark Wu (Jira)
Jark Wu created FLINK-24607:
---

 Summary: SourceCoordinator may miss to close SplitEnumerator when 
failover frequently
 Key: FLINK-24607
 URL: https://issues.apache.org/jira/browse/FLINK-24607
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.13.3
Reporter: Jark Wu
 Attachments: jobmanager.log

We are having a connection leak problem when using mysql-cdc [1] source. We 
observed that many enumerators are not closed from the JM log.

{code}
➜  test123 cat jobmanager.log | grep "SourceCoordinator \[\] - Restoring 
SplitEnumerator" | wc -l
 264
➜  test123 cat jobmanager.log | grep "SourceCoordinator \[\] - Starting split 
enumerator" | wc -l
 264
➜  test123 cat jobmanager.log | grep "MySqlSourceEnumerator \[\] - Starting 
enumerator" | wc -l
 263
➜  test123 cat jobmanager.log | grep "SourceCoordinator \[\] - Closing 
SourceCoordinator" | wc -l
 264
➜  test123 cat jobmanager.log | grep "MySqlSourceEnumerator \[\] - Closing 
enumerator" | wc -l
 195
{code}

We added "Closing enumerator" log in {{MySqlSourceEnumerator#close()}}, and 
"Starting enumerator" in {{MySqlSourceEnumerator#start()}}. From the above 
result you can see that SourceCoordinator is restored and closed 264 times, 
split enumerator is started 264 but only closed 195 times. It seems that 
{{SourceCoordinator}} misses to close enumerator when job failover frequently. 

I also went throught the code of {{SourceCoordinator}} and found some 
suspicious point:

The {{started}} flag and  {{enumerator}} is assigned in the main thread, 
however {{SourceCoordinator#close()}} is executed async by 
{{DeferrableCoordinator#closeAsync}}.  That means the close method will check 
the {{started}} and {{enumerator}} variable async. Is there any concurrency 
problem here which mean lead to dirty read and miss to close the 
{{enumerator}}? 

I'm still not sure, because it's hard to reproduce locally, and we can't deploy 
a custom flink version to production env. 


[1]: https://github.com/ververica/flink-cdc-connectors



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24606) AvroDeserializationSchema buffer is not clean

2021-10-20 Thread heyu dou (Jira)
heyu dou created FLINK-24606:


 Summary: AvroDeserializationSchema buffer is not clean
 Key: FLINK-24606
 URL: https://issues.apache.org/jira/browse/FLINK-24606
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Reporter: heyu dou


 org.apache.flink.formats.avro.AvroDeserializationSchema Decoder want reuse 
org.apache.avro.io.BinaryDecoder.

But the way it is used is wrong.

Should be reset Decoder before deserialization().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24605) org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.NoOffsetForPartitionException: Undefined offset with no reset policy for partitions

2021-10-20 Thread Abhijit Talukdar (Jira)
Abhijit Talukdar created FLINK-24605:


 Summary: 
org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.NoOffsetForPartitionException:
 Undefined offset with no reset policy for partitions
 Key: FLINK-24605
 URL: https://issues.apache.org/jira/browse/FLINK-24605
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.0
Reporter: Abhijit Talukdar


Getting below issue when using 'scan.startup.mode' = 'group-offsets'.

 

WITH (
 'connector' = 'kafka',
 'topic' = 'ss7gsm-signaling-event',
 'properties.bootstrap.servers' = '**:9093',
 'properties.group.id' = 'ss7gsm-signaling-event-T5',
 'value.format' = 'avro-confluent',
 'value.avro-confluent.schema-registry.url' = 'https://***:9099',
 {color:#ff8b00}'scan.startup.mode' = 'group-offsets',{color}
{color:#ff8b00} 'properties.auto.offset.reset' = 'earliest',{color}
 'properties.security.protocol'= 'SASL_SSL',
 'properties.ssl.truststore.location'= '/*/*/ca-certs.jks',
 'properties.ssl.truststore.password'= '*',
 'properties.sasl.kerberos.service.name'= 'kafka'
)

 

'ss7gsm-signaling-event-T5' is a new group id. If the group id is present in ZK 
then it works otherwise getting below exception. 'properties.auto.offset.reset' 
property is ignored.

 

021-10-20 22:18:28,267 INFO  
org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.ConsumerConfig 
[] - ConsumerConfig values: 021-10-20 22:18:28,267 INFO  
org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.ConsumerConfig 
[] - ConsumerConfig values: 

allow.auto.create.topics = false

auto.commit.interval.ms = 5000

{color:#FF} +*auto.offset.reset = none*+{color}

bootstrap.servers = [.xxx.com:9093]

 

 

Exception:

 

021-10-20 22:18:28,620 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> Sink: 
Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched from 
INITIALIZING to RUNNING.021-10-20 22:18:28,620 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> Sink: 
Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched from 
INITIALIZING to RUNNING.2021-10-20 22:18:28,621 INFO  
org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumerator [] - 
Assigning splits to readers \{0=[[Partition: ss7gsm-signaling-event-2, 
StartingOffset: -3, StoppingOffset: -9223372036854775808], [Partition: 
ss7gsm-signaling-event-8, StartingOffset: -3, StoppingOffset: 
-9223372036854775808], [Partition: ss7gsm-signaling-event-7, StartingOffset: 
-3, StoppingOffset: -9223372036854775808], [Partition: 
ss7gsm-signaling-event-9, StartingOffset: -3, StoppingOffset: 
-9223372036854775808], [Partition: ss7gsm-signaling-event-5, StartingOffset: 
-3, StoppingOffset: -9223372036854775808], [Partition: 
ss7gsm-signaling-event-6, StartingOffset: -3, StoppingOffset: 
-9223372036854775808], [Partition: ss7gsm-signaling-event-0, StartingOffset: 
-3, StoppingOffset: -9223372036854775808], [Partition: 
ss7gsm-signaling-event-4, StartingOffset: -3, StoppingOffset: 
-9223372036854775808], [Partition: ss7gsm-signaling-event-1, StartingOffset: 
-3, StoppingOffset: -9223372036854775808], [Partition: 
ss7gsm-signaling-event-3, StartingOffset: -3, StoppingOffset: 
-9223372036854775808]]}2021-10-20 22:18:28,716 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> Sink: 
Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched from 
RUNNING to FAILED on xx.xxx.xxx.xxx:42075-d80607 @ xx.xxx.com 
(dataPort=34120).java.lang.RuntimeException: One or more fetchers have 
encountered exception at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkErrors(SplitFetcherManager.java:225)
 ~[flink-table_2.11-1.14.0.jar:1.14.0] at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.getNextFetch(SourceReaderBase.java:169)
 ~[flink-table_2.11-1.14.0.jar:1.14.0] at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:130)
 ~[flink-table_2.11-1.14.0.jar:1.14.0] at 
org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:342)
 ~[flink-dist_2.11-1.14.0.jar:1.14.0] at 
org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
 ~[flink-dist_2.11-1.14.0.jar:1.14.0] at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
 ~[flink-dist_2.11-1.14.0.jar:1.14.0] at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:496)
 ~[flink-dist_2.11-1.14.0.jar:1.14.0] at 
org.apache.flink.str

"streaming.tests.avro does not exist"

2021-10-20 Thread Saad Ur Rahman
Hello everyone,

I am trying to set up IntelliJ IDEA so that I can start contributing but I am 
running into the following error:

flink/flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/DataStreamAllroundTestProgram.java:34:45

java: package org.apache.flink.streaming.tests.avro does not exist

I have followed the instructions here:
https://ci.apache.org/projects/flink/flink-docs-master/docs/flinkdev/ide_setup/

I have also tried to invalidate the cache and rebuild the project. I also tried 
to download the source through the Maven panel to no avail:
Flink : Tests
Flink : Formats : Avro


Thanks,
/Saad


Re: "streaming.tests.avro does not exist"

2021-10-20 Thread Chesnay Schepler
What you're lacking are generated files. Run mvn generate-sources (or 
mvn clean install) on the command-line.


On 20/10/2021 20:22, Saad Ur Rahman wrote:

Hello everyone,

I am trying to set up IntelliJ IDEA so that I can start contributing but I am 
running into the following error:

flink/flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/DataStreamAllroundTestProgram.java:34:45

java: package org.apache.flink.streaming.tests.avro does not exist

I have followed the instructions here:
https://ci.apache.org/projects/flink/flink-docs-master/docs/flinkdev/ide_setup/

I have also tried to invalidate the cache and rebuild the project. I also tried 
to download the source through the Maven panel to no avail:
Flink : Tests
Flink : Formats : Avro


Thanks,
/Saad





Re: [DISCUSS] Creating an external connector repository

2021-10-20 Thread Thomas Weise
Hi,

I see the stable core Flink API as a prerequisite for modularity. And
for connectors it is not just the source and sink API (source being
stable as of 1.14), but everything that is required to build and
maintain a connector downstream, such as the test utilities and
infrastructure.

Without the stable surface of core Flink, changes will leak into
downstream dependencies and force lock step updates. Refactoring
across N repos is more painful than a single repo. Those with
experience developing downstream of Flink will know the pain, and that
isn't limited to connectors. I don't remember a Flink "minor version"
update that was just a dependency version change and did not force
other downstream changes.

Imagine a project with a complex set of dependencies. Let's say Flink
version A plus Flink reliant dependencies released by other projects
(Flink-external connectors, Beam, Iceberg, Hudi, ..). We don't want a
situation where we bump the core Flink version to B and things fall
apart (interface changes, utilities that were useful but not public,
transitive dependencies etc.).

The discussion here also highlights the benefits of keeping certain
connectors outside Flink. Whether that is due to difference in
developer community, maturity of the connectors, their
specialized/limited usage etc. I would like to see that as a sign of a
growing ecosystem and most of the ideas that Arvid has put forward
would benefit further growth of the connector ecosystem.

As for keeping connectors within Apache Flink: I prefer that as the
path forward for "essential" connectors like FileSource, KafkaSource,
... And we can still achieve a more flexible and faster release cycle.

Thanks,
Thomas





On Wed, Oct 20, 2021 at 3:32 AM Jark Wu  wrote:
>
> Hi Konstantin,
>
> > the connectors need to be adopted and require at least one release per
> Flink minor release.
> However, this will make the releases of connectors slower, e.g. maintain
> features for multiple branches and release multiple branches.
> I think the main purpose of having an external connector repository is in
> order to have "faster releases of connectors"?
>
>
> From the perspective of CDC connector maintainers, the biggest advantage of
> maintaining it outside of the Flink project is that:
> 1) we can have a more flexible and faster release cycle
> 2) we can be more liberal with committership for connector maintainers
> which can also attract more committers to help the release.
>
> Personally, I think maintaining one connector repository under the ASF may
> not have the above benefits.
>
> Best,
> Jark
>
> On Wed, 20 Oct 2021 at 15:14, Konstantin Knauf  wrote:
>
> > Hi everyone,
> >
> > regarding the stability of the APIs. I think everyone agrees that
> > connector APIs which are stable across minor versions (1.13->1.14) are the
> > mid-term goal. But:
> >
> > a) These APIs are still quite young, and we shouldn't make them @Public
> > prematurely either.
> >
> > b) Isn't this *mostly* orthogonal to where the connector code lives? Yes,
> > as long as there are breaking changes, the connectors need to be adopted
> > and require at least one release per Flink minor release.
> > Documentation-wise this can be addressed via a compatibility matrix for
> > each connector as Arvid suggested. IMO we shouldn't block this effort on
> > the stability of the APIs.
> >
> > Cheers,
> >
> > Konstantin
> >
> >
> >
> > On Wed, Oct 20, 2021 at 8:56 AM Jark Wu  wrote:
> >
> >> Hi,
> >>
> >> I think Thomas raised very good questions and would like to know your
> >> opinions if we want to move connectors out of flink in this version.
> >>
> >> (1) is the connector API already stable?
> >> > Separate releases would only make sense if the core Flink surface is
> >> > fairly stable though. As evident from Iceberg (and also Beam), that's
> >> > not the case currently. We should probably focus on addressing the
> >> > stability first, before splitting code. A success criteria could be
> >> > that we are able to build Iceberg and Beam against multiple Flink
> >> > versions w/o the need to change code. The goal would be that no
> >> > connector breaks when we make changes to Flink core. Until that's the
> >> > case, code separation creates a setup where 1+1 or N+1 repositories
> >> > need to move lock step.
> >>
> >> From another discussion thread [1], connector API is far from stable.
> >> Currently, it's hard to build connectors against multiple Flink versions.
> >> There are breaking API changes both in 1.12 -> 1.13 and 1.13 -> 1.14 and
> >>  maybe also in the future versions,  because Table related APIs are still
> >> @PublicEvolving and new Sink API is still @Experimental.
> >>
> >>
> >> (2) Flink testability without connectors.
> >> > Flink w/o Kafka connector (and few others) isn't
> >> > viable. Testability of Flink was already brought up, can we really
> >> > certify a Flink core release without Kafka connector? Maybe those
> >> > connectors that are used in Flink e2e tests to val

"streaming.tests.avro does not exist"

2021-10-20 Thread Saad Ur Rahman
Hello everyone,

I am trying to set up IntelliJ IDEA so that I can start contributing but I am 
running into the following error:

flink/flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/DataStreamAllroundTestProgram.java:34:45

java: package org.apache.flink.streaming.tests.avro does not exist

I have followed the instructions here:
https://ci.apache.org/projects/flink/flink-docs-master/docs/flinkdev/ide_setup/

I have also tried to invalidate the cache and rebuild the project. I also tried 
to download the source through the Maven panel to no avail:
Flink : Tests
Flink : Formats : Avro


Thanks,
/Saad


[jira] [Created] (FLINK-24604) Failing tests for casting decimals to boolean

2021-10-20 Thread Marios Trivyzas (Jira)
Marios Trivyzas created FLINK-24604:
---

 Summary: Failing tests for casting decimals to boolean
 Key: FLINK-24604
 URL: https://issues.apache.org/jira/browse/FLINK-24604
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner, Table SQL / Runtime
Affects Versions: 1.15.0
Reporter: Marios Trivyzas
Assignee: Marios Trivyzas


Currently some tests in *CalcITCase.scala, SimplifyJoinConditionRuleTest.scala* 
and *FlinkRexUtilTest.scala* are failing because of the merge of 
[https://github.com/apache/flink/pull/17311] and 
[https://github.com/apache/flink/pull/17439] where the first one adds some 
tests with decimal to boolean cast and the latter drops this support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Looking back at the Apache Flink 1.14 development cycle / getting better for 1.15

2021-10-20 Thread Johannes Moser
Dear Flink community,

In preparation for the 1.15 development cycle of Apache Flink (it already 
started) and preparing the release management we are collecting feedback from 
the community.
If you didn’t have a chance to look at the release announcement you might want 
to do that now [1]

Also watch out for the 1.14 release meetup [2] tomorrow and Flink Forward [3] 
next week.

-

We’d love to improve the experience for contributors and users going forward 
and if you want to help us do so I’d be very thankful to get some answers to 
the following questions.

** What features/changes will make your live easier/harder?

** Would you like to be more/less informed about the development process?

** In case you have contributed to that release, how was your experience?

** Do you have any questions regarding the 1.14 release?

-

I will share an anonymous overview of the feedback at a later stage.
Thanks for the answers in advance.

Best,
Joe


[1] https://flink.apache.org/news/2021/09/29/release-1.14.0.html 

[2] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/281406310 

[3] https://www.flink-forward.org/global-2021 
 



[jira] [Created] (FLINK-24603) Add E2E test

2021-10-20 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-24603:


 Summary: Add E2E test
 Key: FLINK-24603
 URL: https://issues.apache.org/jira/browse/FLINK-24603
 Project: Flink
  Issue Type: Sub-task
  Components: API / DataSet, API / DataStream
Reporter: Chesnay Schepler
 Fix For: 1.15.0


Add an e2e test, where the job uses some Scala libraries (not Scala types!). 
The test must remove the flink-scala jar from the lib/ directory beforehand, 
and manually add some other currently unsupported Scala version (e.g., 2.13 or 
even 3.X) to the lib/ directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-188: Introduce Built-in Dynamic Table Storage

2021-10-20 Thread Ingo Bürk
Hi Jingsong,

thank you for writing up the proposal. The benefits such a mechanism will
bring will be very valuable! I haven't yet looked into this in detail, but
one question came to my mind immediately:

The DDL for these tables seems to rely on there not being a 'connector'
option. However, catalogs can provide a custom factory, and thus tables
don't necessarily need to contain such an option already today. How will
this interact / work with catalogs? I think there are more points regarding
interaction with catalogs, e.g. if tables are dropped externally rather
than through Flink SQL DDL, how would Flink be able to remove the physical
storage for it.


Best
Ingo

On Wed, Oct 20, 2021 at 11:14 AM Jingsong Li  wrote:

> Hi all,
>
> Kurt and I propose to introduce built-in storage support for dynamic
> table, a truly unified changelog & table representation, from Flink
> SQL’s perspective. We believe this kind of storage will improve the
> usability a lot.
>
> We want to highlight some characteristics about this storage:
>
> - It’s a built-in storage for Flink SQL
> ** Improve usability issues
> ** Flink DDL is no longer just a mapping, but a real creation for these
> tables
> ** Masks & abstracts the underlying technical details, no annoying options
>
> - Supports subsecond streaming write & consumption
> ** It could be backed by a service-oriented message queue (Like Kafka)
> ** High throughput scan capability
> ** Filesystem with columnar formats would be an ideal choice just like
> iceberg/hudi does.
>
> - More importantly, in order to solve the cognitive bar, storage needs
> to automatically address various Insert/Update/Delete inputs and table
> definitions
> ** Receive any type of changelog
> ** Table can have primary key or no primary key
>
> Looking forward to your feedback.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage
>
> Best,
> Jingsong Lee
>


[jira] [Created] (FLINK-24602) Include diffs for configuration documentation page

2021-10-20 Thread Zichen Liu (Jira)
Zichen Liu created FLINK-24602:
--

 Summary: Include diffs for configuration documentation page
 Key: FLINK-24602
 URL: https://issues.apache.org/jira/browse/FLINK-24602
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Zichen Liu


We could include it as an extra column on this page:

[https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/]

Or have another page dedicated to diffs on what the default configuration for 
each key has been throughout historical versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Creating an external connector repository

2021-10-20 Thread Jark Wu
Hi Konstantin,

> the connectors need to be adopted and require at least one release per
Flink minor release.
However, this will make the releases of connectors slower, e.g. maintain
features for multiple branches and release multiple branches.
I think the main purpose of having an external connector repository is in
order to have "faster releases of connectors"?


>From the perspective of CDC connector maintainers, the biggest advantage of
maintaining it outside of the Flink project is that:
1) we can have a more flexible and faster release cycle
2) we can be more liberal with committership for connector maintainers
which can also attract more committers to help the release.

Personally, I think maintaining one connector repository under the ASF may
not have the above benefits.

Best,
Jark

On Wed, 20 Oct 2021 at 15:14, Konstantin Knauf  wrote:

> Hi everyone,
>
> regarding the stability of the APIs. I think everyone agrees that
> connector APIs which are stable across minor versions (1.13->1.14) are the
> mid-term goal. But:
>
> a) These APIs are still quite young, and we shouldn't make them @Public
> prematurely either.
>
> b) Isn't this *mostly* orthogonal to where the connector code lives? Yes,
> as long as there are breaking changes, the connectors need to be adopted
> and require at least one release per Flink minor release.
> Documentation-wise this can be addressed via a compatibility matrix for
> each connector as Arvid suggested. IMO we shouldn't block this effort on
> the stability of the APIs.
>
> Cheers,
>
> Konstantin
>
>
>
> On Wed, Oct 20, 2021 at 8:56 AM Jark Wu  wrote:
>
>> Hi,
>>
>> I think Thomas raised very good questions and would like to know your
>> opinions if we want to move connectors out of flink in this version.
>>
>> (1) is the connector API already stable?
>> > Separate releases would only make sense if the core Flink surface is
>> > fairly stable though. As evident from Iceberg (and also Beam), that's
>> > not the case currently. We should probably focus on addressing the
>> > stability first, before splitting code. A success criteria could be
>> > that we are able to build Iceberg and Beam against multiple Flink
>> > versions w/o the need to change code. The goal would be that no
>> > connector breaks when we make changes to Flink core. Until that's the
>> > case, code separation creates a setup where 1+1 or N+1 repositories
>> > need to move lock step.
>>
>> From another discussion thread [1], connector API is far from stable.
>> Currently, it's hard to build connectors against multiple Flink versions.
>> There are breaking API changes both in 1.12 -> 1.13 and 1.13 -> 1.14 and
>>  maybe also in the future versions,  because Table related APIs are still
>> @PublicEvolving and new Sink API is still @Experimental.
>>
>>
>> (2) Flink testability without connectors.
>> > Flink w/o Kafka connector (and few others) isn't
>> > viable. Testability of Flink was already brought up, can we really
>> > certify a Flink core release without Kafka connector? Maybe those
>> > connectors that are used in Flink e2e tests to validate functionality
>> > of core Flink should not be broken out?
>>
>> This is a very good question. How can we guarantee the new Source and Sink
>> API are stable with only test implementation?
>>
>>
>> Best,
>> Jark
>>
>>
>>
>>
>>
>> On Tue, 19 Oct 2021 at 23:56, Chesnay Schepler 
>> wrote:
>>
>> > Could you clarify what release cadence you're thinking of? There's quite
>> > a big range that fits "more frequent than Flink" (per-commit, daily,
>> > weekly, bi-weekly, monthly, even bi-monthly).
>> >
>> > On 19/10/2021 14:15, Martijn Visser wrote:
>> > > Hi all,
>> > >
>> > > I think it would be a huge benefit if we can achieve more frequent
>> > releases
>> > > of connectors, which are not bound to the release cycle of Flink
>> itself.
>> > I
>> > > agree that in order to get there, we need to have stable interfaces
>> which
>> > > are trustworthy and reliable, so they can be safely used by those
>> > > connectors. I do think that work still needs to be done on those
>> > > interfaces, but I am confident that we can get there from a Flink
>> > > perspective.
>> > >
>> > > I am worried that we would not be able to achieve those frequent
>> releases
>> > > of connectors if we are putting these connectors under the Apache
>> > umbrella,
>> > > because that means that for each connector release we have to follow
>> the
>> > > Apache release creation process. This requires a lot of manual steps
>> and
>> > > prohibits automation and I think it would be hard to scale out
>> frequent
>> > > releases of connectors. I'm curious how others think this challenge
>> could
>> > > be solved.
>> > >
>> > > Best regards,
>> > >
>> > > Martijn
>> > >
>> > > On Mon, 18 Oct 2021 at 22:22, Thomas Weise  wrote:
>> > >
>> > >> Thanks for initiating this discussion.
>> > >>
>> > >> There are definitely a few things that are not optimal with our
>> > >> current management of connectors. I wou

Re: [DISCUSS] FLIP-176: Unified Iteration to Support Algorithms (Flink ML)

2021-10-20 Thread Yun Gao
Hi all,

With some offline discussion with Becket and Zhipeng, we have made some 
modification to the API to provide a unified method for iteration on the 
bounded streams, and also allow users to construct the iteration with mixed 
types of operator lifecycle (namely whether we re-create users' operators for 
each round).
We have updated the FLIP to reflect the change. 

If we do not have other concerns, we would start the vote tomorrow. Very thanks!

Best,
Yun


--
From:Yun Gao 
Send Time:2021 Oct. 5 (Tue.) 14:27
To:"David Morávek" ; dev 
Subject:Re: [DISCUSS] FLIP-176: Unified Iteration to Support Algorithms (Flink 
ML)

Hi David,

Very thanks for the feedback and glad to see that we have the same opinions on 
a lot of points of the
iteration! :)

And for the checkpoint, basically to support the checkpoint for a job with 
feedback edges, we 
need to also include the records on the feedback edges into the checkpoint 
snapshot, as described in [1].
We do this by exploiting the reference count mechanism provided by the raw 
states so that the
asynchronous phase would wait until we finish writing all the feedback records 
into the raw states, 
which is also similar to the implementation in the statefun. 

Including the feedback records into snapshot is enough for the unbounded 
iteration, but for the bounded iteration,
we would also need
1. Checkpoint after tasks finished: since for an iteration job with bounded 
inputs, most time of the execution is spent
after all the sources are finished and the iteration body is executing, we 
would need to support checkpoints during
this period. Fortunately in 1.14 we have implemented the first version of this 
functionality.
2. Keep the notification of round increment exactly-once: for bounded iteration 
we would notify the round end for each
operator via onEpochWatermarkIncrement(), this is done by insert epoch 
watermarks at the end of each round. We would
like to keep the notification of onEpochWatermarkIncrement() exactly-once to 
simplify the algorithms' development. This
is done by ensuring that the epoch watermarks with the same epoch value and the 
barriers of the same checkpoint always
have the same order when transmitting in the iteration body. With this 
condition, after failover all the operators inside the
iteration body must have received the same amount of notifications, and we 
could start with the next one.  Also since the epoch 
watermarks might also be snapshot in the feedback edge snapshot, we disable the 
rescaling of the head / tail operators for the 
bounded iteration. 

Best,
Yun



[1] https://arxiv.org/abs/1506.08603



--
From:David Morávek 
Send Time:2021 Oct. 4 (Mon.) 14:05
To:dev ; Yun Gao 
Subject:Re: [DISCUSS] FLIP-176: Unified Iteration to Support Algorithms (Flink 
ML)

Hi Yun,

I did a quick pass over the design doc and it addresses all of the problems
with the current iterations I'm aware of. It's great to see that you've
been able to workaround the need of vectorized watermarks by giving up
nested iterations (which IMO is more of an academic concept than something
with a solid use-case). I'll try to give it some more thoughts, but from a
first pass it looks great +1 ;)

One thing that I'm unsure about, how do you plan to implement exactly-once
checkpointing of the feedback edge?

Best,
D.

On Mon, Oct 4, 2021 at 4:42 AM Yun Gao  wrote:

> Hi all,
>
> If we do not have other concerns on this FLIP, we would like to start the
> voting by the end of oct 8th.
>
> Best,
> Yun.
>
>
> --
> From:Yun Gao 
> Send Time:2021 Sep. 15 (Wed.) 20:47
> To:dev 
> Subject:[DISCUSS] FLIP-176: Unified Iteration to Support Algorithms (Flink
> ML)
>
>
> Hi all,
>
> DongLin, ZhiPeng and I are opening this thread to propose designing and
> implementing a new iteration library inside the Flink-ML project, as
> described in
>  FLIP-176[1].
>
> Iteration serves as a fundamental functionality required to support the
> implementation
> of ML algorithms. Previously Flink supports bounded iteration on top of
> the
> DataSet API and unbounded iteration on top of the DataStream API. However,
> since we are going to deprecated the dataset API and the current unbounded
> iteration
> API on top of the DataStream API is not fully complete, thus we are
> proposing
> to add the new unified iteration library on top of DataStream API to
> support both
> unbounded and bounded iterations.
>
> Very thanks for your feedbacks!
>
> [1] https://cwiki.apache.org/confluence/x/hAEBCw
>
> Best,Yun



--
From:David Morávek 
Send Time:2021 Oct. 4 (Mon.) 14:05
To:dev ; Yun Gao 
Subject:Re: [DISCUSS] FLIP-176: Unified Iteration to Support Algorithms (Flink 
ML)

Hi Yun,

I did a quick pass over the design doc and it addresses all of the 

[DISCUSS] FLIP-188: Introduce Built-in Dynamic Table Storage

2021-10-20 Thread Jingsong Li
Hi all,

Kurt and I propose to introduce built-in storage support for dynamic
table, a truly unified changelog & table representation, from Flink
SQL’s perspective. We believe this kind of storage will improve the
usability a lot.

We want to highlight some characteristics about this storage:

- It’s a built-in storage for Flink SQL
** Improve usability issues
** Flink DDL is no longer just a mapping, but a real creation for these tables
** Masks & abstracts the underlying technical details, no annoying options

- Supports subsecond streaming write & consumption
** It could be backed by a service-oriented message queue (Like Kafka)
** High throughput scan capability
** Filesystem with columnar formats would be an ideal choice just like
iceberg/hudi does.

- More importantly, in order to solve the cognitive bar, storage needs
to automatically address various Insert/Update/Delete inputs and table
definitions
** Receive any type of changelog
** Table can have primary key or no primary key

Looking forward to your feedback.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage

Best,
Jingsong Lee


[jira] [Created] (FLINK-24601) Error in submitting the job

2021-10-20 Thread Shyam (Jira)
Shyam  created FLINK-24601:
--

 Summary: Error in submitting the job
 Key: FLINK-24601
 URL: https://issues.apache.org/jira/browse/FLINK-24601
 Project: Flink
  Issue Type: Bug
Reporter: Shyam 


I am new to Flink. I have done Flink 1.14.0 standalone installation in AWS 
server and written a simple job in java 1.8. I am new to Flink. I have done 
Flink 1.14.0 standalone installation in AWS server and written a simple job in 
java 1.8. ```DataSet set = 
executionEnvironment.fromCollection(text);DataSet upperCase = set.map(s 
-> s.toUpperCase());upperCase.count();JobExecutionResult result = 
executionEnvironment.execute("sample job");```When I run the code, I get the 
following error in Flink runtime 


2021-10-19 16:00:51.398  WARN 14358 --- [-netty-thread-1] 
o.a.f.c.program.rest.RestClusterClient   : Attempt to submit job 'Flink Java 
Job at Tue Oct 19 16:00:38 IST 2021' (f38d6a987100a98662e10ada551d05b9) to 
'http://xx.xx.xx.xx:8081' has failed.
java.util.concurrent.CompletionException: 
org.apache.flink.runtime.rest.ConnectionClosedException: Channel became 
inactive. at 
java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) 
[na:1.8.0_292] at 
java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
 [na:1.8.0_292] at 
java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925) 
[na:1.8.0_292] at 
java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:913)
 ~[na:1.8.0_292] at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
[na:1.8.0_292] at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
 [na:1.8.0_292] at 
org.apache.flink.runtime.rest.RestClient$ClientHandler.channelInactive(RestClient.java:634)
 ~[flink-runtime-1.14.0.jar:1.14.0]

 

My job is retrying to submit the job for the multiple times and finally stopped 
with another message 

 
{code:java}
org.apache.flink.runtime.rest.util.RestClientException: 
[org.apache.flink.runtime.rest.handler.RestHandlerException: The 
jobGraphFileName field must not be omitted or be null. at 
org.apache.flink.runtime.rest.handler.job.JobSubmitHandler.handleRequest(JobSubmitHandler.java:105){code}

Note: I checked the flink and java version in my app and server. Everything is 
same (Flink 1.4.0 & java 1.8)  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


One week until Flink Forward! -- Info & special announcements

2021-10-20 Thread Caito Scherr
Hi Everyone,

We have just one week before Flink Forward Global 2021!!

If you (or a friend) still need to register, you can do so here
 for free. Since we’re
virtual, you can join from anywhere in the world.


I am excited to announce that the full schedule is now up, and you can
check it out here
. The lineup
includes 45+ speakers from companies like Splunk, Facebook, AWS, Intesa
Sanpaolo, Reddit and more.

As a special bonus, we also have a free, virtual “Ask Me Anything” meetup
this Thursday (10am PST) where you can get caught up on the newest features
before the conference and ask the committers anything about our newest
release! Sign up here
.


Get ready for 2 days of great talks, as well as live networking, and some
fun games and surprises!

Thank you to everyone who submitted a talk along with our amazing
Program Committee
who helped put this lineup together!

Caito Scherr

*Program Committee Chair - Flink Forward Global 2021*


[jira] [Created] (FLINK-24600) job checkpoint page has two p99 which is duplicated

2021-10-20 Thread donglei (Jira)
donglei created FLINK-24600:
---

 Summary: job checkpoint page  has two p99 which is duplicated
 Key: FLINK-24600
 URL: https://issues.apache.org/jira/browse/FLINK-24600
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Web Frontend
 Environment: flink 1.14 
Reporter: donglei
 Attachments: image-2021-10-20-16-03-43-437.png

 

flink checkpoints page has two p99 which is duplicated

!image-2021-10-20-16-03-43-437.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24599) Make checking for type root and family less verbose

2021-10-20 Thread Timo Walther (Jira)
Timo Walther created FLINK-24599:


 Summary: Make checking for type root and family less verbose
 Key: FLINK-24599
 URL: https://issues.apache.org/jira/browse/FLINK-24599
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Timo Walther


Currently, we use `LogicalTypeChecks.hasRoot()` and 
`LogicalTypeChecks.hasFamily()` for frequent checking of logical types. It was 
a conscious decision to not overload `LogicalType` with utility methods in the 
beginning. But the two mentioned methods would be nice to have available in 
`LogicalType` directly.

We suggest:
{code}
LogicalType#is(LogicalTypeRoot)
LogicalType#is(LogicalTypeFamily)
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24598) Current IT case do not cover fallback path for hash aggregate

2021-10-20 Thread Shuo Cheng (Jira)
Shuo Cheng created FLINK-24598:
--

 Summary: Current IT case do not cover fallback path for hash 
aggregate
 Key: FLINK-24598
 URL: https://issues.apache.org/jira/browse/FLINK-24598
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Runtime
Affects Versions: 1.14.0, 1.15.0
Reporter: Shuo Cheng


Test data in AggregateITCaseBase#testBigData is not big enough to trigger hash 
agg to sort and spill buffer and fallback to sort agg.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Creating an external connector repository

2021-10-20 Thread Konstantin Knauf
Hi everyone,

regarding the stability of the APIs. I think everyone agrees that
connector APIs which are stable across minor versions (1.13->1.14) are the
mid-term goal. But:

a) These APIs are still quite young, and we shouldn't make them @Public
prematurely either.

b) Isn't this *mostly* orthogonal to where the connector code lives? Yes,
as long as there are breaking changes, the connectors need to be adopted
and require at least one release per Flink minor release.
Documentation-wise this can be addressed via a compatibility matrix for
each connector as Arvid suggested. IMO we shouldn't block this effort on
the stability of the APIs.

Cheers,

Konstantin



On Wed, Oct 20, 2021 at 8:56 AM Jark Wu  wrote:

> Hi,
>
> I think Thomas raised very good questions and would like to know your
> opinions if we want to move connectors out of flink in this version.
>
> (1) is the connector API already stable?
> > Separate releases would only make sense if the core Flink surface is
> > fairly stable though. As evident from Iceberg (and also Beam), that's
> > not the case currently. We should probably focus on addressing the
> > stability first, before splitting code. A success criteria could be
> > that we are able to build Iceberg and Beam against multiple Flink
> > versions w/o the need to change code. The goal would be that no
> > connector breaks when we make changes to Flink core. Until that's the
> > case, code separation creates a setup where 1+1 or N+1 repositories
> > need to move lock step.
>
> From another discussion thread [1], connector API is far from stable.
> Currently, it's hard to build connectors against multiple Flink versions.
> There are breaking API changes both in 1.12 -> 1.13 and 1.13 -> 1.14 and
>  maybe also in the future versions,  because Table related APIs are still
> @PublicEvolving and new Sink API is still @Experimental.
>
>
> (2) Flink testability without connectors.
> > Flink w/o Kafka connector (and few others) isn't
> > viable. Testability of Flink was already brought up, can we really
> > certify a Flink core release without Kafka connector? Maybe those
> > connectors that are used in Flink e2e tests to validate functionality
> > of core Flink should not be broken out?
>
> This is a very good question. How can we guarantee the new Source and Sink
> API are stable with only test implementation?
>
>
> Best,
> Jark
>
>
>
>
>
> On Tue, 19 Oct 2021 at 23:56, Chesnay Schepler  wrote:
>
> > Could you clarify what release cadence you're thinking of? There's quite
> > a big range that fits "more frequent than Flink" (per-commit, daily,
> > weekly, bi-weekly, monthly, even bi-monthly).
> >
> > On 19/10/2021 14:15, Martijn Visser wrote:
> > > Hi all,
> > >
> > > I think it would be a huge benefit if we can achieve more frequent
> > releases
> > > of connectors, which are not bound to the release cycle of Flink
> itself.
> > I
> > > agree that in order to get there, we need to have stable interfaces
> which
> > > are trustworthy and reliable, so they can be safely used by those
> > > connectors. I do think that work still needs to be done on those
> > > interfaces, but I am confident that we can get there from a Flink
> > > perspective.
> > >
> > > I am worried that we would not be able to achieve those frequent
> releases
> > > of connectors if we are putting these connectors under the Apache
> > umbrella,
> > > because that means that for each connector release we have to follow
> the
> > > Apache release creation process. This requires a lot of manual steps
> and
> > > prohibits automation and I think it would be hard to scale out frequent
> > > releases of connectors. I'm curious how others think this challenge
> could
> > > be solved.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Mon, 18 Oct 2021 at 22:22, Thomas Weise  wrote:
> > >
> > >> Thanks for initiating this discussion.
> > >>
> > >> There are definitely a few things that are not optimal with our
> > >> current management of connectors. I would not necessarily characterize
> > >> it as a "mess" though. As the points raised so far show, it isn't easy
> > >> to find a solution that balances competing requirements and leads to a
> > >> net improvement.
> > >>
> > >> It would be great if we can find a setup that allows for connectors to
> > >> be released independently of core Flink and that each connector can be
> > >> released separately. Flink already has separate releases
> > >> (flink-shaded), so that by itself isn't a new thing. Per-connector
> > >> releases would need to allow for more frequent releases (without the
> > >> baggage that a full Flink release comes with).
> > >>
> > >> Separate releases would only make sense if the core Flink surface is
> > >> fairly stable though. As evident from Iceberg (and also Beam), that's
> > >> not the case currently. We should probably focus on addressing the
> > >> stability first, before splitting code. A success criteria could be
> > >> that we are able to build Iceb