[jira] [Created] (FLINK-35058) Encountered change event for table db.table whose schema isn't known to this connector

2024-04-08 Thread MOBIN (Jira)
MOBIN created FLINK-35058:
-

 Summary: Encountered change event for table db.table whose schema 
isn't known to this connector
 Key: FLINK-35058
 URL: https://issues.apache.org/jira/browse/FLINK-35058
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: 1.17.1
Reporter: MOBIN


Flink1.17.1

flink-cdc:flink-sql-connector-mysql-cdc-2.4.1.jar
{code:java}
CREATE TABLE `test_cdc_timestamp` (
  `id` BIGINT COMMENT '主键id',
   
   proctime AS PROCTIME(),
   PRIMARY KEY(id) NOT ENFORCED
) WITH (
      'connector' = 'mysql-cdc',
      'hostname' = 'x',
      'scan.startup.mode' = 'timestamp',
      'scan.startup.timestamp-millis' = '171241920' ,
      'port' = '3306',
      'username' = 'xxx',
      'password' = 'xxx',
      'database-name' = 'xxtablename',
      'table-name' = 'xxdatabase',
      'scan.incremental.snapshot.enabled' = 'false',
       'debezium.snapshot.locking.mode' = 'none',
       'server-id' = '5701',
     'server-time-zone' = 'Asia/Shanghai',
    'debezium.skipped.operations' = 'd'
      ); {code}
When I use 'scan.startup.mode' = 'latent-offset 'or'initial' to synchronize 
data normally, when I use 'scan.startup.mode' = 'timestamp', the following 
error is reported
{code:java}
2024-04-09 11:11:15.619 [debezium-engine] INFO  io.debezium.util.Threads  - 
Requested thread factory for connector MySqlConnector, id = mysql_binlog_source 
named = change-event-source-coordinator
2024-04-09 11:11:15.621 [debezium-engine] INFO  io.debezium.util.Threads  - 
Creating thread 
debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator
2024-04-09 11:11:15.629 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.pipeline.ChangeEventSourceCoordinator  - Metrics registered
2024-04-09 11:11:15.630 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.pipeline.ChangeEventSourceCoordinator  - Context created
2024-04-09 11:11:15.642 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.connector.mysql.MySqlSnapshotChangeEventSource  - No previous 
offset has been found
2024-04-09 11:11:15.642 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.connector.mysql.MySqlSnapshotChangeEventSource  - According 
to the connector configuration only schema will be snapshotted
2024-04-09 11:11:15.644 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.pipeline.ChangeEventSourceCoordinator  - Snapshot ended with 
SnapshotResult [status=SKIPPED, offset=null]
2024-04-09 11:11:15.652 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.util.Threads  - Requested thread factory for connector 
MySqlConnector, id = mysql_binlog_source named = binlog-client
2024-04-09 11:11:15.656 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.pipeline.ChangeEventSourceCoordinator  - Starting streaming
2024-04-09 11:11:15.682 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.connector.mysql.MySqlStreamingChangeEventSource  - GTID set 
purged on server: 
0969640a-1d48-11ed-b6cf-28dee561557c:1-27603868993,70958f24-2253-11eb-891d-f875a48ad7b1:1-50323,ec1e6593-2251-11eb-9c18-f875a48ad539:1-25345454762
2024-04-09 11:11:15.682 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.connector.mysql.MySqlStreamingChangeEventSource  - Skip 0 
events on streaming start
2024-04-09 11:11:15.682 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.connector.mysql.MySqlStreamingChangeEventSource  - Skip 0 
rows on streaming start
2024-04-09 11:11:15.683 
[debezium-mysqlconnector-mysql_binlog_source-change-event-source-coordinator] 
INFO  io.debezium.util.Threads  - Creating thread 
debezium-mysqlconnector-mysql_binlog_source-binlog-client
2024-04-09 11:11:15.686 [blc.mysql.com:3306] INFO  io.debezium.util.Threads 
 - Creating thread debezium-mysqlconnector-mysql_binlog_source-binlog-client
2024-04-09 11:11:15.700 [blc.mysql.com:3306] INFO  
io.debezium.connector.mysql.MySqlStreamingChangeEventSource  - Connected to 
MySQL binlog at xxx.mysql.com:3306, starting at MySqlOffsetContext 
[sourceInfoSchema=Schema{io.debezium.connector.mysql.Source:STRUCT}, 
sourceInfo=SourceInfo [currentGtid=null, currentBinlogFilename=, 
currentBinlogPosition=0, currentRowNumber=0, serverId=0, sourceTime=null, 
threadId=-1, currentQuery=null, tableIds=[], databaseName=null], 
snapshotCompleted=false, transactionContext=TransactionContext 
[currentTransactionId=null, perTableEventCount={}, totalEventCount=0], 
restartGtidSet=null, currentGtidSet=null, 

[jira] [Created] (FLINK-35057) Bump org.apache.commons:commons-compress from 1.25.0 to 1.26.1 for Flink jdbc connector

2024-04-08 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-35057:
---

 Summary: Bump org.apache.commons:commons-compress from 1.25.0 to 
1.26.1 for Flink jdbc connector
 Key: FLINK-35057
 URL: https://issues.apache.org/jira/browse/FLINK-35057
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / JDBC
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35056) when initial sqlserver table that

2024-04-08 Thread yandufeng (Jira)
yandufeng created FLINK-35056:
-

 Summary: when initial sqlserver table that
 Key: FLINK-35056
 URL: https://issues.apache.org/jira/browse/FLINK-35056
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: yandufeng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Iceberg table maintenance

2024-04-08 Thread Péter Váry
Forwarding the invite for the discussion we plan to do with the Iceberg
folks, as some of you might be interested in this.

-- Forwarded message -
From: Brian Olsen 
Date: Mon, Apr 8, 2024, 18:29
Subject: Re: Flink table maintenance
To: 


Hey Iceberg nation,

I would like to share about the meeting this Wednesday to further discuss
details of Péter's proposal on Flink Maintenance Tasks.
Calendar Link: https://calendar.app.google/83HGYWXoQJ8zXuVCA

List discussion:
https://lists.apache.org/thread/10mdf9zo6pn0dfq791nf4w1m7jh9k3sl


Design Doc: Flink table maintenance




On Mon, Apr 1, 2024 at 8:52 PM Manu Zhang  wrote:

> Hi Peter,
>
> Are you proposing to create a user facing locking feature in Iceberg, or
>> just something something for internal use?
>>
>
> Since it's a general issue, I'm proposing to create a general user
> interface first, while the implementation can be left to users. For
> example, we use Airflow to schedule maintenance jobs and we can check
> in-progress jobs with the Airflow API. Hive metastore lock might be another
> option we can implement for users.
>
> Thanks,
> Manu
>
> On Tue, Apr 2, 2024 at 5:26 AM Péter Váry 
> wrote:
>
>> Hi Ajantha,
>>
>> I thought about enabling post commit topology based compaction for sinks
>> using options, like we use for the parametrization of streaming reads [1].
>> I think it will be hard to do it in a user friendly way - because of the
>> high number of parameters -, but I think it is a possible solution with
>> sensible defaults.
>>
>> There is a batch-like solution for data file compaction already available
>> [2], but I do not see how we could extend Flink SQL to be able to call it.
>>
>> Writing to a branch using Flink SQL should be another thread, but by my
>> first guess, it shouldn't be hard to implement using options, like:
>> /*+ OPTIONS('branch'='b1') */
>> Since writing to branch i already working through the Java API [3].
>>
>> Thanks, Peter
>>
>> 1 -
>> https://iceberg.apache.org/docs/latest/flink-queries/#flink-streaming-read
>> 2 -
>> https://github.com/apache/iceberg/blob/820fc3ceda386149f42db8b54e6db9171d1a3a6d/flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/actions/RewriteDataFilesAction.java
>> 3 -
>> https://iceberg.apache.org/docs/latest/flink-writes/#branch-writes
>>
>> On Mon, Apr 1, 2024, 16:30 Ajantha Bhat  wrote:
>>
>>> Thanks for the proposal Peter.
>>>
>>> I just wanted to know do we have any plans for supporting SQL syntax for
>>> table maintenance (like CALL procedure) for pure Flink SQL users?
>>> I didn't see any custom SQL parser plugin support in Flink. I also saw
>>> that Branch write doesn't have SQL support (only Branch reads use Option),
>>> So I am not sure about the roadmap of Iceberg SQL support in Flink.
>>> Was there any discussion before?
>>>
>>> - Ajantha
>>>
>>> On Mon, Apr 1, 2024 at 7:51 PM Péter Váry 
>>> wrote:
>>>
 Hi Manu,

 Just to clarify:
 - Are you proposing to create a user facing locking feature in Iceberg,
 or just something something for internal use?

 I think we shouldn't add locking to Iceberg's user facing scope in this
 stage. A fully featured locking system has many more features that we need
 (priorities, fairness, timeouts etc). I could be tempted when we are
 talking about the REST catalog, but I think that should be further down the
 road, if ever...

 About using the tags:
 - I whole-heartedly agree that using tags is not intuitive, and I see
 your points in most of your arguments. OTOH, introducing new requirement
 (locking mechanism) seems like a wrong direction to me.
 - We already defined a requirement (atomic changes on the table) for
 the Catalog implementations which could be used to archive our goal here.
 - We also already store technical data in snapshot properties in Flink
 jobs (JobId, OperatorId, CheckpointId). Maybe technical tags/table
 properties is not a big stretch.

 Or we can look at these tags or metadata as 'technical data' which is
 internal to Iceberg, and shouldn't expressed on the external API. My
 concern is:
 - Would it be used often enough to worth the additional complexity?

 Knowing that Spark compaction is struggling with the same issue is a
 good indicator, but probably we would need more use cases for introducing a
 new feature with this complexity, or simpler solution.

 Thanks, Peter


 On Mon, Apr 1, 2024, 10:18 Manu Zhang  wrote:

> What would the community think of exploiting tags for preventing
>> concurrent maintenance loop executions.
>
>
> This 

[jira] [Created] (FLINK-35055) Flink CDC connector release contains jar with incompatible licenses

2024-04-08 Thread Xiqian YU (Jira)
Xiqian YU created FLINK-35055:
-

 Summary: Flink CDC connector release contains jar with 
incompatible licenses
 Key: FLINK-35055
 URL: https://issues.apache.org/jira/browse/FLINK-35055
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: Xiqian YU


Currently, Flink CDC connector releases both slim and fat jars. Apart from CDC 
itself, all of its dependencies are packaged into fat jars, including some with 
incompatible licenses:
 * Db2 connector: `com.ibm.db2.jcc:db2jcc:db2jcc4` licensed with a non-FOSS 
license (International Program License Agreement).
 * MySQL connector: `mysql:mysql-connector-java` licensed with GPLv2 license, 
which is incompatible with Apache 2.0.
 * Oracle connector: `com.oracle.ojdbc` licensed with a non-FOSS license 
(Oracle Free Use Terms and Conditions).

To fix this problem we may:
 # Exclude questionable dependencies from released jar;
 # Add docs to guide user download & place dependencies manually.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35054) Migrate TemporalJoinRewriteWithUniqueKeyRule

2024-04-08 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-35054:
-

 Summary: Migrate TemporalJoinRewriteWithUniqueKeyRule
 Key: FLINK-35054
 URL: https://issues.apache.org/jira/browse/FLINK-35054
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] FLIP-399: Flink Connector Doris

2024-04-08 Thread wudi
Hi devs,

I would like to start a vote about FLIP-399 [1]. The FLIP is about contributing 
the Flink Doris Connector[2] to the Flink community. Discussion thread [3].

The vote will be open for at least 72 hours unless there is an objection or
insufficient votes.


Thanks,
Di.Wu


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
[2] https://github.com/apache/doris-flink-connector
[3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh



[jira] [Created] (FLINK-35053) TIMESTAMP with TIME ZONE not supported by JDBC connector for Postgres

2024-04-08 Thread Pietro (Jira)
Pietro created FLINK-35053:
--

 Summary: TIMESTAMP with TIME ZONE not supported by JDBC connector 
for Postgres
 Key: FLINK-35053
 URL: https://issues.apache.org/jira/browse/FLINK-35053
 Project: Flink
  Issue Type: Bug
  Components: Connectors / JDBC
Affects Versions: jdbc-3.1.2, 1.18.1, 1.19.0
Reporter: Pietro


The JDBC sink for Postgres does not support {{{}TIMESTAMP WITH TIME ZONE{}}}, 
nor {{TIMESTAMP_LTZ}} types.

Related issues: FLINK-22199, FLINK-20869
h2. Problem Explanation

A Postgres {{target_table}} has a field {{tm_tz}} of type {{timestamptz}} .
{code:sql}
-- Postgres DDL
CREATE TABLE target_table (
tm_tz TIMESTAMP WITH TIME ZONE
)
{code}
In Flink we have a table with a column of type {{{}TIMESTAMP_LTZ(6){}}}, and 
our goal is to sink it to {{{}target_table{}}}.
{code:sql}
-- Flink DDL
CREATE TABLE sink (
tm_tz TIMESTAMP_LTZ(6)
) WITH (
'connector' = 'jdbc',
'table-name' = 'target_table'
...
)
{code}
According to 
[AbstractPostgresCompatibleDialect.supportedTypes()|https://github.com/apache/flink-connector-jdbc/blob/7025642d88ff661e486745b23569595e1813a1d0/flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/dialect/AbstractPostgresCompatibleDialect.java#L109],
 {{TIMESTAMP_WITH_LOCAL_TIME_ZONE}} is supported, while 
{{TIMESTAMP_WITH_TIME_ZONE}} is not.

However, when the converter is created via 
[AbstractJdbcRowConverter.createInternalConverter()|https://github.com/apache/flink-connector-jdbc/blob/7025642d88ff661e486745b23569595e1813a1d0/flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/converter/AbstractJdbcRowConverter.java#L132],
 it throws an {{UnsupportedOperationException}} since 
{{TIMESTAMP_WITH_LOCAL_TIME_ZONE}} is *not* among the available types, while 
[{{TIMESTAMP_WITH_TIME_ZONE}}|https://github.com/apache/flink-connector-jdbc/blob/7025642d88ff661e486745b23569595e1813a1d0/flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/converter/AbstractJdbcRowConverter.java#L168]
 is.
{code:java}
Exception in thread "main" java.lang.UnsupportedOperationException: Unsupported 
type:TIMESTAMP_LTZ(6)
at 
org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.createInternalConverter(AbstractJdbcRowConverter.java:186)
at 
org.apache.flink.connector.jdbc.databases.postgres.dialect.PostgresRowConverter.createPrimitiveConverter(PostgresRowConverter.java:99)
at 
org.apache.flink.connector.jdbc.databases.postgres.dialect.PostgresRowConverter.createInternalConverter(PostgresRowConverter.java:58)
at 
org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.createNullableInternalConverter(AbstractJdbcRowConverter.java:118)
at 
org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.(AbstractJdbcRowConverter.java:68)
at 
org.apache.flink.connector.jdbc.databases.postgres.dialect.PostgresRowConverter.(PostgresRowConverter.java:47)
at 
org.apache.flink.connector.jdbc.databases.postgres.dialect.PostgresDialect.getRowConverter(PostgresDialect.java:51)
at 
org.apache.flink.connector.jdbc.table.JdbcDynamicTableSource.getScanRuntimeProvider(JdbcDynamicTableSource.java:184)
at 
org.apache.flink.table.planner.connectors.DynamicSourceUtils.validateScanSource(DynamicSourceUtils.java:478)
at 
org.apache.flink.table.planner.connectors.DynamicSourceUtils.prepareDynamicSource(DynamicSourceUtils.java:161)
at 
org.apache.flink.table.planner.connectors.DynamicSourceUtils.convertSourceToRel(DynamicSourceUtils.java:125)
at 
org.apache.flink.table.planner.plan.schema.CatalogSourceTable.toRel(CatalogSourceTable.java:118)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.toRel(SqlToRelConverter.java:4002)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:2872)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2432)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2346)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2291)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:728)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:714)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3848)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:618)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:229)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:205)
at 

[jira] [Created] (FLINK-35052) Webhook validator should reject unsupported Flink versions

2024-04-08 Thread Alexander Fedulov (Jira)
Alexander Fedulov created FLINK-35052:
-

 Summary: Webhook validator should reject unsupported Flink versions
 Key: FLINK-35052
 URL: https://issues.apache.org/jira/browse/FLINK-35052
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Alexander Fedulov
Assignee: Alexander Fedulov


The admission webhook currently does not verify if FlinkDeployment CR utilizes 
Flink versions that are not supported by the Operator. This causes the CR to be 
accepted and the failure to be postponed until the reconciliation phase. We 
should instead fail fast and provide users direct feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35051) Weird priorities when processing unaligned checkpoints

2024-04-08 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-35051:
--

 Summary: Weird priorities when processing unaligned checkpoints
 Key: FLINK-35051
 URL: https://issues.apache.org/jira/browse/FLINK-35051
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing, Runtime / Network, Runtime / Task
Affects Versions: 1.18.1, 1.19.0, 1.17.2
Reporter: Piotr Nowojski


While looking through the code I noticed that `StreamTask` is processing 
unaligned checkpoints in strange order/priority. The end result is that 
unaligned checkpoint `Start Delay` time can be increased, and triggering 
checkpoints in `StreamTask` can be unnecessary delayed by other mailbox actions 
in the system, like for example:
* processing time timers
* `AsyncWaitOperator` results
* ... 

Incoming UC barrier is treated as a priority event by the network stack (it 
will be polled from the input before anything else). This is what we want, but 
polling elements from network stack has lower priority then processing enqueued 
mailbox actions.

Secondly, if AC barrier timeout to UC, that's done via a mailbox action, but 
this mailbox action is also not prioritised in any way, so other mailbox 
actions could be unnecessarily executed first. 

On top of that there is a clash of two separate concepts here:
# Mailbox priority. yieldToDownstream - so in a sense reverse to what we would 
like to have for triggering checkpoint, but that only kicks in #yield() calls, 
where it's actually correct, that operator in a middle of execution can not 
yield to checkpoint - it should only yield to downstream.
# Control mails in mailbox executor - cancellation is done via that, it 
bypasses whole mailbox queue.
# Priority events in the network stack.

It's unfortunate that 1. vs 3. has a naming clash, as priority name is used in 
both things, and highest network priority event containing UC barrier, when 
executed via mailbox has actually the lowest mailbox priority.

Control mails mechanism is a kind of priority mails executed out of order, but 
doesn't generalise well for use in checkpointing.

This whole thing should be re-worked at some point. Ideally what we would like 
have is that:
* mail to convert AC barriers to UC
* polling UC barrier from the network input
* checkpoint trigger via RPC for source tasks
should be processed first, with an exception of yieldToDownstream, where 
current mailbox priorities should be adhered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Externalized Google Cloud Connectors

2024-04-08 Thread Ahmed Hamdy
Hi Claire,
I am in favor of Martijn and Leonard's points, it is better to follow aws
connectors and keep it under ASF.
Do you have other suggestions for connectors than pub/sub at the moment?
Best Regards
Ahmed Hamdy


On Thu, 4 Apr 2024 at 08:38, Martijn Visser 
wrote:

> Hi Lorenzo,
>
> Bahir is retired, see the homepage. It plays no role (anymore).
>
> >  This, unfortunately, is the tradeoff for developing the connectors
> outside of Apache in exchange for development velocity.
>
> I understand that. It can be considered to develop the connectors outside
> of the Flink project, in order to achieve development velocity. We've seen
> a similar thing happen with the CDC connectors, before that was ultimately
> donated to the Flink project. However, there are no guarantees that
> external contributions are considered when evaluating committers, because
> there's no visibility for the PMC on these external contributions.
>
> Best regards,
>
> Martijn
>
> On Wed, Apr 3, 2024 at 3:26 PM 
> wrote:
>
> > @Leonard @Martijn
> > Following up on @Claire question, what is the role of Bahir (
> > https://bahir.apache.org/) in this scenario?
> >
> > I am also trying to understand how connectors fir in the Flink project
> > scenario :)
> >
> > Thank you,
> > Lorenzo
> > On Apr 2, 2024 at 06:13 +0200, Leonard Xu , wrote:
> > > Hey, Claire
> > >
> > > Thanks starting this discussion, all flink external connector repos are
> > sub-projects of Apache Flink, including
> > https://github.com/apache/flink-connector-aws.
> > >
> > > Creating a flink external connector repo named flink-connectors-gcp as
> > sub-project of Apache Beam is not a good idea from my side.
> > >
> > > > Currently, we have no Flink committers on our team. We are actively
> > > > involved in the Apache Beam community and have a number of ASF
> members
> > on
> > > > the team.
> > >
> > > Not having Flink committer should not be a strong reason in this case,
> > Flink community welcome contributors to contribute and maintain the
> > connectors, as a contributor, through continuous connector development
> and
> > maintenance work in the community, you will also have the opportunity to
> > become a Committer.
> > >
> > > Best,
> > > Leonard
> > >
> > >
> > > > 2024年2月14日 上午12:24,Claire McCarthy  .INVALID>
> > 写道:
> > > >
> > > > Hi Devs!
> > > >
> > > > I’d like to kick off a discussion on setting up a repo for a new
> fleet
> > of
> > > > Google Cloud connectors.
> > > >
> > > > A bit of context:
> > > >
> > > > -
> > > >
> > > > We have a team of Google engineers who are looking to build/maintain
> > > > 5-10 GCP connectors for Flink.
> > > > -
> > > >
> > > > We are wondering if it would make sense to host our connectors under
> > the
> > > > ASF umbrella following a similar repo structure as AWS (
> > > > https://github.com/apache/flink-connector-aws). In our case:
> > > > apache/flink-connectors-gcp.
> > > > -
> > > >
> > > > Currently, we have no Flink committers on our team. We are actively
> > > > involved in the Apache Beam community and have a number of ASF
> members
> > on
> > > > the team.
> > > >
> > > >
> > > > We saw that one of the original motivations for externalizing
> > connectors
> > > > was to encourage more activity and contributions around connectors by
> > > > easing the contribution overhead. We understand that the decision was
> > > > ultimately made to host the externalized connector repos under the
> ASF
> > > > organization. For the same reasons (release infra, quality assurance,
> > > > integration with the community, etc.), we would like all GCP
> > connectors to
> > > > live under the ASF organization.
> > > >
> > > > We want to ask the Flink community what you all think of this idea,
> and
> > > > what would be the best way for us to go about contributing something
> > like
> > > > this. We are excited to contribute and want to learn and follow your
> > > > practices.
> > > >
> > > > A specific issue we know of is that our changes need approval from
> > Flink
> > > > committers. Do you have a suggestion for how best to go about a new
> > > > contribution like ours from a team that does not have committers? Is
> it
> > > > possible, for example, to partner with a committer (or a small
> cohort)
> > for
> > > > tight engagement? We also know about ASF voting and release process,
> > but
> > > > that doesn't seem to be as much of a potential hurdle.
> > > >
> > > > Huge thanks in advance for sharing your thoughts!
> > > >
> > > >
> > > > Claire
> > >
> >
>


Re: [DISCUSS] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-08 Thread Ahmed Hamdy
Acknowledged, +1 to start a vote.
Best Regards
Ahmed Hamdy


On Mon, 8 Apr 2024 at 12:04, Rui Fan <1996fan...@gmail.com> wrote:

> Sorry, it's a typo. It should be FLINK-32558[1].
>
> [1] https://issues.apache.org/jira/browse/FLINK-32558
>
> Best,
> Rui
>
> On Mon, Apr 8, 2024 at 6:44 PM Ahmed Hamdy  wrote:
>
> > Hi Rui,
> > Thanks for the proposal.
> > Is the deprecation Jira mentioned (FLINK-32258) correct?
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Sun, 7 Apr 2024 at 03:37, Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > If there are no extra comments, I will start voting in three days,
> thank
> > > you~
> > >
> > > Best,
> > > Rui
> > >
> > > On Thu, Mar 28, 2024 at 4:46 PM Muhammet Orazov
> > >  wrote:
> > >
> > > > Hey Rui,
> > > >
> > > > Thanks for the detailed explanation and updating the FLIP!
> > > >
> > > > It is much clearer definitely, thanks for the proposal.
> > > >
> > > > Best,
> > > > Muhammet
> > > >
> > > > On 2024-03-28 07:37, Rui Fan wrote:
> > > > > Hi Muhammet,
> > > > >
> > > > > Thanks for your reply!
> > > > >
> > > > >> The execution mode is also used for the DataStream API [1],
> > > > >> would that also affect/hide the DataStream execution mode
> > > > >> if we remove it from the WebUI?
> > > > >
> > > > > Sorry, I didn't describe it clearly in FLIP-441[2], I have updated
> > it.
> > > > > Let me clarify the Execution Mode here:
> > > > >
> > > > > 1. Flink 1.19 website[3] also mentions the Execution mode, but it
> > > > > actually matches the JobType[4] in the Flink code. Both of them
> > > > > have 2 types: STREAMING and BATCH.
> > > > > 2. execution.runtime-mode can be set to 3 types: STREAMING,
> > > > > BATCH and AUTOMATIC. But the jobType will be inferred as
> > > > > STREAMING or BATCH when execution.runtime-mode is set
> > > > > to AUTOMATIC.
> > > > > 3. The ExecutionMode I describe is: code link[5] , as we can
> > > > > see, ExecutionMode has 4 enums: PIPELINED,
> > > > > PIPELINED_FORCED, BATCH and BATCH_FORCED.
> > > > > And we can see a flink streaming job from Flink WebUI,
> > > > > the Execution mode is PIPELINE instead of STREAMING.
> > > > > I attached a screenshot to the FLIP doc[2], you can see it there.
> > > > > 4. What this proposal wants to do is to remove the ExecutionMode
> > > > > with four enumerations on Flink WebUI and introduce the
> > > > > JobType with two enumerations (STREAMING or BATCH).
> > > > > STREAMING or BATCH is clearer and more accurate for users.
> > > > >
> > > > > Please let me know if it's not clear or anything is wrong, thanks a
> > > > > lot!
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/execution_mode/
> > > > > [2] https://cwiki.apache.org/confluence/x/agrPEQ
> > > > > [3]
> > > > >
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/datastream/execution_mode/
> > > > > [4]
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/f31c128bfc457b64dd7734f71123b74faa2958ba/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobType.java#L22
> > > > > [5]
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/f31c128bfc457b64dd7734f71123b74faa2958ba/flink-core/src/main/java/org/apache/flink/api/common/ExecutionMode.java#L54
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Thu, Mar 28, 2024 at 1:33 AM Venkatakrishnan Sowrirajan
> > > > > 
> > > > > wrote:
> > > > >
> > > > >> Rui,
> > > > >>
> > > > >> I assume the current proposal would also handle the case of mixed
> > mode
> > > > >> (BATCH + STREAMING within the same app) in the future, right?
> > > > >>
> > > > >> Regards
> > > > >> Venkat
> > > > >>
> > > > >> On Wed, Mar 27, 2024 at 10:15 AM Venkatakrishnan Sowrirajan <
> > > > >> vsowr...@asu.edu> wrote:
> > > > >>
> > > > >>> This will be a very useful addition to Flink UI. Thanks Rui for
> > > > >>> starting
> > > > >>> a FLIP for this improvement.
> > > > >>>
> > > > >>> Regards
> > > > >>> Venkata krishnan
> > > > >>>
> > > > >>>
> > > > >>> On Wed, Mar 27, 2024 at 4:49 AM Muhammet Orazov
> > > > >>>  wrote:
> > > > >>>
> > > >  Hello Rui,
> > > > 
> > > >  Thanks for the proposal! It looks good!
> > > > 
> > > >  I have minor clarification from my side:
> > > > 
> > > >  The execution mode is also used for the DataStream API [1],
> > > >  would that also affect/hide the DataStream execution mode
> > > >  if we remove it from the WebUI?
> > > > 
> > > >  Best,
> > > >  Muhammet
> > > > 
> > > >  [1]:
> > > > 
> > > > 
> > > >
> > >
> >
> https://urldefense.com/v3/__https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/execution_mode/__;!!IKRxdwAv5BmarQ!eFyqVJyje_8hu1vMSUwKGBsj8vqsFDisEvJ5AxPV0LduhhHWF3rPKYEEE-09biA0unFbfMy5AVQZMgBv1AOa5oTHmcYlkUE$
> > > > 
> > > > 
> > > >  On 2024-03-27 06:23, Rui Fan wrote:
> > > >  > Hi flink developers,
> 

Re: [DISCUSS] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-08 Thread Rui Fan
Sorry, it's a typo. It should be FLINK-32558[1].

[1] https://issues.apache.org/jira/browse/FLINK-32558

Best,
Rui

On Mon, Apr 8, 2024 at 6:44 PM Ahmed Hamdy  wrote:

> Hi Rui,
> Thanks for the proposal.
> Is the deprecation Jira mentioned (FLINK-32258) correct?
> Best Regards
> Ahmed Hamdy
>
>
> On Sun, 7 Apr 2024 at 03:37, Rui Fan <1996fan...@gmail.com> wrote:
>
> > If there are no extra comments, I will start voting in three days, thank
> > you~
> >
> > Best,
> > Rui
> >
> > On Thu, Mar 28, 2024 at 4:46 PM Muhammet Orazov
> >  wrote:
> >
> > > Hey Rui,
> > >
> > > Thanks for the detailed explanation and updating the FLIP!
> > >
> > > It is much clearer definitely, thanks for the proposal.
> > >
> > > Best,
> > > Muhammet
> > >
> > > On 2024-03-28 07:37, Rui Fan wrote:
> > > > Hi Muhammet,
> > > >
> > > > Thanks for your reply!
> > > >
> > > >> The execution mode is also used for the DataStream API [1],
> > > >> would that also affect/hide the DataStream execution mode
> > > >> if we remove it from the WebUI?
> > > >
> > > > Sorry, I didn't describe it clearly in FLIP-441[2], I have updated
> it.
> > > > Let me clarify the Execution Mode here:
> > > >
> > > > 1. Flink 1.19 website[3] also mentions the Execution mode, but it
> > > > actually matches the JobType[4] in the Flink code. Both of them
> > > > have 2 types: STREAMING and BATCH.
> > > > 2. execution.runtime-mode can be set to 3 types: STREAMING,
> > > > BATCH and AUTOMATIC. But the jobType will be inferred as
> > > > STREAMING or BATCH when execution.runtime-mode is set
> > > > to AUTOMATIC.
> > > > 3. The ExecutionMode I describe is: code link[5] , as we can
> > > > see, ExecutionMode has 4 enums: PIPELINED,
> > > > PIPELINED_FORCED, BATCH and BATCH_FORCED.
> > > > And we can see a flink streaming job from Flink WebUI,
> > > > the Execution mode is PIPELINE instead of STREAMING.
> > > > I attached a screenshot to the FLIP doc[2], you can see it there.
> > > > 4. What this proposal wants to do is to remove the ExecutionMode
> > > > with four enumerations on Flink WebUI and introduce the
> > > > JobType with two enumerations (STREAMING or BATCH).
> > > > STREAMING or BATCH is clearer and more accurate for users.
> > > >
> > > > Please let me know if it's not clear or anything is wrong, thanks a
> > > > lot!
> > > >
> > > > [1]
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/execution_mode/
> > > > [2] https://cwiki.apache.org/confluence/x/agrPEQ
> > > > [3]
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/datastream/execution_mode/
> > > > [4]
> > > >
> > >
> >
> https://github.com/apache/flink/blob/f31c128bfc457b64dd7734f71123b74faa2958ba/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobType.java#L22
> > > > [5]
> > > >
> > >
> >
> https://github.com/apache/flink/blob/f31c128bfc457b64dd7734f71123b74faa2958ba/flink-core/src/main/java/org/apache/flink/api/common/ExecutionMode.java#L54
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Thu, Mar 28, 2024 at 1:33 AM Venkatakrishnan Sowrirajan
> > > > 
> > > > wrote:
> > > >
> > > >> Rui,
> > > >>
> > > >> I assume the current proposal would also handle the case of mixed
> mode
> > > >> (BATCH + STREAMING within the same app) in the future, right?
> > > >>
> > > >> Regards
> > > >> Venkat
> > > >>
> > > >> On Wed, Mar 27, 2024 at 10:15 AM Venkatakrishnan Sowrirajan <
> > > >> vsowr...@asu.edu> wrote:
> > > >>
> > > >>> This will be a very useful addition to Flink UI. Thanks Rui for
> > > >>> starting
> > > >>> a FLIP for this improvement.
> > > >>>
> > > >>> Regards
> > > >>> Venkata krishnan
> > > >>>
> > > >>>
> > > >>> On Wed, Mar 27, 2024 at 4:49 AM Muhammet Orazov
> > > >>>  wrote:
> > > >>>
> > >  Hello Rui,
> > > 
> > >  Thanks for the proposal! It looks good!
> > > 
> > >  I have minor clarification from my side:
> > > 
> > >  The execution mode is also used for the DataStream API [1],
> > >  would that also affect/hide the DataStream execution mode
> > >  if we remove it from the WebUI?
> > > 
> > >  Best,
> > >  Muhammet
> > > 
> > >  [1]:
> > > 
> > > 
> > >
> >
> https://urldefense.com/v3/__https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/execution_mode/__;!!IKRxdwAv5BmarQ!eFyqVJyje_8hu1vMSUwKGBsj8vqsFDisEvJ5AxPV0LduhhHWF3rPKYEEE-09biA0unFbfMy5AVQZMgBv1AOa5oTHmcYlkUE$
> > > 
> > > 
> > >  On 2024-03-27 06:23, Rui Fan wrote:
> > >  > Hi flink developers,
> > >  >
> > >  > I'd like to start a discussion to discuss FLIP-441:
> > >  > Show the JobType and remove Execution Mode on Flink WebUI[1].
> > >  >
> > >  > Currently, the jobType has 2 types in Flink: STREAMING and
> BATCH.
> > >  > They work on completely different principles, such as:
> scheduler,
> > >  > shuffle, join, etc. These differences lead to different
> > > troubleshooting
> > >  

Re: [VOTE] Release flink-connector-rabbitmq, v3.0.2 release candidate #1

2024-04-08 Thread Ahmed Hamdy
Hi Martijn,

Sorry for joining late,
+1 (non-binding)

- Verified Checksums and Signatures
- Verified no binaries in archive
- Built source successfully
- Checked tag correctly in github
- reviewed web PR

Best Regards
Ahmed Hamdy


On Mon, 15 Jan 2024 at 13:54, Danny Cranmer  wrote:

> +1 (binding)
>
> - Release notes look good
> - Signatures match for binary/source archive
> - Checksums match for binary/source archive
> - Contents of Maven dist look complete
> - Verified there are no binaries in the source archive
> - Built src/tests pass running Maven locally
> - Tag is present in Github
> - CI built tag successfully [1]
> - Approved docs PR
>
> Thanks,
> Danny
>
> [1]
> https://github.com/apache/flink-connector-rabbitmq/actions/runs/7502278783
>
> On Mon, Jan 15, 2024 at 5:01 AM Zhongqiang Gong  >
> wrote:
>
> > +1 (non-binding)
> >
> > - Build with JDK 11 on ubuntu 22.04
> > - Gpg sign is correct
> > - No binary files in the source release
> > - Reviewed web pr (need to rebase)
> >
> > Best,
> > Zhongqiang Gong
> >
> > On 2024/01/12 12:50:53 Martijn Visser wrote:
> > > Hi everyone,
> > > Please review and vote on the release candidate #1 for the version
> > > 3.0.2, as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x.
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release to be deployed to dist.apache.org
> > > [2], which are signed with the key with fingerprint
> > > A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag v3.0.2-rc1 [5],
> > > * website pull request listing the new release [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Release Manager
> > >
> > > [1]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353145
> > > [2]
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-rabbitmq-3.0.2-rc1
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1697
> > > [5]
> >
> https://github.com/apache/flink-connector-rabbitmq/releases/tag/v3.0.2-rc1
> > > [6] https://github.com/apache/flink-web/pull/712
> > >
> >
>


Re: [DISCUSS] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-08 Thread Ahmed Hamdy
Hi Rui,
Thanks for the proposal.
Is the deprecation Jira mentioned (FLINK-32258) correct?
Best Regards
Ahmed Hamdy


On Sun, 7 Apr 2024 at 03:37, Rui Fan <1996fan...@gmail.com> wrote:

> If there are no extra comments, I will start voting in three days, thank
> you~
>
> Best,
> Rui
>
> On Thu, Mar 28, 2024 at 4:46 PM Muhammet Orazov
>  wrote:
>
> > Hey Rui,
> >
> > Thanks for the detailed explanation and updating the FLIP!
> >
> > It is much clearer definitely, thanks for the proposal.
> >
> > Best,
> > Muhammet
> >
> > On 2024-03-28 07:37, Rui Fan wrote:
> > > Hi Muhammet,
> > >
> > > Thanks for your reply!
> > >
> > >> The execution mode is also used for the DataStream API [1],
> > >> would that also affect/hide the DataStream execution mode
> > >> if we remove it from the WebUI?
> > >
> > > Sorry, I didn't describe it clearly in FLIP-441[2], I have updated it.
> > > Let me clarify the Execution Mode here:
> > >
> > > 1. Flink 1.19 website[3] also mentions the Execution mode, but it
> > > actually matches the JobType[4] in the Flink code. Both of them
> > > have 2 types: STREAMING and BATCH.
> > > 2. execution.runtime-mode can be set to 3 types: STREAMING,
> > > BATCH and AUTOMATIC. But the jobType will be inferred as
> > > STREAMING or BATCH when execution.runtime-mode is set
> > > to AUTOMATIC.
> > > 3. The ExecutionMode I describe is: code link[5] , as we can
> > > see, ExecutionMode has 4 enums: PIPELINED,
> > > PIPELINED_FORCED, BATCH and BATCH_FORCED.
> > > And we can see a flink streaming job from Flink WebUI,
> > > the Execution mode is PIPELINE instead of STREAMING.
> > > I attached a screenshot to the FLIP doc[2], you can see it there.
> > > 4. What this proposal wants to do is to remove the ExecutionMode
> > > with four enumerations on Flink WebUI and introduce the
> > > JobType with two enumerations (STREAMING or BATCH).
> > > STREAMING or BATCH is clearer and more accurate for users.
> > >
> > > Please let me know if it's not clear or anything is wrong, thanks a
> > > lot!
> > >
> > > [1]
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/execution_mode/
> > > [2] https://cwiki.apache.org/confluence/x/agrPEQ
> > > [3]
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/datastream/execution_mode/
> > > [4]
> > >
> >
> https://github.com/apache/flink/blob/f31c128bfc457b64dd7734f71123b74faa2958ba/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobType.java#L22
> > > [5]
> > >
> >
> https://github.com/apache/flink/blob/f31c128bfc457b64dd7734f71123b74faa2958ba/flink-core/src/main/java/org/apache/flink/api/common/ExecutionMode.java#L54
> > >
> > > Best,
> > > Rui
> > >
> > > On Thu, Mar 28, 2024 at 1:33 AM Venkatakrishnan Sowrirajan
> > > 
> > > wrote:
> > >
> > >> Rui,
> > >>
> > >> I assume the current proposal would also handle the case of mixed mode
> > >> (BATCH + STREAMING within the same app) in the future, right?
> > >>
> > >> Regards
> > >> Venkat
> > >>
> > >> On Wed, Mar 27, 2024 at 10:15 AM Venkatakrishnan Sowrirajan <
> > >> vsowr...@asu.edu> wrote:
> > >>
> > >>> This will be a very useful addition to Flink UI. Thanks Rui for
> > >>> starting
> > >>> a FLIP for this improvement.
> > >>>
> > >>> Regards
> > >>> Venkata krishnan
> > >>>
> > >>>
> > >>> On Wed, Mar 27, 2024 at 4:49 AM Muhammet Orazov
> > >>>  wrote:
> > >>>
> >  Hello Rui,
> > 
> >  Thanks for the proposal! It looks good!
> > 
> >  I have minor clarification from my side:
> > 
> >  The execution mode is also used for the DataStream API [1],
> >  would that also affect/hide the DataStream execution mode
> >  if we remove it from the WebUI?
> > 
> >  Best,
> >  Muhammet
> > 
> >  [1]:
> > 
> > 
> >
> https://urldefense.com/v3/__https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/execution_mode/__;!!IKRxdwAv5BmarQ!eFyqVJyje_8hu1vMSUwKGBsj8vqsFDisEvJ5AxPV0LduhhHWF3rPKYEEE-09biA0unFbfMy5AVQZMgBv1AOa5oTHmcYlkUE$
> > 
> > 
> >  On 2024-03-27 06:23, Rui Fan wrote:
> >  > Hi flink developers,
> >  >
> >  > I'd like to start a discussion to discuss FLIP-441:
> >  > Show the JobType and remove Execution Mode on Flink WebUI[1].
> >  >
> >  > Currently, the jobType has 2 types in Flink: STREAMING and BATCH.
> >  > They work on completely different principles, such as: scheduler,
> >  > shuffle, join, etc. These differences lead to different
> > troubleshooting
> >  > processes, so when users are maintaining a job or troubleshooting,
> >  > it's needed to know whether the current job is a STREAMING or
> >  > BATCH job. Unfortunately, Flink WebUI doesn't expose it to the
> >  > users so far.
> >  >
> >  > Also, Execution Mode is related to DataSet api, it has been marked
> >  > as @Deprecated in FLINK-32258 (1.18), but it's still shown in
> Flink
> >  > WebUI.
> >  >
> >  > Looking 

[jira] [Created] (FLINK-35050) Remove Lazy Initialization of DynamoDbBeanElementConverter

2024-04-08 Thread Ahmed Hamdy (Jira)
Ahmed Hamdy created FLINK-35050:
---

 Summary: Remove Lazy Initialization of DynamoDbBeanElementConverter
 Key: FLINK-35050
 URL: https://issues.apache.org/jira/browse/FLINK-35050
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / DynamoDB
Affects Versions: aws-connector-4.3.0
Reporter: Ahmed Hamdy


h2. Description

{{ DynamoDbBeanElementConverter }} implements {{ ElementConverter }} which now 
supports open method as of FLINK-29938, we need to remove lazy initialization 
given that it is now unblocked.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


RE: [DISCUSS] FLIP-XXX Apicurio-avro format

2024-04-08 Thread David Radley
Hi,
I have posted a Google Doc [0] to the mailing list for a discussion thread for 
a Flip proposal to introduce a Apicurio-avro format. The discussions have been 
resolved, please could a committer/PMC member copy the contents from the Google 
Doc, and create a FLIP number for this,. as per the process [1],
  Kind regards, David.
[0]
https://docs.google.com/document/d/14LWZPVFQ7F9mryJPdKXb4l32n7B0iWYkcOdEd1xTC7w/edit?usp=sharing

[1]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-CreateyourOwnFLIP

From: Jeyhun Karimov 
Date: Friday, 22 March 2024 at 13:05
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [DISCUSS] FLIP-XXX Apicurio-avro format
Hi David,

Thanks a lot for clarification.
Sounds good to me.

Regards,
Jeyhun

On Fri, Mar 22, 2024 at 10:54 AM David Radley 
wrote:

> Hi Jeyhun,
> Thanks for your feedback.
>
> So for outbound messages, the message includes the global ID. We register
> the schema and match on the artifact id. So if the schema then evolved,
> adding a new  version, the global ID would still be unique and the same
> version would be targeted. If you wanted to change the Flink table
> definition in line with a higher version, then you could do this – the
> artifact id would need to match for it to use the same schema and a higher
> artifact version would need to be provided. I notice that Apicurio has
> rules around compatibility that you can configure, I suppose if we attempt
> to create an artifact that breaks these rules , then the register schema
> will fail and the associated operation should fail (e.g. an insert). I have
> not tried this.
>
>
> For inbound messages, using the global id in the header – this targets one
> version of the schema. I can create different messages on the topic built
> with different schema versions, and I can create different tables in Flink,
> as long as the reader and writer schemas are compatible as per the
> https://github.com/apache/flink/blob/779459168c46b7b4c600ef52f99a5435f81b9048/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/RegistryAvroDeserializationSchema.java#L109
> Then this should work.
>
> Does this address your question?
> Kind regards, David.
>
>
> From: Jeyhun Karimov 
> Date: Thursday, 21 March 2024 at 21:06
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [DISCUSS] FLIP-XXX Apicurio-avro format
> Hi David,
>
> Thanks for the FLIP. +1 for it.
> I have a minor comment.
>
> Can you please elaborate more on mechanisms in place to ensure data
> consistency and integrity, particularly in the event of schema conflicts?
> Since each message includes a schema ID for inbound and outbound messages,
> can you elaborate more on message consistency in the context of schema
> evolution?
>
> Regards,
> Jeyhun
>
>
>
>
>
> On Wed, Mar 20, 2024 at 4:34 PM David Radley  wrote:
>
> > Thank you very much for your feedback Mark. I have made the changes in
> the
> > latest google document. On reflection I agree with you that the
> > globalIdPlacement format configuration should apply to the
> deserialization
> > as well, so it is declarative. I am also going to have a new
> configuration
> > option to work with content IDs as well as global IDs. In line with the
> > deser Apicurio IdHandler and headerHandlers.
> >
> >  kind regards, David.
> >
> >
> > On 2024/03/20 15:18:37 Mark Nuttall wrote:
> > > +1 to this
> > >
> > > A few small comments:
> > >
> > > Currently, if users have Avro schemas in an Apicurio Registry (an open
> > source Apache 2 licensed schema registry), then the natural way to work
> > with those Avro flows is to use the schemas in the Apicurio Repository.
> > > 'those Avro flows' ... this is the first reference to flows.
> > >
> > > The new format will use the global Id to look up the Avro schema that
> > the message was written during deserialization.
> > > I get the point, phrasing is awkward. Probably you're more interested
> in
> > content than word polish at this point though.
> > >
> > > The Avro Schema Registry (apicurio-avro) format
> > > The Confluent format is called avro-confluent; this should be
> > avro-apicurio
> > >
> > > How to create tables with Apicurio-avro format
> > > s/Apicurio-avro/avro-apicurio/g
> > >
> > > HEADER – globalId is put in the header
> > > LEGACY– global Id is put in the message as a long
> > > CONFLUENT - globalId is put in the message as an int.
> > > Please could we specify 'four-byte int' and 'eight-byte long' ?
> > >
> > > For a Kafka source the globalId will be looked for in this order:
> > > - In the header
> > > - After a magic byte as an int
> > > - After a magic byte as a long.
> > > but apicurio-avro.globalid-placement has a default value of HEADER :
> why
> > do we have a search order as well? Isn't apicurio-avro.globalid-placement
> > enough? Don't the two mechanisms conflict?
> > >
> > > In addition to the types listed there, Flink supports reading/writing
> > nullable types. 

Re: [DISCUSS] FLIP-XXX Apicurio-avro format

2024-04-08 Thread David Radley
Hi,
The discussions seem to have ended. I have made some minor changes to the 
document [1], I would like to leave the discussion open to the end of this 
week. If there are no more discussions, I will ask for the Flip to be copied 
into the standard FLIP location and start the voting next week.

[1] 
https://docs.google.com/document/d/14LWZPVFQ7F9mryJPdKXb4l32n7B0iWYkcOdEd1xTC7w/edit?usp=sharing


  Kind regards, David


From: David Radley 
Date: Wednesday, 20 March 2024 at 11:03
To: dev@flink.apache.org 
Subject: [EXTERNAL] [DISCUSS] FLIP-XXX Apicurio-avro format
Hi,
As per the FLIP process I would like to raise a FLIP, but do not have 
authority, so have created a google doc for the Flip to introduce a new 
Apicurio Avro format. The document is 
https://docs.google.com/document/d/14LWZPVFQ7F9mryJPdKXb4l32n7B0iWYkcOdEd1xTC7w/edit?usp=sharing

I have prototyped a lot of the content to prove that this approach is feasible. 
I look forward to the discussion,
  Kind regards, David.



Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


[jira] [Created] (FLINK-35049) Implement Async State API for ForStStateBackend

2024-04-08 Thread Hangxiang Yu (Jira)
Hangxiang Yu created FLINK-35049:


 Summary: Implement Async State API for ForStStateBackend
 Key: FLINK-35049
 URL: https://issues.apache.org/jira/browse/FLINK-35049
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Hangxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35048) Implement all methods of AsyncKeyedStateBakend

2024-04-08 Thread Hangxiang Yu (Jira)
Hangxiang Yu created FLINK-35048:


 Summary: Implement all methods of AsyncKeyedStateBakend 
 Key: FLINK-35048
 URL: https://issues.apache.org/jira/browse/FLINK-35048
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Hangxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35047) Introduce ForStStateBackend

2024-04-08 Thread Hangxiang Yu (Jira)
Hangxiang Yu created FLINK-35047:


 Summary: Introduce ForStStateBackend
 Key: FLINK-35047
 URL: https://issues.apache.org/jira/browse/FLINK-35047
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Hangxiang Yu
Assignee: Hangxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35046) Introduce New KeyedStateBackend related Async interfaces

2024-04-08 Thread Hangxiang Yu (Jira)
Hangxiang Yu created FLINK-35046:


 Summary: Introduce New KeyedStateBackend related Async interfaces
 Key: FLINK-35046
 URL: https://issues.apache.org/jira/browse/FLINK-35046
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Hangxiang Yu


Since we have introduced new State API, the async version of some classes 
should be introduced to support it, e.g. AsyncKeyedStateBackend, new State 
Descriptor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35045) Introduce ForStFileSystem to support reading and writing with ByteBuffer

2024-04-08 Thread Hangxiang Yu (Jira)
Hangxiang Yu created FLINK-35045:


 Summary: Introduce ForStFileSystem to support reading and writing 
with ByteBuffer
 Key: FLINK-35045
 URL: https://issues.apache.org/jira/browse/FLINK-35045
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Hangxiang Yu
Assignee: Hangxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35044) Introduce statebackend-forst module

2024-04-08 Thread Hangxiang Yu (Jira)
Hangxiang Yu created FLINK-35044:


 Summary: Introduce statebackend-forst module
 Key: FLINK-35044
 URL: https://issues.apache.org/jira/browse/FLINK-35044
 Project: Flink
  Issue Type: Sub-task
Reporter: Hangxiang Yu
Assignee: Hangxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35043) Release beta version of ForSt

2024-04-08 Thread Hangxiang Yu (Jira)
Hangxiang Yu created FLINK-35043:


 Summary: Release beta version of ForSt
 Key: FLINK-35043
 URL: https://issues.apache.org/jira/browse/FLINK-35043
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Hangxiang Yu
Assignee: Hangxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)