Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-01-13 Thread Martijn Visser
This sounds like a no-brainer +1 Two things that seem to be obvious, but might be good to double check: 1. All newly discovered partitions will be consumed from the earliest offset possible. That's how it's documented for version 1.12 [1], but not for later versions, which is why I would like to

Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-01-13 Thread John Roesler
Thanks for this proposal, Qingsheng! If you want to be a little conservative with the default, 5 minutes might be better than 30 seconds. The equivalent config in Kafka seems to be metadata.max.age.ms (https://kafka.apache.org/documentation/#consumerconfigs_metadata.max.age.ms), which has a de

[jira] [Created] (FLINK-30675) Decompose printing logic from Executor

2023-01-13 Thread Shengkai Fang (Jira)
Shengkai Fang created FLINK-30675: - Summary: Decompose printing logic from Executor Key: FLINK-30675 URL: https://issues.apache.org/jira/browse/FLINK-30675 Project: Flink Issue Type: Sub-task

Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-01-13 Thread Gabor Somogyi
+1 on the overall direction, it's an important feature. I've had a look on the latest master and looks like removed partition handling is not yet added but I think this is essential. https://github.com/apache/flink/blob/28c3e1a3923ba560b559a216985c1abeb794ebaa/flink-connectors/flink-connector-kaf

Re: [VOTE] FLIP-281: Sink Supports Speculative Execution For Batch Job

2023-01-13 Thread Biao Liu
Hi everyone, I'm happy to announce that FLIP-281[1] has been accepted. Thanks for all your feedback and votes. Here is the voting result: +1 (binding), 3 in total: - Zhu Zhu - Lijie Wang - Jing Zhang +1 (non-binding), 2 in total: - yuxia - Jing Ge +0 (binding), 1 in total: - Martijn There are

Re: [VOTE] Apache Flink Table Store 0.3.0, release candidate #1

2023-01-13 Thread Jark Wu
+1 (binding) - Build and compile the source code locally: *OK* - Verified signatures and hashes: *OK* - Checked no missing artifacts in the staging area: *OK* - Reviewed the website release PR: *OK* - Checked the licenses: *OK* - Went through the quick start: *OK* * Verified with both flink 1.14

Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-01-13 Thread Qingsheng Ren
Thanks everyone for joining the discussion! @Martijn: > All newly discovered partitions will be consumed from the earliest offset possible. Thanks for the reminder! I checked the logic of KafkaSource and found that new partitions will start from the offset initializer specified by the user inste

[jira] [Created] (FLINK-30676) Introduce Data Structures for table store

2023-01-13 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30676: Summary: Introduce Data Structures for table store Key: FLINK-30676 URL: https://issues.apache.org/jira/browse/FLINK-30676 Project: Flink Issue Type: Sub-tas

[jira] [Created] (FLINK-30677) SqlGatewayServiceStatementITCase.testFlinkSqlStatements fails

2023-01-13 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-30677: - Summary: SqlGatewayServiceStatementITCase.testFlinkSqlStatements fails Key: FLINK-30677 URL: https://issues.apache.org/jira/browse/FLINK-30677 Project: Flink

[jira] [Created] (FLINK-30678) NettyConnectionManagerTest.testManualConfiguration appears to be unstable

2023-01-13 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-30678: - Summary: NettyConnectionManagerTest.testManualConfiguration appears to be unstable Key: FLINK-30678 URL: https://issues.apache.org/jira/browse/FLINK-30678 Project:

Re: [DISCUSS] FLIP-286: Fix the API stability/scope annotation inconsistency in AbstractStreamOperator

2023-01-13 Thread Dawid Wysakowicz
Hi Becket, May I ask what is the motivation for the change? I'm really skeptical about making any of those classes `Public` or `PublicEvolving`. As far as I am concerned there is no agreement in the community if StreamOperator is part of the `Public(Evolving)` API. At least I think it should

Re: [DISCUSS] FLIP-276: Data Consistency of Streaming and Batch ETL in Flink and Table Store

2023-01-13 Thread Shammon FY
Hi Piotr, I discussed with @jinsong lee about `Timestamp Barrier` and `Aligned Checkpoint` for data consistency in FLIP, we think there are many defects indeed in using `Aligned Checkpoint` to support data consistency as you mentioned. According to our historical discussion, I think we have reach

[jira] [Created] (FLINK-30679) Can not load the data of hive dim table when project-push-down is introduced

2023-01-13 Thread hehuiyuan (Jira)
hehuiyuan created FLINK-30679: - Summary: Can not load the data of hive dim table when project-push-down is introduced Key: FLINK-30679 URL: https://issues.apache.org/jira/browse/FLINK-30679 Project: Flink

[DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-01-13 Thread Joao Boto
Hi flink devs, I'd like to start a discussion thread for FLIP-287[1]. This comes from an offline discussion with @Lijie Wang, from FLIP-239[2] specially for the sink[3]. Basically to expose the ExecutionConfig and JobId on SinkV2#InitContext. This changes are necessary to correct migrate the cur

Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-01-13 Thread Jing Ge
+1 for the proposal that makes users' daily work easier and therefore makes Flink more attractive. Best regards, Jing On Fri, Jan 13, 2023 at 11:27 AM Qingsheng Ren wrote: > Thanks everyone for joining the discussion! > > @Martijn: > > > All newly discovered partitions will be consumed from th

[jira] [Created] (FLINK-30680) Consider using the autoscaler to detect slow taskmanagers

2023-01-13 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-30680: -- Summary: Consider using the autoscaler to detect slow taskmanagers Key: FLINK-30680 URL: https://issues.apache.org/jira/browse/FLINK-30680 Project: Flink Issue T

Re: [VOTE] Apache Flink Table Store 0.3.0, release candidate #1

2023-01-13 Thread Jingsong Li
+1 (binding) Best, Jingsong On Fri, Jan 13, 2023 at 5:16 PM Jark Wu wrote: > > +1 (binding) > > - Build and compile the source code locally: *OK* > - Verified signatures and hashes: *OK* > - Checked no missing artifacts in the staging area: *OK* > - Reviewed the website release PR: *OK* > - Chec

[RESULT][VOTE] Apache Flink Table Store 0.3.0, release candidate #1

2023-01-13 Thread Jingsong Li
I'm happy to announce that we have unanimously approved this release. There are 3 approving votes, 3 of which are binding: * Yu Li (binding) * Jark Wu (binding) * Jingsong Lee (binding) There are no disapproving votes. Thank you for verifying the release candidate. I will now proceed to finalize

Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-01-13 Thread Benchao Li
+1, we've enabled this by default (10mins) in our production for years. Jing Ge 于2023年1月13日周五 22:22写道: > +1 for the proposal that makes users' daily work easier and therefore makes > Flink more attractive. > > Best regards, > Jing > > > On Fri, Jan 13, 2023 at 11:27 AM Qingsheng Ren wrote: > >

[ANNOUNCE] Apache Flink Table Store 0.3.0 released

2023-01-13 Thread Jingsong Li
The Apache Flink community is very happy to announce the release of Apache Flink Table Store 0.3.0. Apache Flink Table Store is a unified storage to build dynamic tables for both streaming and batch processing in Flink, supporting high-speed data ingestion and timely data query. Please check out

Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-01-13 Thread Gunnar Morling
Hey Joao, Thanks for this FLIP! One question on the proposed interface changes: is it expected that the configuration is *mutated* via the InitContext passed to Sink::createWriter()? If that's not the case, how about establishing a read-only contract representing the current configuration and pass

Re: [DISCUSS] FLIP-286: Fix the API stability/scope annotation inconsistency in AbstractStreamOperator

2023-01-13 Thread Becket Qin
Hi Dawid, Thanks for the reply. I am currently looking at the Beam Flink runner, and there are quite some hacks the Beam runner has to do in order to deal with the backwards incompatible changes in AbstractStreamOperator and some of the classes exposed by it. Regardless of what we think, the fact

Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-01-13 Thread João Boto
Hi Gunnar, Thanks for your time and response... I think the problem you want to solve is the exposure of the ExecutionConfig (that can be mutated) no? The configuration is not mutated, we only need to know if objectReuse is enable. This is already expose on RuntimeContext we think to keep it s

[jira] [Created] (FLINK-30681) Pulsar-Flink connector corrupts its output topic

2023-01-13 Thread Jacek Wislicki (Jira)
Jacek Wislicki created FLINK-30681: -- Summary: Pulsar-Flink connector corrupts its output topic Key: FLINK-30681 URL: https://issues.apache.org/jira/browse/FLINK-30681 Project: Flink Issue Ty

Re: [DISCUSS] FLIP-286: Fix the API stability/scope annotation inconsistency in AbstractStreamOperator

2023-01-13 Thread Konstantin Knauf
Hi Becket, > It is a basic rule of public API that anything exposed by a public interface should also be public. I agree with this in general. Did you get an overview of where we currently violate this? Is this something that the Arc42 architecture tests could test for so that as a first measure

Re: [DISCUSS] FLIP-286: Fix the API stability/scope annotation inconsistency in AbstractStreamOperator

2023-01-13 Thread Becket Qin
I don't have an overview of all the holes in our public API surface at the moment. It would be great if there's some tool to do a scan. In addition to fixing the annotation consistency formally, I think it is equally, if not more, important to see whether the public APIs we expose tell a good stor

Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-01-13 Thread Jing Ge
Hi Joao, Thanks for bringing this up. Exposing internal domain instances depends on your requirements. Technically, it is even possible to expose the RuntimeContext [1] (must be considered very carefully). Since you mentioned that you only need to know if objectReuse is enabled, how about just exp

Re: [DISCUSS] FLIP-286: Fix the API stability/scope annotation inconsistency in AbstractStreamOperator

2023-01-13 Thread Jing Ge
Hi Becket, Speaking of tools, we have ArchUnit integrated in Flink. Extending the defined ArchRules [1] a little bit, you will get the wished scan result. [1] https://github.com/apache/flink/blob/560b4612735a2b9cd3b5db88adf5cb223e85535b/flink-architecture-tests/flink-architecture-tests-production

[jira] [Created] (FLINK-30682) FLIP-283: Use adaptive batch scheduler as default scheduler for batch jobs

2023-01-13 Thread JunRui Li (Jira)
JunRui Li created FLINK-30682: - Summary: FLIP-283: Use adaptive batch scheduler as default scheduler for batch jobs Key: FLINK-30682 URL: https://issues.apache.org/jira/browse/FLINK-30682 Project: Flink

[jira] [Created] (FLINK-30683) Make adaptive batch scheduler as the default batch scheduler

2023-01-13 Thread Junrui Li (Jira)
Junrui Li created FLINK-30683: - Summary: Make adaptive batch scheduler as the default batch scheduler Key: FLINK-30683 URL: https://issues.apache.org/jira/browse/FLINK-30683 Project: Flink Issue

[jira] [Created] (FLINK-30684) Use the default parallelism as a flag for vertices that can automatically derive parallelism.

2023-01-13 Thread Junrui Li (Jira)
Junrui Li created FLINK-30684: - Summary: Use the default parallelism as a flag for vertices that can automatically derive parallelism. Key: FLINK-30684 URL: https://issues.apache.org/jira/browse/FLINK-30684

[jira] [Created] (FLINK-30685) Support mark the transformations whose parallelism is infected by the input transformation

2023-01-13 Thread Junrui Li (Jira)
Junrui Li created FLINK-30685: - Summary: Support mark the transformations whose parallelism is infected by the input transformation Key: FLINK-30685 URL: https://issues.apache.org/jira/browse/FLINK-30685

[jira] [Created] (FLINK-30686) Simplify the configuration of adaptive batch scheduler

2023-01-13 Thread Junrui Li (Jira)
Junrui Li created FLINK-30686: - Summary: Simplify the configuration of adaptive batch scheduler Key: FLINK-30686 URL: https://issues.apache.org/jira/browse/FLINK-30686 Project: Flink Issue Type:

Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-01-13 Thread Yun Tang
+1 for this proposal and thanks Qingsheng for driving this. Considering the interval, we also set the value as 5min, equivalent to the default value of metadata.max.age.ms. Best Yun Tang From: Benchao Li Sent: Friday, January 13, 2023 23:06 To: dev@flink.apache

[jira] [Created] (FLINK-30687) FILTER not effect in count(*)

2023-01-13 Thread tanjialiang (Jira)
tanjialiang created FLINK-30687: --- Summary: FILTER not effect in count(*) Key: FLINK-30687 URL: https://issues.apache.org/jira/browse/FLINK-30687 Project: Flink Issue Type: Bug Compone