Re: Request to open the contributor permission!

2021-06-08 Thread Yun Tang
Hi hapihu, Welcome to Apache Flink community! You don't need to ask contributor permission for Flink JIRA issues now, and you could comment in the issue which you're interested to ask as to be assigned. You could also find more details in [1] [1] https://flink.apache.org/contributing/how-to-con

[jira] [Created] (FLINK-22940) Make SQL column max column widh configurable

2021-06-08 Thread Svend Vanderveken (Jira)
Svend Vanderveken created FLINK-22940: - Summary: Make SQL column max column widh configurable Key: FLINK-22940 URL: https://issues.apache.org/jira/browse/FLINK-22940 Project: Flink Issue

[jira] [Created] (FLINK-22939) Generalize JDK switch in azure setup

2021-06-08 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-22939: Summary: Generalize JDK switch in azure setup Key: FLINK-22939 URL: https://issues.apache.org/jira/browse/FLINK-22939 Project: Flink Issue Type: Impr

Re: [DISCUSS] Definition of idle partitions

2021-06-08 Thread Piotr Nowojski
Hi Eron, Can you elaborate a bit more what do you mean? I don’t understand what do you mean by more general solution. As of now, stream is marked idle by a source/watermark generator, which has an effect of temporarily ignoring this stream/partition from calculating min watermark in the downs

[jira] [Created] (FLINK-22938) Slot request bulk is not fulfillable! Could not allocate the required slot within slot request timeout

2021-06-08 Thread Bhagi (Jira)
Bhagi created FLINK-22938: - Summary: Slot request bulk is not fulfillable! Could not allocate the required slot within slot request timeout Key: FLINK-22938 URL: https://issues.apache.org/jira/browse/FLINK-22938

[jira] [Created] (FLINK-22937) rocksdb cause jvm to crash

2021-06-08 Thread Piers (Jira)
Piers created FLINK-22937: - Summary: rocksdb cause jvm to crash Key: FLINK-22937 URL: https://issues.apache.org/jira/browse/FLINK-22937 Project: Flink Issue Type: Bug Components: Runtime /

Re: [DISCUSS] Definition of idle partitions

2021-06-08 Thread Eron Wright
It seems to me that idleness was introduced to deal with a very specific issue. In the pipeline, watermarks are aggregated not on a per-split basis but on a per-subtask basis. This works well when each subtask has exactly one split. When a sub-task has multiple splits, various complications occu

Re: Add control mode for flink

2021-06-08 Thread Xintong Song
> > 2. There are two kinds of existing special elements, special stream > records (e.g. watermarks) and events (e.g. checkpoint barrier). They all > flow through the whole DAG, but events needs to be acknowledged by > downstream and can overtake records, while stream records are not). So I’m > wond

Re: Add control mode for flink

2021-06-08 Thread Steven Wu
> producing control events from JobMaster is similar to triggering a savepoint. Paul, here is what I see the difference. Upon job or jobmanager recovery, we don't need to recover and replay the savepoint trigger signal. On Tue, Jun 8, 2021 at 8:20 PM Paul Lam wrote: > +1 for this feature. Setti

Re: Add control mode for flink

2021-06-08 Thread Paul Lam
+1 for this feature. Setting up a separate control stream is too much for many use cases, it would very helpful if users can leverage the built-in control flow of Flink. My 2 cents: 1. @Steven IMHO, producing control events from JobMaster is similar to triggering a savepoint. The REST api is no

Re: Re: Add control mode for flink

2021-06-08 Thread Steven Wu
option 2 is probably not feasible, as checkpoint may take a long time or may fail. Option 1 might work, although it complicates the job recovery and checkpoint. After checkpoint completion, we need to clean up those control signals stored in HA service. On Tue, Jun 8, 2021 at 1:14 AM 刘建刚 wrote:

Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-08 Thread Yangze Guo
Thanks for the valuable suggestion, Arvid. 1) Yes, we can add a new SlotSharingGroup which includes the name and its resource. After that, we have two interfaces for configuring the slot sharing group of an operator: - #slotSharingGroup(String name)// the resource of it can be configured throu

[jira] [Created] (FLINK-22936) Support column comment in Schema and ResolvedSchema

2021-06-08 Thread Jark Wu (Jira)
Jark Wu created FLINK-22936: --- Summary: Support column comment in Schema and ResolvedSchema Key: FLINK-22936 URL: https://issues.apache.org/jira/browse/FLINK-22936 Project: Flink Issue Type: New Fea

Re: [DISCUSS] Limit size of already processed files in File Source SplitEnumerator

2021-06-08 Thread Tianxin Zhao
Thanks Till, Guowei and Arvid for the insightful discussion! 1. Regarding size and scan performance We are in the POC stage and not hitting OOM issue yet, the issue is discovered by reading through FileSource implementation. Our order of magnitude is each path 200B and ~8000 files

[jira] [Created] (FLINK-22935) Can not start standalone cluster

2021-06-08 Thread Lyubing Qiang (Jira)
Lyubing Qiang created FLINK-22935: - Summary: Can not start standalone cluster Key: FLINK-22935 URL: https://issues.apache.org/jira/browse/FLINK-22935 Project: Flink Issue Type: Bug

Re: Request to open the contributor permission!

2021-06-08 Thread Yangze Guo
Hi, Welcome to the community! You don't need a contributor's permission to contribute to Apache Flink. Simply find a JIRA ticket you'd like to work on and ask a committer to assign you to the ticket. You can refer to the contribution guidelines [1]. [1] https://flink.apache.org/contributing/how-t

[jira] [Created] (FLINK-22934) Add instructions for using the " ' " escape syntax of SQL client

2021-06-08 Thread Roc Marshal (Jira)
Roc Marshal created FLINK-22934: --- Summary: Add instructions for using the " ' " escape syntax of SQL client Key: FLINK-22934 URL: https://issues.apache.org/jira/browse/FLINK-22934 Project: Flink

[jira] [Created] (FLINK-22933) Upgrade the Flink Fabric8io/kubernetes-client version to >=5.4.0 to be FIPS compliant

2021-06-08 Thread Fuyao Li (Jira)
Fuyao Li created FLINK-22933: Summary: Upgrade the Flink Fabric8io/kubernetes-client version to >=5.4.0 to be FIPS compliant Key: FLINK-22933 URL: https://issues.apache.org/jira/browse/FLINK-22933 Project

Re: [VOTE] Watermark propagation with Sink API

2021-06-08 Thread Eron Wright
Voting is re-open for FLIP-167 as-is (without idleness support as was the point of contention). On Fri, Jun 4, 2021 at 10:45 AM Eron Wright wrote: > Little update on this, more good discussion over the last few days, and > the FLIP will probably be amended to incorporate idleness. Voting will

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-08 Thread Eron Wright
Thanks, the narrowed FLIP-167 is fine for now. I'll re-activate the vote process. Thanks! On Tue, Jun 8, 2021 at 3:01 AM Till Rohrmann wrote: > Hi everyone, > > I do agree that Flink's definition of idleness is not fully thought through > yet. Consequently, I would feel a bit uneasy to make it

[jira] [Created] (FLINK-22932) RocksDBStateBackendWindowITCase fails with savepoint timeout

2021-06-08 Thread Roman Khachatryan (Jira)
Roman Khachatryan created FLINK-22932: - Summary: RocksDBStateBackendWindowITCase fails with savepoint timeout Key: FLINK-22932 URL: https://issues.apache.org/jira/browse/FLINK-22932 Project: Flink

[jira] [Created] (FLINK-22931) Migrate to flink-shaded-force-shading

2021-06-08 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-22931: Summary: Migrate to flink-shaded-force-shading Key: FLINK-22931 URL: https://issues.apache.org/jira/browse/FLINK-22931 Project: Flink Issue Type: Tec

Request to open the contributor permission!

2021-06-08 Thread w.gh123
Hi,  I want to contribute to Apache Flink. Would you please give me the contributor permission? My JIRA ID is hapihu

[jira] [Created] (FLINK-22930) [flink-python]: Resources should be closed

2021-06-08 Thread wuguihu (Jira)
wuguihu created FLINK-22930: --- Summary: [flink-python]: Resources should be closed Key: FLINK-22930 URL: https://issues.apache.org/jira/browse/FLINK-22930 Project: Flink Issue Type: Bug Co

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-06-08 Thread Piotr Nowojski
Hi, Thanks for resuming this discussion. I think +1 for the proposal of dropping (deprecating) `dispose()`, and adding `flush()` to the `StreamOperator`/udfs. Semantically it would be more like new `close()` is an equivalent of old `dispose()`. Old `close()` is an equivalent of new `flush() + clos

[jira] [Created] (FLINK-22929) Change the default failover strategy to FixDelayRestartStrategy

2021-06-08 Thread Yun Gao (Jira)
Yun Gao created FLINK-22929: --- Summary: Change the default failover strategy to FixDelayRestartStrategy Key: FLINK-22929 URL: https://issues.apache.org/jira/browse/FLINK-22929 Project: Flink Issue

Re: [DISCUSS] Definition of idle partitions

2021-06-08 Thread Piotr Nowojski
Hi Arvid, Thanks for writing down this summary and proposal. I think this was the foundation of the disagreement in FLIP-167 discussion. Dawid was arguing that idleness is intermittent, strictly a task local concept and as such shouldn't be exposed in for example sinks. While me and Eron thought t

Re: Flink 1.14. Bi-weekly 2021-06-08

2021-06-08 Thread Till Rohrmann
Thanks for this update Joe, Dawid and Xintong. This is super helpful! Cheers, Till On Tue, Jun 8, 2021 at 4:18 PM Johannes Moser wrote: > Hi, > > Today we had our first bi-weekly. > Here’s a short summary of what has been discussed. > > Please watch the 1.14. Release page [1] to stay up to date

[jira] [Created] (FLINK-22928) Unexpected exception happens in RecordWriter when stopping-with-savepoint

2021-06-08 Thread Yun Gao (Jira)
Yun Gao created FLINK-22928: --- Summary: Unexpected exception happens in RecordWriter when stopping-with-savepoint Key: FLINK-22928 URL: https://issues.apache.org/jira/browse/FLINK-22928 Project: Flink

[DISCUSS] Definition of idle partitions

2021-06-08 Thread Arvid Heise
Hi devs, While discussing "Watermark propagation with Sink API" and during "[FLINK-18934] Idle stream does not advance watermark in connected stream", we noticed some drawbacks on how Flink defines idle partitions currently. To recap, idleness was always considered as a means to achieve progress

Flink 1.14. Bi-weekly 2021-06-08

2021-06-08 Thread Johannes Moser
Hi, Today we had our first bi-weekly. Here’s a short summary of what has been discussed. Please watch the 1.14. Release page [1] to stay up to date. * Feature freeze date * As response to our last email the question was risen to push the feature freeze date back by a month, which would mean ear

[jira] [Created] (FLINK-22927) Exception on JobClient.get_job_status().result()

2021-06-08 Thread Jira
Maciej Bryński created FLINK-22927: -- Summary: Exception on JobClient.get_job_status().result() Key: FLINK-22927 URL: https://issues.apache.org/jira/browse/FLINK-22927 Project: Flink Issue Ty

[jira] [Created] (FLINK-22926) IDLE source should go ACTIVE when registering a new split

2021-06-08 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-22926: Summary: IDLE source should go ACTIVE when registering a new split Key: FLINK-22926 URL: https://issues.apache.org/jira/browse/FLINK-22926 Project: Flink

[jira] [Created] (FLINK-22925) "FieldDescriptor does not match message type" ERROR when use protobuf-router

2021-06-08 Thread Bill lee (Jira)
Bill lee created FLINK-22925: Summary: "FieldDescriptor does not match message type" ERROR when use protobuf-router Key: FLINK-22925 URL: https://issues.apache.org/jira/browse/FLINK-22925 Project: Flink

[jira] [Created] (FLINK-22924) Expose create_local_environment in PyFlink

2021-06-08 Thread Jira
Maciej Bryński created FLINK-22924: -- Summary: Expose create_local_environment in PyFlink Key: FLINK-22924 URL: https://issues.apache.org/jira/browse/FLINK-22924 Project: Flink Issue Type: Bu

Re: [DISCUSS] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to Flink

2021-06-08 Thread Till Rohrmann
Great :-) On Tue, Jun 8, 2021 at 1:11 PM Yingjie Cao wrote: > Hi Till, > > Thanks for the suggestion. The blog post is already on the way. > > Best, > Yingjie > > Till Rohrmann 于2021年6月8日周二 下午5:30写道: > >> Thanks for the update Yingjie. Would it make sense to write a short blog >> post about thi

Re: [DISCUSS] Limit size of already processed files in File Source SplitEnumerator

2021-06-08 Thread Arvid Heise
Hi Tianxin, I assigned you the ticket, so you could go ahead and create some POC PR. I would like to understand the issue first a bit better and then give some things to consider. In general, I see your point that in a potentially infinitely running application keeping track of all read entities w

Re: How to unsubscribe?

2021-06-08 Thread Leonard Xu
Hi, Morgan Just send an email with any content to user-unsubscr...@flink.apache.org will unsubscribe the mail from Flink user mail list. And also send an email to with any content to dev-unsubscr...@flink.apache.org

[jira] [Created] (FLINK-22923) Queryable state (rocksdb) with TM restart end-to-end test unstable

2021-06-08 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-22923: -- Summary: Queryable state (rocksdb) with TM restart end-to-end test unstable Key: FLINK-22923 URL: https://issues.apache.org/jira/browse/FLINK-22923 Project: Flink

[jira] [Created] (FLINK-22922) Migrate flink website to hugo

2021-06-08 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-22922: Summary: Migrate flink website to hugo Key: FLINK-22922 URL: https://issues.apache.org/jira/browse/FLINK-22922 Project: Flink Issue Type: Improvement

Re: [DISCUSS] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to Flink

2021-06-08 Thread Yingjie Cao
Hi Till, Thanks for the suggestion. The blog post is already on the way. Best, Yingjie Till Rohrmann 于2021年6月8日周二 下午5:30写道: > Thanks for the update Yingjie. Would it make sense to write a short blog > post about this feature including some performance improvement numbers? I > think this could

Re: [DISCUSS] Limit size of already processed files in File Source SplitEnumerator

2021-06-08 Thread Guowei Ma
It would really simplify a lot if the modification timestamp of each newly scanned file is increased. We only need to record the file list corresponding to the largest timestamp. Timestamp of each scanned file 1. It is smaller than the maximum timestamp, which means it has been processed; 2.

Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-08 Thread Arvid Heise
Hi Yangze, I like the general approach to bind requirements to slotsharing groups. I think the current approach is also flexible enough that a user could simply use ParameterTool or similar to use config values and wire that with their slotgroups, such that different requirements can be tested wit

Re: [DISCUSS][Statebackend][Runtime] Changelog Statebackend Configuration Proposal

2021-06-08 Thread Yu Li
+1 for option 3. IMHO persisting (operator's) state data through change log is an independent mechanism which could co-work with all kinds of local state stores (heap and rocksdb). This mechanism is similar to the WAL (write-ahead-log) mechanism in the database system. Although implement-wise we'r

How to unsubscribe?

2021-06-08 Thread Geldenhuys , Morgan Karl
How can I unsubscribe to this mailing lists? The volume of is just getting too much at the moment. Following the steps described in the website (https://flink.apache.org/community.html) did not appear to do anything. Sorry for the spam and thanks in advance.

[jira] [Created] (FLINK-22921) SQL Client can't resolve the escape of "'" correctly.

2021-06-08 Thread Roc Marshal (Jira)
Roc Marshal created FLINK-22921: --- Summary: SQL Client can't resolve the escape of "'" correctly. Key: FLINK-22921 URL: https://issues.apache.org/jira/browse/FLINK-22921 Project: Flink Issue Typ

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-08 Thread Till Rohrmann
Hi everyone, I do agree that Flink's definition of idleness is not fully thought through yet. Consequently, I would feel a bit uneasy to make it part of Flink's API right now. Instead, defining the proper semantics first and then exposing it sounds like a good approach forward. Hence, +1 for optio

Re: [DISCUSS] Limit size of already processed files in File Source SplitEnumerator

2021-06-08 Thread Till Rohrmann
Hi Tianxin, thanks for starting this discussion. I am pulling in Arvid who works on Flink's connectors. I think the problem you are describing can happen. >From what I understand you are proposing to keep track of the watermark of processed file input splits and then filter out splits based on t

[jira] [Created] (FLINK-22920) Guava version conflict in flink-format module

2021-06-08 Thread sujun (Jira)
sujun created FLINK-22920: - Summary: Guava version conflict in flink-format module Key: FLINK-22920 URL: https://issues.apache.org/jira/browse/FLINK-22920 Project: Flink Issue Type: Bug Com

[jira] [Created] (FLINK-22919) Remove support for Hadoop1.x in HadoopInputFormatCommonBase.getCredentialsFromUGI

2021-06-08 Thread Junfan Zhang (Jira)
Junfan Zhang created FLINK-22919: Summary: Remove support for Hadoop1.x in HadoopInputFormatCommonBase.getCredentialsFromUGI Key: FLINK-22919 URL: https://issues.apache.org/jira/browse/FLINK-22919 Pro

Re: [DISCUSS] Feedback Collection Jira Bot

2021-06-08 Thread Till Rohrmann
I like this idea. It would then be the responsibility of the component maintainers to manage the lifecycle explicitly. Cheers, Till On Mon, Jun 7, 2021 at 1:48 PM Arvid Heise wrote: > One more idea for the bot. Could we have a label to exclude certain tickets > from the life-cycle? > > I'm thin

Re: [DISCUSS] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to Flink

2021-06-08 Thread Till Rohrmann
Thanks for the update Yingjie. Would it make sense to write a short blog post about this feature including some performance improvement numbers? I think this could be interesting to our users. Cheers, Till On Mon, Jun 7, 2021 at 4:49 AM Jingsong Li wrote: > Thanks Yingjie for the great effort!

[DISCUSS]FLIP-170 Adding Checkpoint Rejection Mechanism

2021-06-08 Thread Senhong Liu
Hi guys, We would like to start a discussion on the new FLIP about rejecting checkpoints on the operator level. The basic idea is to allow the operator to reject a checkpoint when it is not under a proper situation and returning a proper failure reason. http://cwiki.apache.org/confluence/display/

Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-08 Thread Yangze Guo
@Yang In short, the external resources will participate in resource deduction and be logically ensured, but requesting an external resource must still be done through config options with the current default resource allocation strategy. In FLIP-56, we abstract the logic of resource allocation to th

[jira] [Created] (FLINK-22918) StreamingFileSink does not commit partition when no message is sent between checkpoints

2021-06-08 Thread lihe ma (Jira)
lihe ma created FLINK-22918: --- Summary: StreamingFileSink does not commit partition when no message is sent between checkpoints Key: FLINK-22918 URL: https://issues.apache.org/jira/browse/FLINK-22918 Project

Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-08 Thread Yang Wang
Thanks @Yangze for preparing this FLIP. I think this is a good start point for the community users to have a taste on the fine-grained resource management, which we all believe it could improve the Flink job stability and cluster utilization. I have a simple question about the extended resources.

Re: Re: Add control mode for flink

2021-06-08 Thread 刘建刚
Thanks for the reply. It is a good question. There are multi choices as follows: 1. We can persist control signals in HighAvailabilityServices and replay them after failover. 2. Only tell the users that the control signals take effect after they are checkpointed. Steven Wu [via Apach

Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-08 Thread Yangze Guo
@Xintong > introduce a general approach for overwriting such job specifics without > re-compiling the job I think that would be a good direction. Just share some cents on this topic. I'd divide the job-level specifics into two categories: - Specifics which affect how Flink executes the job, e.g. "

[jira] [Created] (FLINK-22917) Dynamically change the log level of apache flink at runtime

2021-06-08 Thread pierrexiong (Jira)
pierrexiong created FLINK-22917: --- Summary: Dynamically change the log level of apache flink at runtime Key: FLINK-22917 URL: https://issues.apache.org/jira/browse/FLINK-22917 Project: Flink Is

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-08 Thread Piotr Nowojski
Hi Eron, The FLIP-167 is narrow, but we recently discovered some problems with current idleness semantics as Arvid explained. We are planning to present a new proposal to redefine them. Probably as a part of it, we would need to rename them. Given that, I think it doesn't make sense to expose idle

[jira] [Created] (FLINK-22916) Revisit and close JIRA issues around legacy planner

2021-06-08 Thread Timo Walther (Jira)
Timo Walther created FLINK-22916: Summary: Revisit and close JIRA issues around legacy planner Key: FLINK-22916 URL: https://issues.apache.org/jira/browse/FLINK-22916 Project: Flink Issue Typ