[jira] [Created] (FLINK-22872) Remove usages of legacy planner test utilities in Python

2021-06-03 Thread Timo Walther (Jira)
Timo Walther created FLINK-22872: Summary: Remove usages of legacy planner test utilities in Python Key: FLINK-22872 URL: https://issues.apache.org/jira/browse/FLINK-22872 Project: Flink

Re: [Discuss] Planning Flink 1.14

2021-06-03 Thread Xintong Song
Thanks everyone for the feedback. @Jing, Thanks for the inputs. Could you please ask a committer who works together with you on these items to fill them into the feature collecting wiki page [1]? I assume Jark, who co-edited the flip wiki page, is working with you? @Kurt, @Till and @Seth, First

[jira] [Created] (FLINK-22871) pyflink deploy support yarn application mode

2021-06-03 Thread konwu (Jira)
konwu created FLINK-22871: - Summary: pyflink deploy support yarn application mode Key: FLINK-22871 URL: https://issues.apache.org/jira/browse/FLINK-22871 Project: Flink Issue Type: New Feature

[jira] [Created] (FLINK-22870) Grouping sets + case when + constant string throws AssertionError

2021-06-03 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-22870: --- Summary: Grouping sets + case when + constant string throws AssertionError Key: FLINK-22870 URL: https://issues.apache.org/jira/browse/FLINK-22870 Project: Flink

Re: Integrating new connector with Flink SQL.

2021-06-03 Thread Jingsong Li
Hi santhosh, 1.Yes. 2.I'm very glad if you can contribute. Best, Jingsong On Fri, Jun 4, 2021 at 1:13 AM santhosh venkat wrote: > Hi, Jingsong, > > Thanks for taking time to respond to my questions. Really appreciate it. > > 1. Just to ensure we are on the same page, are you recommending us

[jira] [Created] (FLINK-22869) SQLClientSchemaRegistryITCase timeouts on azure

2021-06-03 Thread Xintong Song (Jira)
Xintong Song created FLINK-22869: Summary: SQLClientSchemaRegistryITCase timeouts on azure Key: FLINK-22869 URL: https://issues.apache.org/jira/browse/FLINK-22869 Project: Flink Issue Type:

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-03 Thread Eron Wright
I think the rationale for end-to-end idleness (i.e. between pipelines) is the same as the rationale for idleness between operators within a pipeline. On the 'main issue' you mentioned, we entrust the source with adapting to Flink's notion of idleness (e.g. in Pulsar source, it means that no

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-03 Thread Márton Balassi
That is an interesting idea, Till. The main issue with it is that TLS certificates have an expiration time, usually they get approved for a couple years. Forcing our users to restart jobs to reprovision TLS certificates would be weird when we could just implement a single proper strong

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-03 Thread Arvid Heise
When I was rethinking the idleness issue, I came to the conclusion that it should be inferred at the source of the respective downstream pipeline again. The main issue on propagating idleness is that you would force the same definition across all downstream pipelines, which may not be what the

Re: Flattening of events

2021-06-03 Thread Chesnay Schepler
Have a look at flatMaps: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/operators/overview/#datastream-rarr-datastream-1 On 6/3/2021 8:28 PM, Satish Saley wrote: Hi team, I am trying to figure out a way to flatten events in my Flink app. Event that i am

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-03 Thread Eron Wright
Thanks Piotr for bringing this up. I reflected on this and I agree we should expose idleness, otherwise a multi-stage flow could stall. Regarding the latency markers, I don't see an immediate need for propagating them, because they serve to estimate latency within a pipeline, not across

Flattening of events

2021-06-03 Thread Satish Saley
Hi team, I am trying to figure out a way to flatten events in my Flink app. Event that i am consuming from Kafka is UpperLevelData { int upperId; List listOfModules } ModuleData { int moduleId; string info; } After consuming this event, i want to flatten it out in following format -

Re: Integrating new connector with Flink SQL.

2021-06-03 Thread santhosh venkat
Hi, Jingsong, Thanks for taking time to respond to my questions. Really appreciate it. 1. Just to ensure we are on the same page, are you recommending us to implement Source, SourceReader and SplitEnumberator abstractions for the new source connector. And use either the DataStreamScanProvider or

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-03 Thread Till Rohrmann
I guess the idea would then be to let the proxy do the authentication job and only forward the request via an SSL mutually encrypted connection to the Flink cluster. Would this be possible? The beauty of this setup is in my opinion that this setup should work with all kinds of authentication

Re: [VOTE] Watermark propagation with Sink API

2021-06-03 Thread Piotr Nowojski
I would like to ask you to hold on with counting the votes until I get an answer for my one question in the dev mailing list thread (sorry if it was already covered somewhere). Best, Piotrek czw., 3 cze 2021 o 16:12 Jark Wu napisał(a): > +1 (binding) > > Best, > Jark > > On Thu, 3 Jun 2021 at

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-03 Thread Piotr Nowojski
Hi, Sorry for chipping in late in the discussion, but I would second this point from Arvid: > 4. Potentially, StreamStatus and LatencyMarker would also need to be encoded. It seems like this point was asked, but not followed? Or did I miss it? Especially the StreamStatus part. For me it sounds

Re: [DISCUSS][Statebackend][Runtime] Changelog Statebackend Configuration Proposal

2021-06-03 Thread Piotr Nowojski
Hi, I would actually prefer option 6 (or 5/4), for the sake of configuration being explicit and self explanatory. But at the same time I don't have very hard preferences and from the remaining options, option 3 seems the most reasonable. The question would be, do we want to expose to the users

Re: [VOTE] Watermark propagation with Sink API

2021-06-03 Thread Jark Wu
+1 (binding) Best, Jark On Thu, 3 Jun 2021 at 21:34, Dawid Wysakowicz wrote: > +1 (binding) > > Best, > > Dawid > > On 03/06/2021 03:50, Zhou, Brian wrote: > > +1 (non-binding) > > > > Thanks Eron, looking forward to seeing this feature soon. > > > > Thanks, > > Brian > > > > -Original

[jira] [Created] (FLINK-22868) Update training exercises for 1.13

2021-06-03 Thread David Anderson (Jira)
David Anderson created FLINK-22868: -- Summary: Update training exercises for 1.13 Key: FLINK-22868 URL: https://issues.apache.org/jira/browse/FLINK-22868 Project: Flink Issue Type:

Re: [VOTE] Watermark propagation with Sink API

2021-06-03 Thread Dawid Wysakowicz
+1 (binding) Best, Dawid On 03/06/2021 03:50, Zhou, Brian wrote: > +1 (non-binding) > > Thanks Eron, looking forward to seeing this feature soon. > > Thanks, > Brian > > -Original Message- > From: Arvid Heise > Sent: Wednesday, June 2, 2021 15:44 > To: dev > Subject: Re: [VOTE]

[jira] [Created] (FLINK-22867) I see that there is lake of documentation in python

2021-06-03 Thread Abeer Mohamed (Jira)
Abeer Mohamed created FLINK-22867: - Summary: I see that there is lake of documentation in python Key: FLINK-22867 URL: https://issues.apache.org/jira/browse/FLINK-22867 Project: Flink Issue

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-06-03 Thread Dawid Wysakowicz
Hi all, Thanks for the very insightful discussion. I'd like to revive the effort of FLIP-147. First of all, from my side I'd like to say that I am really interested in helping that happen in the upcoming 1.14 release. I agree with Till that the final checkpoints and global commits are mostly

[jira] [Created] (FLINK-22866) I need to use the current Fraud Detection example that needs flink-walkthrough-common in Python

2021-06-03 Thread Abeer Mohamed (Jira)
Abeer Mohamed created FLINK-22866: - Summary: I need to use the current Fraud Detection example that needs flink-walkthrough-common in Python Key: FLINK-22866 URL: https://issues.apache.org/jira/browse/FLINK-22866

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-03 Thread Gabor Somogyi
Thanks for giving options to fulfil the need. Users are looking for a solution where users can be identified on the whole cluster and restrict access to resources/actions. A good example for such an action is cancelling other users running jobs. * SSL does provide mutual authentication but when

Re: [Discuss] Planning Flink 1.14

2021-06-03 Thread Seth Wiesman
Hi Everyone, +1 for the Release Managers. Thank you all for volunteering. @Till Rohrmann A common sentiment that I have heard from many users is that upgrading off of 1.9 was very difficult. In particular, a lot of people struggled to understand the new memory model. Many users who required

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-03 Thread Till Rohrmann
Thanks for the information Gabor. If it is about securing the communication between the REST client and the REST server, then Flink already supports enabling mutual SSL authentication [1]. Would this be enough to secure the communication and to pass an audit? [1]

Re: [Discuss] Planning Flink 1.14

2021-06-03 Thread Till Rohrmann
Thanks for volunteering as our release managers Xintong, Dawid and Joe! Thanks for starting the discussion about the release date Kurt. Personally, I prefer in general shorter release cycles as it allows us to deliver features faster and people feel less pressure to merge half-done features last

Re: recover from svaepoint

2021-06-03 Thread Till Rohrmann
Thanks for this insight. So the problem might be Flink using an internal Kafka API (the connector uses reflection to get hold of the TransactionManager) which changed between version 2.4.1 and 2.5. I think this is a serious problem because it breaks our end-to-end exactly once story when using new

[jira] [Created] (FLINK-22865) Optimize state serialize/deserialize in PyFlink

2021-06-03 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-22865: Summary: Optimize state serialize/deserialize in PyFlink Key: FLINK-22865 URL: https://issues.apache.org/jira/browse/FLINK-22865 Project: Flink Issue Type:

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-03 Thread Gabor Somogyi
Hi Till, Since I'm working in security area 10+ years let me share my thought. I would like to emphasise there are experts better than me but I have some basics. The discussion is open and not trying to tell alone things... > I mean if an attacker can get access to one of the machines, then it

Re: recover from svaepoint

2021-06-03 Thread Tianxin Zhao
I encountered the exact same issue before when experimenting in a testing environment. I was not able to spot the bug as mentioned in this thread, the solution I did was to downgrade my own kafka-client version from 2.5 to 2.4.1, matching the version of flink-connector-kafka. In 2.4.1 Kafka,

Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-03 Thread Xintong Song
Thanks Yangze for preparing the FLIP. The proposed changes look good to me. As you've mentioned in the implementation plan, I believe one of the most important tasks of this FLIP is to have the feature well documented. It would be really nice if we can keep that in mind and start drafting the

[jira] [Created] (FLINK-22864) Remove the legacy planner code base

2021-06-03 Thread Timo Walther (Jira)
Timo Walther created FLINK-22864: Summary: Remove the legacy planner code base Key: FLINK-22864 URL: https://issues.apache.org/jira/browse/FLINK-22864 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-22863) ArrayIndexOutOfBoundsException may happen when building rescale edges

2021-06-03 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-22863: Summary: ArrayIndexOutOfBoundsException may happen when building rescale edges Key: FLINK-22863 URL: https://issues.apache.org/jira/browse/FLINK-22863 Project: Flink

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-03 Thread Till Rohrmann
Thanks for updating the document Márton. Why is it that banks will consider it more secure if Flink comes with Kerberos authentication (assuming a properly secured setup)? I mean if an attacker can get access to one of the machines, then it should also be possible to obtain the right Kerberos

[DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-03 Thread Yangze Guo
Hi, there, We would like to start a discussion thread on "FLIP-169: DataStream API for Fine-Grained Resource Requirements"[1], where we propose the DataStream API for specifying fine-grained resource requirements in StreamExecutionEnvironment. Please find more details in the FLIP wiki document

Re: recover from svaepoint

2021-06-03 Thread Till Rohrmann
Thanks for the update. Skimming over the code it looks indeed that we are overwriting the values of the static value ProducerIdAndEpoch.NONE. I am not 100% how this will cause the observed problem, though. I am also not a Flink Kafka connector and Kafka expert so I would appreciate it if someone

Re: recover from svaepoint

2021-06-03 Thread Till Rohrmann
Forwarding 周瑞's message to a duplicate thread: After our analysis, we found a bug in the org.apache.flink.streaming.connectors.kafka.internals.FlinkKafkaInternalProducer.resumeTransaction method The analysis process is as follows: