退订

2021-06-04 Thread lizikunn
退订 | | lizikunn | | lizik...@163.com | 签名由网易邮箱大师定制

[jira] [Created] (FLINK-22883) Select view columns fail when store metadata with hive

2021-06-04 Thread ELLEX_SHEN (Jira)
ELLEX_SHEN created FLINK-22883: -- Summary: Select view columns fail when store metadata with hive Key: FLINK-22883 URL: https://issues.apache.org/jira/browse/FLINK-22883 Project: Flink Issue

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
I understand your scenario but I disagree with its assumptions: "However, the partition of A is empty and thus A is temporarily idle." - you're assuming that the behavior of the source is to mark itself idle if data isn't available, but that's clearly source-specific and not behavior we expect to

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Arvid Heise
At least one big motivation is having (temporary) empty partitions. Let me give you an example, why imho idleness is only approximate in this case: Assume you have source subtask A, B, C that correspond to 3 source partitions and a downstream keyed window operator W. W would usually trigger on

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
Yes I'm talking about an implementation of idleness that is unrelated to processing time. The clear example is partition assignment to subtasks, which probably motivated Flink's idleness functionality in the first place. On Fri, Jun 4, 2021 at 12:53 PM Arvid Heise wrote: > Hi Eron, > > Are you

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Arvid Heise
Hi Eron, Are you referring to an implementation of idleness that does not rely on a wall clock but on some clock baked into the partition information of the source system? If so, you are right that it invalidates my points. Do you have an example on where this is used? With a wall clock, you

Re: [VOTE] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
Little update on this, more good discussion over the last few days, and the FLIP will probably be amended to incorporate idleness. Voting will remain open until, let's say, mid-next week. On Thu, Jun 3, 2021 at 8:00 AM Piotr Nowojski wrote: > I would like to ask you to hold on with counting

Re: Add control mode for flink

2021-06-04 Thread Peter Huang
I agree with Steven. This logic can be added in a dynamic config framework that can bind into Flink operators. We probably don't need to let Flink runtime handle it. On Fri, Jun 4, 2021 at 8:11 AM Steven Wu wrote: > I am not sure if we should solve this problem in Flink. This is more like > a

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
Dawid, I think you're mischaracterizing the idleness signal as inherently a heuristic, but Flink does not impose that. A source-based watermark (and corresponding idleness signal) may well be entirely data-driven, entirely deterministic. Basically you're underselling what the pipeline is capable

[jira] [Created] (FLINK-22882) Tasks are blocked while emitting records

2021-06-04 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-22882: -- Summary: Tasks are blocked while emitting records Key: FLINK-22882 URL: https://issues.apache.org/jira/browse/FLINK-22882 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-22881) Tasks are blocked while emitting stream status

2021-06-04 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-22881: -- Summary: Tasks are blocked while emitting stream status Key: FLINK-22881 URL: https://issues.apache.org/jira/browse/FLINK-22881 Project: Flink Issue

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Arvid Heise
While everything I wrote before is still valid, upon further rethinking, I think that the conclusion is not necessarily correct: - If the user wants to have pipeline A and B behaving as if A+B was jointly executed in the same pipeline without the intermediate Pulsar topic, having the idleness in

Re: Add control mode for flink

2021-06-04 Thread Steven Wu
I am not sure if we should solve this problem in Flink. This is more like a dynamic config problem that probably should be solved by some configuration framework. Here is one post from google search:

Add control mode for flink

2021-06-04 Thread 刘建刚
Hi everyone, Flink jobs are always long-running. When the job is running, users may want to control the job but not stop it. The control reasons can be different as following: 1. Change data processing’ logic, such as filter condition. 2. Send trigger events to make the

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Arvid Heise
I think the core issue in this discussion is that we kind of assume that idleness is something universally well-defined. But it's not. It's a heuristic to advance data processing in event time where we would lack data to do so otherwise. Keep in mind that idleness has no real definition in terms

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-04 Thread Till Rohrmann
As I've said I am not a security expert and that's why I have to ask for clarification, Gabor. You are saying that if we configure a truststore for the REST endpoint with a single trusted certificate which has been generated by the operator of the Flink cluster, then the attacker can generate a

[jira] [Created] (FLINK-22880) Remove "blink" term in code base

2021-06-04 Thread Timo Walther (Jira)
Timo Walther created FLINK-22880: Summary: Remove "blink" term in code base Key: FLINK-22880 URL: https://issues.apache.org/jira/browse/FLINK-22880 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-22879) Remove "blink" suffix from table modules

2021-06-04 Thread Timo Walther (Jira)
Timo Walther created FLINK-22879: Summary: Remove "blink" suffix from table modules Key: FLINK-22879 URL: https://issues.apache.org/jira/browse/FLINK-22879 Project: Flink Issue Type:

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Piotr Nowojski
Hi, > Imagine you're starting consuming from the result channel in a situation were you have: > record4, record3, StreamStatus.ACTIVE, StreamStatus.IDLE record2, record1, record0 > Switching to the encoded StreamStatus.IDLE is unnecessary, and might cause the record3 and record4 to be late

[jira] [Created] (FLINK-22878) Allow placeholder options in format factories

2021-06-04 Thread Timo Walther (Jira)
Timo Walther created FLINK-22878: Summary: Allow placeholder options in format factories Key: FLINK-22878 URL: https://issues.apache.org/jira/browse/FLINK-22878 Project: Flink Issue Type:

[jira] [Created] (FLINK-22877) Remove BatchTableEnvironment and related API classes

2021-06-04 Thread Timo Walther (Jira)
Timo Walther created FLINK-22877: Summary: Remove BatchTableEnvironment and related API classes Key: FLINK-22877 URL: https://issues.apache.org/jira/browse/FLINK-22877 Project: Flink Issue

Re: [Discuss] Planning Flink 1.14

2021-06-04 Thread JING ZHANG
Hi all, @Xintong Song Thanks for reminding me, I would contact Jark to update the wiki page. Besides, I'd like to provide more inputs by sharing our experience about upgrading Internal version of Flink. Flink has been widely used in the production environment since 2018 in our company. Our

[jira] [Created] (FLINK-22876) Adding SharedObjects junit rule to ease test development

2021-06-04 Thread Arvid Heise (Jira)
Arvid Heise created FLINK-22876: --- Summary: Adding SharedObjects junit rule to ease test development Key: FLINK-22876 URL: https://issues.apache.org/jira/browse/FLINK-22876 Project: Flink Issue

[jira] [Created] (FLINK-22875) Cannot work by Pickler and Unpickler

2021-06-04 Thread JYXL (Jira)
JYXL created FLINK-22875: Summary: Cannot work by Pickler and Unpickler Key: FLINK-22875 URL: https://issues.apache.org/jira/browse/FLINK-22875 Project: Flink Issue Type: Bug Reporter:

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Piotr Nowojski
Hi, Thanks for picking up this discussion. For the record, I also think we shouldn't expose latency markers. About the stream status > Persisting the StreamStatus I don't agree with the view that sinks are "storing" the data/idleness status. This nomenclature makes only sense if we are

[jira] [Created] (FLINK-22874) flink table partition trigger doesn't effect as expectation when sink into hive table

2021-06-04 Thread Spongebob (Jira)
Spongebob created FLINK-22874: - Summary: flink table partition trigger doesn't effect as expectation when sink into hive table Key: FLINK-22874 URL: https://issues.apache.org/jira/browse/FLINK-22874

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-04 Thread Gabor Somogyi
> I did not mean for the user to sign its own certificates but for the operator of the cluster. Once the user request hits the proxy, it should no longer be under his control. I think I do not fully understand yet why this would not work. I said it's not solving the authentication problem over any

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
I believe that the correctness of watermarks and stream status markers is determined entirely by the source (ignoring the generic assigner). Such stream elements are known not to overtake records, and aren't transient from a pipeline perspective. I do agree that recoveries may be lossy if some

[jira] [Created] (FLINK-22873) Add ToC to configuration documentation

2021-06-04 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-22873: Summary: Add ToC to configuration documentation Key: FLINK-22873 URL: https://issues.apache.org/jira/browse/FLINK-22873 Project: Flink Issue Type:

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-06-04 Thread Yun Gao
Hi all, Very thanks @Dawid for resuming the discussion and very thanks @Till for the summary ! (and very sorry for I missed the mail and do not response in time...) I also agree with that we could consider the global commits latter separately after we have addressed the final checkpoints, and

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-04 Thread Till Rohrmann
I did not mean for the user to sign its own certificates but for the operator of the cluster. Once the user request hits the proxy, it should no longer be under his control. I think I do not fully understand yet why this would not work. What I would like to avoid is to add more complexity into

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-04 Thread Gyula Fóra
Hi! I think there might be possible alternatives but it seems Kerberos on the rest endpoint ticks all the right boxes and provides a super clean and simple solution for strong authentication. I wouldn’t even consider sidecar proxies etc if we can solve it in such a simple way as proposed by G.

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-04 Thread Gabor Somogyi
Till, thanks for investing time in giving further options. Marci, thanks for summarizing the use-case point of view. We've arrived back to one of the original problems. Namely if an attacker gets access to a node it's possible to cancel other user's jobs (and more can be done). Self signed

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-04 Thread Till Rohrmann
I am not saying that we shouldn't add a strong authentication mechanism if there are good reasons for it. I primarily would like to understand the context a bit better in order to give qualified feedback and come to a good decision. In order to do this, I have the feeling that we haven't fully

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Dawid Wysakowicz
Hi Eron, I might be missing some background on Pulsar partitioning but something seems off to me. What is the chunk/batch/partition that Pulsar brokers will additionally combine watermarks for? Isn't it the case that only a single Flink sub-task would write to such a chunk and thus will produce

Re: [Discuss] Planning Flink 1.14

2021-06-04 Thread Prasanna kumar
Hi all, We are using Flink for our eventing system. Overall we are very happy with the tech, documentation and community support and quick replies in mails. My last 1 year experience with versions. We were working on 1.10 initially during our research phase then we stabilised with 1.11 as we