kafka: how to stop consumption temporarily

2020-01-02 Thread David Morin
Hi, Is there a way to stop temporarily to consume one kafka source in streaming mode ? Use case: I have to consume 2 topics but in fact one of them is more prioritized. One of this topic is dedicated to ingest data from db (change data capture) and one of them is dedicated to make a synchroniza

Sub-user

2020-01-02 Thread Jary Zhen

Re: Sub-user

2020-01-02 Thread Benchao Li
Hi Jary, You need to send a email to *user-subscr...@flink.apache.org * to subscribe, not user@flink.apache.org. Jary Zhen 于2020年1月2日周四 下午4:53写道: > > -- Benchao Li School of Electronics Engineering and Computer Science, Peking University Tel:+86-15650713730 Email: libenc...@gmail.com; libenc

Re: Sub-user

2020-01-02 Thread vino yang
Hi Jary, All the Flink's mailing list information can be found here[1]. [1]: https://flink.apache.org/community.html#mailing-lists Best, Vino Benchao Li 于2020年1月2日周四 下午4:56写道: > Hi Jary, > > You need to send a email to *user-subscr...@flink.apache.org > * to subscribe, not user@flink.apache.o

Re: Flink Prometheus metric doubt

2020-01-02 Thread Chesnay Schepler
In practice the documentation is incorrect. While technically the metric _would_ emit -1 if the job is in a failed/finished state, the reality is that at this point the metric is unregistered and no longer updated, since the owning component (the jobmanager) is shutting down. I can't think of

Re: [DISCUSS] Drop vendor specific repositories from pom.xml

2020-01-02 Thread Aljoscha Krettek
+1 to remove > On 20. Dec 2019, at 10:34, Robert Metzger wrote: > > Okay, I understand. I'm okay with removing the profile. > > On Thu, Dec 19, 2019 at 11:34 AM Till Rohrmann wrote: > >> The profiles make bumping ZooKeeper's version a bit more cumbersome. I >> would be interested for this rea

Session Window with dynamic gap

2020-01-02 Thread KristoffSC
Hi all, I'm exploring Flink for our new project. Currently I'm playing with Session Windows with dynamic Gap. In short, I would like to be able to change the value of the gap on demand, for example on config update. So I'm having this code: messageStream .keyBy(tradeKeySelector

Yarn Kerberos issue

2020-01-02 Thread Juan Gentile
Hello, Im trying to submit a job (batch worcount) to a Yarn cluster. I’m trying to use delegation tokens and I’m getting the following error: org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster at org.apache.flink.yarn.AbstractYarn

Re: Session Window with dynamic gap

2020-01-02 Thread vino yang
Hi KristoffSC, Firstly, IMO, you can implement this feature by customizing the `SessionWindowTimeGapExtractor`. Additionally, let me clearify a concept. A component that implements the `SessionWindowTimeGapExtractor` interface should not be an operator in Flink. In Flink's concepts, Window is an

Re: Session Window with dynamic gap

2020-01-02 Thread KristoffSC
Thank you for the answer, the thing is that I would not like to call external system for each Window, rather I woudl like to keep the gap size in Flink's state which I will be able to change from external system, for example handle configUpdate message from Kafka. So if SessionWindowTimeGapExtra

Re: [Problem] Unable to do join on TumblingEventTimeWindows using SQL

2020-01-02 Thread jeremyji
Hi, Fabian, i use time-windowed join according to the docs you give but still have the problem. Here is my flink sql look like: SELECT a.account account, SUM(a.value) + SUM(b.value), UNIX_TIMESTAMP(TUMBLE_START(a.producer_timestamp, INTERVAL '3' MINUTE)) FROM (SELECT

Flink group with time-windowed join

2020-01-02 Thread jeremyji
Two stream as table1, table2. We know that group with regular join won't work so we have to use time-windowed join. So here is my flink sql looks like: *SELECT a.account account, SUM(a.value) + SUM(b.value), UNIX_TIMESTAMP(TUMBLE_START(a.producer_timestamp, INTERVAL '3' MIN

Re: [DISCUSS] Drop vendor specific repositories from pom.xml

2020-01-02 Thread Till Rohrmann
Thanks for the feedback. As I haven't heard any objections, I'll open a JIRA issue and a PR to remove the vendor specific repositories. Cheers, Till On Thu, Jan 2, 2020 at 12:28 PM Aljoscha Krettek wrote: > +1 to remove > > > On 20. Dec 2019, at 10:34, Robert Metzger wrote: > > > > Okay, I und

Re: Session Window with dynamic gap

2020-01-02 Thread KristoffSC
So I was trying to have something like this: PipelineConfigOperator pipelineConfigOperator = new PipelineConfigOperator(); messageStream .connect(pipelineConfigStream) .process(*pipelineConfigOperator*) .keyBy(tradeKeySelector) .wind

Late outputs for Session Window

2020-01-02 Thread KristoffSC
Hi all, In my pipeline setup I cannot see side outputs for Session Window (Flink 1.9.1) What I have is: messageStream. .keyBy(tradeKeySelector) .window(ProcessingTimeSessionWindows.withDynamicGap(new TradeAggregationGapExtractor())) .sideOutputLateData(lateTradeMessages) .process

Re: Duplicate tasks for the same query

2020-01-02 Thread RKandoji
Ok thanks, does it mean version 1.9.2 is what I need to use? On Wed, Jan 1, 2020 at 10:25 PM Jingsong Li wrote: > Blink planner was introduced in 1.9. We recommend use blink planner after > 1.9. > After some bug fix, I think the latest version of 1.9 is OK. The > production environment has also

Controlling the Materialization of JOIN updates

2020-01-02 Thread Benoît Paris
Hello all! I'm trying to design a stream pipeline, and have trouble controlling when a JOIN is triggering an update: Setup: - The Event table; "probe side", "query side", the result of earlier stream processing - The DimensionAtJoinTimeX tables; of updating nature, "build side", the

Re: Best way set max heap size via env variables or program arguments?

2020-01-02 Thread Li Peng
Awesome, thanks! On Wed, Jan 1, 2020 at 6:17 PM Xintong Song wrote: > Hi Li, > > Regarding your questions: > > 1. Is there actually a way to pass in the heap size via arguments to >> taskmanager.sh? Is passing -Dtaskmanager.heap.size supposed to work? > > > No, '-Dtaskmanager.heap.size' is not s

Re: Session Window with dynamic gap

2020-01-02 Thread KristoffSC
Ok, I did some more tests and yep, it seems that there is no way to use Flink's State in class that will implement SessionWindowTimeGapExtractor. Even if I will implement this interface on a class that is an operator, whenever extract method is called it does not have any access to Flink's state

Checkpoints issue and job failing

2020-01-02 Thread Navneeth Krishnan
Hi All, We are running into checkpoint timeout issue more frequently in production and we also see the below exception. We are running flink 1.4.0 and the checkpoints are saved on NFS. Can someone suggest how to overcome this? [image: image.png] java.lang.IllegalStateException: Could not initial

Re: Duplicate tasks for the same query

2020-01-02 Thread Jingsong Li
Yes, 1.9.2 or Coming soon 1.10 Best, Jingsong Lee On Fri, Jan 3, 2020 at 12:43 AM RKandoji wrote: > Ok thanks, does it mean version 1.9.2 is what I need to use? > > On Wed, Jan 1, 2020 at 10:25 PM Jingsong Li > wrote: > >> Blink planner was introduced in 1.9. We recommend use blink planner af

Re: kafka: how to stop consumption temporarily

2020-01-02 Thread Terry Wang
Hi, I’d like to share my opinion here. It seems that you need adjust the Kafka consumer to have communication each other. When your begin the dump process, you need to notify another CDC-topic consumer to wait idle. Best, Terry Wang > 2020年1月2日 16:49,David Morin 写道: > > Hi, > > Is there

Re: Session Window with dynamic gap

2020-01-02 Thread vino yang
Hi KristoffSC, >> Are there any plans to add support of Flink State into SessionWindowTimeGapExtractor? As I said, `SessionWindowTimeGapExtractor` is neither a general UDF nor an operator. But I cannot give a clear answer. Let me ping @Aljoscha Krettek to give the answer. Best, Vino Kristoff

Re: Checkpoints issue and job failing

2020-01-02 Thread vino yang
Hi Navneeth, Did you check if the path contains in the exception is really can not be found? Best, Vino Navneeth Krishnan 于2020年1月3日周五 上午8:23写道: > Hi All, > > We are running into checkpoint timeout issue more frequently in production > and we also see the below exception. We are running flink