Re: Interval Join Late Record Metrics

2019-12-11 Thread Congxian Qiu
Hi Chris >From the code[1], currently, IntervalJoin will ignore the late data silently, maybe you can create an issue to track this. [1] https://github.com/apache/flink/blob/5c89d12849ea2aa332126b32808e363f12d436a2/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/co/Int

Re: Flink ML feature

2019-12-11 Thread Till Rohrmann
Hi guys, it is true that we dropped Flink-ML with 1.9. The reason is that the community started working on a new ML library which you can find under flink-ml-parent [1]. This module contains the framework for building ML pipelines but not yet too many algorithms iirc. The plan is to extend this li

Open Method is not being called in case of AggregateFunction UDFs

2019-12-11 Thread Arujit Pradhan
Hi all, So we are creating some User Defined Functions of type AggregateFunction. And we want to send some static metrics from the *open()* method of the UDFs as we can get *MetricGroup *by *FunctionContext *which is only exposed in the open method. Our code looks something like this(Which is an i

Re: Request for removal from subscription

2019-12-11 Thread Tom Blackwood
Please send a message to: user-unsubscr...@flink.apache.org for unsubscribing. On Wed, Dec 11, 2019 at 1:39 PM L Jainkeri, Suman (Nokia - IN/Bangalore) < suman.l_jaink...@nokia.com> wrote: > Unsubscribe >

Re: Open Method is not being called in case of AggregateFunction UDFs

2019-12-11 Thread Dawid Wysakowicz
Hi Arujit, Could you also share the query where you use this UDF? It would also help if you said which version of Flink you are using and which planner. Best, Dawid On 11/12/2019 10:21, Arujit Pradhan wrote: > Hi all, > > So we are creating some User Defined Functions of type > AggregateFunctio

Re: Thread access and broadcast state initialization in BroadcastProcessFunction

2019-12-11 Thread KristoffSC
Hi Vino, Thank you for your response and provided links. So just to clarify and small follow up. 1. Methods will be called only by one thread right? 2. The links you provided are tackling a case when we got a "fast stream" element before we received broadcast stream element. In my case we had Br

Re: Processing Events by custom rules kept in Broadcast State

2019-12-11 Thread Timo Walther
Hi, I think when it comes to the question "What data type should I put in state?", this question should usually be answered with a well-defined data structure that allows for future state upgrades. Like defining a database schema. So I would not put "arbirary" classes such as Jackson's Object

Re: Flink 'Job Cluster' mode Ui Access

2019-12-11 Thread Chesnay Schepler
Would it be possible for you to provide us with full debug log file? On 10/12/2019 18:07, Jatin Banger wrote: Yes, I did. On Tue, Dec 10, 2019 at 3:47 PM Arvid Heise > wrote: Hi Jatin, just to be sure. Did you increase the log level to debug [1] before

Re: Need help using AggregateFunction instead of FoldFunction

2019-12-11 Thread Arvid Heise
Hi Devin, for event-time based windows, you need to give Flink two types of information: - timestamp of records, which I assume is in your case already embedded into the Pulsar records - and a watermark assigner. The watermarks help Flink to determine when windows can be closed in respect to out-

Re: Thread access and broadcast state initialization in BroadcastProcessFunction

2019-12-11 Thread Timo Walther
1. Yes, methods will only be called by one thread. The FLink API aims to abstract all concurrency topics away when using the provided methods and state. 2. The open() method should always be the first method being called. If this is not the case, this is definitely a bug. Which Flink version a

Re: Open Method is not being called in case of AggregateFunction UDFs

2019-12-11 Thread Timo Walther
I remember that we fixed some bug around this topic recently. The legacy planner should not be affected. There is another user reporting this: https://issues.apache.org/jira/browse/FLINK-15040 Regards, Timo On 11.12.19 10:34, Dawid Wysakowicz wrote: Hi Arujit, Could you also share the query

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread Timo Walther
Hi Mans, I would recommend to create a little prototype to answer most of your questions in action. You can simple do: stream = env.fromElements(1L, 2L, 3L, 4L) .assignTimestampsAndWatermarks( new AssignerWithPunctuatedWatermarks{ extractTimestamp(e) = e, check

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread Timo Walther
Little mistake: The key must be any constant instead of `e`. On 11.12.19 11:42, Timo Walther wrote: Hi Mans, I would recommend to create a little prototype to answer most of your questions in action. You can simple do: stream = env.fromElements(1L, 2L, 3L, 4L)    .assignTimestampsAndWater

Re: Order events by filed that does not represent time

2019-12-11 Thread Timo Walther
Hi Krzysztof, first of all Flink does not sort events based on timestamp. The concept of watermarks just postpones the triggering of a time operation until the watermark says all events until a time t have arrived. For your problem, you can simply use a ProcessFunction and buffer the events

Re: Open Method is not being called in case of AggregateFunction UDFs

2019-12-11 Thread Timo Walther
At least I hope it has been fixed. Which version and planner are you using? On 11.12.19 11:47, Arujit Pradhan wrote: Hi Timo, Thanks for the bug reference. You mentioned that this bug has been fixed. Is the fix available for flink 1.9+ and default query planner. Thanks and regards, /Arujit

Scala case class TypeInformation and Serializer

2019-12-11 Thread 杨光
Hi, I'm working on write a flink stream job with scala api , how should I find out which class is serialied by flink type serializer and which is falled back to generic Kryo serializer. And if one class falls back to Kryo serializer, how can I make some extend the TypeInfo classes of Flink or some

Re: Localenvironment jobcluster ha High availability

2019-12-11 Thread Gary Yao
Hi Eric, What you say should be possible because your job will be executed in a MiniCluster [1] which has HA support. I have not tried this out myself, and I am not aware that people are doing this in production. However, there are integration tests that use MiniCluster + ZooKeeper [2]. Best, Gar

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread M Singh
Thanks Timo for your answer.  I will try the prototype but was wondering if I can find some theoretical documentation to give me a sound understanding. Mans On Wednesday, December 11, 2019, 05:44:15 AM EST, Timo Walther wrote: Little mistake: The key must be any constant instead of `e`

Re: Apache Flink - Retries for async processing

2019-12-11 Thread M Singh
Thanks Zhu for your advice.  Mans On Tuesday, December 10, 2019, 09:32:01 PM EST, Zhu Zhu wrote: Hi M Singh, I think you would be able to know the request failure cause and whether it is recoverable or not.You can handle the error as you like. For example, if you think the error is un

Re: Order events by filed that does not represent time

2019-12-11 Thread David Anderson
Krzysztof, Note that if you want to have Flink treat these sequence numbers as event time timestamps, you probably can, so long as they are generally increasing, and there's some bound on how out-of-order they can be. The advantage to doing this is that you might be able to use Flink SQL's event

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread David Anderson
I'll attempt to answer your questions. If we have allowed lateness to be greater than 0 (say 5), then if an event > which arrives at window end + 3 (within allowed lateness), > (a) it is considered late and included in the window function as a > late firing ? > An event with a timestamp that

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread David Anderson
> > If we have allowed lateness to be greater than 0 (say 5), then if an event > which arrives at window end + 3 (within allowed lateness), > > (a) it is considered late and included in the window function as a > late firing ? > An event with a timestamp that falls within the window's boundarie

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread M Singh
Thanks David for your detailed answers.   Mans On Wednesday, December 11, 2019, 08:12:51 AM EST, David Anderson wrote: If we have allowed lateness to be greater than 0 (say 5), then if an event which arrives at window end + 3 (within allowed lateness),      (a) it is considered late

Re: Open Method is not being called in case of AggregateFunction UDFs

2019-12-11 Thread Jark Wu
Hi Arujit, Thanks for reporting this. Are you using this UDF in window aggregation in old planner ? AFAIK, open() method of UDAF is only not called in window aggregation in old planner, because old planner uses DataStream WindowOperator which will not call open() on AggregateFunction [1]. I also

Re: Scala case class TypeInformation and Serializer

2019-12-11 Thread Yun Tang
Hi Would you please give related code? I think it might due to insufficient hint to type information. Best Yun Tang From: 杨光 Date: Wednesday, December 11, 2019 at 7:20 PM To: user Subject: Scala case class TypeInformation and Serializer Hi, I'm working on write a flink stream job with scal

Re: Interval Join Late Record Metrics

2019-12-11 Thread Chris Gillespie
Thanks Congxian, I made a JIRA to track this request. https://issues.apache.org/jira/browse/FLINK-15202 On Wed, Dec 11, 2019 at 12:56 AM Congxian Qiu wrote: > Hi Chris > > From the code[1], currently, IntervalJoin will ignore the late data > silently, maybe you can create an issue to track this.

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-11 Thread Li Peng
Hey Yang, here are the commands: "/opt/flink/bin/taskmanager.sh", "start-foreground", "-Djobmanager.rpc.address={{ .Chart.Name }}-job-manager", "-Dtaskmanager.numberOfTaskSlots=1" "/opt/flink/bin/standalone-job.sh", "start-foreground", "-Djobmanager.rpc.address={{ .Chart.Name }}-job-manager", "-D

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-11 Thread Yang Wang
Hi Peng, What i mean is to use `docker exec` into the running pod and `ps` to get the real command that is running for jobmanager. Do you have checked the /opt/flink/conf/log4j.properties is right? I have tested standalone per-job on my kubernetes cluster, the logs show up as expected. Best, Ya

[ANNOUNCE] Apache Flink 1.8.3 released

2019-12-11 Thread Hequn Cheng
Hi, The Apache Flink community is very happy to announce the release of Apache Flink 1.8.3, which is the third bugfix release for the Apache Flink 1.8 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streamin

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-11 Thread Dian Fu
Thanks Hequn for being the release manager and everyone who contributed to this release. Regards, Dian > 在 2019年12月12日,下午2:24,Hequn Cheng 写道: > > Hi, > > The Apache Flink community is very happy to announce the release of Apache > Flink 1.8.3, which is the third bugfix release for the Apach

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-11 Thread Jeff Zhang
Great work, Hequn Dian Fu 于2019年12月12日周四 下午2:32写道: > Thanks Hequn for being the release manager and everyone who contributed to > this release. > > Regards, > Dian > > 在 2019年12月12日,下午2:24,Hequn Cheng 写道: > > Hi, > > The Apache Flink community is very happy to announce the release of Apache > F

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-11 Thread Jark Wu
Thanks Hequn for helping out this release and being the release manager. Great work! Best, Jark On Thu, 12 Dec 2019 at 15:02, Jeff Zhang wrote: > Great work, Hequn > > Dian Fu 于2019年12月12日周四 下午2:32写道: > >> Thanks Hequn for being the release manager and everyone who contributed >> to this relea

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-11 Thread jincheng sun
Thanks for being the release manager and the great work Hequn :) Also thanks to the community making this release possible! Best, Jincheng Jark Wu 于2019年12月12日周四 下午3:23写道: > Thanks Hequn for helping out this release and being the release manager. > Great work! > > Best, > Jark > > On Thu, 12 De

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-11 Thread Jingsong Li
Thanks Hequn for your driving, 1.8.3 fixed a lot of issues and it is very useful to users. Great work! Best, Jingsong Lee On Thu, Dec 12, 2019 at 3:25 PM jincheng sun wrote: > Thanks for being the release manager and the great work Hequn :) > Also thanks to the community making this release pos

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-11 Thread ouywl
HI yang,   Could you give more info detail? log4j.properties content, and The k8s yaml. Is use the dockerfile in flink-container? When I test it use the default per-job yaml in flick-container? It is only show logs in docker infos. And not logs in /opt/flink/log.

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-11 Thread Wei Zhong
Thanks Hequn for being the release manager. Great work! Best, Wei > 在 2019年12月12日,15:27,Jingsong Li 写道: > > Thanks Hequn for your driving, 1.8.3 fixed a lot of issues and it is very > useful to users. > Great work! > > Best, > Jingsong Lee > > On Thu, Dec 12, 2019 at 3:25 PM jincheng sun