Re: Multiple Exceptions during Load Test in State Access APIs with RocksDB

2021-06-09 Thread Yun Gao
Hi Chirag, Logically Integer type should not have this issue. Sorry that from the current description I have not found other issues, could you also share the code in the main method that adds the KeyProcessFunction into the job ? Very thanks! Best, Yun ---

Re: Re: Re: Re: Failed to cancel a job using the STOP rest API

2021-06-09 Thread Yun Gao
Very thanks Kezhu for the catch, it also looks to me the same issue as FLINK-21028. -- From:Piotr Nowojski Send Time:2021 Jun. 9 (Wed.) 22:12 To:Kezhu Wang Cc:Thomas Wang ; Yun Gao ; user Subject:Re: Re: Re: Re: Failed to cance

FileSource may cause akka.pattern.AskTimeoutException, and akka.ask.timeout not workable

2021-06-09 Thread 陳樺威
Hello all, Our team encounter *akka.pattern.AskTimeoutException *when start jobmanager. Base on the error message, we try to setup *akka.ask.timeout * and* web.timeout *to 360s, but both of them doesn't work. We guess the issue may cause by *FileSource.forRecordFileFormat.* The application will l

How to gracefully handle job recovery failures

2021-06-09 Thread Li Peng
Hey folks, we have a cluster with HA mode enabled, and recently after doing a zookeeper restart, our Kafka cluster (Flink v. 1.11.3, Scala v. 2.12) crashed and was stuck in a crash loop, with the following error: 2021-06-10 02:14:52.123 [cluster-io-thread-1] ERROR org.apache.flink.runtime.entrypoi

confused about `TO_TIMESTAMP` document description

2021-06-09 Thread Tony Wei
Hi Expert, this document [1] said `TO_TIMESTAMP` will use the session time zone to convert date time string into a timestamp. If I understand correctly, when I set session time zone to `Asia/Shanghai` and query `SELECT TO_TIMESTAMP('1970-01-01 08:00:00');`, the result should be epoch timestamp `0`

Re: Records Are Never Emitted in a Tumbling Event Window When Each Key Only Has One Record

2021-06-09 Thread Joseph Lorenzini
Hi Arvid,   I may have figured out the problem.   When using a tumbling window on a keyed stream and event time is being used, does time only move forward when there’s an event with a newer timestamp? Said another way: if I have a 5 second tumbling window, the job would need to consume

Re: DataStream API in Batch Execution mode

2021-06-09 Thread Marco Villalobos
That worked. Thank you very much. On Mon, Jun 7, 2021 at 9:23 PM Guowei Ma wrote: > Hi, Macro > > I think you could try the `FileSource` and you could find an example from > [1]. The `FileSource` would scan the file under the given > directory recursively. > Would you mind opening an issue for

Re: Records Are Never Emitted in a Tumbling Event Window When Each Key Only Has One Record

2021-06-09 Thread Arvid Heise
Hi Joe, Yes, that is correct! Only when a new record arrives and we know that timestamp, we can deduce the watermark and advance it. The window operator would close the old window and open a new one. Sorry that I haven't seen that immediately. It's sometimes hard to think in terms of individual r

Re: Records Are Never Emitted in a Tumbling Event Window When Each Key Only Has One Record

2021-06-09 Thread Joseph Lorenzini
Hi Arvid,   I am on 1.11.2.   The flink job has four operators:   Source from kakfa topic one: sent 14 recordsSource from kafka topic two: sent 6 recordsMap: received 15 records/sent 14 recordsMap: received 6 records/sent 6 recordsTumbling Window to Filesink: received 20 records/se

Re: Re: Re: Re: Failed to cancel a job using the STOP rest API

2021-06-09 Thread Piotr Nowojski
Yes good catch Kezhu, IllegalStateException sounds very much like FLINK-21028. Thomas, could you try upgrading to Flink 1.13.1 or 1.12.4? (1.11.4 hasn't been released yet)? Piotrek wt., 8 cze 2021 o 17:18 Kezhu Wang napisał(a): > Could it be same as FLINK-21028[1] (titled as “Streaming applica

NPE when restoring from savepoint in Flink 1.13.1 application

2021-06-09 Thread 陳昌倬
Hi, We have NullPointerException when trying to restore from savepoint for the same jar, or different jar, or different parallelism. We have workaround this issue by changing UIDs in almost all operators. We want to know if there is anyway to avoid this problem so that it will not cause service m

PyFlink: Upload resource files to Flink cluster

2021-06-09 Thread Sumeet Malhotra
Hi, I'm using UDTFs in PyFlink, that depend upon a few resource files (JSON schema files actually). The path of this file can be passed into the UDTF, but essentially this path needs to exist on the Task Manager node where the task executes. What's the best way to upload these resource files? As o

ProcessFunctionTestHarnesses for testing Python functions

2021-06-09 Thread Bogdan Sulima
Hi all, in Java/Scala i was using ProcessFunctionTestHarnesses to test my ProcessFunctions with timers based on event timestamps. Now i am switching to Python (my favourite language). Is there a similar TestHarness to support testing Python ProcessFunctions? Thanks for your answers in advance. R

Re: Records Are Never Emitted in a Tumbling Event Window When Each Key Only Has One Record

2021-06-09 Thread Arvid Heise
Hi Joe, could you please check (in web UI) if the watermark is advancing past the join? The window operator would not trigger if it doesn't advance. On which Flink version are you running? On Tue, Jun 8, 2021 at 10:13 PM Joseph Lorenzini wrote: > Hi all, > > > > I have observed behavior joinin

Re: [table-walkthrough exception] Unable to create a source for reading table...

2021-06-09 Thread Arvid Heise
Hi Lingfeng, could you try org.apache.flink flink-sql-connector-kafka_${scala.binary.version} ${flink.version} to your pom? On Wed, Jun 9, 2021 at 5:04 AM Lingfeng Pu wrote: > Hi, > > I'm following the tutorial to run the "flink-playgroun

RE: Issue with onTimer method of KeyedProcessFunction

2021-06-09 Thread V N, Suchithra (Nokia - IN/Bangalore)
Hi Zhang, Please find the code snippet. private ReducingState aggrRecord; // record being aggregated @Override public void onTimer(long timestamp, OnTimerContext ctx, Collector out) throws Exception { // FIXME timer is not working? or re-registration not working? NatLogData event = aggrRecord.g

Re: Using s3 bucket for high availability

2021-06-09 Thread Kurtis Walker
Thank you, I figured it out. My IAM policy was missing some actions. Seems I needed to give it “*” for it to work. From: Tamir Sagi Date: Wednesday, June 9, 2021 at 6:02 AM To: Yang Wang , Kurtis Walker Cc: user Subject: Re: Using s3 bucket for high availability EXTERNAL EMAIL I'd try seve

Re: Persisting state in RocksDB

2021-06-09 Thread Arvid Heise
Hi Paul, Welcome to the club! What's your SinkFunction? Is it custom? If so, you could also implement CheckpointedFunction to read and write data. Here you could use OperatorStateStore and with it the BroadcastState. However, broadcast state is quite heavy (it sends all data to all instances, so

Re: subscribe

2021-06-09 Thread Arvid Heise
To subscribe, please send a mail to user-subscr...@flink.apache.org On Fri, Jun 4, 2021 at 4:56 AM Boyang Chen wrote: > >

Re: Flink kafka consumers stopped consuming messages

2021-06-09 Thread Ilya Karpov
Hi Arvid, thanks for reply, thread dump + logs research didn’t help. We suggested that problem was in async call to remote key-value storage because we (1) found that async client timeout was set to 0 (effectively no timeout, idle infinitely), (2) async client threads we sleeping, (3) AsyncWait

RE: Issue with onTimer method of KeyedProcessFunction

2021-06-09 Thread V N, Suchithra (Nokia - IN/Bangalore)
Hi Zhang, Please find the code snippet. private ReducingState aggrRecord; // record being aggregated @Override public void onTimer(long timestamp, OnTimerContext ctx, Collector out) throws Exception { // FIXME timer is not working? or re-registration not working? NatLogData event = aggrRecord.g

Re: behavior differences when sinking side outputs to a custom KeyedCoProcessFunction

2021-06-09 Thread Arvid Heise
Hi Jin, as you have figured out, if something goes wrong with watermarks it's usually because of the watermark generator (sorry for not receiving any feedback whatsoever). Thank you very much for sharing your solution! On Thu, Jun 3, 2021 at 8:51 PM Jin Yi wrote: > just to resolve this thread,

Re: streaming file sink OUT metrics not populating

2021-06-09 Thread Arvid Heise
For reference, the respective FLIP shows the ideas [1]. It's on our agenda for 1.14. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics On Thu, Jun 3, 2021 at 6:41 PM Chesnay Schepler wrote: > This is a known issue, and cannot be fixed on the user sid

Re: Events triggering JobListener notification

2021-06-09 Thread Arvid Heise
Hi Barak, I think the answer to your question is lies in the javadoc: /** * Callback on job execution finished, successfully or unsuccessfully. It is only called back * when you call {@code execute()} instead of {@code executeAsync()} methods of execution * environments. * * Exactly one of t

Re: Flink kafka consumers stopped consuming messages

2021-06-09 Thread Arvid Heise
Hi Ilya, These messages could pop up when a Kafka broker is down but should eventually disappear. So I'm a bit lost. If there was a bug, it's also most likely fixed in the meantime. So if you want to be on the safe side, I'd try to upgrade to more recent versions (Flink + Kafka consumer). Best,

Re: Jupyter PyFlink Web UI

2021-06-09 Thread Jeff Zhang
BTW, you can also send email to zeppelin user maillist to join zeppelin slack channel to discuss more details. http://zeppelin.apache.org/community.html Jeff Zhang 于2021年6月9日周三 下午6:34写道: > Hi Maciek, > > You can try zeppelin which support pyflink and display flink job url > inline. > > http://z

Re: Jupyter PyFlink Web UI

2021-06-09 Thread Jeff Zhang
Hi Maciek, You can try zeppelin which support pyflink and display flink job url inline. http://zeppelin.apache.org/docs/0.9.0/interpreter/flink.html Maciej Bryński 于2021年6月9日周三 下午1:53写道: > Nope. > I found the following solution. > > conf = Configuration() > env = > StreamExecutionEnvironment(

Re: Using s3 bucket for high availability

2021-06-09 Thread Tamir Sagi
I'd try several things try accessing the bucket from CLI first locally https://docs.aws.amazon.com/cli/latest/reference/s3/ If it does not work please check your credentials under ~/.aws/credentials file + ~/.aws/config = since the AWS clients read the credentials from these files by default(u

Re: State migration for sql job

2021-06-09 Thread Yuval Itzchakov
As my company is also a heavy user of Flink SQL, the state migration story is very important to us. I as well believe that adding new fields should start to accumulate state from the point in time of the change forward. Is anyone actively working on this? Is there anyway to get involved? On Tue,

Re: Issue with onTimer method of KeyedProcessFunction

2021-06-09 Thread JING ZHANG
Hi Suchithra, Would you please provide more information in detail or paste the main code? Best regards, JING ZHANG V N, Suchithra (Nokia - IN/Bangalore) 于2021年6月9日周三 下午3:42写道: > Hello, > > > > We are using apache flink 1.7 and now trying to upgrade to flink 1.12.3 > version. After upgrading to

Re: Using s3 bucket for high availability

2021-06-09 Thread Yang Wang
It seems to be a S3 issue. And I am not sure it is the root cause. Could you please share more details of the JobManager log? Or could you verify that the Flink cluster could access the S3 bucket successfully(e.g. store the checkpoint) when HA is disabled? Best, Yang Kurtis Walker 于2021年6月8日周二

Issue with onTimer method of KeyedProcessFunction

2021-06-09 Thread V N, Suchithra (Nokia - IN/Bangalore)
Hello, We are using apache flink 1.7 and now trying to upgrade to flink 1.12.3 version. After upgrading to 1.12.3 version, the onTimer method of KeyedProcessFunction is not behaving correctly, the value of ReducingState and ValueState always return null. Could you please help in debugging th

Re: Questions about implementing a flink source

2021-06-09 Thread Arvid Heise
Hi Evan, 1. I'd recommend supporting DeserializationSchema in any case similar to KafkaRecordDeserializationSchema. First, it aligns with other sources and user expectations. Second, it's a tad faster and the plan looks easier if you omit a chained task. Third, you can avoid quite a bit of boilerp

Re: Multiple Exceptions during Load Test in State Access APIs with RocksDB

2021-06-09 Thread Chirag Dewan
Thanks for the reply Yun. The key is an Integer type. Do you think there can be hash collisions for Integers? It somehow works on single TM now. No errors for 1m records.But as soon as we move to 2 TMs, we get all sort of errors - 'Position Out of Bound', key not in Keygroup etc. This also caus