Re: rocksdb max open file descriptor issue crashed application

2020-02-11 Thread Apoorv Upadhyay
Hi, Below is the error I am getting : 2020-02-08 05:40:24,543 INFO org.apache.flink.runtime.taskmanager.Task - order-steamBy-api-order-ip (3/6) (34c7b05d5a75dbbcc5718acf6b18) switched from RUNNING to CANCELING. 2020-02-08 05:40:24,543 INFO

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-11 Thread Becket Qin
+1 (binding) - verified signature - Ran word count example successfully. Thanks, Jiangjie (Becket) Qin On Wed, Feb 12, 2020 at 1:29 PM Jark Wu wrote: > +1 > > - checked/verified signatures and hashes > - Pip installed the package successfully: pip install > apache-flink-1.9.2.tar.gz > - Run

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-11 Thread Becket Qin
+1 (binding) - verified signature - Ran word count example successfully. Thanks, Jiangjie (Becket) Qin On Wed, Feb 12, 2020 at 1:29 PM Jark Wu wrote: > +1 > > - checked/verified signatures and hashes > - Pip installed the package successfully: pip install > apache-flink-1.9.2.tar.gz > - Run

Using multithreaded library within ProcessFunction with callbacks relying on the out parameter

2020-02-11 Thread Salva Alcántara
I am working on a `CoProcessFunction` that uses a third party library for detecting certain patterns of events based on some rules. So, in the end, the `ProcessElement1` method is basically forwarding the events to this library and registering a callback so that, when a match is detected, the

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-11 Thread Jark Wu
+1 - checked/verified signatures and hashes - Pip installed the package successfully: pip install apache-flink-1.9.2.tar.gz - Run word count example successfully through the documentation [1]. Best, Jark [1]:

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-11 Thread Jark Wu
+1 - checked/verified signatures and hashes - Pip installed the package successfully: pip install apache-flink-1.9.2.tar.gz - Run word count example successfully through the documentation [1]. Best, Jark [1]:

Dedup all data in stream

2020-02-11 Thread Akshay Shinde
Hi Community In our Flink job, in source we are creating our own stream to process n number of objects per 2 minutes. And in process function for each object from generated source stream we are doing some operation which we expect to get finished in 2 minutes. Every 2 minutes we are

Flink complaining when trying to write to s3 in Parquet format

2020-02-11 Thread Fatima Omer
I have a java app that is using a flink SQL query to perform aggregations on a data stream being read in from Kafka. Attached is the java file for reference. The query results are being written to s3. I can write successfully in Json format but when I try to use Parquet format, flink complains

Re: Aggregation for last n seconds for each event

2020-02-11 Thread Fanbin Bu
can u do RANGE BETWEEN INTERVAL '1' SECOND PRECEDING AND CURRENT ROW? On Tue, Feb 11, 2020 at 12:15 PM oleg wrote: > Hi Community, > > I do streaming in event time and I want to preserve ordering and late > events. I have a use case where I need to fire an aggregation function > for events of

Re: Exactly once semantics for hdfs sink

2020-02-11 Thread Vishwas Siravara
Hi Khachatryan, Thanks for your reply. Can you help me understand how it works with hdfs specifically , even a link to a document will help. Best, Vishwas On Mon, Feb 10, 2020 at 10:32 AM Khachatryan Roman < khachatryan.ro...@gmail.com> wrote: > Hi Vishwas, > > Yes, Streaming File Sink does

Flink job fails with org.apache.kafka.common.KafkaException: Failed to construct kafka consumer

2020-02-11 Thread John Smith
Just wondering is this on the client side in the flink Job? I rebooted the task and the job deployed correctly on another node. Is there a specific ulimit that we should set for flink tasks nodes? org.apache.kafka.common.KafkaException: Failed to construct kafka consumer at

Aggregation for last n seconds for each event

2020-02-11 Thread oleg
Hi Community, I do streaming in event time and I want to preserve ordering and late events. I have a use case where I need to fire an aggregation function for events of last n seconds(time units in general) for every incoming event. It seems to me that windowing is not suitable since it may

Re: rocksdb max open file descriptor issue crashed application

2020-02-11 Thread Congxian Qiu
Hi >From the given description, you use RocksDBStateBackend, and will always open 20k files in one machine, and app suddenly opened 35K files than crashed. Could you please share what are the opened files? and what the exception (given the full taskmanager.log maybe helpful) Best, Congxian

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-11 Thread Hequn Cheng
+1 (non-binding) - Check signature and checksum. - Install package successfully with Pip under Python 3.7.4. - Run wordcount example successfully under Python 3.7.4. Best, Hequn On Tue, Feb 11, 2020 at 12:17 PM Dian Fu wrote: > +1 (non-binding) > > - Verified the signature and checksum > -

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-11 Thread Hequn Cheng
+1 (non-binding) - Check signature and checksum. - Install package successfully with Pip under Python 3.7.4. - Run wordcount example successfully under Python 3.7.4. Best, Hequn On Tue, Feb 11, 2020 at 12:17 PM Dian Fu wrote: > +1 (non-binding) > > - Verified the signature and checksum > -

Re: Rescaling a running topology

2020-02-11 Thread Andrey Zagrebin
Hi Stephen, I am sorry that you had this experience with the rescale API. Unfortunately, the rescale API was always experimental and had some flaws. Recently, Flink community decided to disable it temporarily with the 1.9 release, see more explanation here [1]. I would advise the manual

Re: FlinkCEP questions - architecture

2020-02-11 Thread Arvid Heise
Hi Juergen, 1) yes, you are using a changelog of events. If you need more information, you could search for change data capture architecture. For alle CEP question, I'm pulling in Kostas. 12) It depends in which format the data is exported. If you use a format with schema evolution (e.g. Avro),

Re:Re: Flink DataTypes json parse exception

2020-02-11 Thread sunfulin
Hi, I am using the latest Flink 1.10 rc. When I run the same code using Flink 1.8.2, there is no problem. But using 1.10 the issue just occur. Confused by the related reason. At 2020-02-11 18:33:50, "Timo Walther" wrote: >Hi, > >from which Flink version are you upgrading? There were

Re: Flink DataTypes json parse exception

2020-02-11 Thread Timo Walther
Hi, from which Flink version are you upgrading? There were some changes in 1.9 for how to parse timestamps in JSON format. Your error might be related to those changes: https://issues.apache.org/jira/browse/FLINK-11727 I hope this helps. Timo On 07.02.20 07:57, sunfulin wrote: Hi, guys

Re: Flink 1.10 on MapR secure cluster with high availability

2020-02-11 Thread Arvid Heise
Hi Maxim, in general, we have used shading in the past to avoid dependency hells for users. We are in the process to replace more and more relocations with plugins having a different classloader. However, I can't tell you if that will also work for zookeeper and when that would happen. If you

Re: Question: Determining Total Recovery Time

2020-02-11 Thread Morgan Geldenhuys
Thanks for the advice, i will look into it. Had a quick think about another simple solution but we would need a hook into the checkpoint process from the task/operator perspective, which I haven't looked into yet. It would work like this: - The sink operators (?) would keep a local copy of

Re: NoClassDefFoundError when an exception is throw inside invoke in a Custom Sink

2020-02-11 Thread Arvid Heise
Hi David, this seems to be a bug in our s3 plugin. The joda dependency should be bundled there. Are you using s3 as a plugin by any chance? Which flink version are you using? If you are using s3 as a plugin, you could put joda in your plugin folder like this flink-dist ├── conf ├── lib ... └──

rocksdb max open file descriptor issue crashed application

2020-02-11 Thread ApoorvK
flink app is crashing due to "too many file opens" issue , currently app is having 300 operator and 60GB is the state size. suddenly app is opening 35k around files which was 20k few weeks before, hence app is crashing, I have updated the machine as well as yarn limit to 60k hoping it will not