Datastream api implementation of a third party pyflink connector

2021-07-19 Thread Wang, Zhongle
Hi, I'm working on a pyflink datastream connector for Pravega and wish to use a datasource other than Kafka. Currently the Kafka connector for the python datastream api is implemented using a ` get_gateway` function which creates a binding to java in ` FlinkKafkaConsumer`. So if I want to cre

Apache Flink Kafka Connector not found Error

2021-07-19 Thread Taimoor Bhatti
I'm having some trouble with using the Flink DataStream API with the Kafka Connector. There don't seem to be great resources on the internet which can explain the issue I'm having. My project is here: https://github.com/sysarcher/flink-scala-tests I want to I'm unable to use FlinkKafkaConsumer

Re: Delay data elements in pipeline by X minutes

2021-07-19 Thread Dario Heinisch
Hey Jan, No it isn't a logical constraint. Reason is there are different kind of users, some who pay for live data while other want a cheaper version but where the data is delayed. But what happens if I add a random key ( lets say a uuid ) isn't that bad for performance? Then for every Objec

Re: Delay data elements in pipeline by X minutes

2021-07-19 Thread Jan Lukavský
I don't want to speak for Apache Flink - I'm using it via Apache Beam only - but generally speaking, each key will have to be held in state up to some moment when it can be garbage collected. This moment is defined (at least in the Apache Beam case) as the timestamp of end of window + allowed l

Re: Set job specific resources in one StreamTableEnvironment

2021-07-19 Thread Yun Gao
Hi Paul, For parallelism, it should be able to be set with `table.exec.resource.default-parallelism` [1] , and an example to set the parameter is at the first several paragraph. But Regarding the total process memory, I think it should be only set in the cluster level since it is per-cluster

Re: Apache Flink Kafka Connector not found Error

2021-07-19 Thread Yun Gao
Hi Taimoor, I think it is right regarding the provided dependency and we need to use manually included them in the classpath via the IDEA options. And regarding the FlinkKafkaConsumer issue, I tried locally and it seems it could work after adding the import ? Namely import org.apache.flink.stre

Re: Subpar performance of temporal joins with RocksDB backend

2021-07-19 Thread Adrian Bednarz
Thanks Maciej, I think this has helped a bit. We are now at 2k/3k eps on a single node. Now, I just wonder if this isn't too slow for a single node and such a simple query. On Sat, Jul 10, 2021 at 9:28 AM Maciej Bryński wrote: > Could you please set 2 configuration options: > - state.backend.roc

Re: Kafka Consumer Retries Failing

2021-07-19 Thread Piotr Nowojski
Thanks for the update. > Could the backpressure timeout and heartbeat timeout be because of Heap Usage close to Max configured? Could be. This is one of the things I had in mind under overloaded in: > might be related to one another via some different deeper problem (broken network environment,

Re: Apache Flink Kafka Connector not found Error

2021-07-19 Thread Taimoor Bhatti
Hello Yun, Many thanks for the reply... For some reason I'm not able to import org.apache.flink.streaming.connectors within the IDE... I get the following errors: object connectors is not a member of package org.apache.flink.streaming import org.apache.flink.streaming.connectors.kafka.FlinkKafk

Re: Kafka Consumer Retries Failing

2021-07-19 Thread Rahul Patwari
Hi Piotrek, I was just about to update. You are right. The issue is because of a stalled task manager due to High Heap Usage. And the High Heap Usage is because of a Memory Leak in a library we are using. Thanks for your help. On Mon, Jul 19, 2021 at 8:31 PM Piotr Nowojski wrote: > Thanks for

Re: Apache Flink Kafka Connector not found Error

2021-07-19 Thread Yun Gao
Hi Taimoor, It seems sometime IntelliJ does not works well for index, perhaps you could choose mvn -> reimport project from the context menu, if it still not work, perhaps you might try remove the .idea and .iml file and re-open the project again. Best, Yun -

Re: Apache Flink Kafka Connector not found Error

2021-07-19 Thread Taimoor Bhatti
THANKSS... This was it!! I did:- CTRL+SHIFT+A and typed "Reload All Maven Projects" Building the project didn't result in errors. I don't think I could've resolved this... Thanks again Yun!!! From: Yun Gao Sent: Monday, July 19, 2021 5:23 PM To: Taimoor Bh

Re: Kafka Consumer Retries Failing

2021-07-19 Thread Piotr Nowojski
Ok, thanks for the update. Great that you managed to resolve this issue :) Best, Piotrek pon., 19 lip 2021 o 17:13 Rahul Patwari napisał(a): > Hi Piotrek, > > I was just about to update. > You are right. The issue is because of a stalled task manager due to High > Heap Usage. And the High Heap

Can we share state between different keys in the same window?

2021-07-19 Thread Sweta Kalakuntla
Hi, I need to query the database(not a source but for additional information) in ProcessFunction. I want to save the results in a state or some other way so that I can use the data for other keys in the same window. What are my options? Thanks, sweta

Stateful Functions Status

2021-07-19 Thread Omid Bakhshandeh
Hi, We are evaluating Flink Stateful Functions in our company and we are trying to see if it fits our needs. I'm hoping to get some help from the community as we do this. There are a couple of primary questions that can speed up our process: 1- It seems in version 2.2.0, in the Python SDK, it wa

Re: Datastream api implementation of a third party pyflink connector

2021-07-19 Thread Xingbo Huang
Hi Zhongle Wang, Your understanding is correct. Firstly, you need to provide an implementation of a java connector, then add this jar to the dependency[1], and finally add a python connector wrapper. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/python/faq/#adding-jar-

RE: Datastream api implementation of a third party pyflink connector

2021-07-19 Thread Wang, Zhongle
Hi Xingbo, Thanks for the reassurance. PS: The java implementation of pravega connector is at https://github.com/pravega/flink-connectors/packages/ Best, Zhongle Wang From: Xingbo Huang Sent: Tuesday, July 20, 2021 9:58 AM To: Wang, Zhongle Cc: user@flink.apache.org Subject: Re: Datastream ap

Re: Can we share state between different keys in the same window?

2021-07-19 Thread JING ZHANG
Hi sweta, State of different keys are isolated with each other. It means, you could read/write the state of current key in `ProcessFunction`/`KeyedProcessFunction`/`ProcessWindowFunction`, there is no possible to read/write state of other keys. Would you please describe your business demand, let's

Some question of RocksDB state backend on ARM os

2021-07-19 Thread Wanghui (HiCampus)
Hi all: When I use RocksDB as state backend on an aarch64 system, the following error occurs: 1. Does the aarch64 system not support rocksdb? 2. If not, is there a support plan for later versions of flink? Caused by: java.lang.Exception: Exception while creating StreamOper

Topic assignment across Flink Kafka Consumer

2021-07-19 Thread Prasanna kumar
Hi, We have a Flink job reading from multiple Kafka topics based on a regex pattern. What we have found out is that the topics are not shared between the kafka consumers in an even manner . Example if there are 8 topics and 4 kafka consumer operators . 1 consumer is assigned 6 topics , 2 consume