Flink serialization errors at a batch job

2022-05-03 Thread Yunus Olgun
Hi, We're running a large Flink batch job and sometimes it throws serialization errors in the middle of the job. It is always the same operator but the error can be different. Then the following attempts work. Or sometimes attempts get exhausted, then retrying the job. The job is basically

Flink recovers even cancelled jobs after zookeeper failure

2018-05-15 Thread Yunus Olgun
Hi, We are using Flink 1.4.0 at zookeeper high availability mode and with externalized checkpoints. Today after we have restarted a zookeeper node, several Flink clusters have lost connection to the zookeeper. This triggered a leader election at effected clusters. After the leader election, the

Kafka Producer writeToKafkaWithTimestamps; name, uid, parallelism fails

2017-10-05 Thread Yunus Olgun
Hi, I am using Flink 1.3.2. When I try to use KafkaProducer with timestamps it fails to set name, uid or parallelism. It uses default values. ——— FlinkKafkaProducer010.FlinkKafkaProducer010Configuration producer = FlinkKafkaProducer010 .writeToKafkaWithTimestamps(stream, topicName,

Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-09-28 Thread Yunus Olgun
introduce some latency, e.g. >> measuring throughput in taskA >> and sending it to a side output with a taksID, then broadcasting the side >> output to a downstream operator >> which is sth like a coprocess function (taskB) and receives the original >> stream and the s

Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-09-27 Thread Yunus Olgun
more what is your use case? > This way we may figure out another approach to achieve your goal. > In fact, I am not sure if you earn anything by broadcasting the watermark, > other than > re-implementing (to some extent) Flink’s windowing mechanism. > > Thanks, > Kos

CoGroupedStreams.WithWindow sideOutputLateData and allowedLateness

2017-08-28 Thread Yunus Olgun
Hi, WindowedStream has sideOutputLateData and allowedLateness methods to handle late data. A similar functionality at CoGroupedStreams would have been nice. As it is, it silently ignores late data and it is error-prone. - Is there a reason it does not exist? - Any suggested workaround?

Synchronized Kafka sources

2017-08-04 Thread Yunus Olgun
Hi, Is it possible to synchronize two kafka sources? So they can consume from different Kafka topics in close enough event times. My use case is, I have two Kafka topics: A(very large) and B(large). There is a mapping of one to one or zero between A and B. Topology is simply join A and B in a