Re: getting duplicate messages from duplicate jobs

2019-01-28 Thread Tzu-Li (Gordon) Tai
Hi, Yes, Dawid is correct. The "group.id" setting in Flink's Kafka Consumer is only used for group offset fetching and committing offsets back to Kafka (only for exposure purposes, not used for processing guarantees). The Flink Kafka Consumer uses static partition assignment on the KafkaConsumer

Re: KeyBy is not creating different keyed streams for different keys

2019-01-28 Thread Kumar Bolar, Harshith
Typo: lines 1, 2 and 5 From: Harshith Kumar Bolar Date: Tuesday, 29 January 2019 at 10:16 AM To: "user@flink.apache.org" Subject: KeyBy is not creating different keyed streams for different keys Hi all, I'm reading a simple JSON string as input and keying the stream based on two fields A and

KeyBy is not creating different keyed streams for different keys

2019-01-28 Thread Kumar Bolar, Harshith
Hi all, I'm reading a simple JSON string as input and keying the stream based on two fields A and B. But KeyBy is generating the same keyed stream for different values of B but for a particular combination of A and B. The input: { "A": "352580084349898", "B": "1546559127", "C": "A"

Issue setting up Flink in Kubernetes

2019-01-28 Thread Timothy Victor
Hi - Has there been any update on the below issue? I am also facing the same problem. http://mail-archives.apache.org/mod_mbox/flink-user/201812.mbox/%3ccac2r2948lqsyu8nab5p7ydnhhmuox5i4jmyis9g7og6ic-1...@mail.gmail.com%3E There is a similar issue ( https://stackoverflow.com/questions/50806228

How autoscaling works on Kinesis Data Analytics for Java ?

2019-01-28 Thread Maxim Parkachov
Hi, I had impression, that in order to change parallelism, one need to stop Flink streaming job and re-start with new settings. According to https://docs.aws.amazon.com/kinesisanalytics/latest/java/how-scaling.html auto-scaling works out of the box. Could someone with experience of running Flink

Re: SQL Client (Streaming + Checkpoint)

2019-01-28 Thread Yun Tang
Hi Vijay Unfortunately, current sql-client does not support to configure to enable checkpoint. Current execution properties for sql-client support both batch and streaming environment while batch environment dose not support checkpoint. I prefer current sql-client as a tool for prototyping, not

SQL Client (Streaming + Checkpoint)

2019-01-28 Thread Vijay Srinivasaraghavan
It looks like the SQL client does not configure enable checkpoint while submitting the streaming job query. Did anyone notice this behavior? FYI, I am using 1.6.x branch. RegardsVijay

How to load multiple same-format files with single batch job?

2019-01-28 Thread françois lacombe
Hi all, I'm wondering if it's possible and what's the best way to achieve the loading of multiple files with a Json source to a JDBC sink ? I'm running Flink 1.7.0 Let's say I have about 1500 files with the same structure (same format, schema, everything) and I want to load them with a *batch* jo

Re: Trouble migrating state from 1.6.3 to 1.7.1

2019-01-28 Thread pwestermann
Just using a copy of AvroSerializer with the serialVersionUID set to 1 did not work. There was a NullPointerException on the next checkpoint, probably because previousSchema doesn't exist in the old serializer. However, the version from the PR with serialVersionUID set to 1 worked. (I didn't want

Re: Sampling rate higher than 1Khz

2019-01-28 Thread Piotr Nowojski
Hi, Maybe stupid idea, but does anything prevents a user from pretending that watermarks/event times are in different unit, for example microseconds? Of course assuming using row/event time and not using processing time for anything? Piotrek > On 28 Jan 2019, at 14:58, Tzu-Li (Gordon) Tai wr

Re: Sampling rate higher than 1Khz

2019-01-28 Thread Tzu-Li (Gordon) Tai
Hi! Yes, Flink's watermark timestamps are in milliseconds, which means that time-based operators such as time window operators will be fired at a per-millisecond granularity. Whether or not this introduces "latency" in the pipeline depends on the granularity of your time window operations; if you

Re: How Flink prioritise read from kafka topics and partitions ?

2019-01-28 Thread Tzu-Li (Gordon) Tai
Hi Sohi! On Wed, Jan 23, 2019 at 9:01 PM sohimankotia wrote: > Hi, > > Let's say I have flink Kafka consumer read from 3 topics , [ T-1 ,T-2,T-3 > ] > . > > - T1 and T2 are having partitions equal to 100 > - T3 is having partitions equal to 60 > - Flink Task (parallelism is 50) > There isn't a

Re: Trouble migrating state from 1.6.3 to 1.7.1

2019-01-28 Thread Tzu-Li (Gordon) Tai
Thanks Peter! Yes, it would also be great if you try the patch in https://github.com/apache/flink/pull/7580 out and see if that works for you. On Mon, Jan 28, 2019 at 7:47 PM pwestermann wrote: > Hi Gordon, > > We should be able to wait for 1.7.2 but I will also test the workaround and > post if

Sampling rate higher than 1Khz

2019-01-28 Thread Nicholas Walton
Flinks watermarks are in milliseconds. I have time sampled off a sensor at a rate exceeding 1Khz or 1 per millisecond. Is there a way to handle timestamp granularity below milliseconds, or will I have to generate timestamp for the millisecond value preceding that associated with the sensor readi

Re: How to infer table schema from Avro file

2019-01-28 Thread Yun Tang
+ Flink Users From: Yun Tang Sent: Monday, January 28, 2019 19:46 To: Soheil Pourbafrani Subject: Re: How to infer table schema from Avro file Hi Soheil You should provide your generated Avro record class as the type of AvroInputFormat not Avro's GenericRecord c

Re: Trouble migrating state from 1.6.3 to 1.7.1

2019-01-28 Thread pwestermann
Hi Gordon, We should be able to wait for 1.7.2 but I will also test the workaround and post if I run into further issues. Thanks a lot! Peter -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Trouble migrating state from 1.6.3 to 1.7.1

2019-01-28 Thread Tzu-Li (Gordon) Tai
Hi, Thanks for all the information and reporting this. We've identified this to be an actual issue: https://issues.apache.org/jira/browse/FLINK-11436. There's also a PR opened to fix this, and is currently under review: https://github.com/apache/flink/pull/7580. I'll make sure that this is fixed

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-28 Thread Ufuk Celebi
Hey Aaron, sorry for the late reply (again). (1) I think that your final result is in line with what I have reproduced in https://issues.apache.org/jira/browse/FLINK-11402. (2) I think renaming the file would not help as it will still be loaded multiple times when the jobs restarts (as it happen

CfP: LASCAR 2019 - Workshop on Large Scale RDF Analytics || @ESWC 2019 || 2nd – 6th June 2019 || Portorož, Slovenia

2019-01-28 Thread Gezim Sejdiu
Dear all **Apologies for cross-posting** Call for Papers & Posters ESWC - Workshop on Large Scale RDF Analytics *June 3, 2019* Website: http://lascar.sda.tech/ Twitter: @lascarworkshop ABSTRACT: LASCAR-19, Workshop on Large Scale RDF Analytics, at the