Re: Correlation between data streams/operators and threads

2017-11-17 Thread Nico Kruber
regarding 3. a) The taskmanager logs are missing, are there any? b) Also, the JobManager logs say you have 4 slots available in total - is this enough for your 5 devices scenario? c) The JobManager log, however, does not really reveal what it is currently doing, can you set the log level to DEBUG

Re: Correlation between data streams/operators and threads

2017-11-17 Thread Piotr Nowojski
Sorry for not responding but I was away. Regarding 1. One source operator, followed by multiple tasks with parallelism 1 (as visible on your screen shot) that share resource group will collapse to one task slot - only one TaskManager will execute all of your job. Because all of your events ar

Re: Flink memory leak

2017-11-17 Thread Piotr Nowojski
Thank you for those screenshots, they help a lot. However I do not see any obvious candidate for a memory leak. There is a slight upward trend in "G1 Old Gen”, but this can be misleading. To further analyse what’s going you need to run your test case for a longer time. Also you will need to tak

Re: org.apache.flink.runtime.io.network.NetworkEnvironment causing memory leak?

2017-11-17 Thread Piotr Nowojski
Hi, If the TM is not responding check the TM logs if there is some long gap in logs. There might be three main reasons for such gaps: 1. Machine is swapping - setup/configure your machine/processes that machine never swap (best to disable swap altogether) 2. Long GC full stops - look how to ana

Re: Akka configuration setting missing if RemoteEnvironment job is started from CLI

2017-11-17 Thread Lukas Kircher
Hi again, jfyi - not sure if this is correct, but it seems that differing flink jars seemed to be the problem. Here is how I get the RemoteEnvironment running on my setup: * create a maven project with the flink-quickstart-java archetype * no changes to pom.xml * add my custom job to the projec

Accessing Cassandra for reading and writing

2017-11-17 Thread André Schütz
Hi, we want to read and write from and to Cassandra. We found the Flink-Cassandra connector and added the JAR to the lib folder of the running Flink cluster (local machine). Trying to access the Cassandra database by adding the import to a notebook within Apache Zeppelin, resulted in the followi

all task managers reading from all kafka partitions

2017-11-17 Thread r. r.
Hi I have this strange problem: 4 task managers each with one task slot, attaching to the same Kafka topic which has 10 partitions. When I post a single message to the Kafka topic it seems that all 4 consumers fetch the message and start processing (confirmed by TM logs). If I run kafka-consum

Re: Apache Flink - Question about TriggerResult.FIRE

2017-11-17 Thread M Singh
Thanks Stefan and Aljoscha for your responses. Stefan - When I mentioned "new window" - I meant the next window being created.  Eg:  if the event was in w1 based processing time and the trigger returned FIRE - then after the window function is computed, what happens to the events in that window (

Re: Apache Flink - Question about TriggerResult.FIRE

2017-11-17 Thread M Singh
Also, Stefan - You mentioned  "In the first case, it is a new window without the previous elements, in the second case the window reflects the old contents plus all changes since the last trigger." I am assuming the first case is FIRE and second case is FIRE_AND_PURGE - I was thinking that in th

Re: all task managers reading from all kafka partitions

2017-11-17 Thread r. r.
Hi it's Flink 1.3.2, Kafka 0.10.2.0 I am starting 1 JM and 4 TM (with 1 task slot each). Then I deploy 4 times (via ./flink run -p1 x.jar), job parallelism is set to 1. A new thing I just noticed: if I start in parallel to the Flink jobs two kafka-console-consumer (with --consumer-propert

Re: all task managers reading from all kafka partitions

2017-11-17 Thread Gary Yao
Forgot to hit "reply all" in my last email. On Fri, Nov 17, 2017 at 8:26 PM, Gary Yao wrote: > Hi Robert, > > To get your desired behavior, you should start a single job with > parallelism set to 4. > > Flink does not rely on Kafka's consumer groups to distribute the > partitions to the parallel

Re: all task managers reading from all kafka partitions

2017-11-17 Thread r. r.
Hmm, but I want single slot task managers and multiple jobs so that if one job fails it doesn't bring the whole setup (for example 30+ parallel consumers) down. What setup would you advise? The job is quite heavy and might bring the VM down if run with such concurency in one JVM. Thanks!

Re: ElasticSearch 6

2017-11-17 Thread Fritz Budiyanto
Hi, I've tried Flink with ES6, and its causing exception thrown in ES6. Is the fix just matter of bumping es client version to 5.6 ? Could anyone familiar with ES connector confirm ? If this is just a matter of bumping the es client version, can we have this simple change in Flink 1.4 ? Thanks,