Re: how is data partitoned and distributed for connected stream

2017-08-22 Thread Till Rohrmann
Hi, if all operators have the same parallelism, then there will be a pointwise connection. This means all elements arriving at s1_x and s2_x will be forwarded to s3_x with _x denoting the parallel subtask. Thus, to answer your second question, the single s1 element will only be present at one subt

Re: Global State and Scaling

2017-08-22 Thread Till Rohrmann
Hi Elias, you're right, we currently don't support proper broadcast state. Hope to add support for this in the near future. The maximum parallelism only affects the keyed state because it defines how many key groups there are. The key groups are the smallest unit of state which can be re-partitio

Re: Deleting files in continuous processing

2017-08-22 Thread Till Rohrmann
Hi Mohit, Flink does not support this behaviour out of the box afaik. I think you have to write your own source function or extend ContinuousFileMonitoringFunction in order to do that. Cheers, Till ​ On Mon, Aug 21, 2017 at 11:07 PM, Mohit Anchlia wrote: > Just checking to see if there is a wa

Re: Prioritize DataStream

2017-08-22 Thread Till Rohrmann
Hi Elias, sorry for the slow answer. You were right that the answer is currently no. However, people are currently working on changing the way the stream operators work. This will allow the operator to decide from which input to read next. Having such a functionality will enable us to implement p

Re: Question about parallelism

2017-08-22 Thread Till Rohrmann
Hi Jerry, you can set the global parallelism via the ExecutionEnvironment#setParallelism. If you call setParallelism on an operator, then it only changes the parallelism of this operator. The parallelism of an operator means how many parallel instances of this operator will be executed. Thus, it a

Re: akka timeout

2017-08-22 Thread Till Rohrmann
Hi Steven, quick correction for Flink 1.2. Indeed the MetricFetcher does not pick up the right timeout value from the configuration. Instead it uses a hardcoded 10s timeout. This has only been changed recently and is already committed in the master. So with the next release 1.4 it will properly pi

Re: Measure Job Execution Time

2017-08-22 Thread Till Rohrmann
Hi Paolo, could it be that there is an exception being thrown and that's why the last println is not executed? I assume that you want to measure the time of a batch program, right? Cheers, Till On Fri, Aug 18, 2017 at 6:10 PM, Paolo Cristofanelli < cristofanelli.pa...@gmail.com> wrote: > I woul

Re: StandaloneResourceManager failed to associate with JobManager leader

2017-08-22 Thread Till Rohrmann
Hi Hao Sun, have you checked that one can resolve the hostname flink_jobmanager from within the container? This is required to connect to the JobManager. If this is the case, then log files with DEBUG log level would be helpful to track down the problem. Cheers, Till On Wed, Aug 16, 2017 at 5:35

Re: Flink doesn't free YARN slots after restarting

2017-08-22 Thread Till Rohrmann
Hi Bowen, sorry for my late answer. I dug through some of the logs and it seems that you have the following problem: 1. Once in a while the Kinesis producer fails with a UserRecordFailedException saying “Expired while waiting in HttpClient queue Record has reached expiration”. This s

HBase connection problems on Flink 1.3.1

2017-08-22 Thread Flavio Pompermaier
Hi to all, I'm trying to connect to HBase on Flink 1.3.1 but it seems that *HBaseConfiguration.create()* doesn't work correctly (because zookeper properties are not read from hbase-site.xml). I've also tried to put the hbase-site.xml in the flink conf folder but it didn't work.. What should I do?

Re: HBase connection problems on Flink 1.3.1

2017-08-22 Thread Flavio Pompermaier
I was able to fix the problem by adding the following line within bin/config.sh: HBASE_CONF_DIR="/etc/hbase/conf" Indeed, Cloudera 5.9 doesn't set HBASE_CONF_DIR env variable automatically. Another possible solution could be to set this parameter manually into .bash_profile or .profile (not .bash

Re: HBase connection problems on Flink 1.3.1

2017-08-22 Thread Till Rohrmann
Thanks for sharing your solution with the community Flavio. Cheers, Till On Tue, Aug 22, 2017 at 2:34 PM, Flavio Pompermaier wrote: > I was able to fix the problem by adding the following line within > bin/config.sh: > > HBASE_CONF_DIR="/etc/hbase/conf" > > Indeed, Cloudera 5.9 doesn't set HBAS

Expception with Avro Serialization on RocksDBStateBackend

2017-08-22 Thread Biplob Biswas
Hi, I am getting the following exception in my code, I can observe that there's something wrong while serializing my Object, the class of which looks something like this: https://gist.github.com/revolutionisme/1eea5ccf5e1d4a5452f27a1fd5c05ff1 The exact cause it seems is some field inside my nes

Exception when trying to run flink twitter example

2017-08-22 Thread Krishnanand Khambadkone
Hi,  I have created a fat jar with my twitterexample classes and am running it like this,  ~/flink-1.3.2/build-target/bin/flink run -c TwitterExample ./flinktwitter.jar  --twitter-source.consumerKey --twitter-source.consumerSecret --twitter-source.token --twitter-source.tokenSecret I am pro

Re: Expception with Avro Serialization on RocksDBStateBackend

2017-08-22 Thread Till Rohrmann
Hi Biplob, have you told Avro to allow null for fields in your schema? If yes, then could you share the Avro schema, the version of Flink as well as the Avro version with us? This would help with further understanding the problem. Cheers, Till On Tue, Aug 22, 2017 at 5:42 PM, Biplob Biswas wrot

Re: Exception when trying to run flink twitter example

2017-08-22 Thread Till Rohrmann
Hi Krishnanand, could you check that you have the build.properties file in you fat jar containing the field version=? Cheers, Till On Tue, Aug 22, 2017 at 6:19 PM, Krishnanand Khambadkone < kkhambadk...@yahoo.com> wrote: > Hi, I have created a fat jar with my twitterexample classes and am > ru

Re: StandaloneResourceManager failed to associate with JobManager leader

2017-08-22 Thread Hao Sun
Thanks Till, the DEBUG log level is a good idea. I figured it out. I made a mistake with `-` and `_`. On Tue, Aug 22, 2017 at 1:39 AM Till Rohrmann wrote: > Hi Hao Sun, > > have you checked that one can resolve the hostname flink_jobmanager from > within the container? This is required to connec

Re: Exception when trying to run flink twitter example

2017-08-22 Thread Krishnanand Khambadkone
Till,  Thank you for the prompt response.  Yes, including the build.properties (version = 2.2.0) made the exception go away.  Now no exception but no tweets output either.  The program just sits there doing nothing.  I have not specified an output directory so the tweets are sent to stdout.   

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-22 Thread James Bucher
Just wanted to throw in a couple more details here from what I have learned from working with Kubernetes. All processes restart (a lost JobManager restarts eventually). Should be given in Kubernetes: * This works very well, we run multiple jobs with a single Jobmanager and Flink/Kubernetes

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-22 Thread Hao Sun
Great suggestions, the etcd operator is very interesting, thanks James. On Tue, Aug 22, 2017, 12:42 James Bucher wrote: > Just wanted to throw in a couple more details here from what I have > learned from working with Kubernetes. > > *All processes restart (a lost JobManager restarts eventually)

Job submission timeout

2017-08-22 Thread Vishnu Viswanath
Hi, After I submit the job the client timeout after 10 seconds( Guess Job manager is taking long time to build the graph, it is a pretty big JobGraph). *Caused by: org.apache.flink.runtime.client.JobTimeoutException: JobManager did not respond within 1 milliseconds* * at org.apache.flink.runt

Question about windowing

2017-08-22 Thread Jerry Peng
Hello, I have a question regarding windowing and triggering. I am trying to connect the dots between the simple windowing api e.g. stream.countWindow(1000, 100) to the underlying representation using triggers and evictors api: stream.window(GlobalWindows.create()) .evictor(CountEvictor.of(10

Reset Kafka Consumer using Flink Consumer 10 API

2017-08-22 Thread sohimankotia
Hi, I am trying to replay kafka logs from specific offset . But I am not able to make it work . Using Ref : https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/kafka.html#kafka-consumers-start-position-configuration My Code : import org.apache.flink.streaming.api.data

Re: Job submission timeout

2017-08-22 Thread Vishnu Viswanath
Never mind, it was a silly mistake, I used "=" instead of ":" while setting akka.ask.timeout. Now it works fine! On Tue, Aug 22, 2017 at 5:10 PM, Vishnu Viswanath < vishnu.viswanat...@gmail.com> wrote: > Hi, > > After I submit the job the client timeout after 10 seconds( Guess Job > manager is t