Re: The implementation of the RichSinkFunction is not serializable.

2017-08-28 Thread Federico D'Ambrosio
Hello everyone, I solved my issue by using an Array[Byte] as a parameter, instead of the explicit HTableDescriptor parameter. This way I can instantiate the TableDescriptor inside the open method of OutputFormat using the static method HTableDescriptor.parseFrom. In the end, marking conf, table an

Re: Flink Elastic Sink AWS ES

2017-08-28 Thread arpit srivastava
Hi Ant, Can you try this. curl -XGET 'http:///_cat/nodes?v&h=ip,port' This should give you ip and port On Mon, Aug 28, 2017 at 3:42 AM, ant burton wrote: > Hi Arpit, > > The response fromm _nodes doesn’t contain an ip address in my case. Is > this something that you experienced? > > curl -XGE

Re: Issues in recovering state from last crash using custom sink

2017-08-28 Thread Aljoscha Krettek
Hi, How are you testing the recovery behaviour? Are you taking a savepoint ,then shutting down, and then restarting the Job from the savepoint? Best, Aljoscha > On 28. Aug 2017, at 00:28, vipul singh wrote: > > Hi all, > > I am working on a flink archiver application. In a gist this applicat

Re: Even out the number of generated windows

2017-08-28 Thread Aljoscha Krettek
Hi Bowen, There is not built-in TTL but you can use a ProcessFunction to set a timer that clears state. ProcessFunction docs: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/process_function.html Best, Aljoscha > On 27. Aug 2017, at 19:19, Bowen Li wrote: > > Hi Rober

Re: Specific sink behaviour based on tuple key

2017-08-28 Thread Aljoscha Krettek
Hi, The Key is not available directly to a user function? You would have to use within that function the same code that you use for your KeySelector. Best, Aljoscha > On 26. Aug 2017, at 10:01, Alexis Gendronneau wrote: > > Hi all, > > I am looking to customize a sink behaviour based on tupl

Re: [Error]TaskManager -RECEIVED SIGNAL 1: SIGHUP. Shutting down as requested

2017-08-28 Thread Ted Yu
See http://docs.oracle.com/cd/E19253-01/816-5166/6mbb1kq04/index.html Cheers On Sun, Aug 27, 2017 at 11:47 PM, Samim Ahmed wrote: > Hello Ted Yu, > > Thanks for your response and a sincere apology for let reply. > > OS version : Solaris10. > Flink Version : flink-1.2.0-bin-hadoop2-scala_2.10 >

Re: Thoughts - Monitoring & Alerting if a Running Flink job ever kills

2017-08-28 Thread Aljoscha Krettek
Hi, There is no built-in feature for this but you would use your metrics system for that, in my opinion. Best, Aljoscha > On 26. Aug 2017, at 00:49, Raja.Aravapalli wrote: > > Hi, > > Is there a way to set alerting when a running Flink job kills, due to any > reasons? > > Any thoughts p

Re: Question about watermark and window

2017-08-28 Thread Aljoscha Krettek
Hi Tony, I think your analyses are correct. Especially, yes, if you re-read the data the (ts=3) data should still be considered late if both consumers read with the same speed. If, however, (ts=3) is read before the other consumer reads (ts=8) then it should not be considered late, as you said.

Re: Database connection from job

2017-08-28 Thread Aljoscha Krettek
Hi Bart, I think you might be interested in the (admittedly short) section of the doc about RichFunctions: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/api_concepts.html#rich-functions

Re: Question about windowing

2017-08-28 Thread Aljoscha Krettek
Yes, this is a very good explanation, Tony! I'd like to add that "Evictor" is not really a good name for what it does. It should be more like "Keeper" or "Retainer" because what a "CountEvictor.of(1000)" really does is to evict everything but the last 1000 elements, so it should be called "Coun

Classloader issue with UDF's in DataStreamSource

2017-08-28 Thread Edward
I need help debugging a problem with using user defined functions in my DataStreamSource code. Here's the behavior: The first time I upload my jar to the Flink cluster and submit the job, it runs fine. For any subsequent runs of the same job, it's giving me a NoClassDefFound error on one of my UDF

Null Pointer Exception on Trying to read a message from Kafka

2017-08-28 Thread Sridhar Chellappa
Folks, I have a KafkaConsumer that I am trying to read messages from. When I try to create a DataStream from the KafkConsumer (env.addSource()) I get the following exception : Any idea on how can this happen? java.lang.NullPointerException at org.apache.flink.streaming.runtime.tasks.Ope

Re: Null Pointer Exception on Trying to read a message from Kafka

2017-08-28 Thread Ted Yu
Which version of Flink / Kafka are you using ? Can you show the snippet of code where you create the DataStream ? Cheers On Mon, Aug 28, 2017 at 7:38 AM, Sridhar Chellappa wrote: > Folks, > > I have a KafkaConsumer that I am trying to read messages from. When I try > to create a DataStream fro

Off heap memory issue

2017-08-28 Thread Javier Lopez
Hi all, we are starting a lot of Flink jobs (streaming), and after we have started 200 or more jobs we see that the non-heap memory in the taskmanagers increases a lot, to the point of killing the instances. We found out that every time we start a new job, the committed non-heap memory increases b

CoGroupedStreams.WithWindow sideOutputLateData and allowedLateness

2017-08-28 Thread Yunus Olgun
Hi, WindowedStream has sideOutputLateData and allowedLateness methods to handle late data. A similar functionality at CoGroupedStreams would have been nice. As it is, it silently ignores late data and it is error-prone. - Is there a reason it does not exist? - Any suggested workaround?

Re: Sink - Cassandra

2017-08-28 Thread nragon
Nick, Can you send some of your examples using phoenix? Thanks -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Sink-Cassandra-tp4107p15197.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Null Pointer Exception on Trying to read a message from Kafka

2017-08-28 Thread Sridhar Chellappa
DataStream MyKafkaMessageDataStream = env.addSource( getStreamSource(env, parameterTool); ); public RichParallelSourceFunction getStreamSource(StreamExecutionEnvironment env, ParameterTool parameterTool) { // MyKAfkaMessage is a ProtoBuf message env.g

Re: Null Pointer Exception on Trying to read a message from Kafka

2017-08-28 Thread Ted Yu
Which Flink version are you using (so that line numbers can be matched with source code) ? On Mon, Aug 28, 2017 at 9:16 AM, Sridhar Chellappa wrote: > DataStream MyKafkaMessageDataStream = env.addSource( > getStreamSource(env, parameterTool); > ); > > > > public R

metrics for Flink sinks

2017-08-28 Thread Martin Eden
Hi all, Just 3 quick questions both related to Flink metrics, especially around sinks: 1. In the Flink UI Sources always have 0 input records / bytes and Sinks always have 0 output records / bytes? Why is it like that? 2. What is the best practice for instrumenting off the shelf Flink sinks? Cu

Flink Yarn Session failures

2017-08-28 Thread Chan, Regina
Hi, Was trying to understand why it takes about 9 minutes between the last try to start a container and when it finally gets the sigterm to kill the YarnApplicationMasterRunner. Client: Calc Engine: 2017-08-28 12:39:23,596 INFO org.apache.flink.yarn.YarnClusterClient

Re: Issues in recovering state from last crash using custom sink

2017-08-28 Thread vipul singh
Hi Aljoscha, Yes. I am running the application till a few checkpoints are complete. I am stopping the application between two checkpoints, so there will be messages in the list state, which should be checkpointed when *snapshot* is called. I am able to see a checkpoint file on S3( I am saving the

Re: Even out the number of generated windows

2017-08-28 Thread Bowen Li
That's exactly what I found yesterday! Thank you Aljoscha for confirming it! On Mon, Aug 28, 2017 at 2:57 AM, Aljoscha Krettek wrote: > Hi Bowen, > > There is not built-in TTL but you can use a ProcessFunction to set a timer > that clears state. > > ProcessFunction docs: https://ci.apache.org/pr

Default chaining & uid

2017-08-28 Thread Emily McMahon
Does setting uid affect the default chaining (ie if I have two maps in a row and set uid on both)? This makes me think there's no effect All operators that are part of a chain should be assigned an ID as > described in the

Union limit

2017-08-28 Thread boci
Hi guys! I have one input (from mongo) and I split the incoming data to multiple datasets (each created dynamically from configuration) and before I write back the result I want to merge it to one dataset (there is some common transformation). so the flow: DataSet from Mongod => Create Mappers dy

Re: Flink Elastic Sink AWS ES

2017-08-28 Thread ant burton
Hey Arpit, > _cat/nodes?v&h=ip,port returns the following which I have not added the x’s they were returned on the response ipport x.x.x.x 9300 Thanks your for you help Anthony > On 28 Aug 2017, at 10:34, arpit srivastava wrote: > > Hi Ant, > > Can you try this. > > curl -XG

Example build error

2017-08-28 Thread Jakes John
When I am trying to build and run streaming wordcount example(example in the flink github), I am getting the following error StreamingWordCount.java:[56,59] incompatible types: org.apache.flink.api.java.operators.DataSource cannot be converted to org.apache.flink.streaming.api.datastream.DataStr

Re: Example build error

2017-08-28 Thread Ted Yu
Looking at: https://github.com/kl0u/flink-examples/blob/master/src/main/java/com/dataartisans/flinksolo/simple/StreamingWordCount.java there is no line 56. Which repo do you get StreamingWordCount from ? On Mon, Aug 28, 2017 at 3:58 PM, Jakes John wrote: > When I am trying to build and run str

Re: Null Pointer Exception on Trying to read a message from Kafka

2017-08-28 Thread Sridhar Chellappa
1.3.0 On Mon, Aug 28, 2017 at 10:09 PM, Ted Yu wrote: > Which Flink version are you using (so that line numbers can be matched > with source code) ? > > On Mon, Aug 28, 2017 at 9:16 AM, Sridhar Chellappa > wrote: > >> DataStream MyKafkaMessageDataStream = env.addSource( >> getSt

Re: Null Pointer Exception on Trying to read a message from Kafka

2017-08-28 Thread Sridhar Chellappa
Kafka Version is 0.10.0 On Tue, Aug 29, 2017 at 6:43 AM, Sridhar Chellappa wrote: > 1.3.0 > > On Mon, Aug 28, 2017 at 10:09 PM, Ted Yu wrote: > >> Which Flink version are you using (so that line numbers can be matched >> with source code) ? >> >> On Mon, Aug 28, 2017 at 9:16 AM, Sridhar Chellap

Re: Question about watermark and window

2017-08-28 Thread Tony Wei
Hi Alijoscha, It is very helpful to me to understand the behavior on such scenario. Thank you very much!!! Best Regards, Tony Wei 2017-08-28 20:00 GMT+08:00 Aljoscha Krettek : > Hi Tony, > > I think your analyses are correct. Especially, yes, if you re-read the > data the (ts=3) data should sti

Re: Null Pointer Exception on Trying to read a message from Kafka

2017-08-28 Thread Ted Yu
The NPE came from this line: StreamRecord copy = castRecord.copy(serializer.copy(castRecord.getValue())); Either serializer or castRecord was null. I wonder if this has been fixed in 1.3.2 release. On Mon, Aug 28, 2017 at 7:24 PM, Sridhar Chellappa wrote: > Kafka Version is 0.10.0 > >

Re: Null Pointer Exception on Trying to read a message from Kafka

2017-08-28 Thread Sridhar Chellappa
OK. I got past the problem. Basically, I had to change public class MyKafkaMessageSerDeSchema implements DeserializationSchema, SerializationSchema { @Override public MyKafkaMessage deserialize(byte[] message) throws IOException { MyKafkaMessage MyKafkaMessage = null; try

Re: Flink Elastic Sink AWS ES

2017-08-28 Thread arpit srivastava
It seems AWS ES setup is hiding the nodes ip. Then I think you can try @vinay patil's solution. Thanks, Arpit On Tue, Aug 29, 2017 at 3:56 AM, ant burton wrote: > Hey Arpit, > > _cat/nodes?v&h=ip,port > > > returns the following which I have not added the x’s they were returned on > the resp

Consuming a Kafka topic with multiple partitions from Flink

2017-08-28 Thread Isuru Suriarachchi
Hi all, I'm trying to implement a Flink consumer which consumes a Kafka topic with 3 partitions. I've set the parallelism of the execution environment to 3 as I want to make sure that each Kafka partition is consumed by a separate parallel task in Flink. My first question is whether it's always gu

"Unable to find registrar for hdfs" on Flink cluster

2017-08-28 Thread P. Ramanjaneya Reddy
Hi All, build jar file from the beam quickstart. while run the jar on Flinkcluster got below error.? anybody got this error? Could you please help how to resolve this? root1@master:~/NAI/Tools/flink-1.3.0$ *bin/flink run -c org.apache.beam.examples.WordCount /home/root1/NAI/Tools/word-count-beam