Re: How can i merge more than one flink stream

2017-07-19 Thread Kien Truong
Hi, To expand on Fabian's answer, there's a few API for join. * connect - you have to provide a CoprocessFunction. * window join/cogroup - you provide  key selector functions, a time window and a join/cogroup function. With the first method, you have to write more code, in exchange for much

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-19 Thread Tzu-Li (Gordon) Tai
Hi Pedro, Seems like a memory leak. The only issue I’m currently aware of that may be related is [1]. Could you tell if this JIRA relates to what you are bumping into? The JIRA mentions Kafka 09, but a fix is only available for Kafka 010 once we bump our Kafka 010 dependency to the latest

Re: Flink rolling upgrade support

2017-07-19 Thread Moiz Jinia
No. This is the thread that answers my question - http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Does-FlinkKafkaConsumer010-care-about-consumer-group-td14323.html Moiz — sent from phone On 19-Jul-2017, at 10:04 PM, Ted Yu wrote: This was the other

Re:Re: Problem with Flink restoring from checkpoints

2017-07-19 Thread 周思华
Hi Fran, is the DataTimeBucketer acts like a memory buffer and does't managed by flink's state? If so, then i think the problem is not about Kafka, but about the DateTimeBucketer. Flink won't take snapshot for the DataTimeBucketer if it not in any state. Best, Sihua Zhou At 2017-07-20

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-19 Thread Fabian Hueske
Hi, Gordon (in CC) knows the details of Flink's Kafka consumer. He might know how to solve this issue. Best, Fabian 2017-07-19 20:23 GMT+02:00 PedroMrChaves : > Hello, > > Whenever I submit a job to Flink that retrieves data from Kafka the memory > consumption

Re: Flink ML with DataStream

2017-07-19 Thread Fabian Hueske
Hi, unfortunately, it is not possible to convert a DataStream into a DataSet. Flink's DataSet and DataStream APIs are distinct APIs that cannot be used together. The FlinkML library is only available for the DataSet API. There is some ongoing work to add a machine learning library for streaming

Re: Problem with Flink restoring from checkpoints

2017-07-19 Thread Fabian Hueske
Hi Fran, did you observe actual data loss due to the problem you are describing or are you discussing a possible issue based on your observations? AFAIK, Flink's Kafka consumer keeps track of the offsets itself and includes these in the checkpoints. In case of a recovery, it does not rely on the

Re: How can i merge more than one flink stream

2017-07-19 Thread Fabian Hueske
Hi, there are basically two operations to merge streams. 1. Union simply merges the input streams such that the resulting stream has the records of all input streams. Union is a built-in operator in the DataStream API. For that all streams must have the same data type. 2. Join connects records

FlinkKafkaConsumer010 - Memory Issue

2017-07-19 Thread PedroMrChaves
Hello, Whenever I submit a job to Flink that retrieves data from Kafka the memory consumption continuously increases. I've changed the max heap memory from 2gb, to 4gb, even to 6gb but the memory consumption keeps reaching the limit. An example of a simple Job that shows this behavior is

Re: Flink monitor rest API question

2017-07-19 Thread Will Du
Thanks. Seems this has been fixed before in FLINK-5109 Sent from my iPhone > On Jul 19, 2017, at 07:47, Chesnay Schepler wrote: > > Hello, > > Looks like you stumbled upon a bug in our REST API and use a client that is > stricter than others. > > I will create a JIRA

Flink ML with DataStream

2017-07-19 Thread Branham, Jeremy [IT]
Hello - I've been successful working with Flink in Java, but have some trouble trying to leverage the ML library, specifically with KNN. >From my understanding, this is easier in Scala [1] so I've been converting my >code. One issue I've encountered is - How do I get a DataSet[Vector] from a

Re: Flink rolling upgrade support

2017-07-19 Thread Ted Yu
This was the other thread, right ? http://search-hadoop.com/m/Flink/VkLeQ0dXIf1SkHpY?subj=Re+Does+job+restart+resume+from+last+known+internal+checkpoint+ On Wed, Jul 19, 2017 at 9:02 AM, Moiz Jinia wrote: > Yup! Thanks. > > Moiz > > — > sent from phone > > On 19-Jul-2017,

Re: Flink rolling upgrade support

2017-07-19 Thread Moiz Jinia
Yup! Thanks. Moiz — sent from phone On 19-Jul-2017, at 9:21 PM, Aljoscha Krettek [via Apache Flink User Mailing List archive.] wrote: This was now answered in your other Thread, right? Best, Aljoscha >> On 18. Jul 2017, at 11:37, Moiz Jinia <[hidden

Re: Flink rolling upgrade support

2017-07-19 Thread Aljoscha Krettek
This was now answered in your other Thread, right? Best, Aljoscha > On 18. Jul 2017, at 11:37, Moiz Jinia wrote: > > Aljoscha Krettek wrote >> Hi, >> zero-downtime updates are currently not supported. What is supported in >> Flink right now is a savepoint-shutdown-restore

Re: Custom Kryo serializer

2017-07-19 Thread Chesnay Schepler
Raw state can only be used when implementing an operator, not a function. For functions you have to use Managed Operator State. Your function will have to implement the CheckpointedFunction interface, and create a ValueStateDescriptor that you register in initializeState. On 19.07.2017

Re: Custom Kryo serializer

2017-07-19 Thread Boris Lublinsky
Thanks for the reply, but I am not using it for managed state, but rather for the raw state In my implementation I have the following class DataProcessorKeyed extends CoProcessFunction[WineRecord, ModelToServe, Double]{ // The managed keyed state see

Re: Flink monitor rest API question

2017-07-19 Thread Chesnay Schepler
Hello, Looks like you stumbled upon a bug in our REST API and use a client that is stricter than others. I will create a JIRA for this. Regards, Chesnay On 19.07.2017 13:31, Will Du wrote: Hi folks, I am using a java rest client - unirest lib to GET from flink rest API to get a Job

Flink monitor rest API question

2017-07-19 Thread Will Du
Hi folks, I am using a java rest client - unirest lib to GET from flink rest API to get a Job status. I got exception-unsupported content encoding -UTF8. Do you guys known how to resolve it? I use postman client working fine. Thanks, Will

Re: Custom Kryo serializer

2017-07-19 Thread Chesnay Schepler
Hello, I assume you're passing the class of your serializer in a StateDescriptor constructor. If so, you could add a breakpoint in Statedescriptor#initializeSerializerUnlessSet, and check what typeInfo is created and which serializer is created as a result. One thing you could try right

Re: Executing Flink server From IntelliJ

2017-07-19 Thread Chesnay Schepler
Hello, this problem is described in https://issues.apache.org/jira/browse/FLINK-6689. Basically, if you want to use the LocalFlinkMiniCluster you should use a TestStreamEnvironment instead. The RemoteStreamEnvironment only works with a proper Flink cluster. Regards, Chesnay On 14.07.2017

Re: AVRO Union type support in Flink

2017-07-19 Thread Timo Walther
Hi Vishnu, I took a look into the code. Actually, we should support it. However, those types might be mapped to Java Objects that will be serialized with our generic Kryo serializer. Have you tested it? Regards, Timo Am 19.07.17 um 06:30 schrieb Martin Eden: Hey Vishnu, For those of us

Re: Does FlinkKafkaConsumer010 care about consumer group?

2017-07-19 Thread Moiz S Jinia
Great thanks that was very helpful. One last question - > If your job code hasn’t changed across the restores, then it should be > fine even if you didn’t set the UID. What kind of code change? What if the operator pipeline is still the same but there's a some business logic change? On Wed,

Re: Does FlinkKafkaConsumer010 care about consumer group?

2017-07-19 Thread Tzu-Li (Gordon) Tai
Does this mean I can use the same consumer group G1 for the newer version A'? And inspite of same consumer group, A' will receive messages from all partitions when its started from savepoint? Yes. That’s true. Flink internally uses static partition assignment, and the clients are assigned

Re: Does FlinkKafkaConsumer010 care about consumer group?

2017-07-19 Thread Moiz S Jinia
Does this mean I can use the same consumer group G1 for the newer version A'? And inspite of same consumer group, A' will receive messages from all partitions when its started from savepoint? I am using Flink 1.2.1. Does the above plan require setting uid on the Kafka source in the job? Thanks,

Re: Does FlinkKafkaConsumer010 care about consumer group?

2017-07-19 Thread Tzu-Li (Gordon) Tai
Hi! The only occasions which the consumer group is used is: 1. When committing offsets back to Kafka. Since Flink 1.3, this can be disabled completely (both when checkpointing is enabled or disabled). See [1] on details about that. 2. When starting fresh (not starting from some savepoint), if

Does FlinkKafkaConsumer010 care about consumer group?

2017-07-19 Thread Moiz Jinia
Below is a plan for downtime-free upgrade of a Flink job. The downstream consumer of the Flink job is duplicate proof. Scenario 1 - 1. Start Flink job A with consumer group G1 (12 slot job) 2. While job A is running, take a savepoint AS. 3. Start newer version of Flink job A' from savepoint AS

How can i merge more than one flink stream

2017-07-19 Thread Jone Zhang
I have three data streams 1. app exposed and click 2. app download 3. app install How can i merge the streams to create a unified stream,then compute it on time-based windows Thanks