Re: How can i merge more than one flink stream

2017-07-24 Thread Jörn Franke
What do you mean by "consistent"? Of course you can do this only at the time the timpstamp is defined (e.g. Using NTP). However, this is never perfect . Then it is unrealistic that they always end up in the same window because of network delays etc. you will need here a global state that is defi

Re: How can i merge more than one flink stream

2017-07-24 Thread Jone Zhang
Thanks for your reply. I have another question: In my situation, each of the three streams contains a local timestamp segment. How can I ensure that their timestamps are consistent in each time window before the merging operation? And how to ensure the arrival of all the streams with consistent tim

Re: FlinkKafkaConsumer subscribes to partitions in restoredState only.

2017-07-24 Thread Tzu-Li (Gordon) Tai
One other note: the functionality is actually already merged to the master branch. You can also take a look at the feature documentation here [1]. [1]  https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/connectors/kafka.html#kafka-consumers-partition-discovery On 25 July 2017 at 1:22

Re: FlinkKafkaConsumer subscribes to partitions in restoredState only.

2017-07-24 Thread Tzu-Li (Gordon) Tai
Hi, Sorry for not replying to this earlier, it seems like this thread hadn’t been noticed earlier. What you are experiencing is expected behavior. In Flink 1.3, new partitions will not be picked up, only partitions that are in checkpoints state will be subscribed to on restore runs. One main r

Re: a lot of connections in state "CLOSE_WAIT"

2017-07-24 Thread XiangWei Huang
hi, Sorry for replying so late. I have met this issue again and the list is constantly keep growing even if i close the page ,until the website is been unavailable. This issue appeared each time i add metrics for a job from web ui. by the way ,the version of Flink is 1.3.1 Regards, Xia

Re: Split Streams not working

2017-07-24 Thread Kien Truong
Hi, I meant adding a select function between the two consecutive select. Or if you use Flink 1.3, you can use the new side output functionality. Regards, Kien On 7/25/2017 7:54 AM, Kien Truong wrote: Hi, I think you're hitting this bug https://issues.apache.org/jira/browse/FLINK-5031 Try

Re: Split Streams not working

2017-07-24 Thread Kien Truong
Hi, I think you're hitting this bug https://issues.apache.org/jira/browse/FLINK-5031 Try the workaround mentioned in a bug: add a map function between map and select Regards, Kien On 7/25/2017 3:14 AM, smandrell wrote: Basically, we are not splitting the streams correctly because when we t

Connect more than two streams

2017-07-24 Thread Govindarajan Srinivasaraghavan
Hi, I have two streams reading from kafka, one for data and other for control. The data stream is split by type and there are around six types. Each type has its own processing logic and finally everything has to be merged to get the collective state per device. I was thinking I could connect mul

Split Streams not working

2017-07-24 Thread smandrell
Basically, we are not splitting the streams correctly because when we try to select the stream we want from our splitStream (using the select() operation), it never returns a DataStream with just ERROR_EVENT's or a DataStream with just SUCCESS_EVENT's. Instead it returns a DataStream with both ERRO

Re: Type erasure exception

2017-07-24 Thread Ziyad Muhammed
Hi Gabriele Type extraction of java 8 lambdas is not yet supported in IntelliJ Idea IDE. You may solve the issues by following one of the below options. 1. Provide type hints https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/types_serialization.html#type-hints-in-the-java-api 2. Se

Re: Count Different Codes in a Window

2017-07-24 Thread Fabian Hueske
Hi Raj, You can use ReduceFunction in combination with a WindowFunction [1]. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/windows.html#windowfunction-with-incremental-aggregation 2017-07-24 20:31 GMT+02:00 Raj Kumar : > Thanks Fabian. That helped. > > But I

Re: Gelly PageRank implementations in 1.2 to 1.3

2017-07-24 Thread Kaepke, Marc
Thanks for your explanation. The vertex-centric, sg and gsa PageRank need a Double as vertex value. A VertexDegree function generate a vertex with a LongValue as value. Maybe I can iterate over the graph and remove all edges with a degree of zero?! Am 24.07.2017 um 16:36 schrieb Greg Hogan mail

Re: Count Different Codes in a Window

2017-07-24 Thread Raj Kumar
Thanks Fabian. That helped. But I want to access the window start time. AFAIK, reduce can not give this details as it doesn't have timewindow object passed to the reduce method. How can I achieve this ? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.n

Type erasure exception

2017-07-24 Thread Gabriele Di Bernardo
Hi guys, When I run my Flink topology (locally) I get this error: The return type of function 'main(Job.java:69)' could not be determined automatically, due to type erasure. You can give type information hints by using the returns(...) method on the result of the transformation call, or by let

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-24 Thread Stephan Ewen
Hi Prashant! Flink's S3 integration currently goes through Hadoop's S3 file system (as you probably noticed). It seems that the Hadoop's S3 file system is not really well suited for what we want to do, and we are looking to drop it and replace it by something direct (independent of Hadoop) in the

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-24 Thread Stephan Ewen
Hi Prashant! I assume you are using Flink 1.3.0 or 1.3.1? Here are some things you can do: - I would try and disable the incremental checkpointing for a start and see what happens then. That should reduce the number of files already. - Is it possible for you to run a patched version of Flin

Re: FlinkKafkaConsumer subscribes to partitions in restoredState only.

2017-07-24 Thread ninad
Any update on this guys? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FlinkKafkaConsumer-subscribes-to-partitions-in-restoredState-only-tp14233p14410.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabbl

Re: Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread Chesnay Schepler
So I don't know why it doesn't work (it should, afaik), but as a workaround you could maintain an ArrayList or similar in your function, and only add/read elements from the ListState in snapshot/initialize state. On 24.07.2017 17:10, ZalaCheung wrote: Hi all, Does anyone have idea about the n

Re: Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread ZalaCheung
Hi all, Does anyone have idea about the non-keyed managed state problem below? I think all the function in the testFunc class should share the ListState “metrics”. But after I add element to ListState at flatMap2 function, I cannot retrieve the element added to ListState. Desheng Zhang > On

Re: Custom Kryo serializer

2017-07-24 Thread Chesnay Schepler
The docs provide a somewhat good overview on how to interact with managed state: https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state.html#using-managed-operator-state To use the custom serializer you can supply the type class in the StateDescriptor constructor when ini

Re: Gelly PageRank implementations in 1.2 to 1.3

2017-07-24 Thread Greg Hogan
The current algorithm is unweighted though we should definitely look to add a weighted variant and consider PersonalizedPageRank as well. Looking at your results, PageRank scores should sum to 1.0, should be positive unless the damping factor is 1.0, and use of the convergence threshold will gu

Re: Custom Kryo serializer

2017-07-24 Thread Boris Lublinsky
Thanks Chesney, Can you, please, point me to any example? Boris Lublinsky FDP Architect boris.lublin...@lightbend.com https://www.lightbend.com/ > On Jul 24, 2017, at 9:27 AM, Chesnay Schepler wrote: > > Copy of a mail i sent to the user mailing list only: > > Raw state can only be used when

Re: Flink Beam runner

2017-07-24 Thread Stephan Ewen
Hi Boris! Flink deploys Scala 2.11 artifacts already. This is about the beam runner, so this should probably go to the beam dev mailing list... Cheers, Stephan On Mon, Jul 24, 2017 at 4:16 PM, Boris Lublinsky < boris.lublin...@lightbend.com> wrote: > Current runner for Beam 2.0.0 is still on S

Re: Custom Kryo serializer

2017-07-24 Thread Chesnay Schepler
Copy of a mail i sent to the user mailing list only: Raw state can only be used when implementing an operator, not a function. For functions you have to use Managed Operator State. Your function will have to implement the CheckpointedFunction interface, and create a ValueStateDescriptor that y

Custom Kryo serializer

2017-07-24 Thread Boris Lublinsky
Is there a chance, this can be answered? Boris Lublinsky FDP Architect boris.lublin...@lightbend.com https://www.lightbend.com/ > Begin forwarded message: > > From: Boris Lublinsky > Subject: Re: Custom Kryo serializer > Date: July 19, 2017 at 8:28:16 AM CDT > To: user@flink.apache.org, ches...

Flink Beam runner

2017-07-24 Thread Boris Lublinsky
Current runner for Beam 2.0.0 is still on Scala version 2.10. Are there any plans (and dates) to provide runner for Scala 2.11 Boris Lublinsky FDP Architect boris.lublin...@lightbend.com https://www.lightbend.com/

Re: Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread ZalaCheung
Hi Chesnay, Thank you very much. Now I tried to ignore the default value of ListState and Try to use the CoFlatmap function with managed state. But what surprised me is that it seems the state was not shared by two streams. My test code is shown below. DataStream result = stream .conne

Re: Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread Chesnay Schepler
Hello, That's an error in the documentation, only the ValueStateDescriptor has a defaultValue constructor argument. Regards, Chesnay On 24.07.2017 14:56, ZalaCheung wrote: Hi Martin, Thanks for your advice. That’s really helpful. I am using the push scenario. I am now having some trouble b

Re: Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread ZalaCheung
Hi Martin, Thanks for your advice. That’s really helpful. I am using the push scenario. I am now having some trouble because of the state I want to maintain. For me, the simplest way is to maintain to ValueState in a CoFlatMapFunction(Actually RichCoFlatMapFunction). But the rich function can o

Re: Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread Martin Eden
Hey Desheng, Some options that come to mind: - Cave man style: Stop and restart job with new config. - Poll scenario: You could build your own thread that periodically loads from the db into a per worker accessible cache. - Push scenario: have a config stream (based off of some queue) which you co

Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread ZalaCheung
Hi all, I am now trying to implement a anomaly detection algorithm on Flink, which is actually implement a Map operator to do anomaly detection based on timeseries. At first I want to read configuration(like which kafka source host to read datastream from and which sink address to write data t

Re: notNext() and next(negation) not yielding same output in Flink CEP

2017-07-24 Thread Yassine MARZOUGUI
Hi Dawid, Thanks a lot for the explanation, it's all clear now. Best, Yassine 2017-07-23 13:11 GMT+02:00 Dawid Wysakowicz : > Hi Yassine, > > First of all notNext(A) is not equal to next(not A). notNext should be > considered as a “stopCondition” which tells if an event matching the A > conditi

Re: Count Different Codes in a Window

2017-07-24 Thread Fabian Hueske
Hi Raj, I would recommend to use a ReduceFunction instead of a WindowFunction. The benefit of ReduceFunction is that it can be eagerly computed whenever an element is put into the window such that the state of the window is only one element. In contrast, the WindowFunction collects all elements of

Re: Find the running median from a data stream

2017-07-24 Thread Fabian Hueske
Hi Gabriele, I don't think you can compute the exact running median on a stream. This would require to collect all elements of the stream so you would basically need to put the complete stream into the ValueState. Even if the state is backed by RocksDB, the state for a specific key needs to fit on