Re: large sliding window perf question

2017-05-29 Thread Chen Qin
B.T.W It might be better off to pre aggregation via slidingWindow with controlled bucket size and batch update as well as retention. Thanks, Chen > On May 29, 2017, at 3:05 PM, Chen Qin wrote: > > I see, not sure this this hack works. It utilize operator state to hold all

Re: [BUG?] Cannot Load User Class on Local Environment

2017-05-29 Thread Matt
Thanks for looking into it Till! I'll try changing that line locally and then send a JIRA issue. When it gets officially fixed I'll probably create an Ignite-Flink connector to replace the older and less efficient one [1]. Users will be able to create Flink jobs on Ignite nodes, right where the

Re: large sliding window perf question

2017-05-29 Thread Chen Qin
I see, not sure this this hack works. It utilize operator state to hold all mapping assigned to that operator instance. If key by can generate determined mapping between upstream events to fixed operator parallelism, then the operator state could hold mapping between keys and

Re: Flink and swapping question

2017-05-29 Thread Flavio Pompermaier
Hi to all, I'm still trying to understand what's going on our production Flink cluster. The facts are: 1. The Flink cluster runs on 5 VMWare VMs managed by ESXi 2. On a specific job we have, without limiting the direct memory to 5g, the TM gets killed by the OS almost immediately because the

Re: VertexUpdateFunction

2017-05-29 Thread Martin Junghanns
Hi Ali :) You could compute the degrees beforehand (e.g. using the Graph.[in|out|get]degrees()) methods and use the resulting dataset as a new vertex dataset. You can now run your vertex-centric computation and access the degrees as vertex value. Cheers, Martin On 29.05.2017 09:28,

Re: How can I increase Flink managed memory?

2017-05-29 Thread Sathi Chowdhury
For got to mention I am running 10 slots in each machine and taskmanager.network.numberOfBuffers: 1 Is there a scope of improving memory usage? From: Sathi Chowdhury > Date: Monday, May 29, 2017 at 9:55 AM To:

Flink Docker Kubernetes Gitlab CI CDeployment

2017-05-29 Thread Nancy Estrada
Hi all, Has someone successfully run Flink jobs with this type of setup (Gitlab CI CD and Kubernetes)? Since Flink jobs can’t be dockerized and deployed in a natural way as part of the container (according to Flip-6) I am not very sure of how is the best way of doing this. We are thinking of

How can I increase Flink managed memory?

2017-05-29 Thread Sathi Chowdhury
Hello Flink Dev and Community I have 5 task managers each tie 64 GB of memory I am running flink on yarn with task manager heap taskmanager.heap.mb: 563200 Link still shows that it is using about 21 GB memory leaving 35 GB available..how and what can I do to fix it? Please suggest Thanks Sathi

Porting batch percentile computation to streaming window

2017-05-29 Thread William Saar
I am porting a calculation from Spark batches that uses broadcast variables to compute percentiles from metrics and curious for tips on doing this with Flink streaming. I have a windowed computation where I am compute metrics for IP-addresses (a windowed stream of metrics objects grouped by

Re: Flink and swapping question

2017-05-29 Thread Nico Kruber
FYI: taskmanager.sh sets this parameter but also states the following: # Long.MAX_VALUE in TB: This is an upper bound, much less direct memory will be used TM_MAX_OFFHEAP_SIZE="8388607T" Nico On Monday, 29 May 2017 15:19:47 CEST Aljoscha Krettek wrote: > Hi Flavio, > > Is this running on

Re: State in Custom Tumble Window Class

2017-05-29 Thread Aljoscha Krettek
Hi, If you use tumbling windows or sliding windows then Flink will not keep additional meta data besides the actual window contents. Also, if you use a Trigger that purges when firing Flink will clean the window contents after firing a window. This means that you can set allowed lateness to

Re: Flink and swapping question

2017-05-29 Thread Aljoscha Krettek
Hi Flavio, Is this running on YARN or bare metal? Did you manage to find out where this insanely large parameter is coming from? Best, Aljoscha > On 25. May 2017, at 19:36, Flavio Pompermaier wrote: > > Hi to all, > I think we found the root cause of all the problems.

Re: Does RichFilterFunction work on multiple thread?

2017-05-29 Thread Aljoscha Krettek
Hi, To expand a bit on that, each function is only invoked by one Thread concurrently but the function might be serialised and sent to several executors (TaskManagers) for parallel execution. For more information, check out [1] and [2]. Best, Aljoscha [1]

Re: large sliding window perf question

2017-05-29 Thread Aljoscha Krettek
Hi Chen, How to you update the ValueState during checkpointing. I’m asking because a keyed state should always be scoped to a key and when checkpointing there is no key scope because we are not processing any incoming element and we’re not firing a timer (the two cases where we have a key

VertexUpdateFunction

2017-05-29 Thread rostami
Hi, I want to write an iterative algorithm using Gelly (spargel), like: https://ci.apache.org/projects/flink/flink-docs-release-0.8/spargel_guide.html My question is how I can access the actual vertex information like the vertex degree (in- or outdegree) under the subclass of