Re: "Too many open files" in Job Manager

2016-11-14 Thread Ufuk Celebi
Hey Max! Thanks for reporting this issue. Can you give more details about how you are running your job? If you are doing checkpoints to HDFS, could you please report how many checkpoints you find in your configured directory? Is everything properly cleaned up there? – Ufuk On 12 November 2016

Re: Csv to windows?

2016-11-14 Thread Ufuk Celebi
I think this is independent of streaming. If you want to compute the aggregate over all keys and data you need to do this in a single task, e.g. use a (flat)map with parallelism 1, do the aggregation there and then broadcast to downstream operators. Does this make sense or am I overlooking somet

Re: Task managers cant start on YARN cluster

2016-11-14 Thread Ufuk Celebi
Good to know that you solved this. :) Do you think there is something we can do to help users noticing this situation faster? – Ufuk On 13 November 2016 at 00:23:21, Gyula Fóra (gyula.f...@gmail.com) wrote: > Hi, > > What happened is that I compiled Flink with the wrong hadoop version... > >

Re: WindowOperator - element's timestamp

2016-11-14 Thread Ufuk Celebi
Looping in Kostas and Aljoscha who should know what's the expected behaviour here ;) On 11 November 2016 at 16:17:23, Petr Novotnik (petr.novot...@firma.seznam.cz) wrote: > Hello, > > I'm struggling to understand the following behaviour of the > `WindowOperator` and would appreciate some insig

Re: Order by which windows are processed on event time

2016-11-14 Thread Ufuk Celebi
I think there are no ordering guarantees for this. @Aljoscha: is this correct? On 11 November 2016 at 19:57:43, Saiph Kappa (saiph.ka...@gmail.com) wrote: > Hi, > > I have a streaming application based on event time. When I issue a > watermark that will close more than 1 window (and trigger thei

Re: Programmatically abort checkpoint

2016-11-14 Thread Ufuk Celebi
Hey Lorenzo, internally Flink is able to abort checkpoints, but this is not possible from the user code. There is currently no way to be explicitly notified about an incoming barrier. You can check out this PR (https://github.com/apache/flink/pull/2629) and see whether it addresses your questi

Re: Window PURGE Behaviour 1.0.2 vs 1.1.3

2016-11-14 Thread Ufuk Celebi
On 11 November 2016 at 18:19:30, Aljoscha Krettek (aljos...@apache.org) wrote: > Hi, > I think the jar files in the lib folder are checked first so shipping the > WindowOperator with the job should not work. The WindowOperator is instantiated on the client side and shipped as user code to the clu

Re: Flink work with raw S3 (S3FileSystem or other), not a HDFS backed by S3 (S3AFileSystem, NativeS3FileSystem)?

2016-11-14 Thread Ufuk Celebi
The Flink docs show how to setup Flink's internal file system operations to use the S3FileSystem (the StackOverflow question actually shows that it is working, see answer there).\ This is configuration is independent of what you are doing in your user code. If you want to use your own S3 based

Re: Too few memory segments provided. Hash Table needs at least 33 memory segments.

2016-11-14 Thread Ufuk Celebi
What do the TaskManager logs say wrt to allocation of managed memory? Something like: Limiting managed memory to ... of the currently free heap space ..., memory will be allocated lazily. What else did you configure in flink-conf? Looping in Greg and Vasia who maintain Gelly and are most-famil

Re: Window PURGE Behaviour 1.0.2 vs 1.1.3

2016-11-14 Thread Aljoscha Krettek
Not sure, because there's going to be two WindwoOperator classes on the class path when deserialising this on the cluster. On Mon, 14 Nov 2016 at 10:07 Ufuk Celebi wrote: > On 11 November 2016 at 18:19:30, Aljoscha Krettek (aljos...@apache.org) > wrote: > > Hi, > > I think the jar files in the l

Re: Kafka Stream to Database batch inserts

2016-11-14 Thread Ufuk Celebi
You can specify a custom trigger that extends the default ProcessingTimeTrigger (if you are working with processing time) or EventTimeTrigger (if you are working with event time). You do it like this: stream.timeWindow(Time.of(1, SECONDS)).trigger(new MyTrigger()) Check out the Trigger impleme

Re: Task managers cant start on YARN cluster

2016-11-14 Thread Gyula Fóra
Hi, The main problem was that whatever was going wrong was not apparent in the Flink Application master runner but it was only shown in the YarnClient debug log. If you run with the default INFO log level all you see that the Yarn client is trying to fail over again and again as if something was

Flink job restart at checkpoint interval

2016-11-14 Thread Satish Chandra Gupta
Hi, I am using Value State, backed by FsStateBackend on hdfs, as following: env.setStateBackend(new FsStateBackend(stateBackendPath)) env.enableCheckpointing(checkpointInterval) It is non-iterative job running Flink/Yarn. The job restarts at checkpointInterval, I have tried interval varying fro

Re: Compile for Java 1.8

2016-11-14 Thread Ufuk Celebi
As far as I know, there were no concrete discussions about this yet. I would not expect it to happen this year, but maybe/probably in the course of next year. We would have to have a release where we explicitly say that this will be the last major release with 1.7 support, so users running old v

Re: Task managers cant start on YARN cluster

2016-11-14 Thread Ufuk Celebi
What was the log message shown on DEBUG level? Maybe it makes sense to promote it to INFO. ;) I guess there is no easy way to verify the version, right Max or Robert? On 14 November 2016 at 10:45:52, Gyula Fóra (gyula.f...@gmail.com) wrote: > Hi, > > The main problem was that whatever was goin

Re: Compile for Java 1.8

2016-11-14 Thread Andrey Melentyev
Ufuk, do you think it's still worth fixing the build errors and maybe adding a Travis configuration to build and test against 1.8? Or do you think that would introduce too much additional overheads? I mean without deprecating 1.7 support of course. Andrey On Mon, Nov 14, 2016 at 10:51 AM, Ufuk C

Re: Task managers cant start on YARN cluster

2016-11-14 Thread Gyula Fóra
What I mean is the logs coming from org.apache.hadoop.ipc.Client if you look at my original email (at JM logs) Gyula Ufuk Celebi ezt írta (időpont: 2016. nov. 14., H, 10:52): > What was the log message shown on DEBUG level? > > Maybe it makes sense to promote it to INFO. ;) > > I guess there is

Re: get trigger context from WindowFunction

2016-11-14 Thread Ufuk Celebi
I don't think that this is possible right now. There are a proposal and discussion to extend the window function meta data here:  https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+Window+Function+Metadata http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FL

Re: Compile for Java 1.8

2016-11-14 Thread Ufuk Celebi
Sounds good to me if possible without breaking anything :-) On 14 November 2016 at 10:57:26, Andrey Melentyev (andrey.melent...@gmail.com) wrote: > Ufuk, > > do you think it's still worth fixing the build errors and maybe adding a > Travis configuration to build and test against 1.8? Or do you

Re: Many operations cause StackOverflowError with AWS EMR YARN cluster

2016-11-14 Thread Ufuk Celebi
The Python API is in alpha state currently, so we would have to check if it is related specifically to that. Looping in Chesnay who worked on that. The JVM GC error happens on the client side as that's where the optimizer runs. How much memory does the client submitting the job have? How do you

Re: Task managers cant start on YARN cluster

2016-11-14 Thread Ufuk Celebi
Ah, sorry. I thought it was something related to Flink. ;) On 14 November 2016 at 10:59:44, Gyula Fóra (gyula.f...@gmail.com) wrote: > What I mean is the logs coming from org.apache.hadoop.ipc.Client if you > look at my original email (at JM logs) > > Gyula > > Ufuk Celebi ezt írta (időpont: 2

Re: Flink job restart at checkpoint interval

2016-11-14 Thread Ufuk Celebi
There seems to be an Exception happening when Flink tries to serialize the state of your operator (see the stack trace). What are you trying to store via the ValueState? Maybe you can share a code excerpt? – Ufuk On 14 November 2016 at 10:51:06, Satish Chandra Gupta (scgupt...@gmail.com) wrot

Re: Programmatically abort checkpoint

2016-11-14 Thread Lorenzo Affetti
Thank you for the pointers and the clarification! One question, when you say: "There is currently no way to be explicitly notified about an incoming barrier”, isn’t `snapshotOperatorState` invoked when a barrier approaches the operator? Lorenzo Affetti --- MD in computer engineering PhD St

Re: Programmatically abort checkpoint

2016-11-14 Thread Ufuk Celebi
On 14 November 2016 at 11:30:13, Lorenzo Affetti (lorenzo.affe...@polimi.it) wrote: > Thank you for the pointers and the clarification! > > One question, when you say: "There is currently no way to be explicitly > notified about > an incoming barrier”, isn’t `snapshotOperatorState` invoked w

apply with fold- and window function

2016-11-14 Thread Stephan Epping
Hello, I wondered if there is a particular reason for the window function to have explicitly the same input/output type? public SingleOutputStreamOperator apply(R initialValue, FoldFunction foldFunction, WindowFunction function) for example (the following does not work): DataStream aggregate

Re: WindowOperator - element's timestamp

2016-11-14 Thread Aljoscha Krettek
Hi, I'm afraid the ContinuousEventTimeTrigger is a bit misleading and should probably be removed. The problem is that a watermark T signals that we won't see elements with a timestamp < T in the future. It does not signal that we haven't already seen elements with a timestamp > T. So this cannot be

Re: Maintaining watermarks per key, instead of per operator instance

2016-11-14 Thread Fabian Hueske
Hi Stephan, I'm skeptical about two things: - using processing time will result in inaccurately bounded aggregates (or do you want to group by event time in a processing time window?) - writing to and reading from Cassandra might be expensive (not sure what you mean by cheaper in the end) and it i

Re: Order by which windows are processed on event time

2016-11-14 Thread Aljoscha Krettek
Hi, event-time windows are being processed in the order of their end timestamp. If several windows have the same end timestamp then no ordering across those windows is guaranteed. Cheers, Aljoscha On Mon, 14 Nov 2016 at 10:01 Ufuk Celebi wrote: > I think there are no ordering guarantees for thi

Re: get trigger context from WindowFunction

2016-11-14 Thread Aljoscha Krettek
Yup, we're currently working on getting ProcessWindowFunction into master. Then we would work on getting additional information available, such as the current watermark or a firing reason. On Mon, 14 Nov 2016 at 11:07 Ufuk Celebi wrote: > I don't think that this is possible right now. > > There

Re: apply with fold- and window function

2016-11-14 Thread Aljoscha Krettek
Hi, this is a known bug: https://issues.apache.org/jira/browse/FLINK-3869. I'm still hoping that we can get a workaround in for Flink 1.2. See my last comment in the Jira Issue. Cheers, Aljoscha On Mon, 14 Nov 2016 at 14:49 Stephan Epping wrote: > Hello, > > I wondered if there is a particular

Re: Maintaining watermarks per key, instead of per operator instance

2016-11-14 Thread kaelumania
Hey Fabian, thank you very much. - yes, I would window by event time and fire/purge by processing time - Cheaper in the end meant, that having too much state in the flink cluster would be more expensive, as we store all data in cassandra too.I think the fault tolerance would be okay, as we wou

Re: Processing streams of events with unpredictable delays

2016-11-14 Thread Aljoscha Krettek
Hi, in additional to what you mentioned (having a very large allowed lateness) you can also try another approach: adding a custom operator in front of the window operation and splitting the stream by normal elements and very late elements. Then, in the stream of very late elements you have some cus

RE: Flink - Nifi Connectors - Class not found

2016-11-14 Thread PACE, JAMES
bin/flink run -c com.att.flink.poc.NifiTest jars/flinkpoc-0.0.1-SNAPSHOT.jar I have another entry point in this jar that uses readFileStream and that works fine. From: Robert Metzger [mailto:rmetz...@apache.org] Sent: Sunday, November 13, 2016 12:53 AM To: user@flink.apache.org Subject: Re: Flin

Re: Re: Lot of RocksDB files in tmp Directory

2016-11-14 Thread Aljoscha Krettek
Hi, could it be that the job has restarted due to failures a large number of times? Cheers, Aljoscha On Mon, 7 Nov 2016 at 09:21 Dominique Rondé wrote: > First of all, thanks for the explanation. That sounds reasonable. > > But I started the flink routes 3 days ago and went out for the weekend.

Proper way of adding external jars

2016-11-14 Thread Gyula Fóra
Hi, I have been trying to use the -C flag to add external jars with user code and I have observed some strange behaviour. What I am trying to do is the following: I have 2 jars, JarWithMain.jar contains the main class and UserJar.jar contains some classes that the main method will eventually exec

Re: Many operations cause StackOverflowError with AWS EMR YARN cluster

2016-11-14 Thread Geoffrey Mon
Hi Ufuk, The master instance of the cluster was also a m3.xlarge instance with 15 GB RAM, which I would've expected to be enough. I have gotten the program to run successfully on a personal virtual cluster where each node has 8 GB RAM and where the master node was also a worker node, so the proble

Re: Proper way of adding external jars

2016-11-14 Thread Scott Kidder
Hi Gyula, I've typically added external library dependencies to my own application JAR as shaded-dependencies. This ensures that all dependencies are included with my application while being distributed to Flink Job Manager & Task Manager instances. Another approach is to place these external JAR

Re: Maintaining watermarks per key, instead of per operator instance

2016-11-14 Thread Aljoscha Krettek
Hi Stephan, I was going to suggest that using a flatMap and tracking the timestamp of each key yourself is a bit like having a per-key watermark. I wanted to wait a bit before answering because I'm currently working on a new type of Function that will be release with Flink 1.2: ProcessFunction. Thi

Retrieving values from a dataset of datasets

2016-11-14 Thread otherwise777
Hey There, I'm trying to calculate the betweenness in a graph with Flink and Gelly, the way I tried this was by calculating the shortest path from every node to the rest of the nodes. This results in a Dataset of vertices which all have Datasets of their own with all the other vertices and their p

Re: WindowOperator - element's timestamp

2016-11-14 Thread Petr Novotnik
Aljoscha, thanks for your response. The use-case I'm after is basically providing "early" (inaccurate) results to downstream consumers. Suppose we're running aggregations for daily time windows, but we don't want to wait a whole day to see results. The idea is to fire the windows continuously

Re: apply with fold- and window function

2016-11-14 Thread Anchit Jatana
Hi Stephan, I faced the similar issue, the way implemented this(though a workaround) is by making the input to fold function i.e. the initial value to fold symmetric to what goes into the window function. I made the initial value to fold function a tuple with all non required/available index valu

Re: Flink job restart at checkpoint interval

2016-11-14 Thread Satish Chandra Gupta
Most of the ValueState I am using are of Long or Boolean, except one which is a map of Long to Scala case class: ValueState[Map[Long, AnScalaCaseClass]] Does this serialization happen only for the value state members of operators, or also other private fields? Thanks +satish On Mon, Nov 14, 201

Why use Kafka after all?

2016-11-14 Thread Dromit
Hello, As far as I've seen, there are a lot of projects using Flink and Kafka together, but I'm not seeing the point of that. Let me know what you think about this. 1. If I'm not wrong, Kafka provides basically two things: storage (records retention) and fault tolerance in case of failure, while