Re: Code related to spilling data to disk

2016-06-21 Thread Tae-Geon Um
I have another question. Is the spilling only executed on batch mode? What happen on streaming mode? > On Jun 22, 2016, at 1:48 PM, Tae-Geon Um wrote: > > Hi, all > > As far as I know, Flink spills data (states?) to disk if the data exceeds > memory threshold or there exists memory pressur

Code related to spilling data to disk

2016-06-21 Thread Tae-Geon Um
Hi, all As far as I know, Flink spills data (states?) to disk if the data exceeds memory threshold or there exists memory pressure. i’d like to know the detail of how Flink spills data to disk. Could you please let me know which codes do I have to investigate? Thanks, Taegeon

Re: Documentation for translation of Job graph to Execution graph

2016-06-21 Thread Bajaj, Abhinav
Thanks Robert for helpful reply. I have follow up on the Q2 - "In general, we recommend running one JobManager per job” I understand how this can be achieved while running in Yarn, I.e. by submitting single Flink Jobs. Is their some other way of setting Flink to configure single Jobmanager per

TaskManager information during runtime

2016-06-21 Thread omaralvarez
Hi, I have a program that creates one StreamingSource for each port in which I want to receive data. I'm trying to know the hostname of all the created sources, so I can send data to them from external processes. Is there any other way during runtime other than using the REST API for obtaining

Re: Lazy Evaluation

2016-06-21 Thread Aljoscha Krettek
Great to hear that it works now! :-) On Sun, 19 Jun 2016 at 16:33 Paschek, Robert wrote: > Hi Mailing List, > > after "upgrading" the flink version in my pom.xml to 1.0.3, i get two > error messages for these output variants, which don't work: > > org.apache.flink.api.common.functions.InvalidTyp

Re: Start cluster in different modes

2016-06-21 Thread Robert Metzger
Hi Ravinder, the streaming mode has been removed, because Flink now starts in the streaming mode by default. This means that the system is lazily allocating managed memory when user's are executing batch jobs. If you want to preallocate the managed memory, there is a new configuration option for t

Re: Caused by: java.lang.Exception: Serialized representation of java.lang.StackOverflowError: null

2016-06-21 Thread Robert Metzger
Hi, maybe Kryo is not able to serialize the class. Is the class holding the data outside of the heap? In those cases, I would recommend implementing a custom serializer (either by using Flink's TypeInformation system or Kryo). On Tue, Jun 21, 2016 at 5:43 PM, Debaditya Roy wrote: > Hi, > > I

Re: Caused by: java.lang.Exception: Serialized representation of java.lang.StackOverflowError: null

2016-06-21 Thread Debaditya Roy
Hi, I am using flink-1.0.3. Warm Regards, Debaditya On Tue, Jun 21, 2016 at 5:29 PM, Robert Metzger wrote: > Hi, > which version of Flink are you using? There has been a recent fix for the > issue: https://issues.apache.org/jira/browse/FLINK-3762 > > Regards, > Robert > > On Tue, Jun 21, 2016

Re: Caused by: java.lang.Exception: Serialized representation of java.lang.StackOverflowError: null

2016-06-21 Thread Robert Metzger
Hi, which version of Flink are you using? There has been a recent fix for the issue: https://issues.apache.org/jira/browse/FLINK-3762 Regards, Robert On Tue, Jun 21, 2016 at 5:22 PM, Debaditya Roy wrote: > Hello users, > > I am getting an error from the flat map function while running my progra

Re: Documentation for translation of Job graph to Execution graph

2016-06-21 Thread Robert Metzger
Hi, the link has been added newly, yes. Regarding Q1, since there is no documentation right now, I have to refer you to our code. In the JobManager.scala class, there is a method "private def submitJob(jobGraph, ...") where the ExecutionGraph is created. I think that's a good starting point for lo

Caused by: java.lang.Exception: Serialized representation of java.lang.StackOverflowError: null

2016-06-21 Thread Debaditya Roy
Hello users, I am getting an error from the flat map function while running my program. My program is sending an object of type Mat(OpenCV) from the custom source function and passing it to the flat map function for processing. However while executing I am getting this error:

Re: Keeping latest data point in a data stream variable

2016-06-21 Thread Robert Metzger
Hi Biplob, would you like to send the last value somewhere? is there a way of detecting when the stream ends? (Something like a marker element, or could you use a timeout?) Anyways, what you can do is use a flatMap() function, always store the current element. Once the stream is over, you emit th

Re: NPE in JDBCInputFormat

2016-06-21 Thread Robert Metzger
Hi, I also think this is a bug. Can you file a JIRA issue for it? Regards, Robert On Tue, Jun 21, 2016 at 12:15 PM, Flavio Pompermaier wrote: > I think the problem is somehow related to > val DB_ROWTYPE = new RowTypeInfo( > Seq(BasicTypeInfo.INT_TYPE_INFO), > Seq("id")) > > You have

What is the best way of sharing a dataset between the nodes in Apache flink?

2016-06-21 Thread ahmad Sa P
Hi everyone, I am using Apache Flink to process a stream of data and I need to share an index between all the nodes that process the input data. The index is getting updated by the nodes frequently. I would like to know, is it a good practice, from the point of efficiency, to share the Dataset thr

Start cluster in different modes

2016-06-21 Thread Ravinder Kaur
Hello community, I have been working with Flink for a while and have updated from version 0.10 to 1.0 but now I don't see the scripts to start cluster specifically in batch and stream mode like in version 0.10 Could someone tell me the difference and how I could achieve this? I tried to look thro

Keeping latest data point in a data stream variable

2016-06-21 Thread Biplob Biswas
Hi, I want to keep the latest data point which is processed in a datastream variable. So technically I need just one value in the variable and discard all the older ones. Can this be done somehow? I was thinking about using filters but i don't think i can use it for this scenario. Any ideas as to

Re: NPE in JDBCInputFormat

2016-06-21 Thread Flavio Pompermaier
I think the problem is somehow related to val DB_ROWTYPE = new RowTypeInfo( Seq(BasicTypeInfo.INT_TYPE_INFO), Seq("id")) You have only one filed, I think Seq("id") should be removed. However this is a bug IMHO, this kind of error should be checked with and handler with a proper error.

Re: Getting the NumberOfParallelSubtask

2016-06-21 Thread Robert Metzger
Hi Robert, the number of parallel subtasks is the parallelism of the job or the individual operator. Only when executing Flink locally, the parallelism is set to the CPU cores. The number of groups generated by the groupBy() transformation doesn't affect the parallelism. Very often the number of g

NPE in JDBCInputFormat

2016-06-21 Thread Martin Scholl
Hello everyone, JDBCInputFormat of flink 1.1-SNAPSHOT fails with an NPE in Row.productArity: %% snip %% java.io.IOException: Couldn't access resultSet at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.nextRecord(JDBCInputFormat.java:288) at org.apache.flink.api.java.io.jdbc.