Re: 1.1-snapshot issues

2016-05-17 Thread Henry Saputra
Looks like it has been resolved, Could you try it again? On Tue, May 17, 2016 at 7:02 AM, Stefano Baghino < stefano.bagh...@radicalbit.io> wrote: > I believe that Apache repo is having some issues right now: > https://status.apache.org/ > > On Tue, May 17, 2016 at 3:55 PM, aris kol

custom sources

2016-05-17 Thread Abhishek R. Singh
Hi, Can we define custom sources in link? Control the barriers and (thus) checkpoints at good watermark points? -Abhishek-

Re: Flink recovery

2016-05-17 Thread Fabian Hueske
Thanks for reporting back Naveen! 2016-05-17 18:55 GMT+02:00 Madhire, Naveen : > Hi Robert, With the use of manual save points, I was able to obtain > exactly-once output with Kafka and HDFS rolling sink. > > Thanks to you and Fabian for the help. > > > From:

Re: Multi-tenant, deploying flink cluster

2016-05-17 Thread Robert Metzger
Hi, your problem description is very brief. Can you explain a bit more in detail what you need? You can group messages by a certain key (tentant id) and process them together on the same machine. On Fri, May 13, 2016 at 11:23 PM, Alexander Smirnov < alexander.smirn...@gmail.com> wrote: > Hi, >

Re: Flink recovery

2016-05-17 Thread Robert Metzger
Hi, Savepoints are exactly for that use case: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html http://data-artisans.com/how-apache-flink-enables-new-streaming-applications/ Regards, Robert On Tue, May 17, 2016 at 4:25 PM, Madhire, Naveen <

Re: Flink recovery

2016-05-17 Thread Madhire, Naveen
Hey Robert, What is the best way to stop the streaming job in production if I want to upgrade the application without loosing messages and causing duplicates. How can I test this scenario? We are testing few recovery mechanisms like job failure, application upgrade and node failure. Thanks,

Re: killing process in Flink cluster

2016-05-17 Thread Robert Metzger
Hi, so you tried to stop flink by killing the processes? I assume you've started Flink in the standalone cluster mode? If you a kill, and a kill -9 should definitively stop Flink. Did you check the log files of the task manager? The Flink services are logging when they are receiving signals from

1.1-snapshot issues

2016-05-17 Thread aris kol
Hi guys, Since yesterday, I am getting this: [warn] apache.snapshots: tried[warn] http://repository.apache.org/snapshots/org/apache/flink/flink-scala_2.11/1.1-SNAPSHOT/flink-scala_2.11-1.1-SNAPSHOT.pom[error] SERVER ERROR: Proxy Error

Re: Another serialization error

2016-05-17 Thread Flavio Pompermaier
Yes I am On Tue, May 17, 2016 at 3:45 PM, Robert Metzger wrote: > Are you using 1.0.2 on the cluster as well? > > On Tue, May 17, 2016 at 3:40 PM, Flavio Pompermaier > wrote: > >> I tried to debug my application from Eclipse and I got an infinite >>

Re: Another serialization error

2016-05-17 Thread Robert Metzger
Are you using 1.0.2 on the cluster as well? On Tue, May 17, 2016 at 3:40 PM, Flavio Pompermaier wrote: > I tried to debug my application from Eclipse and I got an infinite > recursive call in the TypeExtractor during the analysis of TreeNode (I'm > using Flink 1.0.2): > >

Re: Another serialization error

2016-05-17 Thread Flavio Pompermaier
I tried to debug my application from Eclipse and I got an infinite recursive call in the TypeExtractor during the analysis of TreeNode (I'm using Flink 1.0.2): Exception in thread "main" java.lang.StackOverflowError at

Re: flink snapshotting fault-tolerance

2016-05-17 Thread Robert Metzger
Hi Stravos, I haven't implemented our checkpointing mechanism and I didn't participate in the design decisions while implementing it, so I can not compare it in detail to other approaches. >From a "does it work perspective": Checkpoints are only confirmed if all parallel subtasks successfully

Re: Flink performance tuning

2016-05-17 Thread Robert Metzger
Hi, Flink is not using all available slots by default. You have to pass the "parallelism" as a parameter "-p 21" when submitting the job. This might also explain the performance difference compared to MapReduce. The datatypes you are using look okay. I don't see a performance issue there.

Re: Flink recovery

2016-05-17 Thread Robert Metzger
Hi Naveen, I think cancelling a job is not the right approach for testing our exactly-once guarantees. By cancelling a job, you are discarding the state of your job. Restarting from scratch (without using a savepoint) will cause duplicates. What you can do to validate the behavior is randomly

Re: Another serialization error

2016-05-17 Thread Robert Metzger
The last one is C or A? How often is it failing (every nth run?) Is it always failing at the same execute() call, or at different ones? Is it always the exact same exception or is it different ones? Does the error behave differently depending on the input data? Sorry for asking so many

Re: rocksdb backend on s3 window operator checkpoint issue

2016-05-17 Thread Robert Metzger
I tried reproducing the issue using the org.apache.hadoop.fs.s3a.S3AFileSystem and it worked. I had some dependency issues with the S3AFileSystem so I didn't follow that path for now. If you've used the S3AFileSystem, I can try to get that one working as well. On Tue, May 17, 2016 at 11:59 AM,

Re: Another serialization error

2016-05-17 Thread Flavio Pompermaier
Ah sorry, I forgot to mention that I don't use any custom kryo serializers.. On Tue, May 17, 2016 at 12:39 PM, Flavio Pompermaier wrote: > I got those exceptions running 3 different types of jobs..I could have > tracked the job and the error...my bad! > However, the most

Re: Another serialization error

2016-05-17 Thread Robert Metzger
Hi Flavio, thank you for providing additional details. I don't think that missing hashCode / equals() implementations cause such an error. They can cause wrong sorting or partitioning of the data, but the serialization should still work properly. I suspect the issue somewhere in the serialization

Re: closewith(...) not working in DataStream error, but works in DataSet

2016-05-17 Thread Biplob Biswas
I am also a newbie but from what i experienced during my experiments is that ...The same implementation doesnt work for the streaming context because 1) In streaming context the stream is assumed to be infinite so the process of iteration is also infinite and the part with which you close your

Re: java.lang.ClassCastException: org.apache.flink.streaming.api.watermark.Watermark cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2016-05-17 Thread Robert Metzger
I've filed a JIRA to improve the error message: https://issues.apache.org/jira/browse/FLINK-3918 On Fri, Apr 22, 2016 at 11:17 PM, Fabian Hueske wrote: > Hi Konstantin, > > this exception is thrown if you do not set the time characteristic to > event time and assign

Re: Another serialization error

2016-05-17 Thread Flavio Pompermaier
Hi Robert, in this specific case the interested classes are: - Tuple3 (IndexAttributeToExpand is a POJO extending another class and both of them doesn't implement equals and hashcode) - Tuple3>>

Re: Another serialization error

2016-05-17 Thread Robert Metzger
Hi Flavio, which datatype are you using? On Tue, May 17, 2016 at 11:42 AM, Flavio Pompermaier wrote: > Hi to all, > during these days we've run a lot of Flink jobs and from time to time > (apparently randomly) a different Exception arise during their executions... > I

Re: rocksdb backend on s3 window operator checkpoint issue

2016-05-17 Thread Robert Metzger
Hi, from the code you've provided, everything seems to look okay. I'm currently trying to reproduce the issue. Which Flink version are you using? Which s3 implementation did you configure in the hadoop configuration? Regards, Robert On Mon, May 16, 2016 at 11:52 PM, Chen Qin

Another serialization error

2016-05-17 Thread Flavio Pompermaier
Hi to all, during these days we've run a lot of Flink jobs and from time to time (apparently randomly) a different Exception arise during their executions... I hope one of them could help in finding the source of the problem..This time the exception is: An error occurred while reading the next

Re: Flink recovery

2016-05-17 Thread Stephan Ewen
Hi Naveen! I assume you are using Hadoop 2.7+? Then you should not see the ".valid-length" file. The fix you mentioned is part of later Flink releases (like 1.0.3) Stephan On Mon, May 16, 2016 at 11:46 PM, Madhire, Naveen < naveen.madh...@capitalone.com> wrote: > Thanks Fabian. Actually I

RE: Flink performance tuning

2016-05-17 Thread Serhiy Boychenko
Cheerz, Basically the data is stored in CSV format. The flatMap which I have implemented does: String[] tokens = value.split(","); out.collect(new Tuple2(tokens[0], Double.valueOf(tokens[2]))); The result calculation looks like: DataSet> statistics =

Re: Flink Kafka Streaming - Source [Bytes/records Received] and Sink [Bytes/records sent] show zero messages

2016-05-17 Thread Robert Metzger
Hi, currently, Flink doesn't have code build in for throughput (rate) and latency measurements. Its a planned feature for the future and some first steps are done into that direction. Check out this code for measuring throughput:

Re: Broadcast and read broadcast variable in DataStream

2016-05-17 Thread Aljoscha Krettek
Hi, something like .withBroadcastSet() is not yet available in the DataStream API. I'm working on it, however. Using a (global) static variable will not work for this case since the computation is distributed. The iteration does not work because the head of the iteration (the "loop" variable) is

Re: Managing Flink on Yarn programmatically

2016-05-17 Thread Robert Metzger
Hi, There is currently no officially supported API for managing Flink on YARN programatically. I think the ideal solution would be something like a YarnExecutionEnvironment. But you can still do it, by using Flink's internal YARN abstraction. Check out this YARN test case, that is