Re: flink-1.2 and unit testing / flinkspector

2017-03-17 Thread Ted Yu
You're welcome. Maybe send out pull request to flink-spector so that other people don't have to go thru this route. On Fri, Mar 17, 2017 at 7:16 PM, Tarandeep Singh wrote: > Thank you Ted. It worked! > > Best, > Tarandeep > > On Fri, Mar 17, 2017 at 5:31 PM, Ted Yu wrote: > >> Here is where th

Re: flink-1.2 and unit testing / flinkspector

2017-03-17 Thread Tarandeep Singh
Thank you Ted. It worked! Best, Tarandeep On Fri, Mar 17, 2017 at 5:31 PM, Ted Yu wrote: > Here is where the dependency comes in: > > [INFO] org.apache.flink:flink-test-utils_2.10:jar:1.3-SNAPSHOT > [INFO] +- org.apache.hadoop:hadoop-minikdc:jar:2.7.2:compile > [INFO] | +- org.apache.directory

Job fails to start with S3 savepoint

2017-03-17 Thread Bajaj, Abhinav
Hi, I am trying to explore using S3 for storing checkpoints and savepoints. I can get Flink to store the checkpoints and savepoints in s3. However, when I try to submit the same Job using the stored savepoint, it fails with below exception. I am using Flink 1.2 and submitted the job from the UI

Re: flink-1.2 and unit testing / flinkspector

2017-03-17 Thread Ted Yu
Here is where the dependency comes in: [INFO] org.apache.flink:flink-test-utils_2.10:jar:1.3-SNAPSHOT [INFO] +- org.apache.hadoop:hadoop-minikdc:jar:2.7.2:compile [INFO] | +- org.apache.directory.server:apacheds-core-api:jar:2.0.0-M15:compile [INFO] | | \- org.apache.directory.jdbm:apacheds-jdb

Re: flink-1.2 and unit testing / flinkspector

2017-03-17 Thread Tarandeep Singh
Hi Ted, See the attached patch. I am able to run test examples (e.g. org.flinkspector.datastream.examples.TestMapper) via IntelliJ. But when I try to build via maven- mvn clean install I get that dependency issue (ERROR] Failed to execute goal on project flinkspector-core_2.11: Could not resolv

Re: flink-1.2 and unit testing / flinkspector

2017-03-17 Thread Ted Yu
Can you post the patch for flink-specter where the mini cluster is replaced ? I assume you upgraded the version of Flink in the pom. Cheers > On Mar 17, 2017, at 4:26 PM, Tarandeep Singh wrote: > > Hi, > > Is someone using flinkspector unit testing framework with flink-1.2? > I added the fol

flink-1.2 and unit testing / flinkspector

2017-03-17 Thread Tarandeep Singh
Hi, Is someone using flinkspector unit testing framework with flink-1.2? I added the following dependencies in my pom.xml file: org.flinkspector flinkspector-datastream_2.10 0.5 org.flinkspector flinkspector-co

Re: Return of Flink shading problems in 1.2.0

2017-03-17 Thread Foster, Craig
Ping. So I’ve built with 3.0.5 and it does give proper shading. So it does get me yet another workaround where my only recourse is to use a max version of Maven. Still, I feel there should be a long-term fix at some point in time. I also believe there is a regression in Flink 1.2.0 for Maven 3.3

Re: register metrics reporter on application level

2017-03-17 Thread Chesnay Schepler
I'm not aware of any plans to implement this in the near future. It is a cool feature but the implementation isn't trivial and would make the metric system a lot more complex, specifically in regards to concurrency and resource clean-up. Regards, Chesnay On 17.03.2017 21:13, Mingmin Xu wrote:

Re: register metrics reporter on application level

2017-03-17 Thread Mingmin Xu
Thanks @Chesnay, do you know any plan to add it in future version? Mingmin On Fri, Mar 17, 2017 at 1:05 PM, Chesnay Schepler wrote: > Hello, > > there is currently no way to specify a reporter for a specific job. > > Regards, > Chesnay > > > On 17.03.2017 19:18, Mingmin Xu wrote: > >> Hello all

Re: register metrics reporter on application level

2017-03-17 Thread Chesnay Schepler
Hello, there is currently no way to specify a reporter for a specific job. Regards, Chesnay On 17.03.2017 19:18, Mingmin Xu wrote: Hello all, I'm following https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/metrics.html to collect metrics. It seems the reporter is set on

Re: Checkpointing with RocksDB as statebackend

2017-03-17 Thread Stephan Ewen
@vinay Let's see how fast we get this fix in - I hope yes. It may depend also a bit on the RocksDB community. In any case, if it does not make it in, we can do a 1.2.2 release immediately after (I think the problem is big enough to warrant that), or at least release a custom version of the RocksDB

register metrics reporter on application level

2017-03-17 Thread Mingmin Xu
Hello all, I'm following https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/metrics.html to collect metrics. It seems the reporter is set on cluster level. Is there an option to specify different reporter on application level? Thanks! Mingmin

Re: Checkpointing with RocksDB as statebackend

2017-03-17 Thread vinay patil
Hi Stephan, Is the performance related change of RocksDB going to be part of Flink 1.2.1 ? Regards, Vinay Patil On Thu, Mar 16, 2017 at 6:13 PM, Stephan Ewen [via Apache Flink User Mailing List archive.] wrote: > The only immediate workaround is to use windows with "reduce" or "fold" or > "ag

Re: ProcessingTimeTimer in ProcessFunction after a savepoint

2017-03-17 Thread Aljoscha Krettek
When restoring, processing-time timers that would have fired already should immediately fire. @Florian what Flink version are you using? In Flink 1.1 there was a bug that led to processing-time timers not being reset when restoring. > On 17 Mar 2017, at 15:39, Florian König wrote: > > Hi, > >

Re: SQL + flatten (or .*) quality docs location?

2017-03-17 Thread Stu Smith
Thank you! Just in case someone else stumbles onto this, I figured what was giving me trouble. The object I wanted to flattened happened to be null at times, at which point it would error out and give some exception along the lines of: "method flatten() not found" (Sorry, I'll try to follow up wi

Re: ProcessingTimeTimer in ProcessFunction after a savepoint

2017-03-17 Thread Florian König
Hi, funny coincidence, I was just about to ask the same thing. I have noticed this with restored checkpoints in one of my jobs. The timers seem to be gone. My window trigger registers a processing timer, but it seems that these don’t get restored - even if the timer is set to fire in the future

Re: Return of Flink shading problems in 1.2.0

2017-03-17 Thread Foster, Craig
Hey Stephen: I am building twice in every case described in my previous mail. Well, building then rebuilding the flink-dist submodule. This was fixed in BigTop but I started seeing this issue again with Flink 1.2.0. I was wondering if there's something else in the environment that could prevent

ProcessingTimeTimer in ProcessFunction after a savepoint

2017-03-17 Thread Yassine MARZOUGUI
Hi all, How does the processing time timer behave when a job is taken down with a savepoint and then restarted after the timer was supposed to fire? Will the timer fire at restart because it was missed during the savepoint? I'm wondering because I would like to schedule periodic timers in the fut

Re: Telling if a job has caught up with Kafka

2017-03-17 Thread Gyula Fóra
Hi Gordon, Thanks for the suggestions, I think in general it would be good to make this periodic (with a configurable interval), and also show the latest committed (checkpointed) offset lag. I think it's better to show both not only one of them as they both carry useful information. So we would h

Re: Telling if a job has caught up with Kafka

2017-03-17 Thread Tzu-Li (Gordon) Tai
One other possibility for reporting “consumer lag” is to update the metric only at a configurable interval, if use cases can tolerate a certain delay in realizing the consumer has caught up. Or we could also piggy pack the consumer lag update onto the checkpoint interval - I think in the case t

Re: Telling if a job has caught up with Kafka

2017-03-17 Thread Tzu-Li (Gordon) Tai
Hi, I was thinking somewhat similar to what Ufuk suggested, but if we want to report a “consumer lag” metric, we would essentially need to request the latest offset on every record fetch (because the latest offset advances as well), so I wasn’t so sure of the performance tradeoffs there (the parti

Re: Telling if a job has caught up with Kafka

2017-03-17 Thread Ufuk Celebi
@Gordon: What's your take on integrating this directly into the consumer? Can't we poll the latest offset wie the Offset API [1] and report a consumer lag metric for the consumer group of the application? This we could also display in the web frontend. In the first version, users would have to pol

Re: Return of Flink shading problems in 1.2.0

2017-03-17 Thread Stephan Ewen
Hi Craig! Maven 3.3.x has a shading problem. You need to build two times, once from root, once inside "flink-dist". Have a look here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/building.html#dependency-shading Maybe that way missed in BigTop? I am wondering if we should a

Re: Telling if a job has caught up with Kafka

2017-03-17 Thread Bruno Aranda
Hi, We are interested on this too. So far we flag the records with timestamps in different points of the pipeline and use metrics gauges to measure latency between the different components, but would be good to know if there is something more specific to Kafka that we can do out of the box in Flin

Re: Telling if a job has caught up with Kafka

2017-03-17 Thread Florian König
Hi, thank you Gyula for posting that question. I’d also be interested in how this could be done. You mentioned the dependency on the commit frequency. I’m using https://github.com/quantifind/KafkaOffsetMonitor. With the 08 Kafka consumer a job's offsets as shown in the diagrams updated a lot m

Telling if a job has caught up with Kafka

2017-03-17 Thread Gyula Fóra
Hi All, I am wondering if anyone has some nice suggestions on what would be the simplest/best way of telling if a job is caught up with the Kafka input. An alternative question would be how to tell if a job is caught up to another job reading from the same topic. The first thing that comes to my

Re: Batch stream Sink delay ?

2017-03-17 Thread Fabian Hueske
Yes, an AssignerWithPeriodicWatermarks is peridocially called. The interval is configured with StreamExecutionEnvironment.getConfig().setAutoWatermarkInterval(). When you configure event-time mode (setStreamTimeCharacteristic(TimeCharacteristic.EventTime), the default is set to 200ms. Best, Fabian

Re: Data+control stream from kafka + window function - not working

2017-03-17 Thread Tarandeep Singh
Hi Gordon, When I use getInput (input created via collection), then watermarks are always Long.MAX_VALUE: WM: Watermark @ 9223372036854775807 This is understandable as input source has finished so a watermark of value Long.MAX_VALUE is emitted. When I use getKafkaInput, I get this watermark: WM:

Re: Return of Flink shading problems in 1.2.0

2017-03-17 Thread Ufuk Celebi
Pulling in Robert and Stephan who know the project's shading setup the best. On Fri, Mar 17, 2017 at 6:52 AM, Foster, Craig wrote: > Hi: > > A few months ago, I was building Flink and ran into shading issues for > flink-dist as described in your docs. We resolved this in BigTop by adding > the co