Re: Different result on running Flink in local mode and Yarn cluster

2018-04-25 Thread Jörn Franke
The problem maybe that it is still static. How will the parser use this HashMap? > On 26. Apr 2018, at 06:42, Soheil Pourbafrani wrote: > > I run a code using Flink Java API that gets some bytes from Kafka and parses > it following by inserting into Cassandra database

Different result on running Flink in local mode and Yarn cluster

2018-04-25 Thread Soheil Pourbafrani
I run a code using *Flink* Java API that gets some bytes from *Kafka* and parses it following by inserting into *Cassandra* database using another library *static* method (both parsing and inserting results is done by the library). Running code on local in IDE, I get the desired answer, but

答复: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread 潘 功森
I got you. Thanks! 发件人: 周思华 发送时间: 2018年4月26日 10:50 收件人: TechnoMage 抄送: Fabian Hueske; Timo Walther; user; 潘 功森 主题: Re: Why

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread 周思华
Hi 潘, could you please check the number of kafka's partitions, I think if the {{number of kafka partition}} < {{parallelism of source node}}) then there can be some idle parallel which won't recevice any data... Best Regards, Sihua Zhou On 04/26/2018

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread TechnoMage
If you are using keyed messages in Kafka, or keyed streams in flink, then only partitions that get hashed to the proper value will get data. If not keyed messages, then yes they should all get data. Michael > On Apr 25, 2018, at 8:25 PM, 潘 功森 wrote: > > The event is

答复: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread 潘 功森
The event is running all the time in order,I don't know why one of the partitions does not receive data if not change parallelism? 发件人: Fabian Hueske 发送时间: 2018年4月25日 10:56 收件人: Timo Walther 抄送: user 主题: Re: Why assignTimestampsAndWatermarks

答复: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread 潘 功森
Yes. 发件人: Timo Walther 发送时间: 2018年4月25日 8:43 收件人: user@flink.apache.org 主题: Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired? Hi, did you set your time characteristics to even-time?

RE: KafkaProducer with generic (Avro) serialization schema

2018-04-25 Thread Nortman, Bill
The things I would try would first in you are you class Person and Address have getters and setters and a no argument constructor. From: Wouter Zorgdrager [mailto:zorgdrag...@gmail.com] Sent: Wednesday, April 25, 2018 7:17 AM To: user@flink.apache.org Subject: KafkaProducer with generic (Avro)

Re: data enrichment with SQL use case

2018-04-25 Thread Ken Krugler
Hi Michael, Windowing works when you’re joining timestamped metadata and non-metadata. The common case I’m referring to is where there’s some function state (e.g. rules to process data, machine learning models, or in my case clusters), where you want to process the non-metadata with the

Re: Task Manager detached under load

2018-04-25 Thread Steven Wu
Till, We ran into the same issue. It started with high GC pause that caused jobmanager to lose zk conn and leadership and caused jobmanager to quarantine taskmanager in akka. Once quarantined, akka association btw jobmanager and taskmanager is locked forever. Your suggestion of "

Re: data enrichment with SQL use case

2018-04-25 Thread TechnoMage
I agree in the general case you need to operate on the stream data based on the metadata you have. The side input feature coming some day may help you, in that it would give you a means to receive inputs out of band. But, given changing metadata and changing stream data I am not sure this is

Re: data enrichment with SQL use case

2018-04-25 Thread TechnoMage
Using a flat map function, you can always buffer the non-meta data stream in the operator state until the metadata is aggregated, and then process any collected data. It would require a RichFlatMap to hold data. Michael > On Apr 25, 2018, at 1:20 PM, Ken Krugler

Re: data enrichment with SQL use case

2018-04-25 Thread Ken Krugler
Hi Fabian, > On Apr 24, 2018, at 3:01 AM, Fabian Hueske wrote: > > Hi Alex, > > An operator that has to join two input streams obviously requires two inputs. > In case of an enrichment join, the operator should first read the meta-data > stream and build up a data

Re: Help with OneInputStreamOperatorTestHarness

2018-04-25 Thread Chris Schneider
Hi Gang, FWIW, the code below works just fine using Flink 1.5-SNAPSHOT. I also tried cherry-picking the commit that fixed FLINK-8268 to Flink 1.4.0, but that resulted in the same failure mode. I guess

Re: Externalized checkpoints and metadata

2018-04-25 Thread hao gao
Hi Juan, We modified the flink code a little bit to change the flink checkpoint structure so we can easily identify which is which you can read my note or the PR https://medium.com/hadoop-noob/flink-externalized-checkpoint-eb86e693cfed https://github.com/BranchMetrics/flink/pull/6/files Hope it

Multiple Streams Connect Watermark

2018-04-25 Thread Chengzhi Zhao
Hi, everyone, I am trying to do some join-like pipeline using flink connect operator and CoProcessFunction, I have use case that I need to connect 3+ streams. So I am having something like this: A ===> C B ==> E D So two streams A and B connect at first with 3

Re: Weird Kafka Connector issue

2018-04-25 Thread TechnoMage
So flink is not using the group id, but at the same time all other topics do read all published records. Note that the number of partitions is equal to the parallelism of the flink job so there should be one flink task per partition. Michael > On Apr 25, 2018, at 11:40 AM, Nortman, Bill

Re: Weird Kafka Connector issue

2018-04-25 Thread TechnoMage
Just in case it is a metrics bug, I will add a step to do my own counting in the Flink job. Michael > On Apr 25, 2018, at 9:52 AM, TechnoMage wrote: > > I have another java program reading the topic to monitor the test. It > receives 60,000 records on the “travel”

Re: Weird Kafka Connector issue

2018-04-25 Thread TechnoMage
I have another java program reading the topic to monitor the test. It receives 60,000 records on the “travel” topic, while the kafka consumer only reports 4,138. That and the incongruity of the source to the maps are what seems very weird. I have several other topics where the source is

Re: Beam quickstart

2018-04-25 Thread Jörn Franke
Tried with a fat jar to see if it works in general ? > On 25. Apr 2018, at 15:32, Gyula Fóra wrote: > > Hey, > Is there somewhere an end to end guide how to run a simple beam-on-flink > application (preferrably using Gradle)? I want to run it using the standard >

Re: Beam quickstart

2018-04-25 Thread Ufuk Celebi
Hey Gyula, including Aljoscha (cc) here who is a committer at the Beam project. Did you also ask on the Beam mailing list? – Ufuk On Wed, Apr 25, 2018 at 3:32 PM, Gyula Fóra wrote: > Hey, > Is there somewhere an end to end guide how to run a simple beam-on-flink >

Beam quickstart

2018-04-25 Thread Gyula Fóra
Hey, Is there somewhere an end to end guide how to run a simple beam-on-flink application (preferrably using Gradle)? I want to run it using the standard per-job yarn cluster setup but I cant seem to get it to work. I always end up having strange NoSuchMethod errors from protobuf and have spent

Re: Run programs w/ params including comma via REST api

2018-04-25 Thread Chesnay Schepler
We may be able to modify the handler to just concatenate the arguments again with a comma. I'll look into that tomorrow. On 25.04.2018 11:02, Dongwon Kim wrote: Hi Chesnay, I already modified our application to use semicolon as delimiter and now I can run a job using rest API via dispatcher.

CEP: use of values of previously accepted event

2018-04-25 Thread Esa Heikkinen
Hi I have tried to read [1] and understand how to get values of previously accepted event to use in current event (or pattern). Iterative conditions (with context.getEventsForPatterns) do something like that, but it gets all previously accepter events.. How to get only last one (by Scala) ?

Externalized checkpoints and metadata

2018-04-25 Thread Juan Gentile
Hello, We are trying to use externalized checkpoints, using RocksDB on Hadoop hdfs. We would like to know what is the proper way to resume from a saved checkpoint as we are currently running many jobs in the same flink cluster. The problem is that when we want to restart the jobs and pass the

Re: Class loading issues when using Remote Execution Environment

2018-04-25 Thread kedar mhaswade
Thank you for your response! I have not tried the flink run app.jar route because the way the app is set up does not allow me to do it. Basically, the app is a web application which serves the UI and also submits a Flink job for running Cypher queries. It is a proof-of-concept app, but IMO, a

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread Fabian Hueske
Hi, This sounds like one of the partitions does not receive data. Watermark generation is data driven, i.e., the watermark can only advance if the TimestampAndWatermarkAssigner sees events. By changing the parallelism between the map and the assigner, the events are shuffled across and hence

Re: Kafka to Flink Avro Deserializer

2018-04-25 Thread Lehuede sebastien
Hi Timo, Thanks for your response ! I have define my Avro schema in "toKafka.avsc" and create my "toKafka.java" file with : *#java -jar avro-tools-1.8.2.jar compile schema toKafka.avsc* Then i import Avro Serialize Schema and my "toKafka.java" generated file : *import

Re: Wrong endpoints to cancel a job

2018-04-25 Thread Timo Walther
Hi Dongwon, please send such mails to the dev@ instead of the user@ as Flink 1.5.0 is not released yet. As far as I know the documentation around deployment and FLIP-6 has not been updated yet. But thank you for letting us know! Regards, Timo Am 25.04.18 um 11:03 schrieb Dongwon Kim: Hi,

Re: Kafka to Flink Avro Deserializer

2018-04-25 Thread Timo Walther
Hi Sebastien, for me this seems more an Avro issue than a Flink issue. You can ignore the shaded exception, we shade Google utilities for avoiding depencency conflicts. The root cause is this: java.lang.NullPointerException     at org.apache.avro.specific.SpecificData.getSchema

Wrong endpoints to cancel a job

2018-04-25 Thread Dongwon Kim
Hi, 1.5.0 needs to update its web page for rest APIs. I'm testing YARN dispatcher and had difficulty canceling jobs today. I've been sending DELETE requests to dispatcher according to https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#cancel-job

Re: Run programs w/ params including comma via REST api

2018-04-25 Thread Dongwon Kim
Hi Chesnay, I already modified our application to use semicolon as delimiter and now I can run a job using rest API via dispatcher. Nevertheless, though I understand the difficulty of modifying API, it needs to be adjusted as many users are highly likely to use a list of brokers as an argument

Kafka to Flink Avro Deserializer

2018-04-25 Thread Lehuede sebastien
Hi Guys, I tried to implement my Avro Deserializer following these link : - https://github.com/okkam-it/flink-examples/blob/master/src/main/java/org/okkam/flink/avro/AvroDeserializationSchema.java -

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread Timo Walther
Hi, did you set your time characteristics to even-time? env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); Regards, Timo Am 25.04.18 um 05:15 schrieb 潘 功森: Hi all, I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the

Re: Run programs w/ params including comma via REST api

2018-04-25 Thread Chesnay Schepler
Currently I don't see a way to circumvent the splitting. You will have to use a different delimiter, I guess a semi-colon could work? The code is rather optimistic in that it assumes commas to not occur within a parameter value, and doesn't support any kind of escaping or quoting. (And this

Re: Trigger state clear

2018-04-25 Thread Chesnay Schepler
Do the taskmanager logs contain any exceptions? On 24.04.2018 17:26, miki haiat wrote: Hi I have some issue possibly memory issue that causing the task manager to crash . full code : https://gist.github.com/miko-code/6d7010505c3cb95be122364b29057237 I defined fire_and_purge on element and

Re: How to run flip-6 on mesos

2018-04-25 Thread Gary Yao
Hi Miki, We are currently working on resolving the last blocking issues for a second Release Candidate. Afterwards it depends on how many new blocking issues will be found during the testing. Best, Gary On Tue, Apr 24, 2018 at 9:42 AM, miki haiat wrote: > Its 1.4.2 ... >

Re: Class loading issues when using Remote Execution Environment

2018-04-25 Thread Chesnay Schepler
I couldn't spot any error in what you tried to do. Does the job-submission succeed if you submit the jar through the command-line client? Can you share the project, or a minimal reproducing version? On 25.04.2018 00:41, kedar mhaswade wrote: I am trying to get gradoop_demo

Re: Operators in Flink

2018-04-25 Thread Robert Metzger
Hi Felipe, Operators are fused by the system, yes. We call it operator chaining, read more about it here: https://ci.apache.org/projects/flink/flink-docs-master/concepts/runtime.html On Fri, Apr 20, 2018 at 11:58 AM, Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Hi, > > I have a

Re: Testing Metrics

2018-04-25 Thread Chesnay Schepler
+1 to using reporters. You will have to explicitly pass a configuration with the reporter settings to the environment via StreamExecutionEnvironment#createLocalEnvironment(int, Configuration). The reporter can verify registrations/values and pass this information back to the main test

Re: Testing Metrics

2018-04-25 Thread Tzu-Li (Gordon) Tai
Hi, Do you mean tests to verify that some metric is actually registered? AFAIK, this is not really easy to do as a unit test. One possible way is to have an integration test that uses a metrics reporter, from which you verify against. For example, the Kafka consumer integration tests that uses