How should I choose the deployment model in product environment

2019-03-06 Thread 126
Hi all: I want to deploy my flink cluster in product environment. How should I choose the deployment model? Standalone Cluster? yarn session?(Start a long-running Flink cluster on YARN) yarn-cluster ?(Run a single Flink job on YARN) Thanks

Re: How to monitor Apache Flink in AWS EMR (ElasticMapReduce)?

2019-03-06 Thread Yun Tang
Hi Jack How about extracting flink-metrics-prometheus-1.6.1.jar from downloaded distribution tar https://archive.apache.org/dist/flink/flink-1.6.1/ and upload it to `/usr/lib/flink/lib` on EMR? Otherwise, I believe setup a customized Flink cluster on EMR [1] should work if no other convenient

答复: Flink 在什么情况下产生乱序问题?

2019-03-06 Thread 戴嘉诚
你可以了解下触发器,默认的触发器是按照你发现的做,如果你要实时输出,可以吧触发器更改为ContinuonsEventTimeTrigger ,然后设置你的时间间隔。 发件人: 刘 文 发送时间: 2019年3月6日 22:55 收件人: user-zh@flink.apache.org 抄送: qcx978132...@gmail.com 主题: Re: Flink 在什么情况下产生乱序问题? ).在验证EventTime 加watermark 处理中,我发现往socket发送的数据,不能及时输出或没有输出 ).验证发现,只有当前发送的数据的

sql-client batch 模式执行报错

2019-03-06 Thread yuess_coder
我在sql-client提交任务: create table csv_source1( id varchar, name varchar ) with ( type ='csv', path = '/Users/IdeaProjects/github/apache-flink/build-target/bin/test1.csv' ); create table csv_sink( id varchar, name varchar ) with ( type ='csv', path =

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-06 Thread Ufuk Celebi
I like Shaoxuan's idea to keep this a static site first. We could then iterate on this and make it a dynamic thing. Of course, if we have the resources in the community to quickly start with a dynamic site, I'm not apposed. – Ufuk On Wed, Mar 6, 2019 at 2:31 PM Robert Metzger wrote: > >

Schema Evolution on Dynamic Schema

2019-03-06 Thread shkob1
Hey, My job is built on SQL that is injected as an input to the job. so lets take an example of Select a,max(b) as MaxB,max(c) as MaxC FROM registered_table GROUP BY a (side note: in order for the state not to grow indefinitely i'm transforming to a retracted stream and filtering based on a

How to monitor Apache Flink in AWS EMR (ElasticMapReduce)?

2019-03-06 Thread Jack Tuck
I currently have Flink setup and have a Job running on EMR and I'm now trying to add monitoring by sending metrics off to prometheus. I have come across an issue with running Flink on EMR. I'm using Terraform to provision EMR (I run ansible after to download and run a job). Out the box, it

How to monitor Apache Flink in AWS EMR (ElasticMapReduce)?

2019-03-06 Thread Jack Tuck
I currently have Flink setup and have a Job running on EMR and I'm now trying to add monitoring by sending metrics off to prometheus. I have come across an issue with running Flink on EMR. I'm using Terraform to provision EMR (I run ansible after to download and run a job). Out the box, it

Re: Task slot sharing: force reallocation

2019-03-06 Thread Le Xu
1.3.2 -- should I update to the latest version? Thanks, Le On Wed, Mar 6, 2019 at 4:24 AM Till Rohrmann wrote: > Which version of Flink are you using? > > On Tue, Mar 5, 2019 at 10:58 PM Le Xu wrote: > >> Hi Till: >> >> Thanks for the reply. The setup of the jobs is roughly as follows: For a

Re: Job continuously failing after Checkpoint Restore

2019-03-06 Thread Yun Tang
Hi Laura >From the exception stack, there exist two possible reasons causing this NPE. >Either the KafkaTopicPartition is null or field topic of that >KafkaTopicPartition form the union state is null. No matter what reason, the >problem might existed in the KryoSerializer which used to

Re: [1.7.1] job stuck in suspended state

2019-03-06 Thread Till Rohrmann
Hi Steven, I think I found the problem. It is caused by a JobMaster which takes a long time to suspend the job and multiple leader changes. So what happens after the first leadership revoking and regaining is that the Dispatcher recovers the submitted job but waits to execute it because the

Joining two streams of different priorities

2019-03-06 Thread Aggarwal, Ajay
My main input stream (inputStream1) gets processed using a pipeline that looks like below inputStream1 .keyBy("some-key") .window(TumblingEventTimeWindows.of(Time.seconds(Properties.WINDOW_SIZE)))

Re: Problems with restoring from savepoint

2019-03-06 Thread Павел Поцелуев
Thanks. We'll try it with 1.8.0 and let you know. ---Best regards,Pavel PotseluevSoftware developer, Yandex.Classifieds LLC 06.03.2019, 16:44, "Tzu-Li (Gordon) Tai" :Hi Pavel,As you already discovered, this problem occurs still because in 1.7.x, the KryoSerializer is still using the deprecated

Re: Flink 在什么情况下产生乱序问题?

2019-03-06 Thread 刘 文
).在验证EventTime 加watermark 处理中,我发现往socket发送的数据,不能及时输出或没有输出 ).验证发现,只有当前发送的数据的 getCurrentWatermark()的时间戳 > TimeWindow + maxOutOfOrderness 时,才会触发结束上一次window ).可是最新的记录是不能及时被处理,或者是不能被处理 ).请问这个问题怎么处理? --- > 在

DataStream EventTime last data cannot be output?

2019-03-06 Thread 刘 文
DataStream EventTime last data cannot be output ? In the verification of EventTime plus watermark processing, I found that the data sent to the socket cannot be output in time or output. ). The verification found that only the timestamp of the current send data of getCurrentWatermark() >

Re: Broadcast state with WindowedStream

2019-03-06 Thread Aggarwal, Ajay
Still looking for ideas as to how I can use broadcast state in my use case. From: "Aggarwal, Ajay" Date: Monday, March 4, 2019 at 4:52 PM To: "user@flink.apache.org" Subject: Re: Broadcast state with WindowedStream It sort of makes sense that broadcast state is not available with

4 Apache Events in 2019: DC Roadshow soon; next up Chicago, Las Vegas, and Berlin!

2019-03-06 Thread Rich Bowen
Dear Apache Enthusiast, (You’re receiving this because you are subscribed to one or more user mailing lists for an Apache Software Foundation project.) TL;DR: * Apache Roadshow DC is in 3 weeks. Register now at https://apachecon.com/usroadshowdc19/ * Registration for Apache Roadshow Chicago is

Re: Setting source vs sink vs window parallelism with data increase

2019-03-06 Thread Padarn Wilson
Thanks a lot for your suggestion. I’ll dig into it and update for the mailing list if I find anything useful. Padarn On Wed, 6 Mar 2019 at 6:03 PM, Piotr Nowojski wrote: > Re-adding user mailing list. > > > Hi, > > If it is a GC issue, only GC logs or some JVM memory profilers (like > Oracle’s

Re: Problems with restoring from savepoint

2019-03-06 Thread Tzu-Li (Gordon) Tai
Hi Pavel, As you already discovered, this problem occurs still because in 1.7.x, the KryoSerializer is still using the deprecated TypeSerializerConfigSnapshot as its snapshot, which relies on the serializer being Java-serialized into savepoints as state metadata. In 1.8.0, all Flink's built-in

Re: Using Flink in an university course

2019-03-06 Thread Wouter Zorgdrager
Hi all, Thanks for the input. Much appreciated. Regards, Wouter Op ma 4 mrt. 2019 om 20:40 schreef Addison Higham : > Hi there, > > As far as a runtime for students, it seems like docker is your best bet. > However, you could have them instead package a jar using some interface > (for example,

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-06 Thread Robert Metzger
Awesome! Thanks a lot for looking into this Becket! The VMs hosted by Infra look suitable. @Shaoxuan: There is actually already a static page. It used to be linked, but has been removed from the navigation bar for some reason. This is the page: https://flink.apache.org/ecosystem.html We could

Re: How to check validity or completeness of created checkpoint/savepoint

2019-03-06 Thread Chesnay Schepler
The existence of a _metadata file is a good indicator that Flink has finished writing the checkpoint/savepoint; IIRC we use this in our tests. I'm not aware of any other mechanism. On 06.03.2019 10:21, Parth Sarathy wrote: Hi, I am running flink 1.7.2 and working on resuming a job from a

Re: [1.7.1] job stuck in suspended state

2019-03-06 Thread Till Rohrmann
Hi Steven, a quick update from my side after looking through the logs. The problem seems to be that the Dispatcher does not start recovering the jobs after regaining the leadership after it lost it before. I cannot yet tell why this is happening and I try to further debug the problem. If you

Re: RMQSource synchronous message ack

2019-03-06 Thread Chesnay Schepler
The acknowledgement has to be synchronous since Flink assume that after notifyCheckpointComplete() all data has been persisted to external systems. For example, if record 1 to 100 were passed to the sink and a checkpoint occurs and completed, on restart Flink would continue with record 101.

RE: Checkpoints and catch-up burst (heavy back pressure)

2019-03-06 Thread LINZ, Arnaud
Hi, I like the idea, will give it a try. Thanks, Arnaud De : Stephen Connolly Envoyé : mardi 5 mars 2019 13:55 À : LINZ, Arnaud Cc : zhijiang ; user Objet : Re: Checkpoints and catch-up burst (heavy back pressure) On Tue, 5 Mar 2019 at 12:48, Stephen Connolly

Re: EXT :Re: Flink 1.7.1 Inaccessible

2019-03-06 Thread Seye Jin
You will have to copy and the link in it's entirety,Gmail not recognizing correctly http://mail-archives.apache.org/mod_mbox/flink-user/201709.mbox/< 533686a2-71ee-4356-8961-68cf3f858...@expedia.com> On Wed, Mar 6, 2019, 5:26 AM Till Rohrmann wrote: > Hmm this is strange. Retrieving more

Re: EXT :Re: Flink 1.7.1 Inaccessible

2019-03-06 Thread Till Rohrmann
Hmm this is strange. Retrieving more information from the logs would be helpful to better understand the problem. The link to the related discussion does not work. Maybe you could repost it. Cheers, Till On Wed, Mar 6, 2019 at 4:32 AM Seye Jin wrote: > > Hi till, there were no warn or error

Re: Task slot sharing: force reallocation

2019-03-06 Thread Till Rohrmann
Which version of Flink are you using? On Tue, Mar 5, 2019 at 10:58 PM Le Xu wrote: > Hi Till: > > Thanks for the reply. The setup of the jobs is roughly as follows: For a > cluster with N machines, we deploy X simple map/reduce style jobs (the job > DAG and settings are exactly the same, except

Re: Setting source vs sink vs window parallelism with data increase

2019-03-06 Thread Piotr Nowojski
Re-adding user mailing list. Hi, If it is a GC issue, only GC logs or some JVM memory profilers (like Oracle’s Mission Control) can lead you to the solution. Once you confirm that it’s a GC issue, there are numerous resources online how to analyse the cause of the problem. For that, it is

Re: Flink 在什么情况下产生乱序问题?

2019-03-06 Thread Congxian Qiu
hi 对于 kafka 来说,单 partition 内的消息可以保证顺序,但是 partition A 和 partition B 之间的消息顺序是没法保证的。 Best, Congxian On Mar 5, 2019, 18:35 +0800, 刘 文 , wrote: > 请教一下,大家说的Flink 乱序问题,是什么情况下产生,我没明白? > ).谁给我一下会产生乱序问题的场景吗? > ).以下是读取kafka中的数据,三个并行度 > ).输出的结果如下:(总数据20条) > > 3> Message_3 > 1> Message_1 > 2> Message_2 > 1>

How to check validity or completeness of created checkpoint/savepoint

2019-03-06 Thread Parth Sarathy
Hi, I am running flink 1.7.2 and working on resuming a job from a retained checkpoint / savepoint. I want to enquire if there is any reliable method which can be used to know the validity or completeness of the checkpoint / savepoint created by flink. Thanks, Parth Sarathy -- Sent from:

Problems with restoring from savepoint

2019-03-06 Thread Pavel Potseluev
Hi! We use flink-1.7.1 and have some problems with restoring from savepoint. We use custom kryo serializer which relies on protobuf representation of our model classes. It had been working fine but when we made some change in our model class it broke because of changed serialVersionUID. We can see

Re: submit job failed on Yarn HA

2019-03-06 Thread 孙森
Hi Gary: Yes, it’s the second case, the client host is different from the session cluster got started. I’ve tried the way by using" flink run -yid “, it really works. Best! Sen > 在 2019年3月6日,下午3:19,Gary Yao 写道: > > Hi Sen, > > I took a look at your CLI logs again, and saw that