Hi all:
I want to deploy my flink cluster in product environment. How should I
choose the deployment model?
Standalone Cluster?
yarn session?(Start a long-running Flink cluster on YARN)
yarn-cluster ?(Run a single Flink job on YARN)
Thanks
Hi Jack
How about extracting flink-metrics-prometheus-1.6.1.jar from downloaded
distribution tar https://archive.apache.org/dist/flink/flink-1.6.1/ and upload
it to `/usr/lib/flink/lib` on EMR?
Otherwise, I believe setup a customized Flink cluster on EMR [1] should work if
no other convenient
你可以了解下触发器,默认的触发器是按照你发现的做,如果你要实时输出,可以吧触发器更改为ContinuonsEventTimeTrigger
,然后设置你的时间间隔。
发件人: 刘 文
发送时间: 2019年3月6日 22:55
收件人: user-zh@flink.apache.org
抄送: qcx978132...@gmail.com
主题: Re: Flink 在什么情况下产生乱序问题?
).在验证EventTime 加watermark 处理中,我发现往socket发送的数据,不能及时输出或没有输出
).验证发现,只有当前发送的数据的
我在sql-client提交任务:
create table csv_source1(
id varchar,
name varchar
) with (
type ='csv',
path = '/Users/IdeaProjects/github/apache-flink/build-target/bin/test1.csv'
);
create table csv_sink(
id varchar,
name varchar
) with (
type ='csv',
path =
I like Shaoxuan's idea to keep this a static site first. We could then
iterate on this and make it a dynamic thing. Of course, if we have the
resources in the community to quickly start with a dynamic site, I'm
not apposed.
– Ufuk
On Wed, Mar 6, 2019 at 2:31 PM Robert Metzger wrote:
>
>
Hey,
My job is built on SQL that is injected as an input to the job. so lets take
an example of
Select a,max(b) as MaxB,max(c) as MaxC FROM registered_table GROUP BY a
(side note: in order for the state not to grow indefinitely i'm transforming
to a retracted stream and filtering based on a
I currently have Flink setup and have a Job running on EMR and I'm now trying
to add monitoring by sending metrics off to prometheus.
I have come across an issue with running Flink on EMR. I'm using Terraform to
provision EMR (I run ansible after to download and run a job). Out the box, it
I currently have Flink setup and have a Job running on EMR and I'm now trying
to add monitoring by sending metrics off to prometheus.
I have come across an issue with running Flink on EMR. I'm using Terraform to
provision EMR (I run ansible after to download and run a job). Out the box, it
1.3.2 -- should I update to the latest version?
Thanks,
Le
On Wed, Mar 6, 2019 at 4:24 AM Till Rohrmann wrote:
> Which version of Flink are you using?
>
> On Tue, Mar 5, 2019 at 10:58 PM Le Xu wrote:
>
>> Hi Till:
>>
>> Thanks for the reply. The setup of the jobs is roughly as follows: For a
Hi Laura
>From the exception stack, there exist two possible reasons causing this NPE.
>Either the KafkaTopicPartition is null or field topic of that
>KafkaTopicPartition form the union state is null. No matter what reason, the
>problem might existed in the KryoSerializer which used to
Hi Steven,
I think I found the problem. It is caused by a JobMaster which takes a long
time to suspend the job and multiple leader changes. So what happens after
the first leadership revoking and regaining is that the Dispatcher recovers
the submitted job but waits to execute it because the
My main input stream (inputStream1) gets processed using a pipeline that looks
like below
inputStream1
.keyBy("some-key")
.window(TumblingEventTimeWindows.of(Time.seconds(Properties.WINDOW_SIZE)))
Thanks. We'll try it with 1.8.0 and let you know. ---Best regards,Pavel PotseluevSoftware developer, Yandex.Classifieds LLC 06.03.2019, 16:44, "Tzu-Li (Gordon) Tai" :Hi Pavel,As you already discovered, this problem occurs still because in 1.7.x, the KryoSerializer is still using the deprecated
).在验证EventTime 加watermark 处理中,我发现往socket发送的数据,不能及时输出或没有输出
).验证发现,只有当前发送的数据的 getCurrentWatermark()的时间戳 > TimeWindow + maxOutOfOrderness
时,才会触发结束上一次window
).可是最新的记录是不能及时被处理,或者是不能被处理
).请问这个问题怎么处理?
---
> 在
DataStream EventTime last data cannot be output ?
In the verification of EventTime plus watermark processing, I found that the
data sent to the socket cannot be output in time or output.
). The verification found that only the timestamp of the current send data of
getCurrentWatermark() >
Still looking for ideas as to how I can use broadcast state in my use case.
From: "Aggarwal, Ajay"
Date: Monday, March 4, 2019 at 4:52 PM
To: "user@flink.apache.org"
Subject: Re: Broadcast state with WindowedStream
It sort of makes sense that broadcast state is not available with
Dear Apache Enthusiast,
(You’re receiving this because you are subscribed to one or more user
mailing lists for an Apache Software Foundation project.)
TL;DR:
* Apache Roadshow DC is in 3 weeks. Register now at
https://apachecon.com/usroadshowdc19/
* Registration for Apache Roadshow Chicago is
Thanks a lot for your suggestion. I’ll dig into it and update for the
mailing list if I find anything useful.
Padarn
On Wed, 6 Mar 2019 at 6:03 PM, Piotr Nowojski wrote:
> Re-adding user mailing list.
>
>
> Hi,
>
> If it is a GC issue, only GC logs or some JVM memory profilers (like
> Oracle’s
Hi Pavel,
As you already discovered, this problem occurs still because in 1.7.x, the
KryoSerializer is still using the deprecated TypeSerializerConfigSnapshot
as its snapshot, which relies on the serializer being Java-serialized into
savepoints as state metadata.
In 1.8.0, all Flink's built-in
Hi all,
Thanks for the input. Much appreciated.
Regards,
Wouter
Op ma 4 mrt. 2019 om 20:40 schreef Addison Higham :
> Hi there,
>
> As far as a runtime for students, it seems like docker is your best bet.
> However, you could have them instead package a jar using some interface
> (for example,
Awesome! Thanks a lot for looking into this Becket! The VMs hosted by Infra
look suitable.
@Shaoxuan: There is actually already a static page. It used to be linked,
but has been removed from the navigation bar for some reason. This is the
page: https://flink.apache.org/ecosystem.html
We could
The existence of a _metadata file is a good indicator that Flink has
finished writing the checkpoint/savepoint; IIRC we use this in our
tests. I'm not aware of any other mechanism.
On 06.03.2019 10:21, Parth Sarathy wrote:
Hi,
I am running flink 1.7.2 and working on resuming a job from a
Hi Steven,
a quick update from my side after looking through the logs. The problem
seems to be that the Dispatcher does not start recovering the jobs after
regaining the leadership after it lost it before. I cannot yet tell why
this is happening and I try to further debug the problem.
If you
The acknowledgement has to be synchronous since Flink assume that after
notifyCheckpointComplete() all data has been persisted to external
systems. For example, if record 1 to 100 were passed to the sink and a
checkpoint occurs and completed, on restart Flink would continue with
record 101.
Hi,
I like the idea, will give it a try.
Thanks,
Arnaud
De : Stephen Connolly
Envoyé : mardi 5 mars 2019 13:55
À : LINZ, Arnaud
Cc : zhijiang ; user
Objet : Re: Checkpoints and catch-up burst (heavy back pressure)
On Tue, 5 Mar 2019 at 12:48, Stephen Connolly
You will have to copy and the link in it's entirety,Gmail not recognizing
correctly
http://mail-archives.apache.org/mod_mbox/flink-user/201709.mbox/<
533686a2-71ee-4356-8961-68cf3f858...@expedia.com>
On Wed, Mar 6, 2019, 5:26 AM Till Rohrmann wrote:
> Hmm this is strange. Retrieving more
Hmm this is strange. Retrieving more information from the logs would be
helpful to better understand the problem.
The link to the related discussion does not work. Maybe you could repost it.
Cheers,
Till
On Wed, Mar 6, 2019 at 4:32 AM Seye Jin wrote:
>
> Hi till, there were no warn or error
Which version of Flink are you using?
On Tue, Mar 5, 2019 at 10:58 PM Le Xu wrote:
> Hi Till:
>
> Thanks for the reply. The setup of the jobs is roughly as follows: For a
> cluster with N machines, we deploy X simple map/reduce style jobs (the job
> DAG and settings are exactly the same, except
Re-adding user mailing list.
Hi,
If it is a GC issue, only GC logs or some JVM memory profilers (like Oracle’s
Mission Control) can lead you to the solution. Once you confirm that it’s a GC
issue, there are numerous resources online how to analyse the cause of the
problem. For that, it is
hi
对于 kafka 来说,单 partition 内的消息可以保证顺序,但是 partition A 和 partition B 之间的消息顺序是没法保证的。
Best, Congxian
On Mar 5, 2019, 18:35 +0800, 刘 文 , wrote:
> 请教一下,大家说的Flink 乱序问题,是什么情况下产生,我没明白?
> ).谁给我一下会产生乱序问题的场景吗?
> ).以下是读取kafka中的数据,三个并行度
> ).输出的结果如下:(总数据20条)
>
> 3> Message_3
> 1> Message_1
> 2> Message_2
> 1>
Hi,
I am running flink 1.7.2 and working on resuming a job from a retained
checkpoint / savepoint.
I want to enquire if there is any reliable method which can be used to know
the validity or completeness of the checkpoint / savepoint created by flink.
Thanks,
Parth Sarathy
--
Sent from:
Hi! We use flink-1.7.1 and have some problems with restoring from savepoint. We use custom kryo serializer which relies on protobuf representation of our model classes. It had been working fine but when we made some change in our model class it broke because of changed serialVersionUID. We can see
Hi Gary:
Yes, it’s the second case, the client host is different from the
session cluster got started. I’ve tried the way by using" flink run -yid “, it
really works.
Best!
Sen
> 在 2019年3月6日,下午3:19,Gary Yao 写道:
>
> Hi Sen,
>
> I took a look at your CLI logs again, and saw that
33 matches
Mail list logo