Re: 如何每五分钟统计一次当天某个消息的总条数

2019-03-05 Thread 王涛
是要在.window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(+16))) 后面加一个自定义Trigger,对每一个元素触发。我自定义的Trigger如下: public class WindowTrigger extends Trigger @Override public TriggerResult onElement(final T element, final long timestamp, final TimeWindow window, final TriggerContext ctx) { return

Re: submit job failed on Yarn HA

2019-03-05 Thread Gary Yao
Hi Sen, I took a look at your CLI logs again, and saw that it uses the "default" Flink namespace in ZooKeeper: 2019-02-28 11:18:05,255 INFO org.apache.flink.runtime.util.ZooKeeperUtils - Using '/flink/default' as Zookeeper namespace. However, since you are using YARN, the

Re: Command exited with status 1 in running Flink on marathon

2019-03-05 Thread marzieh ghasemi
Ok, thanks. On Tue, Mar 5, 2019 at 2:43 PM Piotr Nowojski wrote: > Hi, > > Flink per se doesn’t require Hadoop to work, however keep in mind that you > need some way to provide some kind of distributed/remote file system for > checkpoint mechanism to work. If one node writes a file for >

Re: EXT :Re: Flink 1.7.1 Inaccessible

2019-03-05 Thread Seye Jin
Hi till, there were no warn or error log messages. We have been using Flink for a long time now and never experienced this issue(we just migrated to 1.7 from 1.4 though).It was a critical app and after multiple tries to try and resolve, we updated the *high-availabilty.cluster-id* and attached the

Re: submit job failed on Yarn HA

2019-03-05 Thread 孙森
Hi Gary: Thanks very much! I have tried it as the way you said. It works. Hopes that the bug can be fixed as soon as possible. Best! Sen > 在 2019年3月5日,下午3:15,Gary Yao 写道: > > Hi Sen, > > In that email I meant that you should disable the ZooKeeper configuration in > the

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread Becket Qin
Forgot to provide the link... [1] https://www.apache.org/dev/services.html#blogs (Apache infra services) [2] https://www.apache.org/dev/freebsd-jails (FreeBSD Jail provided by Apache Infra) On Wed, Mar 6, 2019 at 10:46 AM Becket Qin wrote: > Hi Robert, > > Thanks for the feedback. These are

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread Becket Qin
Hi Robert, Thanks for the feedback. These are good points. We should absolutely shoot for a dynamic website to support more interactions in the community. There might be a few things to solve: 1. The website code itself. An open source solution would be great. TBH, I do not have much experience

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread Shaoxuan Wang
Hi Becket and Robert, I like this idea! Let us roll this out with Flink connectors at the first beginning. We can start with a static page, and upgrade it when we find a better solution for dynamic one with rich functions. Regards, Shaoxuan On Wed, Mar 6, 2019 at 1:36 AM Robert Metzger

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread vino yang
Hi Becket, Great idea! +1 for this proposal. Best, Vino Bowen Li 于2019年3月6日周三 上午6:24写道: > Thanks for bring it up, Becket. That sounds very good to me. Spark also > has such a page for ecosystem project > https://spark.apache.org/third-party-projects.html and a hosted website >

Re: ????????????????????????????????????????

2019-03-05 Thread ??????
streamOperator .assignTimestampsAndWatermarks(new AscendingTimestampExtractor() { @Override public long extractAscendingTimestamp(EventItem eventItem) { return eventItem.getWindowEnd(); } }) .map(eventItem ->

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread Bowen Li
Thanks for bring it up, Becket. That sounds very good to me. Spark also has such a page for ecosystem project https://spark.apache.org/third-party-projects.html and a hosted website https://spark-packages.org/ with metadata, categories/tags and stats mentioned in the doc. Bowen On Tue, Mar 5,

Re: Task slot sharing: force reallocation

2019-03-05 Thread Le Xu
Hi Till: Thanks for the reply. The setup of the jobs is roughly as follows: For a cluster with N machines, we deploy X simple map/reduce style jobs (the job DAG and settings are exactly the same, except they consumes different data). Each job has N mappers (they are evenly distributed, one mapper

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread Robert Metzger
Hey Becket, This is a great idea! For this to be successful, we need to make sure the page is placed prominently so that the people submitting something will get attention for their contributions. I think a dynamic site would probably be better, if we want features such as up and downvoting or

Re: Checkpoint recovery and state external to flink

2019-03-05 Thread Aggarwal, Ajay
Hi Yun, This is good information. Thank you. However looks like it only applies to SinkFunction. Any thoughts for when intermediate operators are also interacting with external systems? Thanks. Ajay From: Yun Tang Date: Tuesday, March 5, 2019 at 4:04 AM To: "Aggarwal, Ajay" ,

Re: Task slot sharing: force reallocation

2019-03-05 Thread Till Rohrmann
Hard to tell whether this is related to FLINK-11815. To me the setup is not fully clear. Let me try to sum it up: According to Le Xu's description there are n jobs running on a session cluster. I assume that every TaskManager has n slots. The observed behaviour is that every job allocates the

Apache Flink - Task Manager HA setup

2019-03-05 Thread josefernandes
In mission critical setups, constant availability is a must. We are thinking on embracing Apache Flink as our Streaming engine but we have challenging SLAs. Our job plan will be based on a single window thus replicating the job could be an answer. What is the advised setup for high

RMQSource synchronous message ack

2019-03-05 Thread Gabriel Candal
Hi, Recently I've opened a Stack Overflow question about latency spikes (~500ms) after a checkpoint operation, even though the operation itself was relatively fast (~50ms). I've come to realize that the

Re: EventCountJob for Flink 1.7.2

2019-03-05 Thread Flavio Pompermaier
I've just uploaded a Flink-1.7.2 compatible version of the original code at [1]. However FoldingStateDescriptor is now deprecated and should be migrated to AggregatingStateDescriptor (according to the javadoc). Is there any guide about how to do this? I've drafted it in [2] but I didn't know the

Re: Checkpoints and catch-up burst (heavy back pressure)

2019-03-05 Thread Stephen Connolly
On Tue, 5 Mar 2019 at 12:48, Stephen Connolly < stephen.alan.conno...@gmail.com> wrote: > > > On Fri, 1 Mar 2019 at 13:05, LINZ, Arnaud > wrote: > >> Hi, >> >> >> >> I think I should go into more details to explain my use case. >> >> I have one non parallel source (parallelism = 1) that list

Re: Checkpoints and catch-up burst (heavy back pressure)

2019-03-05 Thread Stephen Connolly
On Fri, 1 Mar 2019 at 13:05, LINZ, Arnaud wrote: > Hi, > > > > I think I should go into more details to explain my use case. > > I have one non parallel source (parallelism = 1) that list binary files in > a HDFS directory. DataSet emitted by the source is a data set of file > names, not file

Re: EventCountJob for Flink 1.7.2

2019-03-05 Thread Fabian Hueske
Thanks Flavio! Am Di., 5. März 2019 um 11:23 Uhr schrieb Flavio Pompermaier < pomperma...@okkam.it>: > I discovered that now (in Flink 1.7.2( queryable state server is enabed if > queryable state client is found on the classpath, i.e.: > > > org.apache.flink >

Re: S3 parquet sink - failed with S3 connection exception

2019-03-05 Thread Averell
Hello Kostas, Thanks for your time. I started that job from fresh, set checkpoint interval to 15 minutes. It completed the first 13 checkpoints successfully, only started failing from the 14th. I waited for about 20 more checkpoints, but all failed. Then I cancelled the job, restored from the

Re: Command exited with status 1 in running Flink on marathon

2019-03-05 Thread Piotr Nowojski
Hi, Flink per se doesn’t require Hadoop to work, however keep in mind that you need some way to provide some kind of distributed/remote file system for checkpoint mechanism to work. If one node writes a file for checkpoint/savepoint, in case of restart/crash this file must be accessible from

Re: event time timezone is not correct

2019-03-05 Thread Piotr Nowojski
Hi, Yes, unfortunately this is still not resolved issue :( Piotrek > On 5 Mar 2019, at 04:34, 孙森 wrote: > > Thanks Piotrek. > > It seems the question has not been solved. I will try to use the > TIMESTAMPADD(timeUnit, integer, datetime) instead . > > Best > Sen > >> 在

[DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread Becket Qin
Hi folks, I would like to start a discussion thread about creating a Flink ecosystem website. The website aims to help contributors who have developed projects around Flink share their work with the community. Please see the following doc for more details.

Re: Task slot sharing: force reallocation

2019-03-05 Thread Piotr Nowojski
Hi Le, As I wrote, you can try running Flink in job mode, which spawns separate clusters per each job. Till, is this issue covered by FLINK-11815 ? Is this the same as: > Known issues: > 1. (…) > 2. if task slots are registered before slot

Flink 在什么情况下产生乱序问题?

2019-03-05 Thread 刘 文
请教一下,大家说的Flink 乱序问题,是什么情况下产生,我没明白? ).谁给我一下会产生乱序问题的场景吗? ).以下是读取kafka中的数据,三个并行度 ).输出的结果如下:(总数据20条) 3> Message_3 1> Message_1 2> Message_2 1> Message_4 2> Message_5 3> Message_6 2> Message_8 1> Message_7 2> Message_11 3> Message_9 2> Message_14 1> Message_10 2> Message_17 3> Message_12 2> Message_20

Re: EventCountJob for Flink 1.7.2

2019-03-05 Thread Flavio Pompermaier
I discovered that now (in Flink 1.7.2( queryable state server is enabed if queryable state client is found on the classpath, i.e.: org.apache.flink flink-queryable-state-client-java_${scala.version} ${flink.version} provided I hope this could help someone else.. On Mon, Mar 4, 2019 at 6:54 PM

Re: S3 parquet sink - failed with S3 connection exception

2019-03-05 Thread Kostas Kloudas
Hi Averell, Did you have other failures before (from which you managed to resume successfully)? Can you share a bit more details about your job and potentially the TM/JM logs? The only thing I found about this is here https://forums.aws.amazon.com/thread.jspa?threadID=130172 but Flink does not

Re: Checkpoint recovery and state external to flink

2019-03-05 Thread Yun Tang
Hi Ajay I think two phase commit protocol could solve your concern for the exactly-once external system, Flink already support this feature in some sinks [1], e.g. you could refer to [2] to know which version of Kafaka producer could support exactly-once. [1]